Skip to content

Conversation

astrofrog
Copy link
Member

@astrofrog astrofrog commented Jul 22, 2025

This is very much a WIP and not ready for usage but just opening this to not lose my progress.

I think we need to write down the rules for block sizes, and broadcasted reprojection, and I think it's ok to not cover all arbitrary cases. I'm now using the terminology 'reprojected dimensions' and 'non-reprojected dimensions', where the latter are essentially dimensions where we assume the mapping is 1-to-1 between input and output.

Proposed rules:

  • The dimensionality of the overall reprojection is determined by the shape of the input data (the array itself, not the input data). Call this ndim.
  • If wcs_in has fewer pixel dimensions than ndim, then we assume that the input WCS applies to the last wcs_in.pixel_n_dim dimensions of the input data, and that leading dimensions are the non-reprojected ones. The same applies to wcs_out.
  • shape_out should either have ndim elements, or match the number of reprojected dimensions. If the latter, then any missing elements for non-reprojected dimensions should be set to be the same as the input.
  • block_size, if specified, should also either have ndim elements, or match the number of reprojected dimensions. If the latter, then any missing elements for non-reprojected dimensions should be either set to 1 if block_size equals shape_out for reprojected dimensions, or -1 otherwise
  • block_size should either match shape_out for reprojected or for non-reprojected dimensions (that is, we either chunk over non-reprojected dimensions only, or over reprojected dimensions only). If it matches shape_out for reprojected dimensions, then the remaining block_size should be 1 so that only a single slice is reprojected at a time (yes there may be cases where someone wants to process slices in e.g. time or spectral slices, but this is going to introduce more complexity when the input WCS has dimension ndim so we punt on this for now).
  • If specified, non_reprojected_dims should for now be only leading dimensions starting from 0, so e.g. (0,), (0, 1) and so on. (1,2) is not allowed for now. This could be relaxed in future potentially at the cost of more complexity.
  • In principle one could imagine examples where different dimensions need to be ignored in the input and output, for example if the WCS order is different. I don't really want to cross that bridge, but if we ever do want to do this, the non_reprojected_dims option could take a list of tuples where each tuple gives the (input_dim, output_dim) correspondence.
  • If non_reprojected_dims is specified, then if wcs_in or wcs_out have fewer dimensions than ndim, the difference in the number of dimensions should match the length of non_reprojected_dims.

For cases where the third axis is completely decoupled from the spatial axes in cubes, non_reprojected_dims isn't strictly needed because the WCSes could be easily sliced down to 2D. However, we do need that option for cases where the input WCS is 3D in the sense that each spatial slice might be different (for example due to drift at different times) but where each time still corresponds cleanly to one 2D slice and there is no dependence of e.g. time position on spatial position.

Complex examples that should work:

  • If a 3D dataset is reprojected from a 3D spectral WCS to a 3D time WCS, if the spectral and time axes are the third WCS axis (first numpy axis) then if non_reprojected_dims=(0,), it should work and basically only care about the spatial to spatial conversion. This example doesn't make a huge amount of sense, but another more realistic example is that if one wants to align two spectral cubes spatially without touching the spectral axis, non_reprojected_dims=(0,) could do this.

Copy link

codecov bot commented Jul 22, 2025

Codecov Report

❌ Patch coverage is 80.35714% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.43%. Comparing base (dc4b0e2) to head (0c503ea).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
reproject/common.py 81.63% 9 Missing ⚠️
reproject/mosaicking/coadd.py 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #539      +/-   ##
==========================================
- Coverage   88.30%   87.43%   -0.87%     
==========================================
  Files          28       28              
  Lines        1411     1441      +30     
==========================================
+ Hits         1246     1260      +14     
- Misses        165      181      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@astrofrog astrofrog force-pushed the broadcasting-with-nd-wcs branch from ddcca20 to 37561e9 Compare July 22, 2025 13:47
@astrofrog
Copy link
Member Author

astrofrog commented Jul 22, 2025

Ok so we need to add proper support for non_reprojected_dims to reproject_and_coadd rather than the current hack. And then of course a lot of documentation and tests!


# Determine how many extra broadcasted dimensions are present
n_broadcasted = len(shape_out) - wcs_in.low_level_wcs.pixel_n_dim
n_broadcasted = len(shape_out) - wcs_out.low_level_wcs.pixel_n_dim
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hack currently!

@astrofrog astrofrog force-pushed the broadcasting-with-nd-wcs branch from 37561e9 to 0c503ea Compare August 4, 2025 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant