-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Behaviour from Dataset.broadcast_like
is strange and inconsistent with how arithmetic ops on Datasets actually broadcast
#10031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Broadcasting in the way you request is a no-op in Xarray-land, so you don't need
What we don't support yet, is allowing broadcast to insert a size-1 unlabeled dimension of the same name. This I have found useful in combination with Admittedly, our documentation on this should be a lot better. See https://tutorial.xarray.dev/fundamentals/02.3_aligning_data_objects.html. cc @headtr1ck |
So the example above was just a minimal reproducer to illustrate the non-sensical / inconsistent behaviour. In practise we run into related issues when broadcasting different datasets in less trivial cases. Also -- you say "Broadcasting in the way you request is a no-op" -- I agree it should be a no-op, but the above clearly illustrates that it isn't, which is kind of the point here right? The other ticket is about broadcasting Dataset against DataArray, I think it's likely the same underlying cause, but if I had to summarize the overall problem, it's that behaviour of |
Thanks for the clarification. Yes i misunderstood your initial post. Apologies for that. I'm closing in favor of #6549. |
No worries. Is it intentional that |
|
If it is intentional it would be good to confirm, and to document what the behaviour is with Datasets. I rather suspect it's behaviour that most people wouldn't want or expect, although I may be missing the original motivation. |
What happened?
What did you expect to happen?
I expected the shape to be consistent with how an actual arithmetic operation broadcasts:
This looks like a bug to me, but if the behaviour is intentional can we please document the reason and draw attention to it with big alarm bells in the docs, it as it's very unexpected and can lead to undesired blow-ups in the size of arrays.
Either way can we please have a version of the broadcast API which broadcasts datasets in the same way that arithmetic operations broadcast them?
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
This is related to #6549, which has been open as a feature request for ~3 years, although not quite the same. Opening a bug anyway with an more focused / minimal illustration of why this makes very little sense.
Environment
xarray: 2025.01.2
pandas: 2.2.3
numpy: 2.2.1
scipy: 1.13.1
netCDF4: 1.4.1
pydap: None
h5netcdf: 999
h5py: 3.11.0
zarr: 2.18.2
cftime: 1.6.4
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.9.1
cartopy: None
seaborn: 0.12.2
numbagg: None
fsspec: 2023.3.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 0.dev0+unknown
pip: None
conda: None
pytest: None
mypy: None
IPython: 7.34.0
sphinx: None
The text was updated successfully, but these errors were encountered: