Skip to content

Dataset.broadcast_like(other) should broadcast against like variables in other #6549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
headtr1ck opened this issue Apr 30, 2022 · 6 comments

Comments

@headtr1ck
Copy link
Collaborator

Is your feature request related to a problem?

I am a bit puzzled about how xarrays is broadcasting Datasets.
It seems to always add all dimensions to all variables.
Is this what you want in general?

See this example:

import xarray as xr

da = xr.DataArray([[1, 2, 3]], dims=("x", "y"))
# <xarray.DataArray (x: 1, y: 3)>
# array([[1, 2, 3]])
ds = xr.Dataset({"a": ("x", [1]), "b": ("z", [2, 3])})
# <xarray.Dataset>
# Dimensions:  (x: 1, z: 2)
# Dimensions without coordinates: x, z
# Data variables:
#     a        (x) int32 1
#     b        (z) int32 2 3
ds.broadcast_like(da)

# returns:
# <xarray.Dataset>
# Dimensions:  (x: 1, y: 3, z: 2)
# Dimensions without coordinates: x, y, z
# Data variables:
#     a        (x, y, z) int32 1 1 1 1 1 1
#     b        (x, y, z) int32 2 3 2 3 2 3

# I think it should return:
# <xarray.Dataset>
# Dimensions:  (x: 1, y: 3, z: 2)
# Dimensions without coordinates: x, y, z
# Data variables:
#     a        (x, y) int32 1 1 1  # notice here without "z" dim
#     b        (x, y, z) int32 2 3 2 3 2 3

Describe the solution you'd like

I would like broadcasting to behave the same way as e.g. a simple addition.
In the upper example da + ds produces the dimensions that I want.

Describe alternatives you've considered

ds + xr.zeros_like(da) this works, but seems more like a "dirty hack".

Additional context

Maybe one can add an option to broadcasting that controls this behavior?

@keewis
Copy link
Collaborator

keewis commented Apr 30, 2022

see also #6304 which covers xr.broadcast

@headtr1ck
Copy link
Collaborator Author

see also #6304 which covers xr.broadcast

I tried adding a join input to Dataset.broadcast_like and passing it to align, but that did not work (at least for join="inner"). Still got the same result...

@headtr1ck
Copy link
Collaborator Author

related to #6227

@dcherian dcherian changed the title Improved Dataset broadcasting Add a broadcasting mode that inserts size-1 labeled dimensions Feb 6, 2025
@dcherian dcherian changed the title Add a broadcasting mode that inserts size-1 labeled dimensions Add a broadcasting mode that inserts size-1 unlabeled dimension Feb 6, 2025
@dcherian dcherian changed the title Add a broadcasting mode that inserts size-1 unlabeled dimension Improved Dataset.broadcasting Feb 6, 2025
@dcherian
Copy link
Contributor

dcherian commented Feb 6, 2025

I keep misunderstanding this issue so typing this out to make sure I got it right.

Writing out dimension names in square brackets

ds['a': 'x', 'b': 'z'].broadcast_like(da: ['x', 'y']) -> ds['a': ['x', y'], 'b': ['x', 'y', 'z']]

IIUC the request is to avoid broadcasting the variables in ds against each other, and to only broadcast each variable against da separately. Did I get it right?

@dcherian dcherian changed the title Improved Dataset.broadcasting Dataset.broadcast_like(other) should broadcast against like variables in other Feb 6, 2025
@dcherian
Copy link
Contributor

dcherian commented Feb 6, 2025

@mjwillson posted a nice summary in #10031 :

I had to summarize the overall problem, it's that behaviour of xarray.broadcast (and Dataset.broadcast_like etc) is not consistent with how actual arithmetic operations broadcast, in cases where Datasets are involved.

@alvarosg
Copy link

alvarosg commented Feb 6, 2025

In the light of #10031, and broadcasting a dataset to itself not behaving as a no-op, should we label this as "bug" rather than "enhancement"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants