Skip to content

xr.merge fails when passing dict #2948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mathause opened this issue May 7, 2019 · 5 comments · Fixed by #2971
Closed

xr.merge fails when passing dict #2948

mathause opened this issue May 7, 2019 · 5 comments · Fixed by #2971

Comments

@mathause
Copy link
Collaborator

mathause commented May 7, 2019

Code Sample

import xarray as xr

a = xr.DataArray([1, 2], dims=('foo', ), name='a')
b = xr.DataArray([3], dims=('bar', ), name='b')

objects = dict(a=a, b=b)
xr.merge(objects)

Problem description

This returns an AttributeError, while the documentation states that a dict is a valid input.

def merge(objects, compat='no_conflicts', join='outer'):
    """Merge any number of xarray objects into a single Dataset as variables.

    Parameters
    ----------
    objects : Iterable[Union[xarray.Dataset, xarray.DataArray, dict]]
        Merge together all variables from these objects. If any of them are
        DataArray objects, they must have a name.

...

Expected Output

Either the docs or the dict_like_objects loop should be changed.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.1 | packaged by conda-forge | (default, Feb 18 2019, 01:42:00) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.4.138-59-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2

xarray: 0.12.1
pandas: 0.24.2
numpy: 1.16.2
scipy: 1.2.1
netCDF4: 1.5.0.1
pydap: None
h5netcdf: 0.7.1
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: 1.2.0
PseudonetCDF: None
rasterio: 1.0.22
cfgrib: 0.9.6.2
iris: 2.2.0
bottleneck: 1.2.1
dask: 1.1.5
distributed: 1.26.1
matplotlib: 3.0.3
cartopy: 0.17.0
seaborn: 0.9.0
setuptools: 41.0.0
pip: 19.0.3
conda: None
pytest: 4.4.0
IPython: 7.4.0
sphinx: 2.0.1

@shoyer
Copy link
Member

shoyer commented May 7, 2019

You can pass an iterable of dicts into merge, not a single dict, e.g.,

>>> xr.merge([objects])
<xarray.Dataset>
Dimensions:  (bar: 1, foo: 2)
Dimensions without coordinates: bar, foo
Data variables:
    a        (foo) int64 1 2
    b        (bar) int64 3

So I think this is working as expected.

@max-sixty
Copy link
Collaborator

+1

Though I think for these we should have better error messages.

Ideally I think we'd a) check it were an iterable, and b) check each item as processed for one of the valid types, raising with a clear message on failures. Thoughts?

@shoyer
Copy link
Member

shoyer commented May 7, 2019

Though I think for these we should have better error messages.

Ideally I think we'd a) check it were an iterable, and b) check each item as processed for one of the valid types, raising with a clear message on failures. Thoughts?

Yes, absolutely!

@mathause
Copy link
Collaborator Author

mathause commented May 8, 2019

Yes, you are right - I did not take the Iterable[...] part in the docs seriously enough.

Would it make sense to pack them?

if isinstance(objects, (xr.DataArray, xr.Dataset, dict)):
    objects = [objects]

Because the alternative is to raise and then the user has to do [objects] himself, which seems a bit nonsensical to me. Although it is also nonsensical to pass a single DataArray or a Dataset to xr.merge, in contrast to dict.

Then we would need to check isinstance(obj, (xr.DataArray, xr.Dataset, dict)) in the in the dict_like_objects-loop and if not raise a ValueError?

@max-sixty
Copy link
Collaborator

I'm trying to think through if there would ever be ambiguity between packed and as specified; I think it's fine.

One note is that I'm not sure it's that helpful though. In the case listed above, it's just as good to put them in a Dataset:

In [2]: xr.merge([objects])
Out[2]:
<xarray.Dataset>
Dimensions:  (bar: 1, foo: 2)
Dimensions without coordinates: bar, foo
Data variables:
    a        (foo) int64 1 2
    b        (bar) int64 3

In [3]: xr.Dataset(objects)
Out[3]:
<xarray.Dataset>
Dimensions:  (bar: 1, foo: 2)
Dimensions without coordinates: bar, foo
Data variables:
    a        (foo) int64 1 2
    b        (bar) int64 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants