Skip to content

to_netcdf broken encoding: dtype='S1' + chunksizes #2219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
crusaderky opened this issue Jun 7, 2018 · 3 comments
Closed

to_netcdf broken encoding: dtype='S1' + chunksizes #2219

crusaderky opened this issue Jun 7, 2018 · 3 comments
Labels
plan to close May be closeable, needs more eyeballs

Comments

@crusaderky
Copy link
Contributor

xarray.Dataset({'x': ['foo', 'bar', 'baz']}).to_netcdf(
    'foo.nc', engine='h5netcdf',
    encoding={'x': {'dtype': 'S1', 'zlib': True, 'chunksizes': (2, )}})

ValueError: "chunks" must have same rank as dataset shape

Same with engine='netcdf4'. The issue is present in 0.10.6 as well as in 0.10.3.
The problem is obviously that dtype=S1 changes the shape of the variable before passing it to the backend, but while doing so doesn't also change an eventual chunksizes setting.

The workaround is to omit chunksizes or set it to True.

@shoyer
Copy link
Member

shoyer commented Jun 8, 2018

It looks like this version works:

xarray.Dataset({'x': ['foo', 'bar', 'baz']}).to_netcdf(
    'foo.nc', engine='h5netcdf',
    encoding={'x': {'dtype': 'S1', 'zlib': True, 'chunksizes': (2, 3)}})

I suppose we could update chunksizes to accept both versions? Or just clearly document this behavior?

@crusaderky
Copy link
Contributor Author

crusaderky commented Jun 14, 2018

IMHO the trick that alters the shape of the array is strictly an implementation detail which should not be exposed to the end user. If the implementation of xarray alters the shape of the variable, it should as well alter anything that relies on it. So I think that chunksizes=(2, 3) should not be accepted as a valid input.

@max-sixty
Copy link
Collaborator

As part of keeping our issue count <1000, closing as unlikely to inspire change, please reopen if anyone disagrees

@max-sixty max-sixty closed this as not planned Won't fix, can't repro, duplicate, stale Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to close May be closeable, needs more eyeballs
Projects
None yet
Development

No branches or pull requests

4 participants