-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Writing Datasets to netCDF4 with "inconsistent" chunks #2254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
For reference, here's the full traceback: ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-13-6a835b914234> in <module>()
12
13 # Save to a netCDF4 file.
---> 14 dset.to_netcdf("test.nc")
~/dev/xarray/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute)
1148 engine=engine, encoding=encoding,
1149 unlimited_dims=unlimited_dims,
-> 1150 compute=compute)
1151
1152 def to_zarr(self, store=None, mode='w-', synchronizer=None, group=None,
~/dev/xarray/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, writer, encoding, unlimited_dims, compute)
701 # handle scheduler specific logic
702 scheduler = get_scheduler()
--> 703 if (dataset.chunks and scheduler in ['distributed', 'multiprocessing'] and
704 engine != 'netcdf4'):
705 raise NotImplementedError("Writing netCDF files with the %s backend "
~/dev/xarray/xarray/core/dataset.py in chunks(self)
1237 for dim, c in zip(v.dims, v.chunks):
1238 if dim in chunks and c != chunks[dim]:
-> 1239 raise ValueError('inconsistent chunks')
1240 chunks[dim] = c
1241 return Frozen(SortedKeysDict(chunks))
ValueError: inconsistent chunks So yes, it looks like we could fix this by checking chunks on each array independently like you suggest. There's no reason why all dask arrays need to have the same chunking for storing with
This is because you need to indicate chunks for variables separately, via |
I could throw together a pull request if that's all that's involved.
Thanks! I was able to write chunked output the netCDF file by adding |
Yes, a pull request would be appreciated!
…On Wed, Jun 27, 2018 at 1:53 PM Mike Neish ***@***.***> wrote:
So yes, it looks like we could fix this by checking chunks on each array
independently like you suggest. There's no reason why all dask arrays need
to have the same chunking for storing with to_netcdf().
I could throw together a pull request if that's all that's involved.
This is because you need to indicate chunks for variables separately, via
encoding: http://xarray.pydata.org/en/stable/io.html#writing-encoded-data
Thanks! I was able to write chunked output the netCDF file by adding
chunksizes to the encoding attribute of the variables. I found I also had
to specify original_shape as a workaround for #2198
<#2198>.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2254 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABKS1nMYi7kPsJImZLwiHGKqpVHH_UlAks5uA_DIgaJpZM4U551i>
.
|
…nsistent over all arrays. Closes pydata#2254.
Code Sample
The last line results in
Problem description
This error is triggered by
xarray.backends.api.to_netcdf
's use of thedataset.chunks
property in two places:xarray/xarray/backends/api.py
Line 703 in bb581ca
xarray/xarray/backends/api.py
Line 709 in bb581ca
I'm assuming
to_netcdf
only needs to know if chunks are being used, not necessarily if they're consistent?If I define a more general check
and replace the instances of
dataset.chunks
withhave_chunks
, then the netCDF4 file gets written without any problems (although the data seems to be stored contiguously instead of chunked).Is this change as straight-forward as I think, or Is there something intrinsic about
xarray.Dataset
objects or writing to netCDF4 that require consistent chunks?Output of
xr.show_versions()
commit: bb581ca
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-128-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
xarray: 0.10.7
pandas: 0.23.1
numpy: 1.14.5
scipy: None
netCDF4: 1.4.0
h5netcdf: None
h5py: None
Nio: None
zarr: None
bottleneck: None
cyordereddict: None
dask: 0.17.5
distributed: None
matplotlib: None
cartopy: None
seaborn: None
setuptools: 39.2.0
pip: 10.0.1
conda: None
pytest: None
IPython: None
sphinx: None
The text was updated successfully, but these errors were encountered: