Skip to content

Fixes #2198: Drop chunksizes when only when original_shape is different, not when it isn't found #2207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 6, 2019
4 changes: 4 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,10 @@ v0.12.2 (unreleased)
Enhancements
~~~~~~~~~~~~


- netCDF chunksizes are now only dropped when original_shape is different,
not when it isn't found. (:issue:`2207`)
By `Karel van de Plassche <https://github.com/Karel-van-de-Plassche>`_.
- Enable `@` operator for DataArray. This is equivalent to :py:meth:`DataArray.dot`
By `Maximilian Roos <https://github.com/max-sixty>`_.
- Add ``fill_value`` argument for reindex, align, and merge operations
Expand Down
4 changes: 3 additions & 1 deletion xarray/backends/netCDF4_.py
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,9 @@ def _extract_nc4_variable_encoding(variable, raise_on_invalid=False,
chunks_too_big = any(
c > d and dim not in unlimited_dims
for c, d, dim in zip(chunksizes, variable.shape, variable.dims))
changed_shape = encoding.get('original_shape') != variable.shape
has_original_shape = 'original_shape' in encoding
changed_shape = (has_original_shape and
encoding.get('original_shape') != variable.shape)
if chunks_too_big or changed_shape:
del encoding['chunksizes']

Expand Down
12 changes: 12 additions & 0 deletions xarray/tests/test_backends.py
Original file line number Diff line number Diff line change
Expand Up @@ -1134,6 +1134,18 @@ def test_encoding_kwarg_compression(self):

assert ds.x.encoding == {}

def test_keep_chunksizes_if_no_original_shape(self):
ds = Dataset({'x': [1, 2, 3]})
chunksizes = (2, )
ds.variables['x'].encoding = {
'chunksizes': chunksizes
}

with self.roundtrip(ds) as actual:
assert_identical(ds, actual)
assert_array_equal(ds['x'].encoding['chunksizes'],
actual['x'].encoding['chunksizes'])

def test_encoding_chunksizes_unlimited(self):
# regression test for GH1225
ds = Dataset({'x': [1, 2, 3], 'y': ('x', [2, 3, 4])})
Expand Down