reindex doesn't preserve chunks

The following code creates a small (100x100) chunked `DataArray`, and then re-indexes it into a huge one (100000x100000):

```python
import xarray as xr
import numpy as np

n = 100
x = np.arange(n)
y = np.arange(n)
da = xr.DataArray(np.zeros(n*n).reshape(n, n), coords=[x, y], dims=['x', 'y']).chunk(n, n)

n2 = 100000
x2 = np.arange(n2)
y2 = np.arange(n2)
da2 = da.reindex({'x': x2, 'y': y2})
da2
```

But the re-indexed `DataArray` has `chunksize=(100000, 100000)` instead of `chunksize=(100, 100)`:

```
<xarray.DataArray (x: 100000, y: 100000)>
dask.array<shape=(100000, 100000), dtype=float64, chunksize=(100000, 100000)>
Coordinates:
  * x        (x) int64 0 1 2 3 4 5 6 ... 99994 99995 99996 99997 99998 99999
  * y        (y) int64 0 1 2 3 4 5 6 ... 99994 99995 99996 99997 99998 99999
```

Which immediately leads to a memory error when trying to e.g. store it to a `zarr` archive:

```python
ds2 = da2.to_dataset(name='foo')
ds2.to_zarr(store='foo', mode='w')
```

Trying to re-chunk to 100x100 before storing it doesn't help, but this time it takes a lot more time before crashing with a memory error:

```python
da3 = da2.chunk(n, n)
ds3 = da3.to_dataset(name='foo')
ds3.to_zarr(store='foo', mode='w')
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

reindex doesn't preserve chunks #2745

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

reindex doesn't preserve chunks #2745

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions