Description
What happened?
I'm hitting the error below when trying to calculate a rolling mean of window length 5 over an x, y, time
cube of climate data chunked by x, y, 1
.
ValueError: Moving window (=5) must between 1 and 4, inclusive
What did you expect to happen?
We would expect the rolling mean to calculate correctly.
Minimal Complete Verifiable Example
import dask.array as da
import xarray as xr
import numpy as np
# Dimensions and sizes
nx, ny, nt = 100, 200, 50 # size of x, y, and time dimensions
x = np.linspace(0, 10, nx) # x-coordinates
y = np.linspace(0, 20, ny) # y-coordinates
time = np.linspace(0, 1, nt) # time coordinates
# Generate a random Dask array with lazy computation
data = da.random.random(size=(nx, ny, nt), chunks=(100, 200, 1))
# Create an xarray DataArray with coordinates and attributes
data_array = xr.DataArray(
data,
dims=["x", "y", "time"],
coords={"x": x, "y": y, "time": time},
name="dummy_data",
attrs={"units": "arbitrary", "description": "Dummy 3D dataset"}
)
d_rolling = data_array.rolling(time=5).mean()
d_rolling.compute()
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/paolo/miniforge3/envs/hip-analysis-dev-env/lib/python3.10/site-packages/xarray/core/dataarray.py", line 1207, in compute
return new.load(**kwargs)
File "/Users/paolo/miniforge3/envs/hip-analysis-dev-env/lib/python3.10/site-packages/xarray/core/dataarray.py", line 1175, in load
ds = self._to_temp_dataset().load(**kwargs)
File "/Users/paolo/miniforge3/envs/hip-analysis-dev-env/lib/python3.10/site-packages/xarray/core/dataset.py", line 899, in load
evaluated_data: tuple[np.ndarray[Any, Any], ...] = chunkmanager.compute(
File "/Users/paolo/miniforge3/envs/hip-analysis-dev-env/lib/python3.10/site-packages/xarray/namedarray/daskmanager.py", line 85, in compute
return compute(*data, **kwargs) # type: ignore[no-untyped-call, no-any-return]
File "/Users/paolo/miniforge3/envs/hip-analysis-dev-env/lib/python3.10/site-packages/dask/base.py", line 660, in compute
results = schedule(dsk, keys, **kwargs)
File "/Users/paolo/miniforge3/envs/hip-analysis-dev-env/lib/python3.10/site-packages/dask/_task_spec.py", line 739, in __call__
return self.func(*new_argspec, **kwargs)
ValueError: Moving window (=5) must between 1 and 4, inclusive
Anything else we need to know?
The issue happens only when chunks are of dimension 1 (as in the example above) or 2, as in the following snippet:
data_array = data_array.chunk(time=2)
d_rolling = data_array.rolling(time=3).mean()
d_rolling.compute()
But rechunking to 3 or above, and also -1, computes without error.
The error started happening when I updated by environment. See below.
Possibly related to #4922?
Environment
INSTALLED VERSIONS
commit: None
python: 3.10.16 | packaged by conda-forge | (main, Dec 5 2024, 14:12:04) [Clang 18.1.8 ]
python-bits: 64
OS: Darwin
OS-release: 23.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: 4.9.2
xarray: 2024.11.0
pandas: 2.2.3
numpy: 1.26.4
scipy: 1.14.1
netCDF4: 1.7.2
pydap: None
h5netcdf: None
h5py: 3.12.1
zarr: 2.18.3
cftime: 1.6.4
nc_time_axis: None
iris: None
bottleneck: 1.4.2
dask: 2024.12.0
distributed: 2024.12.0
matplotlib: 3.9.3
cartopy: 0.24.0
seaborn: 0.13.2
numbagg: None
fsspec: 2024.10.0
cupy: None
pint: 0.24.4
sparse: None
flox: None
numpy_groupies: None
setuptools: 75.6.0
pip: 24.3.1
conda: installed
pytest: 8.3.4
mypy: 1.13.0
IPython: 8.30.0
sphinx: None