Description
What happened?
Loading multiple tiffs into a raster stack via concat
, open_mfdataset
, open_dataset
, combine_by_coords
corrupts the data, causing them to not display correctly, or write out to disk correctly. The rasters show as banded outputs missing data.
Plotting individual pre-stacked looks as expected.
If a slice of the stacked DA is written to a file, that file is also corrupted. This is confirmed by loading the file in QGIS.
What did you expect to happen?
The stack to not mangle the data
Minimal Complete Verifiable Example
import xarray as xr
import rioxarray as rio
import glob
files = sorted(glob.glob('nomax_tiff/*tiff'))
print(files)
# open the files
d0=rio.open_rasterio(files[0]).to_dataset(name='vel').assign_coords(time=0).expand_dims(dim="time")
d1=rio.open_rasterio(files[1]).to_dataset(name='vel').assign_coords(time=1).expand_dims(dim="time")
# should look fine
d0.vel.plot()
d1.vel.plot()
ds = xr.concat([d0,d1],dim='time')
#mangled
ds.vel.plot(col='time')
#this is also mangled
ds.drop('band').squeeze().isel(time=0).rio.to_raster('mangled.tiff')
#without rioxarray/rasterio it is differently mangled
xr.open_mfdataset(files,concat_dim='time',combine='nested').isel(time=0).band_data.plot()
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
No response
Anything else we need to know?
Environment
On a linux cluster:
xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.8.10 (default, Jun 16 2021, 14:20:20)
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.36.2.el7.x86_64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: ('en_CA', 'UTF-8')
libhdf5: 1.10.6
libnetcdf: 4.7.4
xarray: 2022.6.0+computecanada
pandas: 1.4.0
numpy: 1.22.2
scipy: 1.8.0
netCDF4: 1.5.8
pydap: None
h5netcdf: None
h5py: 3.1.0
Nio: None
zarr: None
cftime: 1.6.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.3.0
cfgrib: 0.9.10.1
iris: None
bottleneck: None
dask: 2022.7.0
distributed: None
matplotlib: 3.5.1
cartopy: 0.20.3
seaborn: 0.11.2
numbagg: None
fsspec: 2022.5.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 46.1.3
pip: 20.0.2
conda: None
pytest: None
IPython: 7.31.1
sphinx: None
This also reproduces on a macos install. But xr show_versions segfaults. No idea, seems weird.
Python 3.10.4 (main, Jun 14 2022, 14:00:56) [Clang 13.0.0 (clang-1300.0.27.3)] on darwin
rioxarray 0.12.4
xarray 2022.11.0
rasterio 1.3.3