Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

concat breaks raster stack #7273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks
Chrismarsh opened this issue Nov 9, 2022 · 5 comments
Closed
4 tasks

concat breaks raster stack #7273

Chrismarsh opened this issue Nov 9, 2022 · 5 comments

Comments

@Chrismarsh
Copy link

What happened?

Loading multiple tiffs into a raster stack via concat, open_mfdataset, open_dataset, combine_by_coords corrupts the data, causing them to not display correctly, or write out to disk correctly. The rasters show as banded outputs missing data.

image

Plotting individual pre-stacked looks as expected.
image

If a slice of the stacked DA is written to a file, that file is also corrupted. This is confirmed by loading the file in QGIS.

What did you expect to happen?

The stack to not mangle the data

Minimal Complete Verifiable Example

import xarray as xr
import rioxarray as rio
import glob

files = sorted(glob.glob('nomax_tiff/*tiff'))
print(files)

# open the files
d0=rio.open_rasterio(files[0]).to_dataset(name='vel').assign_coords(time=0).expand_dims(dim="time")
d1=rio.open_rasterio(files[1]).to_dataset(name='vel').assign_coords(time=1).expand_dims(dim="time")

# should look fine
d0.vel.plot()
d1.vel.plot()

ds = xr.concat([d0,d1],dim='time')

#mangled
ds.vel.plot(col='time')

#this is also mangled
ds.drop('band').squeeze().isel(time=0).rio.to_raster('mangled.tiff')

#without rioxarray/rasterio it is differently mangled
xr.open_mfdataset(files,concat_dim='time',combine='nested').isel(time=0).band_data.plot()

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

nomax_tiff.zip

Environment

On a linux cluster:

xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.8.10 (default, Jun 16 2021, 14:20:20)
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.36.2.el7.x86_64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: ('en_CA', 'UTF-8')
libhdf5: 1.10.6
libnetcdf: 4.7.4

xarray: 2022.6.0+computecanada
pandas: 1.4.0
numpy: 1.22.2
scipy: 1.8.0
netCDF4: 1.5.8
pydap: None
h5netcdf: None
h5py: 3.1.0
Nio: None
zarr: None
cftime: 1.6.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.3.0
cfgrib: 0.9.10.1
iris: None
bottleneck: None
dask: 2022.7.0
distributed: None
matplotlib: 3.5.1
cartopy: 0.20.3
seaborn: 0.11.2
numbagg: None
fsspec: 2022.5.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 46.1.3
pip: 20.0.2
conda: None
pytest: None
IPython: 7.31.1
sphinx: None

This also reproduces on a macos install. But xr show_versions segfaults. No idea, seems weird.

Python 3.10.4 (main, Jun 14 2022, 14:00:56) [Clang 13.0.0 (clang-1300.0.27.3)] on darwin
rioxarray 0.12.4
xarray 2022.11.0
rasterio 1.3.3

@Chrismarsh Chrismarsh added bug needs triage Issue that has not been reviewed by xarray team member labels Nov 9, 2022
@Chrismarsh
Copy link
Author

Chrismarsh commented Nov 9, 2022

Further investigation suggests this is similar / related to Issue #3681. Using join="override" seems to resolve the problem. However, it is not at all clear why this needs to be done or why it "fixes" the problem. Is this intended behaviour?

edit: An old xarray 0.15.1 doesn't exhibit this behaviour and is correctly concated as expected.

@mathause
Copy link
Collaborator

mathause commented Nov 9, 2022

Could it be that the coordinates of the individual arrays are not exactly the same?

@Chrismarsh
Copy link
Author

Each tiff is generated from the same underlying code so they are expected (and should be) exactly identical. I haven't seen any indication they are different. I just double checked and the two in the example above indicate they are the same
image

is xarray sensitive in a way that wouldn't be obvious with this kind of test?

@mathause
Copy link
Collaborator

I suspect there are floating point inaccuracies which can sometimes trip up xarray. But I would also not expect differences when the coords are created with the same piece of code.

Unfortunately your test does not work - it will first align and then do the subtraction... Any of the following should work:

(d0.y - d1.y).shape == d0.shape
d0.y.values == d1.y.values
xr.align(d0, d1, join="exact")

If this worked earlier

@dcherian dcherian added usage question and removed bug needs triage Issue that has not been reviewed by xarray team member labels Jan 15, 2023
@Chrismarsh
Copy link
Author

Sorry for slow reply, I got pulled onto another task before Christmas and forgot to revisit. Hit this in a different way again today. I get a combo of mangled as well as NaN values.

print((d0.swe.y - d1.swe.y).shape == d0.swe.shape)
print((d0.swe.y - d1.swe.y).shape)
print(d0.swe.y.shape)

False
(2509,)
(2509,)
np.unique(d0.swe.y.values == d1.swe.y.values)

array([ True])
np.unique(d0.swe.x.values == d1.swe.x.values)

array([False,  True]
xr.align(d0, d1, join="exact")
ValueError: cannot align objects with join='exact' where index/labels/sizes are not equal along these coordinates (dimensions): 'x' ('x',)

Looking at x shows X off by 1e-12

np.unique(d0.swe.x.values - d1.swe.x.values)
array([-1.62003744e-12, -1.60582658e-12, -1.59161573e-12, -1.57740487e-12,
       -1.56319402e-12, -1.54898316e-12, -1.53477231e-12, -1.52056145e-12,
       -1.50635060e-12, -1.49213975e-12, -1.47792889e-12, -1.46371804e-12,
       -1.44950718e-12, -1.43529633e-12, -1.42108547e-12, -1.40687462e-12,
       -1.39266376e-12, -1.37845291e-12, -1.36424205e-12, -1.35003120e-12,
       -1.33582034e-12, -1.32160949e-12, -1.30739863e-12, -1.29318778e-12,
       -1.27897692e-12, -1.26476607e-12, -1.25055521e-12, -1.23634436e-12,
       -1.22213351e-12, -1.20792265e-12, -1.19371180e-12, -1.17950094e-12,
       -1.16529009e-12, -1.15107923e-12, -1.13686838e-12, -1.12265752e-12,
       -1.10844667e-12, -1.09423581e-12, -1.08002496e-12, -1.06581410e-12,
       -1.05160325e-12, -1.03739239e-12, -1.02318154e-12, -1.00897068e-12,
       -9.94759830e-13, -9.80548975e-13, -9.66338121e-13, -9.52127266e-13,
       -9.37916411e-13, -9.23705556e-13, -9.09494702e-13, -8.95283847e-13,
       -8.81072992e-13, -8.66862138e-13, -8.52651283e-13, -8.38440428e-13,
       -8.24229573e-13, -8.10018719e-13, -7.95807864e-13, -7.81597009e-13,
       -7.67386155e-13, -7.53175300e-13, -7.38964445e-13, -7.24753590e-13,
       -7.10542736e-13, -6.96331881e-13, -6.82121026e-13, -6.67910172e-13,
       -6.53699317e-13, -6.39488462e-13, -6.25277607e-13, -6.11066753e-13,
       -5.96855898e-13, -5.82645043e-13, -5.68434189e-13, -5.54223334e-13,
       -5.40012479e-13, -5.25801624e-13, -5.11590770e-13, -4.97379915e-13,
       -4.83169060e-13, -4.68958206e-13, -4.54747351e-13, -4.40536496e-13,
       -4.26325641e-13, -4.12114787e-13, -3.97903932e-13, -3.83693077e-13,
       -3.69482223e-13, -3.55271368e-13, -3.41060513e-13, -3.26849658e-13,
       -3.12638804e-13, -2.98427949e-13, -2.84217094e-13, -2.70006240e-13,
       -2.55795385e-13, -2.41584530e-13, -2.27373675e-13, -2.13162821e-13,
       -1.98951966e-13, -1.84741111e-13, -1.70530257e-13, -1.56319402e-13,
       -1.42108547e-13, -1.27897692e-13, -1.13686838e-13, -9.94759830e-14,
       -8.52651283e-14, -7.10542736e-14, -5.68434189e-14, -4.26325641e-14,
       -2.84217094e-14, -1.42108547e-14,  0.00000000e+00])

@pydata pydata locked and limited conversation to collaborators Aug 1, 2023
@andersy005 andersy005 converted this issue into discussion #8038 Aug 1, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

3 participants