-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Saving zarr to remote location lower cases all data_vars #5028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Interestingly, the same applies to locally created and remotely opened data sets. The variables are recognized correctly from the consolidated metadata, but on opening are made lower case. A quick example: import xarray as xr, numpy as np
x, y, t = np.arange(500), np.arange(500), np.arange(0,500,6)
xr.Dataset(
data_vars={'Group/Variable': (('t', 'y', 'x'), np.ones((t.size, x.size, y.size)))},
coords={'x': x, 'y': y, 'Time': t}
).to_zarr('/tmp/zarr2058', consolidated=True) Locally, everything is fine (every value should be 1). xr.open_zarr('/tmp/zarr2058', group='Group').Variable[0,0,0].compute() # = 1 However, if we access the data via http, the values are not read correctly. To test, start a local webserver: python3 -m http.server -d /tmp/zarr2058 8080 Open our array, the group and variable are properly cased (and derived from xr.open_zarr('http://localhost:8080/', group='Group', consolidated=True) However, access to one of the variables will yield a lower case request to the server, and output xr.open_zarr('http://localhost:8080/', group='Group', consolidated=True).Variable[0,0,0].compute() # = nan The terminal will read the requested url, all in lower case:
However, the problem appears to be in Zarr, not (only) in xarray: zarr.open_group('http://localhost:8080/')['Group'] Behaves exactly the same. Environment: Output of xr.show_versions()INSTALLED VERSIONScommit: None xarray: 0.15.0 |
The issue appears to be in Recently, a Coming back to my last example: zarr.open_consolidated('http://localhost:8080/', storage_options={'normalize_keys': False})['Group'].Variable[0,0,0] Works. Work-around, instantiate the store first: xr.open_zarr(store=zarr.storage.FSStore('http://localhost:8080/', normalize_keys=False),
group='Group',
consolidated=True).Variable[0,0,0].compute() # = 1 |
A heads up that zarr-developers/zarr-python#755 reverses the default behavior of |
Works. Thanks @joshmoore ! |
So glad this got fixed upstream! That's how it is supposed to work! 🏆 Thanks to everyone for making this happen. |
What happened:
I saved a zarr store to a remote location (s3) and read it again and realized the name of the data variables (DataArrays) had reverted to lower case
What you expected to happen:
This does not happen when you save the zarr store locally
Minimal Complete Verifiable Example:
Anything else we need to know?:
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27)
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-1009-aws
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4
xarray: 0.17.0
pandas: 1.2.3
numpy: 1.20.1
scipy: 1.5.3
netCDF4: 1.5.6
pydap: installed
h5netcdf: 0.10.0
h5py: 3.1.0
Nio: None
zarr: 2.6.1
cftime: 1.4.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.2.1
cfgrib: 0.9.8.5
iris: None
bottleneck: 1.3.2
dask: 2021.03.0
distributed: 2021.03.0
matplotlib: 3.3.4
cartopy: 0.18.0
seaborn: 0.11.1
numbagg: None
pint: 0.16.1
setuptools: 49.6.0.post20210108
pip: 21.0.1
conda: None
pytest: 6.2.2
IPython: 7.21.0
sphinx: None
The text was updated successfully, but these errors were encountered: