Skip to content

xr.DataArray.where(drop=True) crashes if the result is False everywhere (h5netcdf engine) #4524

Open
@RondeauG

Description

@RondeauG

Description of the issue:

For the bug to occur, it looks like the following requirements need to be met :

  1. Crashes occured with both xarray v0.16.1 and the current master. The bug does not seem to exist with v0.15.1.
  2. The file must be opened using the h5netcdf engine.
  3. The DataArray, before the call to where(), must contain auxiliary coordinates that will get dropped (in the example below, lon_rho, lat_rho and h). Edit: By "auxiliary coordinate", I mean coordinates that depend on a dimension, without being dimensions themselves.
  4. The call to .where() must give False results everywhere.
  5. The argument drop=True must be used.

Minimal Complete Verifiable Example:

import xarray as xr
ds = xr.tutorial.open_dataset('ROMS_example', engine='h5netcdf')
da = ds["zeta"]
print(da.where(da>1000, drop=True))

This is what da looks like before the call to .where()

print(da)
<xarray.DataArray 'zeta' (ocean_time: 2, eta_rho: 191, xi_rho: 371)>
[141722 values with dtype=float32]
Coordinates:
    lon_rho     (eta_rho, xi_rho) float64 ...
    hc          float64 ...
    h           (eta_rho, xi_rho) float64 ...
    lat_rho     (eta_rho, xi_rho) float64 ...
    Vtransform  int32 ...
  * ocean_time  (ocean_time) datetime64[ns] 2001-08-01 2001-08-08
Dimensions without coordinates: eta_rho, xi_rho
Attributes:
    long_name:  free-surface
    units:      meter
    time:       ocean_time
    field:      free-surface, scalar, series

What happened:

This is the result obtained from the example above with xarray v0.16.1, as well as the current master.

Traceback (most recent call last):
  File "/usr/local/pycharm-community-2017.3.4/helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec
    exec(exp, global_vars, local_vars)
  File "<string>", line 1, in <module>
  File "/home/rondeau/python/github/xarray/xarray/core/common.py", line 1268, in where
    return ops.where_method(self, cond, other)
  File "/home/rondeau/python/github/xarray/xarray/core/ops.py", line 193, in where_method
    return apply_ufunc(
  File "/home/rondeau/python/github/xarray/xarray/core/computation.py", line 1126, in apply_ufunc
    return apply_dataarray_vfunc(
  File "/home/rondeau/python/github/xarray/xarray/core/computation.py", line 268, in apply_dataarray_vfunc
    result_coords = build_output_coords(args, signature, exclude_dims)
  File "/home/rondeau/python/github/xarray/xarray/core/computation.py", line 232, in build_output_coords
    merged_vars, unused_indexes = merge_coordinates_without_align(
  File "/home/rondeau/python/github/xarray/xarray/core/merge.py", line 329, in merge_coordinates_without_align
    return merge_collected(filtered, prioritized)
  File "/home/rondeau/python/github/xarray/xarray/core/merge.py", line 229, in merge_collected
    merged_vars[name] = unique_variable(name, variables, compat)
  File "/home/rondeau/python/github/xarray/xarray/core/merge.py", line 120, in unique_variable
    out = out.set_dims(dim_lengths)
  File "/home/rondeau/python/github/xarray/xarray/core/variable.py", line 1442, in set_dims
    expanded_data = self.data
  File "/home/rondeau/python/github/xarray/xarray/core/variable.py", line 361, in data
    return self.values
  File "/home/rondeau/python/github/xarray/xarray/core/variable.py", line 512, in values
    return _as_array_or_item(self._data)
  File "/home/rondeau/python/github/xarray/xarray/core/variable.py", line 274, in _as_array_or_item
    data = np.asarray(data)
  File "/home/rondeau/.conda/envs/pyscen/lib/python3.8/site-packages/numpy/core/_asarray.py", line 85, in asarray
    return array(a, dtype, copy=False, order=order)
  File "/home/rondeau/python/github/xarray/xarray/core/indexing.py", line 685, in __array__
    self._ensure_cached()
  File "/home/rondeau/python/github/xarray/xarray/core/indexing.py", line 682, in _ensure_cached
    self.array = NumpyIndexingAdapter(np.asarray(self.array))
  File "/home/rondeau/.conda/envs/pyscen/lib/python3.8/site-packages/numpy/core/_asarray.py", line 85, in asarray
    return array(a, dtype, copy=False, order=order)
  File "/home/rondeau/python/github/xarray/xarray/core/indexing.py", line 655, in __array__
    return np.asarray(self.array, dtype=dtype)
  File "/home/rondeau/.conda/envs/pyscen/lib/python3.8/site-packages/numpy/core/_asarray.py", line 85, in asarray
    return array(a, dtype, copy=False, order=order)
  File "/home/rondeau/python/github/xarray/xarray/core/indexing.py", line 560, in __array__
    return np.asarray(array[self.key], dtype=None)
  File "/home/rondeau/python/github/xarray/xarray/backends/h5netcdf_.py", line 28, in __getitem__
    return indexing.explicit_indexing_adapter(
  File "/home/rondeau/python/github/xarray/xarray/core/indexing.py", line 844, in explicit_indexing_adapter
    raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
  File "/home/rondeau/python/github/xarray/xarray/core/indexing.py", line 858, in decompose_indexer
    return _decompose_outer_indexer(indexer, shape, indexing_support)
  File "/home/rondeau/python/github/xarray/xarray/core/indexing.py", line 1021, in _decompose_outer_indexer
    gains = [
  File "/home/rondeau/python/github/xarray/xarray/core/indexing.py", line 1022, in <listcomp>
    (np.max(k) - np.min(k) + 1.0) / len(np.unique(k))
  File "<__array_function__ internals>", line 5, in amax
  File "/home/rondeau/.conda/envs/pyscen/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 2667, in amax
    return _wrapreduction(a, np.maximum, 'max', axis, None, out,
  File "/home/rondeau/.conda/envs/pyscen/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 90, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation maximum which has no identity

What you expected to happen:

This is the result obtained for the example above with xarray v0.15.1

<xarray.DataArray 'zeta' (ocean_time: 0, eta_rho: 0, xi_rho: 0)>
array([], shape=(0, 0, 0), dtype=float32)
Coordinates:
    lon_rho     (eta_rho, xi_rho) float64 
    hc          float64 20.0
    h           (eta_rho, xi_rho) float64 
    lat_rho     (eta_rho, xi_rho) float64 
    Vtransform  int32 2
  * ocean_time  (ocean_time) datetime64[ns] 
Dimensions without coordinates: eta_rho, xi_rho
Attributes:
    long_name:  free-surface
    units:      meter
    time:       ocean_time
    field:      free-surface, scalar, series

Anything else we need to know?:

This following example works as intended. This seems to occur because isel() eliminates auxiliary coordinates.

import xarray as xr
ds = xr.tutorial.open_dataset('ROMS_example', engine='h5netcdf')
da = ds["zeta"].isel(eta_rho=0, xi_rho=0)
print(da.where(da>1000, drop=True))
<xarray.DataArray 'zeta' (ocean_time: 0)>
array([], dtype=float32)
Coordinates:
    lon_rho     float64 -93.6
    hc          float64 20.0
    h           float64 612.1
    lat_rho     float64 27.45
    Vtransform  int32 2
  * ocean_time  (ocean_time) datetime64[ns] 
Attributes:
    long_name:  free-surface
    units:      meter
    time:       ocean_time
    field:      free-surface, scalar, series

Environment:

Output of xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.2 | packaged by conda-forge | (default, Apr 24 2020, 08:20:52) 
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-514.2.2.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.4
xarray: 0.15.2.dev58+g0e43ba9.d20200521
pandas: 1.1.3
numpy: 1.19.1
scipy: 1.5.2
netCDF4: 1.5.3
pydap: None
h5netcdf: 0.8.0
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.2.1
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.30.0
distributed: 2.30.0
matplotlib: 3.3.1
cartopy: 0.18.0
seaborn: 0.11.0
numbagg: None
pint: 0.11
setuptools: 50.3.0.post20201006
pip: 20.2.3
conda: None
pytest: None
IPython: None
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions