-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Unnecessary copy when indexing to obtain a 0d array #2622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It sounds a reasonable improvement, as far as I think of. Is any possible use case in your mind that will be broken if we return a view for 0 dimensional array rather than a copy?
I think it would be sufficient to change the basic indexing part. |
Not that I can think of. The chance of accidentally modifying data due to the change is very slim; the change will only affect statements that convey an explicit intent to do an in-place modification, but currently fail to do so. For example, say you have a dataset
The change affects statements such as the following:
These statements currently modify However, keep in mind that I'm not a seasoned xarray user. Perhaps I'm overlooking some subtleties.
You're right, I just realized outer indexing never returns a view after all, and vectorized indexing certainly doesn't. My comment was mainly about the different array wrappers for lazy indexing, copy on write, etc., but upon closer inspection of the code, it looks like modifying the |
+1 for modifying |
@danielwe, do you mind to send a fix of it? |
Code Sample
Problem description
Indexing into xarray objects creates a view of the underlying data if possible. A surprising exception is when all dimensions are indexed out and the resulting object is 0d. Xarray insists on returning a 0d array rather than a scalar, which suggests (at least to me) that this is also a view whenever possible; however, it is always a copy, and modifying it will never affect the original array.
(The example above is a little contrived, since one could always call
da[0] = 99
. In my actual use case I am indexing into a Dataset in a way that creates views for all variables except the one that happens to collapse to 0d, and thus I'm unable to use the indexed Dataset to modify that variable in the original Dataset.)The copy happens because, internally, the 0d array is created by retrieving a scalar from the underlying numpy array and then wrapping a new array around it. However, in numpy a 0d view can be created directly by indexing with
Ellipsis
/...
, as follows:Thus, a fix that solves my immediate issues and passes all current tests is to modify the following method:
xarray/xarray/core/indexing.py
Lines 1154 to 1163 in 778ffc4
to always append an ellipsis for basic and outer indexing:
I'm not familiar enough with all the indexing variants in xarray to know if this covers all cases of 0d arrays that are currently copies but could be views. If someone wants to share some insight (e.g., some more advanced test cases), I could try and put together a pull request.
Expected Output
Output of
xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-42-lowlatency
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
xarray: 0.11.0
pandas: 0.23.0
numpy: 1.14.3
scipy: 1.1.0
netCDF4: 1.4.0
h5netcdf: 0.6.2
h5py: 2.7.1
Nio: None
zarr: None
cftime: 1.0.0b1
PseudonetCDF: None
rasterio: None
iris: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.17.5
distributed: 1.21.8
matplotlib: 2.2.2
cartopy: None
seaborn: 0.8.1
setuptools: 39.1.0
pip: 10.0.1
conda: 4.5.12
pytest: 3.5.1
IPython: 6.4.0
sphinx: 1.7.4
The text was updated successfully, but these errors were encountered: