Skip to content

BUG: Parsing ellipsis in indexing #93

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jakirkham opened this issue Nov 30, 2016 · 11 comments · Fixed by #172
Closed

BUG: Parsing ellipsis in indexing #93

jakirkham opened this issue Nov 30, 2016 · 11 comments · Fixed by #172
Labels
bug Potential issues with the zarr-python library release notes done Automatically applied to PRs which have release notes.
Milestone

Comments

@jakirkham
Copy link
Member

Running into some cases where parsing an Ellipsis does not work. Note that Input 5 where an Ellipsis is used with an index doesn't work.

Side note: Seems there is something funky with floating point round off error too (e.g. Input 3 and Input 4 seem to differ).

In [1]: import zarr

In [2]: z = zarr.empty(shape=(100, 110), chunks=(10, 11), dtype=float)

In [3]: z[0]
Out[3]: 
array([  6.91088620e-310,   6.91088620e-310,   2.10918838e-316,
         2.10918838e-316,   1.94607893e-316,   5.72938864e-313,
         0.00000000e+000,   3.95252517e-323,   1.57027689e-312,
         1.93101617e-312,   1.25197752e-312,   1.18831764e-312,
         1.12465777e-312,   1.06099790e-312,   2.31297541e-312,
         2.33419537e-312,   1.93101617e-312,   1.93101617e-312,
         1.25197752e-312,   1.18831764e-312,   1.12465777e-312,
         1.03977794e-312,   1.25197752e-312,   2.31297541e-312,
         5.72938864e-313,   1.01855798e-312,   1.08221785e-312,
         1.25197752e-312,   1.25197752e-312,   1.18831764e-312,
         1.97345609e-312,   6.79038653e-313,   1.93101617e-312,
         2.31297541e-312,   5.72938864e-313,   1.01855798e-312,
         1.93101617e-312,   1.93101617e-312,   1.25197752e-312,
         1.18831764e-312,   1.12465777e-312,   1.06099790e-312,
         2.31297541e-312,   5.72938864e-313,   1.01855798e-312,
         1.97345609e-312,   1.93101617e-312,   1.06099790e-312,
         5.72938864e-313,   1.01855798e-312,   2.75859453e-313,
         5.72938864e-313,   1.57027689e-312,   1.93101617e-312,
         1.16709769e-312,   5.72938864e-313,   1.01855798e-312,
         5.72938864e-313,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   4.94065646e-324,   6.91087535e-310,
         0.00000000e+000,   4.94065646e-324,   2.45550626e-321,
         6.91088620e-310,   1.98184750e-316,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         6.91087588e-310,   1.19069821e-321,   2.05561901e-316,
         6.91088620e-310,   6.91070369e-310,   1.03259720e-321,
         6.91088620e-310,   6.91088620e-310,   7.93037613e-120,
         1.44506353e+214,   3.63859382e+185,   2.43896203e-154,
         7.75110916e+228,   4.44743484e+252])

In [4]: z[0, :]
Out[4]: 
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.])

In [5]: z[0, ...]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-27fbc5543985> in <module>()
----> 1 z[0, ...]

/opt/conda2/lib/python2.7/site-packages/zarr/core.pyc in __getitem__(self, item)
    446 
    447         # normalize selection
--> 448         selection = normalize_array_selection(item, self._shape)
    449 
    450         # determine output array shape

/opt/conda2/lib/python2.7/site-packages/zarr/util.pyc in normalize_array_selection(item, shape)
    184         # determine start and stop indices for all axes
    185         selection = tuple(normalize_axis_selection(i, l)
--> 186                           for i, l in zip(item, shape))
    187 
    188         # fill out selection if not completely specified

/opt/conda2/lib/python2.7/site-packages/zarr/util.pyc in <genexpr>((i, l))
    184         # determine start and stop indices for all axes
    185         selection = tuple(normalize_axis_selection(i, l)
--> 186                           for i, l in zip(item, shape))
    187 
    188         # fill out selection if not completely specified

/opt/conda2/lib/python2.7/site-packages/zarr/util.pyc in normalize_axis_selection(item, l)
    163 
    164     else:
--> 165         raise TypeError('expected integer or slice, found: %r' % item)
    166 
    167 

TypeError: expected integer or slice, found: Ellipsis

In [6]: z[...]
Out[6]: 
array([[  6.91088620e-310,   6.91088620e-310,   2.12535499e-316, ...,
          0.00000000e+000,   2.28439709e-032,   6.91088696e-310],
       [  0.00000000e+000,  -5.25530781e-026,   6.91088696e-310, ...,
          6.91087565e-310,   3.95252517e-323,   1.41861741e-316],
       [  6.91087582e-310,   4.44659081e-323,   1.41861622e-316, ...,
          1.41867314e-316,   6.91087582e-310,   2.22329541e-322],
       ..., 
       [  0.00000000e+000,   0.00000000e+000,   0.00000000e+000, ...,
          0.00000000e+000,   0.00000000e+000,   0.00000000e+000],
       [  0.00000000e+000,   0.00000000e+000,   0.00000000e+000, ...,
          0.00000000e+000,   0.00000000e+000,   0.00000000e+000],
       [  0.00000000e+000,   0.00000000e+000,   0.00000000e+000, ...,
          1.41861267e-316,   6.91087582e-310,   6.42285340e-323]])

In [7]: z[:, 0]
Out[7]: 
array([  6.91088620e-310,   6.91088620e-310,   6.91087579e-310,
         6.91087579e-310,   6.91087579e-310,   6.91087579e-310,
         6.91087579e-310,   6.91087579e-310,   6.91087535e-310,
         6.91087535e-310,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         1.82894873e-060,   6.91087535e-310,   3.40411230e-321,
         2.03587931e-316,   6.91088620e-310,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         1.82894873e-060,   6.91087534e-310,   6.91087578e-310,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         1.82894873e-060,   6.91087282e-310,   6.91087282e-310,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         1.82894873e-060,   6.91087578e-310,   6.91087535e-310,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         1.82894873e-060,   6.91087580e-310,   6.91087580e-310,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         1.82894873e-060,   6.91087580e-310,   6.91087566e-310,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         1.82894873e-060,   6.91087282e-310,   6.91087582e-310,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000])
@alimanfoo
Copy link
Member

alimanfoo commented Nov 30, 2016 via email

@jakirkham
Copy link
Member Author

Good point. In real life, there was data in the Zarr array copied from an HDF5 file that motivated coming up with this example. Can confirm I have the same issue if I use zarr.zeros instead.

@alimanfoo
Copy link
Member

alimanfoo commented Nov 30, 2016 via email

@jakirkham
Copy link
Member Author

Oh no sorry. 😳 That was vague. Was only referring to the issue with Ellipsis.

@alimanfoo
Copy link
Member

alimanfoo commented Nov 30, 2016 via email

@jakirkham
Copy link
Member Author

So I have a library, kenjutsu, that I refactored out of some code recently, whose sole purpose is to work with slices. It's pure Python, works with 2 and 3, has no dependencies, and is on conda-forge.

There are two functions in there for cleaning up slices as you are doing here. One for a single slice/length and one for a tuple of slices/shape. While it doesn't yet, it would be pretty trivial to extend it to handle Ellipsis and integer index or indices. Would likely handle this on my end anyways so as to skirt around edge cases like this one.

I don't know if that is something worth considering for you or not. Just figured I'd share in any event. If so, we can discuss what the best way to do this might be.

@alimanfoo
Copy link
Member

alimanfoo commented Dec 1, 2016 via email

@jakirkham
Copy link
Member Author

Definitely. Part of the initial motivation for that code was also to determine the shape of the array needed to hold such a slice. So that part may be useful here too.

Also FYI have an Ellipsis implementation that seems to work ok. ( jakirkham/kenjutsu#7 )

Yeah, I was looking at that case ( https://github.com/alimanfoo/zarr/issues/78 ). I'll post some thoughts over there and we can discuss.

@jakirkham
Copy link
Member Author

FYI made a release of kenjutsu version 0.2.0, which solves this particular case and a few others. It should help cutdown some of the slice logic in zarr if you would be interested in trying.

@alimanfoo
Copy link
Member

alimanfoo commented Dec 5, 2016 via email

@alimanfoo alimanfoo modified the milestone: v2.2 Dec 15, 2016
@jakirkham
Copy link
Member Author

PR ( https://github.com/alimanfoo/zarr/pull/116 ) should fix this issue.

@alimanfoo alimanfoo mentioned this issue Apr 6, 2017
alimanfoo added a commit that referenced this issue Oct 25, 2017
@alimanfoo alimanfoo added the in progress Someone is currently working on this label Oct 31, 2017
alimanfoo added a commit that referenced this issue Nov 9, 2017
@alimanfoo alimanfoo mentioned this issue Nov 10, 2017
6 tasks
alimanfoo added a commit that referenced this issue Nov 11, 2017
@alimanfoo alimanfoo added bug Potential issues with the zarr-python library release notes done Automatically applied to PRs which have release notes. and removed in progress Someone is currently working on this labels Nov 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library release notes done Automatically applied to PRs which have release notes.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants