Skip to content

OverflowError if using NaT as fill_value for mM dtypes #342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
thomascoquet opened this issue Nov 21, 2018 · 6 comments
Closed

OverflowError if using NaT as fill_value for mM dtypes #342

thomascoquet opened this issue Nov 21, 2018 · 6 comments
Labels
bug Potential issues with the zarr-python library
Milestone

Comments

@thomascoquet
Copy link

thomascoquet commented Nov 21, 2018

Code sample

import zarr
import numpy as np

z = zarr.full(dtype='m8[s]', 
              fill_value=np.timedelta64('nat'), 
              shape=10)

raises

OverflowError: int too big to convert
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/home/tom/anaconda3/lib/python3.6/site-packages/zarr/meta.py", line 38, in decode_array_metadata
    fill_value = decode_fill_value(meta['fill_value'], dtype)
  File "/home/tom/anaconda3/lib/python3.6/site-packages/zarr/meta.py", line 159, in decode_fill_value
    return np.array(v, dtype=dtype)[()]
SystemError: <built-in function array> returned a result with an error set
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/tom/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-156-d1f331fc15fd>", line 6, in <module>
    shape=10)
  File "/home/tom/anaconda3/lib/python3.6/site-packages/zarr/creation.py", line 274, in full
    return create(shape=shape, fill_value=fill_value, **kwargs)
  File "/home/tom/anaconda3/lib/python3.6/site-packages/zarr/creation.py", line 123, in create
    cache_metadata=cache_metadata, cache_attrs=cache_attrs, read_only=read_only)
  File "/home/tom/anaconda3/lib/python3.6/site-packages/zarr/core.py", line 123, in __init__
    self._load_metadata()
  File "/home/tom/anaconda3/lib/python3.6/site-packages/zarr/core.py", line 140, in _load_metadata
    self._load_metadata_nosync()
  File "/home/tom/anaconda3/lib/python3.6/site-packages/zarr/core.py", line 155, in _load_metadata_nosync
    meta = decode_array_metadata(meta_bytes)
  File "/home/tom/anaconda3/lib/python3.6/site-packages/zarr/meta.py", line 50, in decode_array_metadata
    raise MetadataError('error decoding metadata: %s' % e)
zarr.errors.MetadataError: error decoding metadata: <built-in function array> returned a result with an error set

Analysis

I think the serialization method for mM types does not allow to unserialize properly Not a Time fill values.

np.array(int(np.datetime64('nat').view('u8')), dtype='M8[D]')[()]

Resolution

In meta.py / encode_fill_value:

    elif dtype.kind in 'mM':
        if np.isnat(v):
            return 'NaT'
        else:
            return int(v.view('u8'))
@jakirkham
Copy link
Member

What happens if you change u8 to f8?

@jakirkham
Copy link
Member

jakirkham commented Nov 21, 2018

The key point appears to be NaT is represented as -0. So casting to an unsigned integral type causes issues. If a signed integral type is used, this works without issues.

In [1]: import numpy as np                                                      

In [2]: np.array(int(np.datetime64('nat').view('i8')), dtype='M8[D]')[()]       
Out[2]: numpy.datetime64('NaT')

@jakirkham
Copy link
Member

jakirkham commented Nov 21, 2018

Have PR ( zarr-developers/numcodecs#127 ) and PR ( #344 ) in the works to fix this issue.

Edit: Have verified these fix the case described in the OP. The latter PR includes a round-trip test of NaT as a fill_value.

@jakirkham jakirkham added the bug Potential issues with the zarr-python library label Nov 21, 2018
@jakirkham jakirkham added this to the v2.3 milestone Nov 21, 2018
@thomascoquet
Copy link
Author

Thank you for the fix!

@jakirkham
Copy link
Member

Of course, were you able to give it a try?

@jakirkham
Copy link
Member

Closing as this should be resolved with PR ( #344 ) and Numcodecs 0.6.2+. Please let us know if you run into any issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

2 participants