Skip to content

NumPy integers not accepted in chunks argument to zarr.zeros #697

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mdickinson opened this issue Feb 2, 2021 · 3 comments · Fixed by #933
Closed

NumPy integers not accepted in chunks argument to zarr.zeros #697

mdickinson opened this issue Feb 2, 2021 · 3 comments · Fixed by #933

Comments

@mdickinson
Copy link

The following code works with zarr 2.3.2, but not with more-recently-released versions of zarr (tested on zarr 2.4.0 and zarr 2.6.1). It now gives a TypeError as a result of attempting to JSON serialize a NumPy integer.

import numpy as np
import zarr

zarr.zeros((10,), chunks=(np.int64(2),))

With zarr 2.3.2, this creates a new array, as expected. With later versions of zarr, I get the following traceback (here with zarr 2.6.1):

(seismic-zarr) mdickinson@mirzakhani temp % python bug.py             
Traceback (most recent call last):
  File "bug.py", line 4, in <module>
    zarr.zeros((10,), chunks=(np.int64(2),))
  File "/Users/mdickinson/.venvs/seismic-zarr/lib/python3.8/site-packages/zarr/creation.py", line 248, in zeros
    return create(shape=shape, fill_value=0, **kwargs)
  File "/Users/mdickinson/.venvs/seismic-zarr/lib/python3.8/site-packages/zarr/creation.py", line 121, in create
    init_array(store, shape=shape, chunks=chunks, dtype=dtype, compressor=compressor,
  File "/Users/mdickinson/.venvs/seismic-zarr/lib/python3.8/site-packages/zarr/storage.py", line 344, in init_array
    _init_array_metadata(store, shape=shape, chunks=chunks, dtype=dtype,
  File "/Users/mdickinson/.venvs/seismic-zarr/lib/python3.8/site-packages/zarr/storage.py", line 434, in _init_array_metadata
    store[key] = encode_array_metadata(meta)
  File "/Users/mdickinson/.venvs/seismic-zarr/lib/python3.8/site-packages/zarr/meta.py", line 75, in encode_array_metadata
    return json_dumps(meta)
  File "/Users/mdickinson/.venvs/seismic-zarr/lib/python3.8/site-packages/zarr/util.py", line 25, in json_dumps
    return json.dumps(o, indent=4, sort_keys=True, ensure_ascii=True,
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py", line 234, in dumps
    return cls(
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py", line 325, in _iterencode_list
    yield from chunks
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type int64 is not JSON serializable

This caused a regression in some other code using zarr: tests started failing after updating zarr.

Please provide the following:

  • Value of zarr.__version__: 2.6.1
  • Value of numcodecs.__version__: 0.7.3
  • Version of Python interpreter: 3.8.7 (from MacPorts)
  • Operating system (Linux/Windows/Mac): macOS 10.15.7 (Catalina)
  • How Zarr was installed (e.g., "using pip into virtual environment", or "using conda"): using pip into a venv.

NumPy version: 1.20.0 (but I was also able to reproduce with NumPy 1.16.5).

@mdickinson
Copy link
Author

Related: https://bugs.python.org/issue24313

@mdickinson
Copy link
Author

Still an issue with zarr 2.10.2:

(zarr) mdickinson@mirzakhani edge % python -VV 
Python 3.9.7 (default, Oct  9 2021, 07:22:04) 
[Clang 12.0.5 (clang-1205.0.22.9)]
(zarr) mdickinson@mirzakhani edge % pip list
Package    Version
---------- -------
asciitree  0.3.3
fasteners  0.16.3
numcodecs  0.9.1
numpy      1.21.3
pip        21.3.1
setuptools 57.4.0
six        1.16.0
zarr       2.10.2
(zarr) mdickinson@mirzakhani edge % python
Python 3.9.7 (default, Oct  9 2021, 07:22:04) 
[Clang 12.0.5 (clang-1205.0.22.9)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import zarr
>>> zarr.zeros((10,), chunks=(np.int64(2),))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mdickinson/.venvs/zarr/lib/python3.9/site-packages/zarr/creation.py", line 268, in zeros
    return create(shape=shape, fill_value=0, **kwargs)
  File "/Users/mdickinson/.venvs/zarr/lib/python3.9/site-packages/zarr/creation.py", line 138, in create
    init_array(store, shape=shape, chunks=chunks, dtype=dtype, compressor=compressor,
  File "/Users/mdickinson/.venvs/zarr/lib/python3.9/site-packages/zarr/storage.py", line 353, in init_array
    _init_array_metadata(store, shape=shape, chunks=chunks, dtype=dtype,
  File "/Users/mdickinson/.venvs/zarr/lib/python3.9/site-packages/zarr/storage.py", line 451, in _init_array_metadata
    store[key] = encode_array_metadata(meta)
  File "/Users/mdickinson/.venvs/zarr/lib/python3.9/site-packages/zarr/meta.py", line 97, in encode_array_metadata
    return json_dumps(meta)
  File "/Users/mdickinson/.venvs/zarr/lib/python3.9/site-packages/zarr/util.py", line 29, in json_dumps
    return json.dumps(o, indent=4, sort_keys=True, ensure_ascii=True,
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py", line 234, in dumps
    return cls(
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/encoder.py", line 325, in _iterencode_list
    yield from chunks
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type int64 is not JSON serializable

@jakirkham
Copy link
Member

If you would like to work on a PR, that would be welcome 🙂

My guess is we can just add some logic to coerce NumPy integers to Python integers when receiving chunks (say through a utility function)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants