Skip to content

GroupMetadata and ArrayV2Metadata should ignore extra keys for zarr v2 metadata #2296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomAugspurger opened this issue Oct 2, 2024 · 0 comments · Fixed by #2297
Closed
Labels
bug Potential issues with the zarr-python library

Comments

@TomAugspurger
Copy link
Contributor

Zarr version

v3

Numcodecs version

na

Python Version

na

Operating System

na

Installation

na

Description

Some Zarr v2 implementations store additional fields at the top level of the metadata files, e.g. nczarr. zarr-python 3.x currently can't read these files because the deserialization methods pass all the keys to the GroupMetadata and ArrayV2Metadata constructors.

Steps to reproduce

In [10]: GroupMetadata.from_dict({"zarr_format": 2, "attributes": {"key": "value"}, "extra": "value"})
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[10], line 1
----> 1 GroupMetadata.from_dict({"zarr_format": 2, "attributes": {"key": "value"}, "extra": "value"})

File ~/gh/zarr-developers/zarr-python/src/zarr/core/group.py:119, in GroupMetadata.from_dict(cls, data)
    116 @classmethod
    117 def from_dict(cls, data: dict[str, Any]) -> GroupMetadata:
    118     assert data.pop("node_type", None) in ("group", None)
--> 119     return cls(**data)

TypeError: GroupMetadata.__init__() got an unexpected keyword argument 'extra'

Additional output

We can filter out unexpected keys before calling the constructors. We should decide what action to take when we encounter an unknown key (warning with the python logging module, a Python warning, and maybe some sort of config options to control the behavior).

For Zarr v3, I think the expectation is that extra keys are allowed, but they must be objects with a must_understand field.

@TomAugspurger TomAugspurger added the bug Potential issues with the zarr-python library label Oct 2, 2024
TomAugspurger added a commit to TomAugspurger/zarr-python that referenced this issue Oct 2, 2024
Ignore unexpected keys in Zarr V2 metadata, to enable reading zarr
files written by other systems, which might store additional data
in the top level of the `.zgroup` and `.zarray` files`

Closes zarr-developers#2296
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant