Skip to content

zarr.open_array duplicates path key? #2166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomAugspurger opened this issue Sep 10, 2024 · 3 comments · Fixed by #2167
Closed

zarr.open_array duplicates path key? #2166

TomAugspurger opened this issue Sep 10, 2024 · 3 comments · Fixed by #2167
Labels
bug Potential issues with the zarr-python library

Comments

@TomAugspurger
Copy link
Contributor

Zarr version

v3

Numcodecs version

n/a

Python Version

n/a

Operating System

n/a

Installation

n/a

Description

I'm looking into some of the xarray tests, and noticed this strange(?) change from V2. It seems like zarr.open_array with path="a" will write the data to a/a/, so that the metadata is at key a/a/zarr.json. I would have expected it to be at a/zarr.json. Is this expected?

Steps to reproduce

# v3
import zarr
import shutil
shutil.rmtree("/tmp/data.zarr", ignore_errors=True)

store = zarr.store.LocalStore(root="/tmp/data.zarr", mode="w")
a = zarr.open_array(store=store, path="a", shape=(4,))
b = zarr.open_array(store=store, path="b", shape=(4,))
[x async for x in store.list()]

gives

['a/a/zarr.json', 'b/b/zarr.json']

and the arrays must be opened with zarr.open_array(store=store, path="a/a").

Contrast that with V2, where you have

import zarr


store = "/tmp/v2.zarr"
a = zarr.open_array(store=store, path="a", shape=(4,))
b = zarr.open_array(store=store, path="b", shape=(4,))

g = zarr.open_group("/tmp/v2.zarr")
list(g.array_keys())

give ['a', 'b'] and you open the arrays with zarr.open_array(store=store, path="a").

Additional output

No response

@TomAugspurger TomAugspurger added the bug Potential issues with the zarr-python library label Sep 10, 2024
@dcherian
Copy link
Contributor

I concur that there is something bizarre here. I struggled with it:

# TODO: clean this up
# if path is None and name is None:
# array_path = None
# array_name = None
# elif path is None and name is not None:
# array_path = f"{name}"
# array_name = f"/{name}"
# elif path is not None and name is None:
# array_path = path
# array_name = None
# elif path == "/":
# assert name is not None
# array_path = name
# array_name = "/" + name
# else:
# assert name is not None
# array_path = f"{path}/{name}"
# array_name = "/" + array_path

Would be good to uncomment (and simplify) once someone figures this out :)

@TomAugspurger
Copy link
Contributor Author

Ah, looks like if path is provided it's appended to store path at

if path is not None:
store_path = store_path / path

Then when we go to create the array in create we pass it again

, despite it already being present in the store_path we provide as store.

I'll see if anything breaks when removing that.

TomAugspurger added a commit to TomAugspurger/zarr-python that referenced this issue Sep 10, 2024
@jhamman
Copy link
Member

jhamman commented Sep 10, 2024

Nice catch @TomAugspurger!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants