Skip to content

Add ZARR_V3_API_AVAILABLE environment variable #1007

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 2, 2022

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Apr 11, 2022

attn: @joshmoore

This PR adds a ZARR_V3_API_AVAILABLE environment variable that controls whether various StoreV3 classes are publicly available via the base zarr namespace. There is one new test case corresponding to this (test_top_level_imports). I could not recall from prior discussion if this was what was wanted or if there was something additional?

As implemented here, even without defining the environment variable, the tests would run with both v3 and v2 cases and users could still use zarr version 3 classes via the convenience methods if zarr_version=3 was specified. We could raise an error on zarr_version=3 in any of these functions if the environment variable is not set, but I'm not sure that is helpful?

The second commit just moves v3-specific storage class code into a separate file to make the separation a bit cleaner.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@grlee77
Copy link
Contributor Author

grlee77 commented Apr 11, 2022

Another reorganization I played with in a different branch is to split thematically similar stores into separate files as already done for zarr/storage/absstore.py. In that case storage_v3.py here would just import those using, e.g.

from zarr._storage.absstore import ABSStoreV3
from zarr._storage.consolidatedstore import ConsolidatedMetadataStoreV3
from zarr._storage.directorystore import DirectoryStoreV3
from zarr._storage.dbmstore import DBMStoreV3
from zarr._storage.fsstore import FSStoreV3
from zarr._storage.lmdbstore import LMDBStoreV3
from zarr._storage.lrustore import LRUStoreCacheV3
from zarr._storage.memorystore import KVStoreV3, MemoryStoreV3
from zarr._storage.mongodbstore import MongoDBStoreV3
from zarr._storage.redisstore import RedisStoreV3
from zarr._storage.sqlitestore import SQLiteStoreV3
from zarr._storage.store import BaseStore
from zarr._storage.store_v3 import data_root, meta_root, StoreV3
from zarr._storage.zipstore import ZipStoreV3

the v2 store classes are in these same files as well. I think that is a little better, but has the downside that it is a larger change that would likely create conflicts with many existing PRs.

That branch is built on top of this one here: https://github.com/grlee77/zarr-python/tree/split-stores
I can open a PR for it here if it seems desirable.

@codecov
Copy link

codecov bot commented Apr 11, 2022

Codecov Report

Merging #1007 (a33266b) into master (1dc0352) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #1007   +/-   ##
=======================================
  Coverage   99.94%   99.94%           
=======================================
  Files          33       34    +1     
  Lines       13663    13710   +47     
=======================================
+ Hits        13656    13703   +47     
  Misses          7        7           
Impacted Files Coverage Δ
zarr/__init__.py 100.00% <100.00%> (ø)
zarr/_storage/store.py 100.00% <100.00%> (ø)
zarr/_storage/v3.py 100.00% <100.00%> (ø)
zarr/convenience.py 100.00% <100.00%> (ø)
zarr/hierarchy.py 99.78% <100.00%> (+<0.01%) ⬆️
zarr/storage.py 100.00% <100.00%> (ø)
zarr/tests/test_attrs.py 100.00% <100.00%> (ø)
zarr/tests/test_convenience.py 100.00% <100.00%> (ø)
zarr/tests/test_core.py 100.00% <100.00%> (ø)
zarr/tests/test_creation.py 100.00% <100.00%> (ø)
... and 4 more

@jakirkham
Copy link
Member

cc @rabernat @MSanKeys963 (also as we discussed this recently)

@joshmoore
Copy link
Member

There is one new test case corresponding to this (test_top_level_imports). I could not recall from prior discussion if this was what was wanted or if there was something additional?

I'll need to take a closer look, but if the new test is being run with both options by default, this would seem to make sense.

the tests would run with both v3 and v2 cases

👍

users could still use zarr version 3 classes via the convenience methods if zarr_version=3

Ok. need to think about that one, since this is essentially another route to the same thing as the environment variable. Maybe the environment variable is overkill and simply needing to import an explicit package for the moment is enough. I'll look through the existing imports using _v3_ to see if maybe we could use v3_experimental (or v3_dev) until the time comes.

Another reorganization I played with in a different branch...

Did you happen to try: zarr._storage.v3? I'm fine if we move away from _storage as a module but I'd have a preference for having fewer naming schemes than more. (Opinions very much welcome, though)

but has the downside that it is a larger change that would likely create conflicts with many existing PRs.

In general I'm 👍 for splitting storage.py but agreed that let's figure out this first (more pressing) issue and go from there.

@jakirkham
Copy link
Member

@grlee77 did you have thoughts on Josh's questions above? Also would it be possible to resolve the merge conflicts here?

@grlee77
Copy link
Contributor Author

grlee77 commented Apr 28, 2022

@grlee77 did you have thoughts on Josh's questions above?

I will revisit this later today.

Right now the user cannot enable or disable v3 support at run time. The change here just causes __init__.py to not import the v3 stores unless this environment variable is set. Is that sufficient? If so then I think we can just have CI run at least one case with the environment variable defined and the new test will be covered.

I am also fine with moving the new file under zarr/_storage/v3.py as suggested above.

@joshmoore
Copy link
Member

@grlee77: let's call is sufficient for this PR. There are a few ongoing threads around V3 and we can loop back once those are all settled. Thanks!

I am also fine with moving the new file under zarr/_storage/v3.py as suggested above.

👍

@grlee77
Copy link
Contributor Author

grlee77 commented Apr 29, 2022

Okay, I fixed the conflicts and Linux tests now run with the environment variable enabled. In practice it is only the new test for symbol importing that relies on the environment variable. All other v3 tests still get run either way.

Copy link
Member

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @grlee77! Merging this as a finalization of the work that's already on the mainline. Pretty sure there's a general sigh of relief from everyone that you got us to this point. 🙌🏽

The next step is likely a general evaluation of the mainline and a decision on the release status. Unless someone else has a suggestion/proposal/etc., I'd say let's start getting 2.12.0a releases out. 🚀

@joshmoore joshmoore merged commit 27cf315 into zarr-developers:master May 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants