Skip to content

[Diff PR] Sharding storage transformer for v3 #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
92ce212
add sharding storage transformer
jstriebel Aug 18, 2022
f6c87b4
add actual transformer
jstriebel Aug 18, 2022
df2dd71
fixe, and allow partial reads for uncompressed v3 arrays
jstriebel Aug 22, 2022
06ce675
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Aug 22, 2022
4c0807e
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Aug 22, 2022
61db74a
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Aug 22, 2022
83c9389
make lgtm happy
jstriebel Aug 22, 2022
fde61e8
add release note
jstriebel Aug 22, 2022
de4de18
better coverage
jstriebel Aug 23, 2022
0deb2b6
fix hexdigest
jstriebel Aug 23, 2022
d3eda71
improve tests
jstriebel Aug 23, 2022
093926c
fix order of storage transformers
jstriebel Aug 24, 2022
6e2790c
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Aug 24, 2022
9257b85
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Aug 25, 2022
e7b14b7
minor test improvement
jstriebel Aug 25, 2022
a52300c
minor test update
jstriebel Aug 25, 2022
a960481
apply PR feedback
jstriebel Sep 8, 2022
6bc1025
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Dec 12, 2022
92a48d8
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Dec 12, 2022
12dc1ae
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Dec 12, 2022
91f10ff
call ensure_bytes in sharding transformer
jstriebel Dec 12, 2022
73fb0a5
minor fixes
jstriebel Dec 12, 2022
490b962
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Dec 22, 2022
e1960a1
adapt to supports_efficient_get_partial_values property
jstriebel Dec 22, 2022
c1bc26d
add ZARR_V3_SHARDING flag for sharding usage
jstriebel Dec 22, 2022
6f5b35a
fix release notes
jstriebel Dec 22, 2022
070c02c
fix release notes
jstriebel Dec 22, 2022
ef5c020
Merge remote-tracking branch 'scm/storage-transformers-and-partial-ge…
jstriebel Dec 22, 2022
385b5d3
add storage_transformers and get/set_partial_values (#1096)
jstriebel Jan 16, 2023
652653d
Merge remote-tracking branch 'scm/storage-transformers-and-partial-ge…
jstriebel Jan 19, 2023
1ccf052
Merge remote-tracking branch 'origin/main' into sharding-storage-tran…
jstriebel Jan 19, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/minimal.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ jobs:
shell: "bash -l {0}"
env:
ZARR_V3_EXPERIMENTAL_API: 1
ZARR_V3_SHARDING: 1
run: |
conda activate minimal
python -m pip install .
Expand All @@ -32,6 +33,7 @@ jobs:
shell: "bash -l {0}"
env:
ZARR_V3_EXPERIMENTAL_API: 1
ZARR_V3_SHARDING: 1
run: |
conda activate minimal
rm -rf fixture/
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ jobs:
ZARR_TEST_MONGO: 1
ZARR_TEST_REDIS: 1
ZARR_V3_EXPERIMENTAL_API: 1
ZARR_V3_SHARDING: 1
run: |
conda activate zarr-env
mkdir ~/blob_emulator
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/windows-testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ jobs:
env:
ZARR_TEST_ABS: 1
ZARR_V3_EXPERIMENTAL_API: 1
ZARR_V3_SHARDING: 1
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down
4 changes: 3 additions & 1 deletion docs/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,9 @@ Unreleased
Add two features of the [v3 spec](https://zarr-specs.readthedocs.io/en/latest/core/v3.0.html):
* storage transformers
* `get_partial_values` and `set_partial_values`
By :user:`Jonathan Striebel <jstriebel>`; :issue:`1096`.
* efficient `get_partial_values` implementation for `FSStoreV3`
* sharding storage transformer
By :user:`Jonathan Striebel <jstriebel>`; :issue:`1096`, :issue:`1111`.

.. _release_2.13.6:

Expand Down
29 changes: 29 additions & 0 deletions zarr/_storage/v3.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,35 @@ def rmdir(self, path=None):
if self.fs.isdir(store_path):
self.fs.rm(store_path, recursive=True)

@property
def supports_efficient_get_partial_values(self):
return True

def get_partial_values(self, key_ranges):
"""Get multiple partial values.
key_ranges can be an iterable of key, range pairs,
where a range specifies two integers range_start and range_length
as a tuple, (range_start, range_length).
range_length may be None to indicate to read until the end.
range_start may be negative to start reading range_start bytes
from the end of the file.
A key may occur multiple times with different ranges.
Inserts None for missing keys into the returned list."""
results = []
for key, (range_start, range_length) in key_ranges:
key = self._normalize_key(key)
path = self.dir_path(key)
try:
if range_start is None or range_length is None:
end = None
else:
end = range_start + range_length
result = self.fs.cat_file(path, start=range_start, end=end)
except self.map.missing_exceptions:
result = None
results.append(result)
return results


class MemoryStoreV3(MemoryStore, StoreV3):

Expand Down
Loading