Skip to content

Commit 2230848

Browse files
authored
Merge branch 'main' into feature/optional-uvloop
2 parents 23ba156 + 62551c7 commit 2230848

40 files changed

+2563
-487
lines changed

.github/workflows/check_changelogs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ jobs:
1212
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
1313

1414
- name: Install uv
15-
uses: astral-sh/setup-uv@557e51de59eb14aaaba2ed9621916900a91d50c6 # v6.6.1
15+
uses: astral-sh/setup-uv@b75a909f75acd358c2196fb9a5f1299a9a8868a4 # v6.7.0
1616

1717
- name: Check changelog entries
1818
run: uv run --no-sync python ci/check_changelog_entries.py

.github/workflows/gpu_test.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ jobs:
3030

3131
steps:
3232
- uses: actions/checkout@v5
33+
with:
34+
fetch-depth: 0 # grab all branches and tags
3335
# - name: cuda-toolkit
3436
# uses: Jimver/[email protected]
3537
# id: cuda-toolkit

changes/1798.feature.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Add a command-line interface to migrate v2 Zarr metadata to v3. Corresponding functions are also
2+
provided under zarr.metadata.

changes/2992.bugfix.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Fix a bug preventing ``ones_like``, ``full_like``, ``empty_like``, ``zeros_like`` and ``open_like`` functions from accepting
2+
an explicit specification of array attributes like shape, dtype, chunks etc. The functions ``full_like``,
3+
``empty_like``, and ``open_like`` now also more consistently infer a ``fill_value`` parameter from the provided array.

changes/3390.misc.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Improve documentation consistency across API functions and remove outdated references to deprecated configuration values that no longer work.

changes/3436.feature.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Adds a registry for chunk key encodings for extensibility.
2+
This allows users to implement a custom `ChunkKeyEncoding`, which can be registered via `register_chunk_key_encoding` or as an entry point under `zarr.chunk_key_encoding`.

changes/3444.feature.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Trying to open a group at a path were a array already exists now raises a helpful error.

docs/user-guide/cli.rst

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
.. _user-guide-cli:
2+
3+
Command-line interface
4+
========================
5+
6+
Zarr-Python provides a command-line interface that enables:
7+
8+
- migration of Zarr v2 metadata to v3
9+
- removal of v2 or v3 metadata
10+
11+
To see available commands run the following in a terminal:
12+
13+
.. code-block:: bash
14+
15+
$ zarr --help
16+
17+
or to get help on individual commands:
18+
19+
.. code-block:: bash
20+
21+
$ zarr migrate --help
22+
23+
$ zarr remove-metadata --help
24+
25+
26+
Migrate metadata from v2 to v3
27+
------------------------------
28+
29+
Migrate to a separate location
30+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
31+
32+
To migrate a Zarr array/group's metadata from v2 to v3 run:
33+
34+
.. code-block:: bash
35+
36+
$ zarr migrate v3 path/to/input.zarr path/to/output.zarr
37+
38+
This will write new ``zarr.json`` files to ``output.zarr``, leaving ``input.zarr`` un-touched.
39+
Note - this will migrate the entire Zarr hierarchy, so if ``input.zarr`` contains multiple groups/arrays,
40+
new ``zarr.json`` will be made for all of them.
41+
42+
Migrate in-place
43+
~~~~~~~~~~~~~~~~
44+
45+
If you'd prefer to migrate the metadata in-place run:
46+
47+
.. code-block:: bash
48+
49+
$ zarr migrate v3 path/to/input.zarr
50+
51+
This will write new ``zarr.json`` files to ``input.zarr``, leaving the existing v2 metadata un-touched.
52+
53+
To open the array/group using the new metadata use:
54+
55+
.. code-block:: python
56+
57+
>>> import zarr
58+
>>> zarr_with_v3_metadata = zarr.open('path/to/input.zarr', zarr_format=3)
59+
60+
Once you are happy with the conversion, you can run the following to remove the old v2 metadata:
61+
62+
.. code-block:: bash
63+
64+
$ zarr remove-metadata v2 path/to/input.zarr
65+
66+
Note there is also a shortcut to migrate and remove v2 metadata in one step:
67+
68+
.. code-block:: bash
69+
70+
$ zarr migrate v3 path/to/input.zarr --remove-v2-metadata
71+
72+
73+
Remove metadata
74+
----------------
75+
76+
Remove v2 metadata using:
77+
78+
.. code-block:: bash
79+
80+
$ zarr remove-metadata v2 path/to/input.zarr
81+
82+
or v3 with:
83+
84+
.. code-block:: bash
85+
86+
$ zarr remove-metadata v3 path/to/input.zarr
87+
88+
By default, this will only allow removal of metadata if a valid alternative exists. For example, you can't
89+
remove v2 metadata unless v3 metadata exists at that location.
90+
91+
To override this behaviour use ``--force``:
92+
93+
.. code-block:: bash
94+
95+
$ zarr remove-metadata v3 path/to/input.zarr --force
96+
97+
98+
Dry run
99+
--------
100+
All commands provide a ``--dry-run`` option that will log changes that would be made on a real run, without creating
101+
or modifying any files.
102+
103+
.. code-block:: bash
104+
105+
$ zarr migrate v3 path/to/input.zarr --dry-run
106+
107+
Dry run enabled - no new files will be created or changed. Log of files that would be created on a real run:
108+
Saving metadata to path/to/input.zarr/zarr.json
109+
110+
111+
Verbose
112+
--------
113+
You can also add ``--verbose`` **before** any command, to see a full log of its actions:
114+
115+
.. code-block:: bash
116+
117+
$ zarr --verbose migrate v3 path/to/input.zarr
118+
119+
$ zarr --verbose remove-metadata v2 path/to/input.zarr
120+
121+
122+
Equivalent functions
123+
--------------------
124+
All features of the command-line interface are also available via functions under
125+
:mod:`zarr.metadata`.
126+
127+

docs/user-guide/config.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,6 @@ Configuration options include the following:
2828

2929
- Default Zarr format ``default_zarr_version``
3030
- Default array order in memory ``array.order``
31-
- Default filters, serializers and compressors, e.g. ``array.v3_default_filters``, ``array.v3_default_serializer``, ``array.v3_default_compressors``, ``array.v2_default_filters`` and ``array.v2_default_compressor``
3231
- Whether empty chunks are written to storage ``array.write_empty_chunks``
3332
- Async and threading options, e.g. ``async.concurrency`` and ``threading.max_workers``
3433
- Selections of implementations of codecs, codec pipelines and buffers
@@ -62,7 +61,7 @@ This is the current default configuration::
6261
'numcodecs.delta': 'zarr.codecs.numcodecs.Delta',
6362
'numcodecs.fixedscaleoffset': 'zarr.codecs.numcodecs.FixedScaleOffset',
6463
'numcodecs.fletcher32': 'zarr.codecs.numcodecs.Fletcher32',
65-
'numcodecs.gZip': 'zarr.codecs.numcodecs.GZip',
64+
'numcodecs.gzip': 'zarr.codecs.numcodecs.GZip',
6665
'numcodecs.jenkins_lookup3': 'zarr.codecs.numcodecs.JenkinsLookup3',
6766
'numcodecs.lz4': 'zarr.codecs.numcodecs.LZ4',
6867
'numcodecs.lzma': 'zarr.codecs.numcodecs.LZMA',

docs/user-guide/consolidated_metadata.rst

Lines changed: 35 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -49,44 +49,41 @@ that can be used.:
4949
>>> from pprint import pprint
5050
>>> pprint(dict(consolidated_metadata.items()))
5151
{'a': ArrayV3Metadata(shape=(1,),
52-
data_type=Float64(endianness='little'),
53-
chunk_grid=RegularChunkGrid(chunk_shape=(1,)),
54-
chunk_key_encoding=DefaultChunkKeyEncoding(name='default',
55-
separator='/'),
56-
fill_value=np.float64(0.0),
57-
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
58-
ZstdCodec(level=0, checksum=False)),
59-
attributes={},
60-
dimension_names=None,
61-
zarr_format=3,
62-
node_type='array',
63-
storage_transformers=()),
64-
'b': ArrayV3Metadata(shape=(2, 2),
65-
data_type=Float64(endianness='little'),
66-
chunk_grid=RegularChunkGrid(chunk_shape=(2, 2)),
67-
chunk_key_encoding=DefaultChunkKeyEncoding(name='default',
68-
separator='/'),
69-
fill_value=np.float64(0.0),
70-
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
71-
ZstdCodec(level=0, checksum=False)),
72-
attributes={},
73-
dimension_names=None,
74-
zarr_format=3,
75-
node_type='array',
76-
storage_transformers=()),
77-
'c': ArrayV3Metadata(shape=(3, 3, 3),
78-
data_type=Float64(endianness='little'),
79-
chunk_grid=RegularChunkGrid(chunk_shape=(3, 3, 3)),
80-
chunk_key_encoding=DefaultChunkKeyEncoding(name='default',
81-
separator='/'),
82-
fill_value=np.float64(0.0),
83-
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
84-
ZstdCodec(level=0, checksum=False)),
85-
attributes={},
86-
dimension_names=None,
87-
zarr_format=3,
88-
node_type='array',
89-
storage_transformers=())}
52+
data_type=Float64(endianness='little'),
53+
chunk_grid=RegularChunkGrid(chunk_shape=(1,)),
54+
chunk_key_encoding=DefaultChunkKeyEncoding(separator='/'),
55+
fill_value=np.float64(0.0),
56+
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
57+
ZstdCodec(level=0, checksum=False)),
58+
attributes={},
59+
dimension_names=None,
60+
zarr_format=3,
61+
node_type='array',
62+
storage_transformers=()),
63+
'b': ArrayV3Metadata(shape=(2, 2),
64+
data_type=Float64(endianness='little'),
65+
chunk_grid=RegularChunkGrid(chunk_shape=(2, 2)),
66+
chunk_key_encoding=DefaultChunkKeyEncoding(separator='/'),
67+
fill_value=np.float64(0.0),
68+
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
69+
ZstdCodec(level=0, checksum=False)),
70+
attributes={},
71+
dimension_names=None,
72+
zarr_format=3,
73+
node_type='array',
74+
storage_transformers=()),
75+
'c': ArrayV3Metadata(shape=(3, 3, 3),
76+
data_type=Float64(endianness='little'),
77+
chunk_grid=RegularChunkGrid(chunk_shape=(3, 3, 3)),
78+
chunk_key_encoding=DefaultChunkKeyEncoding(separator='/'),
79+
fill_value=np.float64(0.0),
80+
codecs=(BytesCodec(endian=<Endian.little: 'little'>),
81+
ZstdCodec(level=0, checksum=False)),
82+
attributes={},
83+
dimension_names=None,
84+
zarr_format=3,
85+
node_type='array',
86+
storage_transformers=())}
9087

9188
Operations on the group to get children automatically use the consolidated metadata.:
9289

0 commit comments

Comments
 (0)