-
-
Notifications
You must be signed in to change notification settings - Fork 366
Open
Labels
bugPotential issues with the zarr-python libraryPotential issues with the zarr-python library
Description
Zarr version
3.1.3
Numcodecs version
0.15.1
Python Version
3.12.9
Operating System
Linux
Installation
uv pip install
Description
Timing comparisons between Zarr 2 and Zarr 3 of various indexing show that version 3 is always slower than version 2. I have attached the code to run the comparisons, and some results. All tests are run in memory and the compressors are disabled.
You can see that for example that accessing data[::step] is much slower (from 0.5s to 3.9s) . It may be due to the overhead of switching between synchronous and asynchronous code, but this is just a guess.
Steps to reproduce
import zarr
import numpy as np
import timeit
import inspect
version = int(zarr.__version__.split(".")[0])
values = np.ones(shape=(60632, 4, 1, 10))
root = zarr.group()
if version < 3:
data = root.create_dataset("data", data=values, shape=values.shape, compressor=None)
else:
data = root.create_array("data", data=values, compressors=None)
start = 15
end = data.shape[0] - 15
step = data.shape[0] // 10
def set_values():
data[:] = 2
tests = [
lambda: data[0:10, :, 0],
lambda: data[:, 0:3, 0],
lambda: data[0:10, 0:3, 0],
lambda: data[:, :, :],
lambda: data[0],
lambda: data[0, :],
lambda: data[0, 0, :],
lambda: data[0, 0, 0, :],
lambda: data[start:end:step],
lambda: data[start:end],
lambda: data[start:],
lambda: data[:end],
lambda: data[::step],
lambda: set_values(),
]
for i, t in enumerate(tests):
src = inspect.getsourcelines(t)[0][0].strip().replace("lambda:", "").strip(",")
elapsed = timeit.timeit(t, number=1000)
print(f"zarr{version}: {src:22}: {elapsed:10.4f} seconds")Additional output
Running the test above with Zarr 2.18.7:
zarr2: data[0:10, :, 0] : 0.1546 seconds
zarr2: data[:, 0:3, 0] : 4.1608 seconds
zarr2: data[0:10, 0:3, 0] : 0.1267 seconds
zarr2: data[:, :, :] : 6.2516 seconds
zarr2: data[0] : 0.1478 seconds
zarr2: data[0, :] : 0.1492 seconds
zarr2: data[0, 0, :] : 0.0602 seconds
zarr2: data[0, 0, 0, :] : 0.0676 seconds
zarr2: data[start:end:step] : 0.5197 seconds
zarr2: data[start:end] : 6.2125 seconds
zarr2: data[start:] : 6.2291 seconds
zarr2: data[:end] : 6.2185 seconds
zarr2: data[::step] : 0.5151 seconds
zarr2: set_values() : 3.2621 seconds
and with Zarr 3.1.3:
zarr3: data[0:10, :, 0] : 1.1350 seconds
zarr3: data[:, 0:3, 0] : 7.1892 seconds
zarr3: data[0:10, 0:3, 0] : 0.9352 seconds
zarr3: data[:, :, :] : 10.4327 seconds
zarr3: data[0] : 1.1225 seconds
zarr3: data[0, :] : 1.1331 seconds
zarr3: data[0, 0, :] : 0.5101 seconds
zarr3: data[0, 0, 0, :] : 0.5125 seconds
zarr3: data[start:end:step] : 3.8887 seconds
zarr3: data[start:end] : 10.4090 seconds
zarr3: data[start:] : 10.4113 seconds
zarr3: data[:end] : 10.4148 seconds
zarr3: data[::step] : 3.9100 seconds
zarr3: set_values() : 17.0404 seconds
jhamman
Metadata
Metadata
Assignees
Labels
bugPotential issues with the zarr-python libraryPotential issues with the zarr-python library