Add asynchronous load method #10327

TomNicholas · 2025-05-16T16:05:49Z

Adds an .async_load() method to Variable, which works by plumbing async get_duck_array all the way down until it finally gets to the async methods zarr v3 exposes.

Needs a lot of refactoring before it could be merged, but it works.

Closes Add an asynchronous load method? #10326
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst
New functions/methods are listed in api.rst

API:

for more information, see https://pre-commit.ci

TomNicholas · 2025-05-19T02:58:10Z

xarray/backends/common.py

@@ -267,13 +268,23 @@ def robust_getitem(array, key, catch=Exception, max_retries=6, initial_delay=500
            time.sleep(1e-3 * next_delay)


-class BackendArray(NdimSizeLenMixin, indexing.ExplicitlyIndexed):
+class BackendArray(ABC, NdimSizeLenMixin, indexing.ExplicitlyIndexed):


As __getitem__ is required, I feel like BackendArray should always have been an ABC.

This class is public API and this is a backwards incompatible change.

It is technically, but only if someone is using this class in a way counter to what the docs explicitly tell you to do (i.e. subclass it).

Regardless this is orthogonal to the rest of the PR, I can remove it, I was just trying to clean up bad things I found.

Reverted in 6c47e3f

TomNicholas · 2025-05-19T02:58:52Z

xarray/backends/common.py

+    async def async_getitem(key: indexing.ExplicitIndexer) -> np.typing.ArrayLike:
+        raise NotImplementedError("Backend does not not support asynchronous loading")


I've implemented this for the ZarrArray class but in theory it could be supported by other backends too.

This might not be the desired behaviour though - this currently means if you opened a dataset from netCDF and called ds.load_async you would get a NotImplementedError. Would it be better to quietly just block instead?

Yes absolutely.

Okay I can do that. But can you explain why you feel that this would be better behaviour? Asking for something to be done async and it quietly blocking also seems not great...

…gather

for more information, see https://pre-commit.ci

TomNicholas · 2025-05-21T04:12:24Z

xarray/core/dataset.py

+        # load everything else concurrently
+        coros = [
+            v.load_async() for k, v in self.variables.items() if k not in chunked_data
+        ]
+        await asyncio.gather(*coros)


We could actually do this same thing inside of the synchronous ds.load() too, but it would require:

Xarray to decide how to call the async code, e.g. with a ThreadPool or similar (see Support concurrent loading of variables #8965)

The backend to support async_getitem (it could fall back to synchronous loading if it's not supported)

We should rate-limite all gather calls with a Semaphore using something like this:

async def async_gather(*coros, concurrency: Optional[int] = None, return_exceptions: bool = False) -> list[Any]: """Execute a gather while limiting the number of concurrent tasks. Args: coros: coroutines list of coroutines to execute concurrency: int concurrency limit if None, defaults to config_obj.get('async.concurrency', 4) if <= 0, no concurrency limit """ if concurrency is None: concurrency = int(config_obj.get("async.concurrency", 4)) if concurrency > 0: # if concurrency > 0, we use a semaphore to limit the number of concurrent coroutines semaphore = asyncio.Semaphore(concurrency) async def sem_coro(coro): async with semaphore: return await coro results = await asyncio.gather(*(sem_coro(c) for c in coros), return_exceptions=return_exceptions) else: results = await asyncio.gather(*coros, return_exceptions=return_exceptions) return results

Arguably that should be left to the underlying storage layer. Zarr already has its own rate limiting. Why introduce this additional complexity and configuration parameter in Xarray?

Does zarr rate-limit per call or globally though? If it's rate-limited per call, and we make lots of concurrent calls from the xarray API, it will exceed the intended rate set in zarr...

I'm not 100% on what Zarr will do but this will rate limit across Xarray variables. We will undoubtedly want to offer control here, even if the default is None for a start.

ianhi · 2025-05-22T21:58:44Z

There is something funky going on when using .sel

# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "arraylake",
#     "yappi",
#     "zarr==3.0.8",
#     "xarray",
#     "icechunk"
# ]
#
# [tool.uv.sources]
# xarray = { git = "https://github.com/TomNicholas/xarray", rev = "async.load" }
# ///

import asyncio
from collections.abc import Iterable
from typing import TypeVar

import numpy as np

import xarray as xr

import zarr
from zarr.abc.store import ByteRequest, Store
from zarr.core.buffer import Buffer, BufferPrototype
from zarr.storage._wrapper import WrapperStore

T_Store = TypeVar("T_Store", bound=Store)


class LatencyStore(WrapperStore[T_Store]):
    """Works the same way as the zarr LoggingStore"""

    latency: float

    def __init__(
        self,
        store: T_Store,
        latency: float = 0.0,
    ) -> None:
        """
        Store wrapper that adds artificial latency to each get call.

        Parameters
        ----------
        store : Store
            Store to wrap
        latency : float
            Amount of artificial latency to add to each get call, in seconds.
        """
        super().__init__(store)
        self.latency = latency

    def __str__(self) -> str:
        return f"latency-{self._store}"

    def __repr__(self) -> str:
        return f"LatencyStore({self._store.__class__.__name__}, '{self._store}', latency={self.latency})"

    async def get(
        self,
        key: str,
        prototype: BufferPrototype,
        byte_range: ByteRequest | None = None,
    ) -> Buffer | None:
        await asyncio.sleep(self.latency)
        return await self._store.get(
            key=key, prototype=prototype, byte_range=byte_range
        )

    async def get_partial_values(
        self,
        prototype: BufferPrototype,
        key_ranges: Iterable[tuple[str, ByteRequest | None]],
    ) -> list[Buffer | None]:
        await asyncio.sleep(self.latency)
        return await self._store.get_partial_values(
            prototype=prototype, key_ranges=key_ranges
        )


memorystore = zarr.storage.MemoryStore({})

shape = 5
X = np.arange(5) * 10
ds = xr.Dataset(
    {
        "data": xr.DataArray(
            np.zeros(shape),
            coords={"x": X},
        )
    }
)

ds.to_zarr(memorystore)


latencystore = LatencyStore(memorystore, latency=0.1)
ds = xr.open_zarr(latencystore, zarr_format=3, consolidated=False, chunks=None)

# no problem for any of these
asyncio.run(ds["data"][0].load_async())
asyncio.run(ds["data"].sel(x=10).load_async())
asyncio.run(ds["data"].sel(x=11, method="nearest").load_async())

# also fine
ds["data"].sel(x=[30, 40]).load()

# broken!
asyncio.run(ds["data"].sel(x=[30, 40]).load_async())

uv run that script gives:

Traceback (most recent call last):
  File "/Users/ian/tmp/async_error.py", line 109, in <module>
    asyncio.run(ds["data"].sel(x=[30, 40]).load_async())
  File "/Users/ian/miniforge3/envs/test/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/Users/ian/miniforge3/envs/test/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ian/miniforge3/envs/test/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/ian/.cache/uv/environments-v2/async-error-29817fa21dae3c0f/lib/python3.12/site-packages/xarray/core/dataarray.py", line 1165, in load_async
    ds = await temp_ds.load_async(**kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ian/.cache/uv/environments-v2/async-error-29817fa21dae3c0f/lib/python3.12/site-packages/xarray/core/dataset.py", line 578, in load_async
    await asyncio.gather(*coros)
  File "/Users/ian/.cache/uv/environments-v2/async-error-29817fa21dae3c0f/lib/python3.12/site-packages/xarray/core/variable.py", line 963, in load_async
    self._data = await async_to_duck_array(self._data, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ian/.cache/uv/environments-v2/async-error-29817fa21dae3c0f/lib/python3.12/site-packages/xarray/namedarray/pycompat.py", line 168, in async_to_duck_array
    return await data.async_get_duck_array()  # type: ignore[no-untyped-call, no-any-return]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ian/.cache/uv/environments-v2/async-error-29817fa21dae3c0f/lib/python3.12/site-packages/xarray/core/indexing.py", line 875, in async_get_duck_array
    await self._async_ensure_cached()
  File "/Users/ian/.cache/uv/environments-v2/async-error-29817fa21dae3c0f/lib/python3.12/site-packages/xarray/core/indexing.py", line 867, in _async_ensure_cached
    duck_array = await self.array.async_get_duck_array()
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ian/.cache/uv/environments-v2/async-error-29817fa21dae3c0f/lib/python3.12/site-packages/xarray/core/indexing.py", line 821, in async_get_duck_array
    return await self.array.async_get_duck_array()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ian/.cache/uv/environments-v2/async-error-29817fa21dae3c0f/lib/python3.12/site-packages/xarray/core/indexing.py", line 674, in async_get_duck_array
    array = await self.array.async_getitem(self.key)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ian/.cache/uv/environments-v2/async-error-29817fa21dae3c0f/lib/python3.12/site-packages/xarray/backends/zarr.py", line 248, in async_getitem
    return await indexing.async_explicit_indexing_adapter(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ian/.cache/uv/environments-v2/async-error-29817fa21dae3c0f/lib/python3.12/site-packages/xarray/core/indexing.py", line 1068, in async_explicit_indexing_adapter
    result = await raw_indexing_method(raw_key.tuple)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object numpy.ndarray can't be used in 'await' expression

TomNicholas · 2025-05-23T02:18:26Z

xarray/backends/zarr.py

+        elif isinstance(key, indexing.VectorizedIndexer):
+            # TODO
+            method = self._vindex
+        elif isinstance(key, indexing.OuterIndexer):
+            # TODO
+            method = self._oindex


@ianhi almost certainly these need to become async to fix your bug

Outer (also known as "Orthogonal") indexing support added in 5eacdb0, but requires changes to zarr-python: zarr-developers/zarr-python#3083

…to async.load

for more information, see https://pre-commit.ci

TomNicholas · 2025-05-23T07:14:33Z

xarray/tests/test_async.py

+        # test vectorized indexing
+        # TODO this shouldn't pass! I haven't implemented async vectorized indexing yet...
+        indexer = xr.DataArray([2, 3], dims=["x"])
+        result = await ds.foo[indexer].load_async()
+        xrt.assert_identical(result, ds.foo[indexer].load())


This currently passes, even though it shouldn't, because I haven't added support for async vectorized indexing yet!

I think this means that my test is wrong, and what I'm doing here is apparently not vectorized indexing. I'm unsure what my test would have to look like though 😕

this is an outer indexer. Try xr.DataArray([[2, 3]], dims=["y", "x"])

I since worked this out, but apparently haven't pushed those changes. Note that it requires changes in Zarr too to make async lazy vectorized indexing work

zarr-developers/zarr-python#3083

dcherian · 2025-05-28T17:03:35Z

xarray/backends/common.py

+    async def async_getitem(key: indexing.ExplicitIndexer) -> np.typing.ArrayLike:
+        raise NotImplementedError("Backend does not not support asynchronous loading")


Yes absolutely.

dcherian · 2025-05-28T17:03:53Z

xarray/backends/common.py

@@ -267,13 +268,23 @@ def robust_getitem(array, key, catch=Exception, max_retries=6, initial_delay=500
            time.sleep(1e-3 * next_delay)


-class BackendArray(NdimSizeLenMixin, indexing.ExplicitlyIndexed):
+class BackendArray(ABC, NdimSizeLenMixin, indexing.ExplicitlyIndexed):


This class is public API and this is a backwards incompatible change.

dcherian · 2025-05-28T17:17:09Z

xarray/core/dataset.py

+        # load everything else concurrently
+        coros = [
+            v.load_async() for k, v in self.variables.items() if k not in chunked_data
+        ]
+        await asyncio.gather(*coros)


We should rate-limite all gather calls with a Semaphore using something like this:

async def async_gather(*coros, concurrency: Optional[int] = None, return_exceptions: bool = False) -> list[Any]: """Execute a gather while limiting the number of concurrent tasks. Args: coros: coroutines list of coroutines to execute concurrency: int concurrency limit if None, defaults to config_obj.get('async.concurrency', 4) if <= 0, no concurrency limit """ if concurrency is None: concurrency = int(config_obj.get("async.concurrency", 4)) if concurrency > 0: # if concurrency > 0, we use a semaphore to limit the number of concurrent coroutines semaphore = asyncio.Semaphore(concurrency) async def sem_coro(coro): async with semaphore: return await coro results = await asyncio.gather(*(sem_coro(c) for c in coros), return_exceptions=return_exceptions) else: results = await asyncio.gather(*coros, return_exceptions=return_exceptions) return results

dcherian · 2025-05-28T17:19:58Z

xarray/tests/test_async.py

+            case "ds":
+                return ds
+
+    def assert_time_as_expected(


Let's instead use mocks to assert the async methods were called. Xarray's job is to do that only

dcherian · 2025-05-28T17:20:44Z

xarray/tests/test_async.py

+        # test vectorized indexing
+        # TODO this shouldn't pass! I haven't implemented async vectorized indexing yet...
+        indexer = xr.DataArray([2, 3], dims=["x"])
+        result = await ds.foo[indexer].load_async()
+        xrt.assert_identical(result, ds.foo[indexer].load())


this is an outer indexer. Try xr.DataArray([[2, 3]], dims=["y", "x"])

dcherian · 2025-05-28T17:22:31Z

xarray/core/indexing.py

+    async def _async_ensure_cached(self):
+        duck_array = await self.array.async_get_duck_array()
+        self.array = as_indexable(duck_array)
+
    def get_duck_array(self):
        self._ensure_cached()


_ensure_cached seems like pointless indirection, it is only used once. let's consolidate.

Removed in 884ce13, but I still feel like it could be simplified further. Does it really need to have the side-effect of re-assigning to self.array?

dcherian · 2025-05-28T17:24:19Z

xarray/core/dataset.py

+        return self
+
+    async def load_async(self, **kwargs) -> Self:
+        # TODO refactor this to pull out the common chunked_data codepath


let's instead just have the sync methods issue a blocking call to the async versions.

I don't think that would solve the use case in xpublish though? You need to be able to asynchronously trigger loading for a bunch of separate dataset objects, which requires an async load api to be exposed, no?

Oh I understand what you mean now, you're not talking about the API, you're just talking about my comment about internal refactoring. You're proposing we do what zarr does internally, which makes sense.

for more information, see https://pre-commit.ci

TomNicholas and others added 21 commits October 24, 2024 17:48

new blank whatsnew

01e7518

Merge branch 'main' of https://github.com/pydata/xarray

83e553b

Merge branch 'main' of https://github.com/pydata/xarray

e44326d

Merge branch 'main' of https://github.com/pydata/xarray

4e4eeb0

Merge branch 'main' of https://github.com/pydata/xarray

d858059

Merge branch 'main' of https://github.com/pydata/xarray

d377780

Merge branch 'main' of https://github.com/pydata/xarray

3132f6a

Merge branch 'main' of https://github.com/pydata/xarray

900eef5

Merge branch 'main' of https://github.com/pydata/xarray

4c4462f

Merge branch 'main' of https://github.com/pydata/xarray

5b9b749

Merge branch 'main' of https://github.com/pydata/xarray

fadb953

Merge branch 'main' of https://github.com/TomNicholas/xarray

57d9d23

Merge branch 'main' of https://github.com/pydata/xarray

11170fc

Merge branch 'main' of https://github.com/pydata/xarray

0b8fa41

Merge branch 'main' of https://github.com/pydata/xarray

f769f85

Merge branch 'main' of https://github.com/pydata/xarray

4eef318

Merge branch 'main' of https://github.com/pydata/xarray

29242a4

test async load using special zarr LatencyStore

e6b3b3b

don't use dask

3ceab60

async all the way down

071c35a

remove assert False

29374f9

TomNicholas added the enhancement label May 16, 2025

github-actions bot added topic-backends topic-indexing topic-documentation topic-zarr Related to zarr storage library io topic-NamedArray Lightweight version of Variable labels May 16, 2025

pre-commit-ci bot and others added 2 commits May 16, 2025 16:07

[pre-commit.ci] auto fixes from pre-commit.com hooks

ab12bb8

for more information, see https://pre-commit.ci

add pytest-asyncio to CI envs

62aa39d

Fix ci/minimum_versions.py

b6d4a82

TomNicholas commented May 19, 2025

View reviewed changes

TomNicholas mentioned this pull request May 19, 2025

Use xarray load_async to make WMS router async xpublish-community/xpublish-wms#136

Open

TomNicholas and others added 6 commits May 21, 2025 08:43

fix formatting

2079d7e

concurrently load different variables in ds.load_async using asyncio.…

48e4534

…gather

test concurrent loading of multiple variables in one dataset

cca7589

fix non-awaited load_async

dfe9b87

rearrange test order

84099f3

[pre-commit.ci] auto fixes from pre-commit.com hooks

ab000c8

for more information, see https://pre-commit.ci

TomNicholas commented May 21, 2025

View reviewed changes

TomNicholas mentioned this pull request May 21, 2025

Concurrent loading of coordinate arrays from Zarr #5092

Open

TomNicholas commented May 23, 2025

View reviewed changes

TomNicholas added 2 commits May 23, 2025 12:36

add test for orthogonal indexing

a8b7b46

explicitly forbid orthogonal indexing

82c7654

TomNicholas mentioned this pull request May 23, 2025

Add async oindex and vindex methods to AsyncArray zarr-developers/zarr-python#3083

Open

6 tasks

TomNicholas and others added 4 commits May 23, 2025 13:36

support async orthogonal indexing via zarr-developers/zarr-python#3083

5eacdb0

Merge branch 'async.load' of https://github.com/TomNicholas/xarray in…

9f33c09

…to async.load

add test for vectorized indexing (even if it doesn't work)

093bf50

[pre-commit.ci] auto fixes from pre-commit.com hooks

4073a24

for more information, see https://pre-commit.ci

TomNicholas commented May 23, 2025

View reviewed changes

TomNicholas added 3 commits May 23, 2025 20:32

add test for basic indexing

842a06c

correct test to actually use vectorized indexing

e19ab55

refactor to parametrize indexing test

b9e8e06

dcherian reviewed May 28, 2025

View reviewed changes

TomNicholas and others added 4 commits May 29, 2025 17:18

implement async vectorized indexing

8bc7bea

revert breaking change to BackendArray

6c47e3f

[pre-commit.ci] auto fixes from pre-commit.com hooks

a86f646

for more information, see https://pre-commit.ci

remove indirection in _ensure_cached method

884ce13

		async def async_getitem(key: indexing.ExplicitIndexer) -> np.typing.ArrayLike:
		raise NotImplementedError("Backend does not not support asynchronous loading")

Uh oh!

Add asynchronous load method #10327

Are you sure you want to change the base?

Add asynchronous load method #10327

Conversation

TomNicholas commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomNicholas May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ianhi commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomNicholas May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TomNicholas commented May 16, 2025 •

edited

Loading

TomNicholas May 29, 2025 •

edited

Loading

ianhi commented May 22, 2025 •

edited

Loading

TomNicholas May 29, 2025 •

edited

Loading