Skip to content

[use case demonstration] Kvikio Direct-to-gpu -> xarray -> xbatcher -> ml model  #87

Open
@jhamman

Description

@jhamman

What is your issue?

Recent developments by @NVIDIA and @dcherian are opening the door for direct-to-gpu data loading in Xarray. This could mean that when combined with Xbatcher and the tensorflow or pytorch data loaders, a complete workflow from Zarr all the way to a ml model training could be accomplished without ever handling data on a CPU.

Here's a short illustration of the potential workflow:

import xarray as xr
import xbatcher

ds = xr.open_dataset(store, engine="kvikio", consolidated=False)

x_gen = xbatcher.BatchGenerator(ds[xvars], {'time': 10}) 
y_gen = xbatcher.BatchGenerator(ds[yvars], {'time': 10}) 

tf_dataset = xbatcher.loaders.keras.CustomTFDataset(x_gen, y_gen)

model.fit(tf_dataset, ...)

This would be awesome to demonstrate in a single example. Perhaps as a second tutorial on Xbatcher's documentation site.

xref: xarray-contrib/cupy-xarray#10

cc @dcherian, @negin513, and @weiji14

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions