Open
Description
What is your issue?
Recent developments by @NVIDIA and @dcherian are opening the door for direct-to-gpu data loading in Xarray. This could mean that when combined with Xbatcher and the tensorflow or pytorch data loaders, a complete workflow from Zarr all the way to a ml model training could be accomplished without ever handling data on a CPU.
Here's a short illustration of the potential workflow:
import xarray as xr
import xbatcher
ds = xr.open_dataset(store, engine="kvikio", consolidated=False)
x_gen = xbatcher.BatchGenerator(ds[xvars], {'time': 10})
y_gen = xbatcher.BatchGenerator(ds[yvars], {'time': 10})
tf_dataset = xbatcher.loaders.keras.CustomTFDataset(x_gen, y_gen)
model.fit(tf_dataset, ...)
This would be awesome to demonstrate in a single example. Perhaps as a second tutorial on Xbatcher's documentation site.