You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This has quadratic behavior if the variables are stored in dask arrays (the dask graph gets one element larger after each loop iteration). This is OK for merge() (which typically only has two arguments) but is problematic for dealing with variables that shouldn't be concatenated inside concat(), which should be able to handle very long lists of arguments.
I encountered this because compat='no_conflicts' is the default for xarray.combine_nested().
I guess there's also the related issue which is that even if we produced the output dask graph by hand without a loop, it still wouldn't be easy to evaluate for a large number of elements. Ideally we would use some sort of tree-reduction to ensure the operation can be parallelized.
This ends up calling
fillna()
in a loop insidexarray.core.merge.unique_variable()
, something like:xarray/xarray/core/merge.py
Lines 147 to 149 in 55e5b5a
This has quadratic behavior if the variables are stored in dask arrays (the dask graph gets one element larger after each loop iteration). This is OK for
merge()
(which typically only has two arguments) but is problematic for dealing with variables that shouldn't be concatenated insideconcat()
, which should be able to handle very long lists of arguments.I encountered this because
compat='no_conflicts'
is the default forxarray.combine_nested()
.I guess there's also the related issue which is that even if we produced the output dask graph by hand without a loop, it still wouldn't be easy to evaluate for a large number of elements. Ideally we would use some sort of tree-reduction to ensure the operation can be parallelized.
xref google/xarray-beam#13
The text was updated successfully, but these errors were encountered: