-
Notifications
You must be signed in to change notification settings - Fork 33
Plans for full xarray compatibility? #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
There are no plans because I don't yet understand how to do it. See discussion in #65. Is it just a matter of special-casing DataArrays so that they are passed through by the |
Thanks for linking that to that discussion, which I hadn't realized was about the same thing. Unfortunately I'm not a wizard, so I wouldn't want to hazard a guess at where the magic happens, or how challenging it would be! My intuition says "not too hard", if one of the active xarray developers could be roped in to assist. If everything can be done with |
I started looking into this after #65 was made. NEP 18 appears to apply only to folks implementing numpy like containers or subclasses. This toolbox only provides a pile of ufuncs that operate on numpy like containers. Xarray does support directly calling ufuncs on its objects . The problem I think is the "dask" argument, passing a value of "allowed" means the dask array(s) are passed directly to the ufunc which is expected to be "dask aware". The argument of "parallelized" is what you need for ufuncs that are not dask aware. So what we actually need to do (I think) is be dask/chunked array aware, and I have no idea how that is done. @gmacgilchrist can you try changing your "allowed" to "parallelized" and see what happens when you remove the |
Yes, I think this was the issue, setting to If |
It looks to me like a few simple changes in our decorator would take care of the ufuncs, which comprise most of the package. I think the special-casing would for objects with `array_ufunc`` methods; if one is found in the argument list, we would just pass everything through to the underlying ufunc (or maybe still handle masked arrays first). What we need to try this is a nice simple self-contained set of test cases that would easily show whether we are doing the right thing, with and without dask. @gmacgilchrist, can you supply such tests? |
Great that you've implemented something that appears to handle this - thank you! What might a test suite look like? I spent a bit of time this morning trying to reproduce, in a very simple configuration, some of the common errors that I was getting previously when passing xarray objects as arguments. Not sophisticated by any stretch, but it could provide some basis for a test suite? You can see the notebook here (I'm not sure why it is loading strangely at the moment, loads fine in nbviewer), I hope that's of some use. |
The toolbox is broadly compatible with xarray DataArrays, but struggles when needing to do things such as broadcast along specific dimensions (a particularly powerful feature of xarray objects), or when xarray objects are separated into "chunks" using dask. The broadcasting issue can be worked around using the xarray
apply_ufunc
command, but this requires writing a wrapper function for eachgsw
function. That approach still fails to handle chunked arrays, so that the workaround is to extract the underlying numpy arrays from the DataArrays and then plug the output of thegsw
function back into a new DataArray. Some example wrapper functions can be seen here.I'm wondering if there are any plans to improve this compatibility with xarray, essentially to allow xarray DataArrays (as well as numpy arrays) to be passed directly to the
gsw
functions, or whether I would be better placed to try to write an overall "wrapper package" to brute force compatibility.The text was updated successfully, but these errors were encountered: