-
Notifications
You must be signed in to change notification settings - Fork 1
Support for dask arrays #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Huh, I have no idea why it's using dask. That has never occurred before for me.
I agree. Still very odd though. Have you found a workaround? Or is this holding you up? Yes, dask sounds really promising in the long run, but not on my radar to implement it anytime soon. |
@spencerahill I'm not sure if you have a pressing use for it currently, but I just wanted to throw some examples up (see this gist) of computing derivative approximations (to arbitrary order of accuracy) using a combination of:
The functions in this notebook enable derivatives to be computed on both DataArrays containing dask arrays or numpy arrays; importantly it assumes the spacing between points along the dimension one takes a derivative along is constant. I think one could extend these functions to compute upwind estimates of the derivatives as well (as is enabled in infinite-diff), but I haven't done it yet. A couple (minor) caveats to this approach:
I was playing around with these recently for another use-case, but I figured you might be interested (at least in the implementation details in xarray and dask). |
@spencerkclark wow, thanks, this looks really cool! I should have time within a week or so to look into it more carefully. Will get back to you then. |
@spencerkclark, I am floored. Although coming from you, I'm not all that surprised! This is so cool. I've been vaguely thinking for the last year or so about something along these lines, i.e. using sympy to generate arbitrary order of accuracy derivatives, and it is amazing to see it done so elegantly. And dask compatible! And with an amazing example Notebook! My big-picture initial reaction is that I want to replace everything possible in infinite-diff with this approach. Or maybe it makes more sense to abandon infinite-diff altogether and start from scratch with this at the core? More minor comments and questions follow. Note that I'm only superficially familiar with actually using sympy.
That's my impression too; I wouldn't worry about it.
I don't think so, in fact that's what makes this whole approach so elegant. Although some additional commenting describing what those calls are doing would help.
How fundamental is this assumption to your approach? The sympy docs link to a paper for arbitrarily-spaced grids just below the function docstring.
An OOP approach could cut down on the code repetition here. E.g. # TODO: Don't hard-code periodic boundary conditions
raise NotImplementedError(
'xdiff currently only supports periodic boundary '
'conditions on dask arrays') You have this NotImplementedError within weights = forward_diff_weights(accuracy, spacing)
origin = -int(np.ceil(accuracy / 2.)) A brief comment explaining these
You use this in 3 places, so worth turning it into a helper function, e.g.: def _make_domain(accuracy, spacing):
return np.arange(0., accuracy + 1.) * spacing
Do you know why the 8th order methods using dask are noisy, whereas no others are?
Could you explain what's going on here? From reading the docstring (admittedly briefly), it's not clear to me how this would get us upwind |
Glad to see you're enthusiastic about this approach! Thanks a lot for your comments.
I'm totally open to using something along these lines within infinite-diff. My initial hunch here was to keep things as functional as possible (i.e. not OOP), to mimic how many of these functions are implemented in
I think this is pretty fundamental to my approach. The issue lies in using That being said, it would be very cool if there were a way to leverage
I agree, thanks for pointing that out. I'll fix that.
Yes, they are a little opaque. Will do.
It's not just the ones with dask; it's all the non-periodic ones. I think that this may actually have to do with floating point error (around order 10^-16). It doesn't show up in the periodic case, because I'm actually using double the grid spacing there. If I use 50 points in the non-periodic case (to make the grid spacing the same), things look smooth for the 8th order case (see the revised notebook).
Yes, I agree this is vague. I'll try and put together a concrete example (as well as add some discussion of the origin definitions) in the next week or so. |
This makes sense. I'm starting to feel like infinite-diff, even if its internals remain highly OOP, would be better if the public API was purely functions that internally rely on the various classes, rather than being through instantiating the objects themselves. I think the same would hold for your new approach.
I agree. I think this means though that we should be careful to check the grid and warn prominently or maybe even raise whenever it's not uniform.
That's awesome! Good indicator that your numerics are solid when floating point precision starts to limit your accuracy 😄 So all that being said, what do you think is the best way forward? |
@spencerahill I've gone ahead and implemented an upwind differencing function (again for arbitrary accuracy, see the new notebook). It's dask-compatible because I basically just built it as an extension to the I also cleaned up the logic to compute the
My sense is to try and work these into infinite-diff and go from there. Does that sound reasonable? Are there any more features infinite-diff needs that could be tricky to adapt to using these functions? |
FYI not sure if it makes any difference for this, but in the new v1.13 release np.gradient now supports uneven spacing.
Sorry, can you clarify what you mean here? I think related, I'm not sure I understand the
Definitely!
Not that I can think of; derivatives and upwind advection are my only use cases. It's possible that adapting this generic functionality to different geometries (e.g. the sphere) will require some departures from the current approach. But honestly this is so much better than what I have now I will gladly re-do anything that this causes problems for! |
Indeed it could be interesting to look at the implementation!
The issue is that higher order upwind schemes (e.g. 3rd order) do not strictly use forward and backward differences. (But again for cases higher than 3rd order I'm not totally sure if my assumption is valid).
Sounds good! At some point when I have a chunk of free time I'll take a dive into to this and see where it leads. |
Sorry, that's my mistake...I had assumed that it was just forward or backward differencing. I tried doing some googling for stencils for higher order upwind schemes but didn't find anything easily. At the end of the day, clearly your accuracy is improving steadily as you increase the order, and in a similar fashion as the standard differencing cases, so you're clearly either 100% correct or nearly so. Interesting that the minimum errors occur where the derivative is largest for odd orders but but at the local extrema for even orders. |
@kuchaale found a bug in
|
I sort of stumbled upon this accidentally -- for some reason (that I'm currently trying to figure out), some variables passed from aospy to the upwind advection schemes in infinite-diff used dask arrays (instead of standard numpy arrays). This ended up raising a
TypeError
because the dask arrays do not support item assignment.Do you have designs on supporting dask arrays in infinite-diff in the future? Particularly for computationally intensive things like MSE budgets, dask seems like it could be a valuable tool. This isn't a huge priority, since at the moment we don't rely on dask at all in aospy, but I just wanted to bring this up as I recently encountered it.
Beyond that (which I think ultimately stems from something within aospy or aospy_user), this package seems to be working very well for me so far!
Here's the full traceback for reference:
The text was updated successfully, but these errors were encountered: