-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
implement Gradient #2398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement Gradient #2398
Conversation
xarray/core/computation.py
Outdated
* x (x) float64 0.0 0.1 1.1 1.2 | ||
Dimensions without coordinates: y | ||
>>> | ||
>>> xr.gradient(da, ('x', 'y')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Altough this API is similar to numpy's counterpart, I'm wondering whether we really need to support this. The return value is a tuple of DataArrays, which I feel inconsistent to other xarray functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I wrote my own wrapper for gradient of a DataArray, I returned a dataset with each variable being a gradient along one dimension, but even that isn't very elegant ¯_(ツ)_/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be another option, but in order to store them to a dataset, we will need to define their names very heuristically...
Maybe we can drop this api as this does not speed up the computation and it's easy to do the same thing manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I don't think it's particularly useful to compute to compute independent gradients with respect to multiple axes at the same time, e.g., du/dx
, du/dy
, etc. If anything, I would be more excited about computing higher order derivatives, e.g., \frac{d^2 u}{dx dy}
xarray/core/dask_array_ops.py
Outdated
@@ -104,3 +104,49 @@ def func(x, window, axis=-1): | |||
index = (slice(None),) * axis + (slice(drop_size, | |||
drop_size + orig_shape[axis]), ) | |||
return out[index] | |||
|
|||
|
|||
def gradient(a, coord, axis, edge_order): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be implemented in the upstream...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean like this? Requires Dask 0.17.3+ though. So IDK if that works for you or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jakirkahm.
I need to compute the gradient with a non uniform coordinate, but dask Array.gradient only supports uniformly spacing coordinate.
I think this can be implemented using overlap as I do here. Is it in the scope of the dask development?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry. Should have read more of the code.
Yeah I think that is in scope. Just not currently supported yet.
Could you please open an issue or perhaps PR over on the Dask repo?
xarray/core/npcompat.py
Outdated
def gradient(f, *varargs, **kwargs): | ||
""" | ||
Return the gradient of an N-dimensional array. | ||
The gradient is computed using second order accurate central differences |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (80 > 79 characters)
xarray/core/npcompat.py
Outdated
""" | ||
Return the gradient of an N-dimensional array. | ||
The gradient is computed using second order accurate central differences | ||
in the interior points and either first or second order accurate one-sides |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (82 > 79 characters)
xarray/core/npcompat.py
Outdated
f : array_like | ||
An N-dimensional array containing samples of a scalar function. | ||
varargs : list of scalar or array, optional | ||
Spacing between f values. Default unitary spacing for all dimensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (81 > 79 characters)
xarray/core/npcompat.py
Outdated
Spacing between f values. Default unitary spacing for all dimensions. | ||
Spacing can be specified using: | ||
1. single scalar to specify a sample distance for all dimensions. | ||
2. N scalars to specify a constant sample distance for each dimension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (82 > 79 characters)
xarray/core/npcompat.py
Outdated
3. N arrays to specify the coordinates of the values along each | ||
dimension of F. The length of the array must match the size of | ||
the corresponding dimension | ||
4. Any combination of N scalars/arrays with the meaning of 2. and 3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (80 > 79 characters)
xarray/core/npcompat.py
Outdated
|
||
# Difference of datetime64 elements results in timedelta64 | ||
if otype == 'M': | ||
# Need to use the full dtype name because it contains unit information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (82 > 79 characters)
xarray/core/npcompat.py
Outdated
for i, axis in enumerate(axes): | ||
if y.shape[axis] < edge_order + 1: | ||
raise ValueError( | ||
"Shape of array too small to calculate a numerical gradient, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (82 > 79 characters)
xarray/core/npcompat.py
Outdated
else: | ||
dx1 = dx[i][0:-1] | ||
dx2 = dx[i][1:] | ||
a = -(dx2)/(dx1 * (dx1 + dx2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E226 missing whitespace around arithmetic operator
xarray/core/npcompat.py
Outdated
shape = np.ones(N, dtype=int) | ||
shape[axis] = -1 | ||
a.shape = b.shape = c.shape = shape | ||
# 1D equivalent -- out[1:-1] = a * f[:-2] + b * f[1:-1] + c * f[2:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (83 > 79 characters)
xarray/core/npcompat.py
Outdated
else: | ||
dx1 = dx[i][0] | ||
dx2 = dx[i][1] | ||
a = -(2. * dx1 + dx2)/(dx1 * (dx1 + dx2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E226 missing whitespace around arithmetic operator
I think this will be very welcome functionality! I wonder if we should consider calling this |
xarray/core/dask_array_ops.py
Outdated
depth = {d: 1 if d == axis else 0 for d in range(a.ndim)} | ||
# temporary pad zero at the boundary | ||
boundary = "none" | ||
ag = overlap(a, depth=depth, boundary=boundary) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F841 local variable 'ag' is assigned to but never used
xarray/core/dask_array_ops.py
Outdated
boundary = "none" | ||
ag = overlap(a, depth=depth, boundary=boundary) | ||
|
||
n_chunk = len(a.chunks[axis]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F841 local variable 'n_chunk' is assigned to but never used
xarray/core/dask_array_ops.py
Outdated
array_loc_start = array_loc_stop - np.array(a.chunks[axis]) - 2 | ||
array_loc_stop[-1] -= 1 | ||
array_loc_start[0] = 0 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
W293 blank line contains whitespace
xarray/core/dask_array_ops.py
Outdated
return grad | ||
|
||
return a.map_overlap( | ||
func, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E126 continuation line over-indented for hanging indent
Agreed. Differentiate is nicer. Some other api questions arised during the implementation
|
xarray/core/dask_array_compat.py
Outdated
from numbers import Integral, Real | ||
|
||
|
||
def validate_axis(axis, ndim): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E303 too many blank lines (2)
xarray/core/dask_array_compat.py
Outdated
if not isinstance(axis, Integral): | ||
raise TypeError("Axis value must be an integer, got %s" % axis) | ||
if axis < -ndim or axis >= ndim: | ||
raise AxisError("Axis %d is out of bounds for array of dimension %d" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F821 undefined name 'AxisError'
E501 line too long (80 > 79 characters)
xarray/core/dask_array_compat.py
Outdated
return axis | ||
|
||
|
||
def _gradient_kernel(x, block_id, coord, axis, array_locs, grad_kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E303 too many blank lines (2)
xarray/core/dask_array_compat.py
Outdated
return grad | ||
|
||
|
||
def gradient(f, *varargs, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E303 too many blank lines (2)
xarray/core/dask_array_ops.py
Outdated
@@ -3,7 +3,7 @@ | |||
|
|||
import numpy as np | |||
|
|||
from . import nputils | |||
from . import nputils, npcompat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 '.npcompat' imported but unused
any thoughts for this? |
Why would you sort the array? Aren't you taking differences of values and dividing by differences between the matching coordinates? |
Thanks, @dopplershift .
Yes, correct. But if we have closer data points, then the estimate of the gradient becomes more precise. Sorting the array according to the coordinate provides the closest points, resulting in the most precise estimate of the gradient. But I also think users can do it manually before taking the gradient. |
xarray/core/computation.py
Outdated
|
||
if not isinstance(dataarray, DataArray): | ||
raise TypeError( | ||
'Only DataArray is supported. Given {}.'.format(type(dataarray))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused here. You wrote an implementation for Dataset.differentiate, too, and here is a duplicate version of DataArray.differentiate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Maybe the top level function is not necessary here and DataArray.differentiate and Dataset.differentiate would be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I don't think we need the separate function.
xarray/core/utils.py
Outdated
array = array - np.min(array) | ||
else: | ||
array = array - np.zeros(array.shape, dtype=array.dtype) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
W293 blank line contains whitespace
xarray/tests/test_dataset.py
Outdated
npcompat.gradient( | ||
da, utils.to_numeric( | ||
da['x'], offset=True, time_unit='D'), axis=0, edge_order=1), | ||
dims=da.dims, coords=da.coords) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E131 continuation line unaligned for hanging indent
xarray/core/dataarray.py
Outdated
The coordinate to be used to compute the gradient. | ||
edge_order: 1 or 2. Default 1 | ||
N-th order accurate differences at the boundaries. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs time_unit
here
xarray/core/dataset.py
Outdated
coord_data = coord_var.data | ||
if coord_data.dtype.kind in ['m', 'M']: | ||
if time_unit is None: | ||
time_unit = np.datetime_data(coord_data.dtype)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use tuple unpacking here, e.g., time_unit, _ = np.datetime_data(coord_data.dtype)
|
||
@pytest.mark.parametrize('dask', [True, False]) | ||
@pytest.mark.parametrize('edge_order', [1, 2]) | ||
def test_gradient(dask, edge_order): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the long term, hopefully we'll solve this by getting real unit support working, but a time_unit
argument seems like a good choice for now
I think it's ready :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fujiisoup I'm really looking forward to having this built in to xarray! Thanks for the extra work in making things dask-compatible as well.
Eventually it would be nice if this worked on DataArrays with cftime.datetime
coordinates; I think it would be relatively straightforward to modify to_numeric
to enable it (we could probably enable it for interp
at the same time), but I can take care of that later if you'd like.
doc/whats-new.rst
Outdated
@@ -36,6 +36,10 @@ Documentation | |||
Enhancements | |||
~~~~~~~~~~~~ | |||
|
|||
- :py:func:`~xarray.differentiate`, :py:meth:`~xarray.DataArray.differentiate`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think xarray.differentiate
is no longer added.
'us', 'ns', 'ps', 'fs', 'as'} | ||
dtype: target dtype | ||
""" | ||
if array.dtype.kind not in ['m', 'M']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like everywhere to_numeric
is called, this check is already made. Is it redundant to check again here (or conversely are the checks before the function is called redundant)?
doc/computation.rst
Outdated
|
||
.. ipython:: python | ||
a = xr.DataArray(np.arange(8).reshape(4, 2), dims=['x'], | ||
coords=[0.1, 0.11, 0.2, 0.3]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example raises an error when you run it -- I think you need to supply two dimension names.
doc/computation.rst
Outdated
|
||
Xarray objects have some handy methods for the computation with their | ||
coordinates. :py:meth:`~xarray.DataArray.differentiate` computes derivatives by | ||
finite central differences using their coordinates, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"finite central differences" -> "centered finite differences"
This is so great! This is going to simplify my classes about derivatives on the globe even more: my only concern is that students will soon forget what a numerical derivative actually is, and that it's not a trivial to implement ;-) |
Speaking of derivatives on the globe, we might want to include an option for periodic boundary conditions (and possibly for other functions like |
See also #1288 : |
@shoyer, have you seen |
Yes, |
I agree that the features implemented here are broadly useful and belong in xarray. The fundamental question is: how far does xarray itself want to go in supporting vector calculus in curvilinear coordinates (i.e. on the sphere). There is a fair bit of overlap between this new functionality and some of the things that we are trying to do in xgcm: https://xgcm.readthedocs.io/en/latest/grids.html. Xgcm supports periodic boundary conditions, as well as more complex topological connections between array edges. It allows users to reproduce precisely the sort of operations used to take gradients in finite-volume staggered-grid models. |
@rabernat my inclination would be to draw the line at regular grids (with or without periodic boundary conditions), which are pretty commonly encountered in data analysis for any continuous physical systems. We would leave non-regular cases like staggered-grids to add-ons like xgcm -- these grids tend to be more application/PDE specific. |
We should definitely mention specialized tools like xgcm as alternatives in the docstring for these xarray methods. |
Thanks all. Updated. Thanks, @spencerkclark
Thanks. I added this function for something like this extension, though I do not yet fully follow your cftime update. It would be super nice if you could take care of this after merge. @shoyer ,
Agreed. This option is nice not only
Thanks. |
I totally understand; |
I'm going to merge this after the Appveyor tests pass, unless there are any further objections. |
@rabernat let me know if this makes sense to you / if you agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I'm late to the party on reviewing this. It looks like a great feature, and I know @fujiisoup has put a huge amount of work into it! 👏
My only concern is that the first thing most geoscience users will try with this function is to take a field with lat, lon
dimensions and call differentiate
on it, expecting to obtain the zonal and meridional gradients.
I feel that some mention is needed in the documentation that this feature is limited to simple cartesian geometry, in which the physical locations of the variables are accurately described by the dimension coordinates.
doc/computation.rst
Outdated
coords=[0.1, 0.11, 0.2, 0.3]) | ||
a.differentiate('x') | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a place where we could mention the limitations of differentiate
and gradient
for non-cartesian geometry.
@@ -2289,6 +2292,57 @@ def rank(self, dim, pct=False, keep_attrs=False): | |||
ds = self._to_temp_dataset().rank(dim, pct=pct, keep_attrs=keep_attrs) | |||
return self._from_temp_dataset(ds) | |||
|
|||
def differentiate(self, coord, edge_order=1, datetime_unit=None): | |||
""" Differentiate the array with the second order accurate central | |||
differences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
@@ -3663,6 +3666,58 @@ def rank(self, dim, pct=False, keep_attrs=False): | |||
attrs = self.attrs if keep_attrs else None | |||
return self._replace_vars_and_dims(variables, coord_names, attrs=attrs) | |||
|
|||
def differentiate(self, coord, edge_order=1, datetime_unit=None): | |||
""" Differentiate with the second order accurate central | |||
differences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
Thanks, @rabernat for the review. Added the limitation of this method to docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
diff
via new keywords #1332whats-new.rst
for all changes andapi.rst
for new APIAdded
xr.gradient
,xr.DataArray.gradient
, andxr.Dataset.gradient
according to #1332.