-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
implement Gradient #2398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement Gradient #2398
Conversation
xarray/core/computation.py
Outdated
| * x (x) float64 0.0 0.1 1.1 1.2 | ||
| Dimensions without coordinates: y | ||
| >>> | ||
| >>> xr.gradient(da, ('x', 'y')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Altough this API is similar to numpy's counterpart, I'm wondering whether we really need to support this. The return value is a tuple of DataArrays, which I feel inconsistent to other xarray functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I wrote my own wrapper for gradient of a DataArray, I returned a dataset with each variable being a gradient along one dimension, but even that isn't very elegant ¯_(ツ)_/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be another option, but in order to store them to a dataset, we will need to define their names very heuristically...
Maybe we can drop this api as this does not speed up the computation and it's easy to do the same thing manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I don't think it's particularly useful to compute to compute independent gradients with respect to multiple axes at the same time, e.g., du/dx, du/dy, etc. If anything, I would be more excited about computing higher order derivatives, e.g., \frac{d^2 u}{dx dy}
xarray/core/dask_array_ops.py
Outdated
| return out[index] | ||
|
|
||
|
|
||
| def gradient(a, coord, axis, edge_order): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be implemented in the upstream...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean like this? Requires Dask 0.17.3+ though. So IDK if that works for you or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jakirkahm.
I need to compute the gradient with a non uniform coordinate, but dask Array.gradient only supports uniformly spacing coordinate.
I think this can be implemented using overlap as I do here. Is it in the scope of the dask development?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry. Should have read more of the code.
Yeah I think that is in scope. Just not currently supported yet.
Could you please open an issue or perhaps PR over on the Dask repo?
xarray/core/npcompat.py
Outdated
| def gradient(f, *varargs, **kwargs): | ||
| """ | ||
| Return the gradient of an N-dimensional array. | ||
| The gradient is computed using second order accurate central differences |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (80 > 79 characters)
xarray/core/npcompat.py
Outdated
| """ | ||
| Return the gradient of an N-dimensional array. | ||
| The gradient is computed using second order accurate central differences | ||
| in the interior points and either first or second order accurate one-sides |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (82 > 79 characters)
xarray/core/npcompat.py
Outdated
| f : array_like | ||
| An N-dimensional array containing samples of a scalar function. | ||
| varargs : list of scalar or array, optional | ||
| Spacing between f values. Default unitary spacing for all dimensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (81 > 79 characters)
xarray/core/npcompat.py
Outdated
| Spacing between f values. Default unitary spacing for all dimensions. | ||
| Spacing can be specified using: | ||
| 1. single scalar to specify a sample distance for all dimensions. | ||
| 2. N scalars to specify a constant sample distance for each dimension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (82 > 79 characters)
xarray/core/npcompat.py
Outdated
| 3. N arrays to specify the coordinates of the values along each | ||
| dimension of F. The length of the array must match the size of | ||
| the corresponding dimension | ||
| 4. Any combination of N scalars/arrays with the meaning of 2. and 3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (80 > 79 characters)
xarray/core/npcompat.py
Outdated
|
|
||
| # Difference of datetime64 elements results in timedelta64 | ||
| if otype == 'M': | ||
| # Need to use the full dtype name because it contains unit information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (82 > 79 characters)
xarray/core/npcompat.py
Outdated
| for i, axis in enumerate(axes): | ||
| if y.shape[axis] < edge_order + 1: | ||
| raise ValueError( | ||
| "Shape of array too small to calculate a numerical gradient, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (82 > 79 characters)
xarray/core/npcompat.py
Outdated
| else: | ||
| dx1 = dx[i][0:-1] | ||
| dx2 = dx[i][1:] | ||
| a = -(dx2)/(dx1 * (dx1 + dx2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E226 missing whitespace around arithmetic operator
xarray/core/npcompat.py
Outdated
| shape = np.ones(N, dtype=int) | ||
| shape[axis] = -1 | ||
| a.shape = b.shape = c.shape = shape | ||
| # 1D equivalent -- out[1:-1] = a * f[:-2] + b * f[1:-1] + c * f[2:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E501 line too long (83 > 79 characters)
xarray/core/npcompat.py
Outdated
| else: | ||
| dx1 = dx[i][0] | ||
| dx2 = dx[i][1] | ||
| a = -(2. * dx1 + dx2)/(dx1 * (dx1 + dx2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E226 missing whitespace around arithmetic operator
|
I think this will be very welcome functionality! I wonder if we should consider calling this |
xarray/core/dask_array_ops.py
Outdated
| depth = {d: 1 if d == axis else 0 for d in range(a.ndim)} | ||
| # temporary pad zero at the boundary | ||
| boundary = "none" | ||
| ag = overlap(a, depth=depth, boundary=boundary) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F841 local variable 'ag' is assigned to but never used
xarray/core/dask_array_ops.py
Outdated
| boundary = "none" | ||
| ag = overlap(a, depth=depth, boundary=boundary) | ||
|
|
||
| n_chunk = len(a.chunks[axis]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F841 local variable 'n_chunk' is assigned to but never used
xarray/core/dask_array_ops.py
Outdated
| array_loc_start = array_loc_stop - np.array(a.chunks[axis]) - 2 | ||
| array_loc_stop[-1] -= 1 | ||
| array_loc_start[0] = 0 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
W293 blank line contains whitespace
xarray/core/dask_array_ops.py
Outdated
| return grad | ||
|
|
||
| return a.map_overlap( | ||
| func, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E126 continuation line over-indented for hanging indent
Agreed. Differentiate is nicer. Some other api questions arised during the implementation
|
| from numbers import Integral, Real | ||
|
|
||
|
|
||
| def validate_axis(axis, ndim): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E303 too many blank lines (2)
xarray/core/dask_array_compat.py
Outdated
| if not isinstance(axis, Integral): | ||
| raise TypeError("Axis value must be an integer, got %s" % axis) | ||
| if axis < -ndim or axis >= ndim: | ||
| raise AxisError("Axis %d is out of bounds for array of dimension %d" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F821 undefined name 'AxisError'
E501 line too long (80 > 79 characters)
| return axis | ||
|
|
||
|
|
||
| def _gradient_kernel(x, block_id, coord, axis, array_locs, grad_kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E303 too many blank lines (2)
| return grad | ||
|
|
||
|
|
||
| def gradient(f, *varargs, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E303 too many blank lines (2)
xarray/core/dask_array_ops.py
Outdated
| import numpy as np | ||
|
|
||
| from . import nputils | ||
| from . import nputils, npcompat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 '.npcompat' imported but unused
|
any thoughts for this? |
|
Why would you sort the array? Aren't you taking differences of values and dividing by differences between the matching coordinates? |
|
Thanks, @dopplershift .
Yes, correct. But if we have closer data points, then the estimate of the gradient becomes more precise. Sorting the array according to the coordinate provides the closest points, resulting in the most precise estimate of the gradient. But I also think users can do it manually before taking the gradient. |
xarray/core/computation.py
Outdated
|
|
||
| if not isinstance(dataarray, DataArray): | ||
| raise TypeError( | ||
| 'Only DataArray is supported. Given {}.'.format(type(dataarray))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused here. You wrote an implementation for Dataset.differentiate, too, and here is a duplicate version of DataArray.differentiate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Maybe the top level function is not necessary here and DataArray.differentiate and Dataset.differentiate would be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I don't think we need the separate function.
xarray/core/utils.py
Outdated
| array = array - np.min(array) | ||
| else: | ||
| array = array - np.zeros(array.shape, dtype=array.dtype) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
W293 blank line contains whitespace
xarray/tests/test_dataset.py
Outdated
| npcompat.gradient( | ||
| da, utils.to_numeric( | ||
| da['x'], offset=True, time_unit='D'), axis=0, edge_order=1), | ||
| dims=da.dims, coords=da.coords) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E131 continuation line unaligned for hanging indent
| The coordinate to be used to compute the gradient. | ||
| edge_order: 1 or 2. Default 1 | ||
| N-th order accurate differences at the boundaries. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs time_unit here
xarray/core/dataset.py
Outdated
| coord_data = coord_var.data | ||
| if coord_data.dtype.kind in ['m', 'M']: | ||
| if time_unit is None: | ||
| time_unit = np.datetime_data(coord_data.dtype)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use tuple unpacking here, e.g., time_unit, _ = np.datetime_data(coord_data.dtype)
|
|
||
| @pytest.mark.parametrize('dask', [True, False]) | ||
| @pytest.mark.parametrize('edge_order', [1, 2]) | ||
| def test_gradient(dask, edge_order): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the long term, hopefully we'll solve this by getting real unit support working, but a time_unit argument seems like a good choice for now
|
I think it's ready :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fujiisoup I'm really looking forward to having this built in to xarray! Thanks for the extra work in making things dask-compatible as well.
Eventually it would be nice if this worked on DataArrays with cftime.datetime coordinates; I think it would be relatively straightforward to modify to_numeric to enable it (we could probably enable it for interp at the same time), but I can take care of that later if you'd like.
doc/whats-new.rst
Outdated
| Enhancements | ||
| ~~~~~~~~~~~~ | ||
|
|
||
| - :py:func:`~xarray.differentiate`, :py:meth:`~xarray.DataArray.differentiate`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think xarray.differentiate is no longer added.
| 'us', 'ns', 'ps', 'fs', 'as'} | ||
| dtype: target dtype | ||
| """ | ||
| if array.dtype.kind not in ['m', 'M']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like everywhere to_numeric is called, this check is already made. Is it redundant to check again here (or conversely are the checks before the function is called redundant)?
|
|
||
| .. ipython:: python | ||
| a = xr.DataArray(np.arange(8).reshape(4, 2), dims=['x'], | ||
| coords=[0.1, 0.11, 0.2, 0.3]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example raises an error when you run it -- I think you need to supply two dimension names.
doc/computation.rst
Outdated
|
|
||
| Xarray objects have some handy methods for the computation with their | ||
| coordinates. :py:meth:`~xarray.DataArray.differentiate` computes derivatives by | ||
| finite central differences using their coordinates, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"finite central differences" -> "centered finite differences"
|
This is so great! This is going to simplify my classes about derivatives on the globe even more: my only concern is that students will soon forget what a numerical derivative actually is, and that it's not a trivial to implement ;-) |
|
Speaking of derivatives on the globe, we might want to include an option for periodic boundary conditions (and possibly for other functions like |
See also #1288 : |
|
@shoyer, have you seen |
Yes, |
|
I agree that the features implemented here are broadly useful and belong in xarray. The fundamental question is: how far does xarray itself want to go in supporting vector calculus in curvilinear coordinates (i.e. on the sphere). There is a fair bit of overlap between this new functionality and some of the things that we are trying to do in xgcm: https://xgcm.readthedocs.io/en/latest/grids.html. Xgcm supports periodic boundary conditions, as well as more complex topological connections between array edges. It allows users to reproduce precisely the sort of operations used to take gradients in finite-volume staggered-grid models. |
|
@rabernat my inclination would be to draw the line at regular grids (with or without periodic boundary conditions), which are pretty commonly encountered in data analysis for any continuous physical systems. We would leave non-regular cases like staggered-grids to add-ons like xgcm -- these grids tend to be more application/PDE specific. |
|
We should definitely mention specialized tools like xgcm as alternatives in the docstring for these xarray methods. |
|
Thanks all. Updated. Thanks, @spencerkclark
Thanks. I added this function for something like this extension, though I do not yet fully follow your cftime update. It would be super nice if you could take care of this after merge. @shoyer ,
Agreed. This option is nice not only
Thanks. |
I totally understand; |
|
I'm going to merge this after the Appveyor tests pass, unless there are any further objections. |
|
@rabernat let me know if this makes sense to you / if you agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I'm late to the party on reviewing this. It looks like a great feature, and I know @fujiisoup has put a huge amount of work into it! 👏
My only concern is that the first thing most geoscience users will try with this function is to take a field with lat, lon dimensions and call differentiate on it, expecting to obtain the zonal and meridional gradients.
I feel that some mention is needed in the documentation that this feature is limited to simple cartesian geometry, in which the physical locations of the variables are accurately described by the dimension coordinates.
| coords=[0.1, 0.11, 0.2, 0.3]) | ||
| a.differentiate('x') | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a place where we could mention the limitations of differentiate and gradient for non-cartesian geometry.
|
|
||
| def differentiate(self, coord, edge_order=1, datetime_unit=None): | ||
| """ Differentiate the array with the second order accurate central | ||
| differences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
|
|
||
| def differentiate(self, coord, edge_order=1, datetime_unit=None): | ||
| """ Differentiate with the second order accurate central | ||
| differences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
|
Thanks, @rabernat for the review. Added the limitation of this method to docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
diffvia new keywords #1332whats-new.rstfor all changes andapi.rstfor new APIAdded
xr.gradient,xr.DataArray.gradient, andxr.Dataset.gradientaccording to #1332.