implement Gradient #2398

fujiisoup · 2018-09-04T08:11:52Z

Closes Shape preserving diff via new keywords #1332
Tests added
Tests passed
Fully documented, including whats-new.rst for all changes and api.rst for new API

Added xr.gradient, xr.DataArray.gradient, and xr.Dataset.gradient according to #1332.

fujiisoup · 2018-09-04T08:22:39Z

xarray/core/computation.py

+      * x        (x) float64 0.0 0.1 1.1 1.2
+    Dimensions without coordinates: y
+    >>>
+    >>> xr.gradient(da, ('x', 'y'))


Altough this API is similar to numpy's counterpart, I'm wondering whether we really need to support this. The return value is a tuple of DataArrays, which I feel inconsistent to other xarray functions.

When I wrote my own wrapper for gradient of a DataArray, I returned a dataset with each variable being a gradient along one dimension, but even that isn't very elegant ¯_(ツ)_/¯

It would be another option, but in order to store them to a dataset, we will need to define their names very heuristically...

Maybe we can drop this api as this does not speed up the computation and it's easy to do the same thing manually.

I agree. I don't think it's particularly useful to compute to compute independent gradients with respect to multiple axes at the same time, e.g., du/dx, du/dy, etc. If anything, I would be more excited about computing higher order derivatives, e.g., \frac{d^2 u}{dx dy}

fujiisoup · 2018-09-04T08:23:04Z

xarray/core/dask_array_ops.py

@@ -104,3 +104,49 @@ def func(x, window, axis=-1):
    index = (slice(None),) * axis + (slice(drop_size,
                                           drop_size + orig_shape[axis]), )
    return out[index]
+
+
+def gradient(a, coord, axis, edge_order):


Maybe this should be implemented in the upstream...

Do you mean like this? Requires Dask 0.17.3+ though. So IDK if that works for you or not.

Thanks @jakirkahm.
I need to compute the gradient with a non uniform coordinate, but dask Array.gradient only supports uniformly spacing coordinate.

I think this can be implemented using overlap as I do here. Is it in the scope of the dask development?

Ah sorry. Should have read more of the code.

Yeah I think that is in scope. Just not currently supported yet.

Could you please open an issue or perhaps PR over on the Dask repo?

stickler-ci · 2018-09-04T11:32:13Z

xarray/core/npcompat.py

+    def gradient(f, *varargs, **kwargs):
+        """
+        Return the gradient of an N-dimensional array.
+        The gradient is computed using second order accurate central differences


E501 line too long (80 > 79 characters)

stickler-ci · 2018-09-04T11:32:14Z

xarray/core/npcompat.py

+        """
+        Return the gradient of an N-dimensional array.
+        The gradient is computed using second order accurate central differences
+        in the interior points and either first or second order accurate one-sides


E501 line too long (82 > 79 characters)

stickler-ci · 2018-09-04T11:32:14Z

xarray/core/npcompat.py

+        f : array_like
+            An N-dimensional array containing samples of a scalar function.
+        varargs : list of scalar or array, optional
+            Spacing between f values. Default unitary spacing for all dimensions.


E501 line too long (81 > 79 characters)

stickler-ci · 2018-09-04T11:32:14Z

xarray/core/npcompat.py

+            Spacing between f values. Default unitary spacing for all dimensions.
+            Spacing can be specified using:
+            1. single scalar to specify a sample distance for all dimensions.
+            2. N scalars to specify a constant sample distance for each dimension.


E501 line too long (82 > 79 characters)

stickler-ci · 2018-09-04T11:32:14Z

xarray/core/npcompat.py

+            3. N arrays to specify the coordinates of the values along each
+               dimension of F. The length of the array must match the size of
+               the corresponding dimension
+            4. Any combination of N scalars/arrays with the meaning of 2. and 3.


E501 line too long (80 > 79 characters)

stickler-ci · 2018-09-04T11:32:16Z

xarray/core/npcompat.py

+
+        # Difference of datetime64 elements results in timedelta64
+        if otype == 'M':
+            # Need to use the full dtype name because it contains unit information


E501 line too long (82 > 79 characters)

stickler-ci · 2018-09-04T11:32:16Z

xarray/core/npcompat.py

+        for i, axis in enumerate(axes):
+            if y.shape[axis] < edge_order + 1:
+                raise ValueError(
+                    "Shape of array too small to calculate a numerical gradient, "


E501 line too long (82 > 79 characters)

stickler-ci · 2018-09-04T11:32:16Z

xarray/core/npcompat.py

+            else:
+                dx1 = dx[i][0:-1]
+                dx2 = dx[i][1:]
+                a = -(dx2)/(dx1 * (dx1 + dx2))


E226 missing whitespace around arithmetic operator

stickler-ci · 2018-09-04T11:32:16Z

xarray/core/npcompat.py

+                shape = np.ones(N, dtype=int)
+                shape[axis] = -1
+                a.shape = b.shape = c.shape = shape
+                # 1D equivalent -- out[1:-1] = a * f[:-2] + b * f[1:-1] + c * f[2:]


E501 line too long (83 > 79 characters)

stickler-ci · 2018-09-04T11:32:16Z

xarray/core/npcompat.py

+                else:
+                    dx1 = dx[i][0]
+                    dx2 = dx[i][1]
+                    a = -(2. * dx1 + dx2)/(dx1 * (dx1 + dx2))


E226 missing whitespace around arithmetic operator

shoyer · 2018-09-04T21:51:44Z

I think this will be very welcome functionality!

I wonder if we should consider calling this xarray.differentiate instead of xarray.gradient. I think the NumPy function is poorly named for differentiating along a single axis at once.

stickler-ci · 2018-09-04T22:29:40Z

xarray/core/dask_array_ops.py

+    depth = {d: 1 if d == axis else 0 for d in range(a.ndim)}
+    # temporary pad zero at the boundary
+    boundary = "none"
+    ag = overlap(a, depth=depth, boundary=boundary)


F841 local variable 'ag' is assigned to but never used

stickler-ci · 2018-09-04T22:29:41Z

xarray/core/dask_array_ops.py

+    boundary = "none"
+    ag = overlap(a, depth=depth, boundary=boundary)
+
+    n_chunk = len(a.chunks[axis])


F841 local variable 'n_chunk' is assigned to but never used

stickler-ci · 2018-09-04T22:29:41Z

xarray/core/dask_array_ops.py

+    array_loc_start = array_loc_stop - np.array(a.chunks[axis]) - 2
+    array_loc_stop[-1] -= 1
+    array_loc_start[0] = 0
+


W293 blank line contains whitespace

stickler-ci · 2018-09-04T22:29:41Z

xarray/core/dask_array_ops.py

+        return grad
+
+    return a.map_overlap(
+            func,


E126 continuation line over-indented for hanging indent

fujiisoup · 2018-09-04T22:41:41Z

I wonder if we should consider calling this xarray.differentiate instead of xarray.gradient. I think the NumPy function is poorly named for differentiating along a single axis at once.

Agreed. Differentiate is nicer.

Some other api questions arised during the implementation

Do we support differentiate for Dataset? In that case, what should we do for the variables that are independent from the target coordinate?
I thought 'keep them as is' is intuitive (and I implemented so), but mathematically, they should be zero.
Do we need to sort the array before computing differentiate? np.gradient implicitly assumes the array is sorted (but do nothing about this).

stickler-ci · 2018-09-06T01:32:13Z

xarray/core/dask_array_compat.py

+    from numbers import Integral, Real
+
+
+    def validate_axis(axis, ndim):


E303 too many blank lines (2)

stickler-ci · 2018-09-06T01:32:13Z

xarray/core/dask_array_compat.py

+        if not isinstance(axis, Integral):
+            raise TypeError("Axis value must be an integer, got %s" % axis)
+        if axis < -ndim or axis >= ndim:
+            raise AxisError("Axis %d is out of bounds for array of dimension %d"


F821 undefined name 'AxisError'
E501 line too long (80 > 79 characters)

stickler-ci · 2018-09-06T01:32:13Z

xarray/core/dask_array_compat.py

+        return axis
+
+
+    def _gradient_kernel(x, block_id, coord, axis, array_locs, grad_kwargs):


E303 too many blank lines (2)

stickler-ci · 2018-09-06T01:32:14Z

xarray/core/dask_array_compat.py

+        return grad
+
+
+    def gradient(f, *varargs, **kwargs):


E303 too many blank lines (2)

stickler-ci · 2018-09-06T01:32:14Z

xarray/core/dask_array_ops.py

@@ -3,7 +3,7 @@

 import numpy as np

-from . import nputils
+from . import nputils, npcompat


F401 '.npcompat' imported but unused

fujiisoup · 2018-09-11T23:18:17Z

any thoughts for this?

dopplershift · 2018-09-12T00:25:54Z

Why would you sort the array? Aren't you taking differences of values and dividing by differences between the matching coordinates?

fujiisoup · 2018-09-12T00:49:23Z

Thanks, @dopplershift .

Aren't you taking differences of values and dividing by differences between the matching coordinates?

Yes, correct. But if we have closer data points, then the estimate of the gradient becomes more precise. Sorting the array according to the coordinate provides the closest points, resulting in the most precise estimate of the gradient.

But I also think users can do it manually before taking the gradient.

shoyer · 2018-09-12T01:13:23Z

xarray/core/computation.py

+
+    if not isinstance(dataarray, DataArray):
+        raise TypeError(
+            'Only DataArray is supported. Given {}.'.format(type(dataarray)))


I'm a little confused here. You wrote an implementation for Dataset.differentiate, too, and here is a duplicate version of DataArray.differentiate.

Agreed. Maybe the top level function is not necessary here and DataArray.differentiate and Dataset.differentiate would be sufficient.

👍 I don't think we need the separate function.

stickler-ci · 2018-09-12T14:32:51Z

xarray/core/utils.py

+        array = array - np.min(array)
+    else:
+        array = array - np.zeros(array.shape, dtype=array.dtype)
+


W293 blank line contains whitespace

stickler-ci · 2018-09-12T14:32:51Z

xarray/tests/test_dataset.py

+        npcompat.gradient(
+            da, utils.to_numeric(
+                da['x'], offset=True, time_unit='D'), axis=0, edge_order=1),
+            dims=da.dims, coords=da.coords)


E131 continuation line unaligned for hanging indent

shoyer · 2018-09-12T15:46:19Z

xarray/core/dataarray.py

+            The coordinate to be used to compute the gradient.
+        edge_order: 1 or 2. Default 1
+            N-th order accurate differences at the boundaries.
+


needs time_unit here

shoyer · 2018-09-12T15:47:37Z

xarray/core/dataset.py

+        coord_data = coord_var.data
+        if coord_data.dtype.kind in ['m', 'M']:
+            if time_unit is None:
+                time_unit = np.datetime_data(coord_data.dtype)[0]


Use tuple unpacking here, e.g., time_unit, _ = np.datetime_data(coord_data.dtype)

shoyer · 2018-09-12T15:48:56Z

xarray/tests/test_dataset.py

+
+@pytest.mark.parametrize('dask', [True, False])
+@pytest.mark.parametrize('edge_order', [1, 2])
+def test_gradient(dask, edge_order):


In the long term, hopefully we'll solve this by getting real unit support working, but a time_unit argument seems like a good choice for now

fujiisoup · 2018-09-19T03:35:51Z

I think it's ready :)

spencerkclark

@fujiisoup I'm really looking forward to having this built in to xarray! Thanks for the extra work in making things dask-compatible as well.

Eventually it would be nice if this worked on DataArrays with cftime.datetime coordinates; I think it would be relatively straightforward to modify to_numeric to enable it (we could probably enable it for interp at the same time), but I can take care of that later if you'd like.

spencerkclark · 2018-09-19T12:56:58Z

doc/whats-new.rst

@@ -36,6 +36,10 @@ Documentation
 Enhancements
 ~~~~~~~~~~~~

+- :py:func:`~xarray.differentiate`, :py:meth:`~xarray.DataArray.differentiate`,


I think xarray.differentiate is no longer added.

spencerkclark · 2018-09-19T13:09:09Z

xarray/core/utils.py

+        'us', 'ns', 'ps', 'fs', 'as'}
+    dtype: target dtype
+    """
+    if array.dtype.kind not in ['m', 'M']:


It looks like everywhere to_numeric is called, this check is already made. Is it redundant to check again here (or conversely are the checks before the function is called redundant)?

shoyer · 2018-09-19T15:56:54Z

doc/computation.rst

+
+.. ipython:: python
+    a = xr.DataArray(np.arange(8).reshape(4, 2), dims=['x'],
+                     coords=[0.1, 0.11, 0.2, 0.3])


This example raises an error when you run it -- I think you need to supply two dimension names.

shoyer · 2018-09-19T15:58:28Z

doc/computation.rst

+
+Xarray objects have some handy methods for the computation with their
+coordinates. :py:meth:`~xarray.DataArray.differentiate` computes derivatives by
+finite central differences using their coordinates,


"finite central differences" -> "centered finite differences"

fmaussion · 2018-09-19T16:03:55Z

This is so great! This is going to simplify my classes about derivatives on the globe even more: my only concern is that students will soon forget what a numerical derivative actually is, and that it's not a trivial to implement ;-)

shoyer · 2018-09-19T16:11:51Z

Speaking of derivatives on the globe, we might want to include an option for periodic boundary conditions (and possibly for other functions like interp as well). But we can save that for another PR :).

fmaussion · 2018-09-19T16:14:48Z

But we can save that for another PR :)

See also #1288 : integrate is the next on my list ;) - I can try to give it a go if @fujiisoup doesn't want to do it himself

jakirkham · 2018-09-19T17:38:42Z

@shoyer, have you seen da.pad?

shoyer · 2018-09-19T20:07:35Z

@shoyer, have you seen da.pad?

Yes, pad solves the hard part of periodic boundary conditions -- we should just need to do a bit of book-keeping on the xarray side.

rabernat · 2018-09-19T20:30:58Z

I agree that the features implemented here are broadly useful and belong in xarray. The fundamental question is: how far does xarray itself want to go in supporting vector calculus in curvilinear coordinates (i.e. on the sphere).

There is a fair bit of overlap between this new functionality and some of the things that we are trying to do in xgcm: https://xgcm.readthedocs.io/en/latest/grids.html. Xgcm supports periodic boundary conditions, as well as more complex topological connections between array edges. It allows users to reproduce precisely the sort of operations used to take gradients in finite-volume staggered-grid models.

shoyer · 2018-09-19T20:35:43Z

@rabernat my inclination would be to draw the line at regular grids (with or without periodic boundary conditions), which are pretty commonly encountered in data analysis for any continuous physical systems. We would leave non-regular cases like staggered-grids to add-ons like xgcm -- these grids tend to be more application/PDE specific.

shoyer · 2018-09-19T20:36:15Z

We should definitely mention specialized tools like xgcm as alternatives in the docstring for these xarray methods.

fujiisoup · 2018-09-19T22:01:12Z

Thanks all. Updated.

Thanks, @spencerkclark

Eventually it would be nice if this worked on DataArrays with cftime.datetime coordinates; I think it would be relatively straightforward to modify to_numeric to enable it (we could probably enable it for interp at the same time), but I can take care of that later if you'd like.

Thanks. I added this function for something like this extension, though I do not yet fully follow your cftime update. It would be super nice if you could take care of this after merge.

@shoyer ,

we might want to include an option for periodic boundary conditions

Agreed. This option is nice not only differentiate but also interp and rolling.
I think we can add a common logic to take care of them.

@fmaussion

See also #1288 : integrate is the next on my list ;) - I can try to give it a go if @fujiisoup doesn't want to do it himself

Thanks.
Actually, I am now moving to another nation and would not have enough time in a few weeks.
I will appreciate if you could take care of this.

spencerkclark · 2018-09-19T22:23:23Z

I added this function for something like this extension, though I do not yet fully follow your cftime update. It would be super nice if you could take care of this after merge.

I totally understand; cftime objects probably seem fairly esoteric to non climate scientists. I'll be happy to take care of this after this gets merged. Indeed having the relevant logic already contained in to_numeric is very convenient.

shoyer · 2018-09-19T22:33:58Z

I'm going to merge this after the Appveyor tests pass, unless there are any further objections.

shoyer · 2018-09-20T02:14:25Z

@rabernat let me know if this makes sense to you / if you agree.

rabernat

Sorry I'm late to the party on reviewing this. It looks like a great feature, and I know @fujiisoup has put a huge amount of work into it! 👏

My only concern is that the first thing most geoscience users will try with this function is to take a field with lat, lon dimensions and call differentiate on it, expecting to obtain the zonal and meridional gradients.

I feel that some mention is needed in the documentation that this feature is limited to simple cartesian geometry, in which the physical locations of the variables are accurately described by the dimension coordinates.

rabernat · 2018-09-20T02:41:21Z

doc/computation.rst

+                     coords=[0.1, 0.11, 0.2, 0.3])
+    a.differentiate('x')
+
+


This is a place where we could mention the limitations of differentiate and gradient for non-cartesian geometry.

rabernat · 2018-09-20T02:41:43Z

xarray/core/dataarray.py

@@ -2289,6 +2292,57 @@ def rank(self, dim, pct=False, keep_attrs=False):
        ds = self._to_temp_dataset().rank(dim, pct=pct, keep_attrs=keep_attrs)
        return self._from_temp_dataset(ds)

+    def differentiate(self, coord, edge_order=1, datetime_unit=None):
+        """ Differentiate the array with the second order accurate central
+        differences.


rabernat · 2018-09-20T02:41:55Z

xarray/core/dataset.py

@@ -3663,6 +3666,58 @@ def rank(self, dim, pct=False, keep_attrs=False):
        attrs = self.attrs if keep_attrs else None
        return self._replace_vars_and_dims(variables, coord_names, attrs=attrs)

+    def differentiate(self, coord, edge_order=1, datetime_unit=None):
+        """ Differentiate with the second order accurate central
+        differences.


fujiisoup · 2018-09-20T12:40:16Z

Thanks, @rabernat for the review.

Added the limitation of this method to docs.

rabernat

👍

fujiisoup added 5 commits September 4, 2018 09:05

Added xr.gradient, DataArray.gradient, Dataset.gradient

59ff688

Working with np.backend

0adfc68

test is not passing

d665b3c

Docs

e0fa5fd

flake8

218e62d

fujiisoup commented Sep 4, 2018

View reviewed changes

fujiisoup added 2 commits September 4, 2018 18:07

support environment without dask

888b924

Support numpy < 1.13

a0ab4c2

stickler-ci reviewed Sep 4, 2018

View reviewed changes

fujiisoup mentioned this pull request Sep 4, 2018

Gradient with coordinate dask/dask#3945

Closed

fujiisoup added 2 commits September 5, 2018 07:18

Support numpy 1.12

c581513

simplify dask.gradient

d6be041

stickler-ci reviewed Sep 4, 2018

View reviewed changes

lint

a083460

fujiisoup added 4 commits September 5, 2018 10:45

Use npcompat.gradient in tests

267694d

Merge branch 'master' into gradient

bf2f35e

move gradient to dask_array_compat

2a71b62

gradient -> differentiate

b504da8

stickler-ci reviewed Sep 6, 2018

View reviewed changes

fujiisoup added 5 commits September 6, 2018 11:06

lint

fb356c5

Merge branch 'master' into gradient

1694d3c

Update dask_array_compat

7a0b57f

Added a link from diff

e93b926

Merge branch 'master' into gradient

4c656e0

shoyer reviewed Sep 12, 2018

View reviewed changes

stickler-ci reviewed Sep 12, 2018

View reviewed changes

shoyer reviewed Sep 12, 2018

View reviewed changes

fujiisoup added 4 commits September 13, 2018 07:09

Update via comment. Use utils.to_numeric also in interp

4112cd9

time_unit -> datetime_unit

1c2c88c

Some more info in docs.

cbfecb4

update test

2e8db19

spencerkclark reviewed Sep 19, 2018

View reviewed changes

shoyer reviewed Sep 19, 2018

View reviewed changes

fmaussion mentioned this pull request Sep 19, 2018

0.10.9 release #2424

Closed

Update via comments

c31539e

rabernat reviewed Sep 20, 2018

View reviewed changes

Update docs.

528bcab

rabernat approved these changes Sep 20, 2018

View reviewed changes

shoyer merged commit ab96954 into pydata:master Sep 21, 2018

spencerkclark mentioned this pull request Sep 23, 2018

Enable use of cftime.datetime coordinates with differentiate and interp #2434

Merged

3 tasks

asross mentioned this pull request Nov 4, 2021

Approximations of higher-order derivatives #5938

Open

		from numbers import Integral, Real


		def validate_axis(axis, ndim):

		return axis


		def _gradient_kernel(x, block_id, coord, axis, array_locs, grad_kwargs):

Uh oh!

implement Gradient #2398

implement Gradient #2398

Uh oh!

Conversation

fujiisoup commented Sep 4, 2018

Uh oh!

fujiisoup Sep 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shoyer commented Sep 4, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fujiisoup commented Sep 4, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fujiisoup commented Sep 11, 2018

Uh oh!

dopplershift commented Sep 12, 2018

Uh oh!

fujiisoup commented Sep 12, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

fujiisoup Sep 4, 2018 •

edited

Loading