Skip to content

Commit db0f13d

Browse files
authored
Dataset.map, GroupBy.map, Resample.map (#3459)
* rename dataset.apply to dataset.map, deprecating apply * use apply in deprecation test * adjust docs * add groupby rename, remove depreciation warnings (to pending) * change internal usages * formatting * whatsnew * docs * docs * internal usages * formatting * docstring, see also
1 parent ffc3275 commit db0f13d

13 files changed

+186
-76
lines changed

doc/computation.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -462,13 +462,13 @@ Datasets support most of the same methods found on data arrays:
462462
abs(ds)
463463
464464
Datasets also support NumPy ufuncs (requires NumPy v1.13 or newer), or
465-
alternatively you can use :py:meth:`~xarray.Dataset.apply` to apply a function
465+
alternatively you can use :py:meth:`~xarray.Dataset.map` to map a function
466466
to each variable in a dataset:
467467

468468
.. ipython:: python
469469
470470
np.sin(ds)
471-
ds.apply(np.sin)
471+
ds.map(np.sin)
472472
473473
Datasets also use looping over variables for *broadcasting* in binary
474474
arithmetic. You can do arithmetic between any ``DataArray`` and a dataset:

doc/groupby.rst

+8-7
Original file line numberDiff line numberDiff line change
@@ -35,10 +35,11 @@ Let's create a simple example dataset:
3535
3636
.. ipython:: python
3737
38-
ds = xr.Dataset({'foo': (('x', 'y'), np.random.rand(4, 3))},
39-
coords={'x': [10, 20, 30, 40],
40-
'letters': ('x', list('abba'))})
41-
arr = ds['foo']
38+
ds = xr.Dataset(
39+
{"foo": (("x", "y"), np.random.rand(4, 3))},
40+
coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))},
41+
)
42+
arr = ds["foo"]
4243
ds
4344
4445
If we groupby the name of a variable or coordinate in a dataset (we can also
@@ -93,15 +94,15 @@ Apply
9394
~~~~~
9495

9596
To apply a function to each group, you can use the flexible
96-
:py:meth:`~xarray.DatasetGroupBy.apply` method. The resulting objects are automatically
97+
:py:meth:`~xarray.DatasetGroupBy.map` method. The resulting objects are automatically
9798
concatenated back together along the group axis:
9899

99100
.. ipython:: python
100101
101102
def standardize(x):
102103
return (x - x.mean()) / x.std()
103104
104-
arr.groupby('letters').apply(standardize)
105+
arr.groupby('letters').map(standardize)
105106
106107
GroupBy objects also have a :py:meth:`~xarray.DatasetGroupBy.reduce` method and
107108
methods like :py:meth:`~xarray.DatasetGroupBy.mean` as shortcuts for applying an
@@ -202,7 +203,7 @@ __ http://cfconventions.org/cf-conventions/v1.6.0/cf-conventions.html#_two_dimen
202203
dims=['ny','nx'])
203204
da
204205
da.groupby('lon').sum(...)
205-
da.groupby('lon').apply(lambda x: x - x.mean(), shortcut=False)
206+
da.groupby('lon').map(lambda x: x - x.mean(), shortcut=False)
206207
207208
Because multidimensional groups have the ability to generate a very large
208209
number of bins, coarse-binning via :py:meth:`~xarray.Dataset.groupby_bins`

doc/howdoi.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ How do I ...
4444
* - convert a possibly irregularly sampled timeseries to a regularly sampled timeseries
4545
- :py:meth:`DataArray.resample`, :py:meth:`Dataset.resample` (see :ref:`resampling` for more)
4646
* - apply a function on all data variables in a Dataset
47-
- :py:meth:`Dataset.apply`
47+
- :py:meth:`Dataset.map`
4848
* - write xarray objects with complex values to a netCDF file
4949
- :py:func:`Dataset.to_netcdf`, :py:func:`DataArray.to_netcdf` specifying ``engine="h5netcdf", invalid_netcdf=True``
5050
* - make xarray objects look like other xarray objects

doc/quick-overview.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ xarray supports grouped operations using a very similar API to pandas (see :ref:
142142
labels = xr.DataArray(['E', 'F', 'E'], [data.coords['y']], name='labels')
143143
labels
144144
data.groupby(labels).mean('y')
145-
data.groupby(labels).apply(lambda x: x - x.min())
145+
data.groupby(labels).map(lambda x: x - x.min())
146146
147147
Plotting
148148
--------

doc/whats-new.rst

+7
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,13 @@ New Features
4444
option for dropping either labels or variables, but using the more specific methods is encouraged.
4545
(:pull:`3475`)
4646
By `Maximilian Roos <https://github.com/max-sixty>`_
47+
- :py:meth:`Dataset.map` & :py:meth:`GroupBy.map` & :py:meth:`Resample.map` have been added for
48+
mapping / applying a function over each item in the collection, reflecting the widely used
49+
and least surprising name for this operation.
50+
The existing ``apply`` methods remain for backward compatibility, though using the ``map``
51+
methods is encouraged.
52+
(:pull:`3459`)
53+
By `Maximilian Roos <https://github.com/max-sixty>`_
4754
- :py:meth:`Dataset.transpose` and :py:meth:`DataArray.transpose` now support an ellipsis (`...`)
4855
to represent all 'other' dimensions. For example, to move one dimension to the front,
4956
use `.transpose('x', ...)`. (:pull:`3421`)

xarray/core/dataarray.py

+8-3
Original file line numberDiff line numberDiff line change
@@ -920,7 +920,7 @@ def copy(self, deep: bool = True, data: Any = None) -> "DataArray":
920920
Coordinates:
921921
* x (x) <U1 'a' 'b' 'c'
922922
923-
See also
923+
See Also
924924
--------
925925
pandas.DataFrame.copy
926926
"""
@@ -1717,7 +1717,7 @@ def stack(
17171717
codes=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2]],
17181718
names=['x', 'y'])
17191719
1720-
See also
1720+
See Also
17211721
--------
17221722
DataArray.unstack
17231723
"""
@@ -1765,7 +1765,7 @@ def unstack(
17651765
>>> arr.identical(roundtripped)
17661766
True
17671767
1768-
See also
1768+
See Also
17691769
--------
17701770
DataArray.stack
17711771
"""
@@ -1923,6 +1923,11 @@ def drop(
19231923
"""Backward compatible method based on `drop_vars` and `drop_sel`
19241924
19251925
Using either `drop_vars` or `drop_sel` is encouraged
1926+
1927+
See Also
1928+
--------
1929+
DataArray.drop_vars
1930+
DataArray.drop_sel
19261931
"""
19271932
ds = self._to_temp_dataset().drop(labels, dim, errors=errors)
19281933
return self._from_temp_dataset(ds)

xarray/core/dataset.py

+30-4
Original file line numberDiff line numberDiff line change
@@ -3557,6 +3557,11 @@ def drop(self, labels=None, dim=None, *, errors="raise", **labels_kwargs):
35573557
"""Backward compatible method based on `drop_vars` and `drop_sel`
35583558
35593559
Using either `drop_vars` or `drop_sel` is encouraged
3560+
3561+
See Also
3562+
--------
3563+
Dataset.drop_vars
3564+
Dataset.drop_sel
35603565
"""
35613566
if errors not in ["raise", "ignore"]:
35623567
raise ValueError('errors must be either "raise" or "ignore"')
@@ -4108,14 +4113,14 @@ def reduce(
41084113
variables, coord_names=coord_names, attrs=attrs, indexes=indexes
41094114
)
41104115

4111-
def apply(
4116+
def map(
41124117
self,
41134118
func: Callable,
41144119
keep_attrs: bool = None,
41154120
args: Iterable[Any] = (),
41164121
**kwargs: Any,
41174122
) -> "Dataset":
4118-
"""Apply a function over the data variables in this dataset.
4123+
"""Apply a function to each variable in this dataset
41194124
41204125
Parameters
41214126
----------
@@ -4135,7 +4140,7 @@ def apply(
41354140
Returns
41364141
-------
41374142
applied : Dataset
4138-
Resulting dataset from applying ``func`` over each data variable.
4143+
Resulting dataset from applying ``func`` to each data variable.
41394144
41404145
Examples
41414146
--------
@@ -4148,7 +4153,7 @@ def apply(
41484153
Data variables:
41494154
foo (dim_0, dim_1) float64 -0.3751 -1.951 -1.945 0.2948 0.711 -0.3948
41504155
bar (x) int64 -1 2
4151-
>>> ds.apply(np.fabs)
4156+
>>> ds.map(np.fabs)
41524157
<xarray.Dataset>
41534158
Dimensions: (dim_0: 2, dim_1: 3, x: 2)
41544159
Dimensions without coordinates: dim_0, dim_1, x
@@ -4165,6 +4170,27 @@ def apply(
41654170
attrs = self.attrs if keep_attrs else None
41664171
return type(self)(variables, attrs=attrs)
41674172

4173+
def apply(
4174+
self,
4175+
func: Callable,
4176+
keep_attrs: bool = None,
4177+
args: Iterable[Any] = (),
4178+
**kwargs: Any,
4179+
) -> "Dataset":
4180+
"""
4181+
Backward compatible implementation of ``map``
4182+
4183+
See Also
4184+
--------
4185+
Dataset.map
4186+
"""
4187+
warnings.warn(
4188+
"Dataset.apply may be deprecated in the future. Using Dataset.map is encouraged",
4189+
PendingDeprecationWarning,
4190+
stacklevel=2,
4191+
)
4192+
return self.map(func, keep_attrs, args, **kwargs)
4193+
41684194
def assign(
41694195
self, variables: Mapping[Hashable, Any] = None, **variables_kwargs: Hashable
41704196
) -> "Dataset":

xarray/core/groupby.py

+40-9
Original file line numberDiff line numberDiff line change
@@ -608,7 +608,7 @@ def assign_coords(self, coords=None, **coords_kwargs):
608608
Dataset.swap_dims
609609
"""
610610
coords_kwargs = either_dict_or_kwargs(coords, coords_kwargs, "assign_coords")
611-
return self.apply(lambda ds: ds.assign_coords(**coords_kwargs))
611+
return self.map(lambda ds: ds.assign_coords(**coords_kwargs))
612612

613613

614614
def _maybe_reorder(xarray_obj, dim, positions):
@@ -655,8 +655,8 @@ def lookup_order(dimension):
655655
new_order = sorted(stacked.dims, key=lookup_order)
656656
return stacked.transpose(*new_order, transpose_coords=self._restore_coord_dims)
657657

658-
def apply(self, func, shortcut=False, args=(), **kwargs):
659-
"""Apply a function over each array in the group and concatenate them
658+
def map(self, func, shortcut=False, args=(), **kwargs):
659+
"""Apply a function to each array in the group and concatenate them
660660
together into a new array.
661661
662662
`func` is called like `func(ar, *args, **kwargs)` for each array `ar`
@@ -702,6 +702,21 @@ def apply(self, func, shortcut=False, args=(), **kwargs):
702702
applied = (maybe_wrap_array(arr, func(arr, *args, **kwargs)) for arr in grouped)
703703
return self._combine(applied, shortcut=shortcut)
704704

705+
def apply(self, func, shortcut=False, args=(), **kwargs):
706+
"""
707+
Backward compatible implementation of ``map``
708+
709+
See Also
710+
--------
711+
DataArrayGroupBy.map
712+
"""
713+
warnings.warn(
714+
"GroupBy.apply may be deprecated in the future. Using GroupBy.map is encouraged",
715+
PendingDeprecationWarning,
716+
stacklevel=2,
717+
)
718+
return self.map(func, shortcut=shortcut, args=args, **kwargs)
719+
705720
def _combine(self, applied, restore_coord_dims=False, shortcut=False):
706721
"""Recombine the applied objects like the original."""
707722
applied_example, applied = peek_at(applied)
@@ -765,7 +780,7 @@ def quantile(self, q, dim=None, interpolation="linear", keep_attrs=None):
765780
if dim is None:
766781
dim = self._group_dim
767782

768-
out = self.apply(
783+
out = self.map(
769784
self._obj.__class__.quantile,
770785
shortcut=False,
771786
q=q,
@@ -820,16 +835,16 @@ def reduce_array(ar):
820835

821836
check_reduce_dims(dim, self.dims)
822837

823-
return self.apply(reduce_array, shortcut=shortcut)
838+
return self.map(reduce_array, shortcut=shortcut)
824839

825840

826841
ops.inject_reduce_methods(DataArrayGroupBy)
827842
ops.inject_binary_ops(DataArrayGroupBy)
828843

829844

830845
class DatasetGroupBy(GroupBy, ImplementsDatasetReduce):
831-
def apply(self, func, args=(), shortcut=None, **kwargs):
832-
"""Apply a function over each Dataset in the group and concatenate them
846+
def map(self, func, args=(), shortcut=None, **kwargs):
847+
"""Apply a function to each Dataset in the group and concatenate them
833848
together into a new Dataset.
834849
835850
`func` is called like `func(ds, *args, **kwargs)` for each dataset `ds`
@@ -862,6 +877,22 @@ def apply(self, func, args=(), shortcut=None, **kwargs):
862877
applied = (func(ds, *args, **kwargs) for ds in self._iter_grouped())
863878
return self._combine(applied)
864879

880+
def apply(self, func, args=(), shortcut=None, **kwargs):
881+
"""
882+
Backward compatible implementation of ``map``
883+
884+
See Also
885+
--------
886+
DatasetGroupBy.map
887+
"""
888+
889+
warnings.warn(
890+
"GroupBy.apply may be deprecated in the future. Using GroupBy.map is encouraged",
891+
PendingDeprecationWarning,
892+
stacklevel=2,
893+
)
894+
return self.map(func, shortcut=shortcut, args=args, **kwargs)
895+
865896
def _combine(self, applied):
866897
"""Recombine the applied objects like the original."""
867898
applied_example, applied = peek_at(applied)
@@ -914,7 +945,7 @@ def reduce_dataset(ds):
914945

915946
check_reduce_dims(dim, self.dims)
916947

917-
return self.apply(reduce_dataset)
948+
return self.map(reduce_dataset)
918949

919950
def assign(self, **kwargs):
920951
"""Assign data variables by group.
@@ -923,7 +954,7 @@ def assign(self, **kwargs):
923954
--------
924955
Dataset.assign
925956
"""
926-
return self.apply(lambda ds: ds.assign(**kwargs))
957+
return self.map(lambda ds: ds.assign(**kwargs))
927958

928959

929960
ops.inject_reduce_methods(DatasetGroupBy)

xarray/core/resample.py

+39-4
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
import warnings
2+
13
from . import ops
24
from .groupby import DataArrayGroupBy, DatasetGroupBy
35

@@ -173,8 +175,8 @@ def __init__(self, *args, dim=None, resample_dim=None, **kwargs):
173175

174176
super().__init__(*args, **kwargs)
175177

176-
def apply(self, func, shortcut=False, args=(), **kwargs):
177-
"""Apply a function over each array in the group and concatenate them
178+
def map(self, func, shortcut=False, args=(), **kwargs):
179+
"""Apply a function to each array in the group and concatenate them
178180
together into a new array.
179181
180182
`func` is called like `func(ar, *args, **kwargs)` for each array `ar`
@@ -212,7 +214,9 @@ def apply(self, func, shortcut=False, args=(), **kwargs):
212214
applied : DataArray or DataArray
213215
The result of splitting, applying and combining this array.
214216
"""
215-
combined = super().apply(func, shortcut=shortcut, args=args, **kwargs)
217+
# TODO: the argument order for Resample doesn't match that for its parent,
218+
# GroupBy
219+
combined = super().map(func, shortcut=shortcut, args=args, **kwargs)
216220

217221
# If the aggregation function didn't drop the original resampling
218222
# dimension, then we need to do so before we can rename the proxy
@@ -225,6 +229,21 @@ def apply(self, func, shortcut=False, args=(), **kwargs):
225229

226230
return combined
227231

232+
def apply(self, func, args=(), shortcut=None, **kwargs):
233+
"""
234+
Backward compatible implementation of ``map``
235+
236+
See Also
237+
--------
238+
DataArrayResample.map
239+
"""
240+
warnings.warn(
241+
"Resample.apply may be deprecated in the future. Using Resample.map is encouraged",
242+
PendingDeprecationWarning,
243+
stacklevel=2,
244+
)
245+
return self.map(func=func, shortcut=shortcut, args=args, **kwargs)
246+
228247

229248
ops.inject_reduce_methods(DataArrayResample)
230249
ops.inject_binary_ops(DataArrayResample)
@@ -247,7 +266,7 @@ def __init__(self, *args, dim=None, resample_dim=None, **kwargs):
247266

248267
super().__init__(*args, **kwargs)
249268

250-
def apply(self, func, args=(), shortcut=None, **kwargs):
269+
def map(self, func, args=(), shortcut=None, **kwargs):
251270
"""Apply a function over each Dataset in the groups generated for
252271
resampling and concatenate them together into a new Dataset.
253272
@@ -282,6 +301,22 @@ def apply(self, func, args=(), shortcut=None, **kwargs):
282301

283302
return combined.rename({self._resample_dim: self._dim})
284303

304+
def apply(self, func, args=(), shortcut=None, **kwargs):
305+
"""
306+
Backward compatible implementation of ``map``
307+
308+
See Also
309+
--------
310+
DataSetResample.map
311+
"""
312+
313+
warnings.warn(
314+
"Resample.apply may be deprecated in the future. Using Resample.map is encouraged",
315+
PendingDeprecationWarning,
316+
stacklevel=2,
317+
)
318+
return self.map(func=func, shortcut=shortcut, args=args, **kwargs)
319+
285320
def reduce(self, func, dim=None, keep_attrs=None, **kwargs):
286321
"""Reduce the items in this group by applying `func` along the
287322
pre-defined resampling dimension.

0 commit comments

Comments
 (0)