Skip to content

Feature/weighted #2922

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 60 commits into from
Mar 19, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
0f2da8e
weighted for DataArray
mathause Apr 26, 2019
5f64492
remove some commented code
mathause Apr 26, 2019
685e5c4
pep8 and faulty import tests
mathause Apr 26, 2019
c9d612d
add weighted sum, replace 0s in sum_of_wgt
mathause Apr 30, 2019
a20a4cf
weighted: overhaul tests
mathause Apr 30, 2019
26c24b6
weighted: pep8
mathause Apr 30, 2019
f3c6758
weighted: pep8 lines
mathause Apr 30, 2019
25c3c29
weighted update docs
mathause May 2, 2019
5d37d11
weighted: fix typo
mathause May 2, 2019
b1c572b
weighted: pep8
mathause May 8, 2019
d1d1f2c
undo changes to avoid merge conflict
mathause Oct 17, 2019
6be1414
Merge branch 'master' into feature/weighted
mathause Oct 17, 2019
059263c
add weighted to dataarray again
mathause Oct 17, 2019
8b1904b
remove super
mathause Oct 17, 2019
8cad145
overhaul core/weighted.py
mathause Oct 17, 2019
49d4e43
add DatasetWeighted class
mathause Oct 17, 2019
527256e
_maybe_get_all_dims return sorted tuple
mathause Oct 17, 2019
739568f
work on: test_weighted
mathause Oct 17, 2019
f01305d
black and flake8
mathause Oct 17, 2019
2e3880d
Apply suggestions from code review (docs)
mathause Oct 17, 2019
ae8d048
restructure interim
mathause Oct 18, 2019
dc7f605
restructure classes
mathause Oct 18, 2019
c646568
Merge branch 'master' into feature/weighted
mathause Dec 4, 2019
e2ad69e
update weighted.py
mathause Dec 4, 2019
bd4f048
black
mathause Dec 4, 2019
3c7695a
use map; add keep_attrs
mathause Dec 4, 2019
ef07edd
implement expected_weighted; update tests
mathause Dec 4, 2019
064b5a9
add whats new
mathause Dec 4, 2019
fec1a35
Merge branch 'master' into feature/weighted
mathause Dec 4, 2019
72c7942
undo changes to whats-new
mathause Dec 4, 2019
0e91411
F811: noqa where?
mathause Dec 4, 2019
1eb2913
api.rst
mathause Dec 5, 2019
118dfed
add to computation
mathause Dec 5, 2019
e08c921
small updates
mathause Dec 5, 2019
0fafe0b
add example to gallery
mathause Dec 5, 2019
a8d330d
typo
mathause Dec 5, 2019
ae0012f
another typo
mathause Dec 5, 2019
111259b
correct docstring in core/common.py
mathause Dec 5, 2019
5afc6f3
Merge branch 'master' into feature/weighted
mathause Jan 14, 2020
668b54b
typos
mathause Jan 14, 2020
d877022
adjust review
mathause Jan 14, 2020
ead681e
clean tests
mathause Jan 14, 2020
c4598ba
add test nonequal coords
mathause Jan 14, 2020
866fba5
comment on use of dot
mathause Jan 14, 2020
3cc00c1
fix erroneous merge
mathause Jan 14, 2020
8f34167
Merge branch 'master' into feature/weighted
mathause Jan 21, 2020
9f0a8cd
update tests
mathause Jan 21, 2020
98929f1
Merge branch 'master' into feature/weighted
mathause Mar 5, 2020
62c43e6
move example to notebook
mathause Mar 5, 2020
2e8aba2
move whats-new entry to 15.1
mathause Mar 5, 2020
d14f668
some doc updates
mathause Mar 5, 2020
7fa78ae
dot to own function
mathause Mar 5, 2020
3ebb9d4
simplify some tests
mathause Mar 5, 2020
f01d47a
Doc updates
dcherian Mar 17, 2020
4b184f6
very minor changes.
dcherian Mar 17, 2020
1e06adc
fix & add references
dcherian Mar 17, 2020
706579a
doc: return 0/NaN on 0 weights
mathause Mar 17, 2020
b2718db
Merge branch 'feature/weighted' of https://github.com/mathause/xarray…
mathause Mar 17, 2020
4c17108
Merge branch 'master' into feature/weighted
mathause Mar 17, 2020
8acc78e
Update xarray/core/common.py
dcherian Mar 18, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,7 @@ Computation
Dataset.groupby_bins
Dataset.rolling
Dataset.rolling_exp
Dataset.weighted
Dataset.coarsen
Dataset.resample
Dataset.diff
Expand Down Expand Up @@ -340,6 +341,7 @@ Computation
DataArray.groupby_bins
DataArray.rolling
DataArray.rolling_exp
DataArray.weighted
DataArray.coarsen
DataArray.dt
DataArray.resample
Expand Down Expand Up @@ -577,6 +579,22 @@ Rolling objects
core.rolling.DatasetRolling.reduce
core.rolling_exp.RollingExp

Weighted objects
================

.. autosummary::
:toctree: generated/

core.weighted.DataArrayWeighted
core.weighted.DataArrayWeighted.mean
core.weighted.DataArrayWeighted.sum
core.weighted.DataArrayWeighted.sum_of_weights
core.weighted.DatasetWeighted
core.weighted.DatasetWeighted.mean
core.weighted.DatasetWeighted.sum
core.weighted.DatasetWeighted.sum_of_weights


Coarsen objects
===============

Expand Down
86 changes: 85 additions & 1 deletion doc/computation.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. currentmodule:: xarray

.. _comput:

###########
Expand Down Expand Up @@ -241,12 +243,94 @@ You can also use ``construct`` to compute a weighted rolling sum:
To avoid this, use ``skipna=False`` as the above example.


.. _comput.weighted:

Weighted array reductions
=========================

:py:class:`DataArray` and :py:class:`Dataset` objects include :py:meth:`DataArray.weighted`
and :py:meth:`Dataset.weighted` array reduction methods. They currently
support weighted ``sum`` and weighted ``mean``.

.. ipython:: python

coords = dict(month=('month', [1, 2, 3]))

prec = xr.DataArray([1.1, 1.0, 0.9], dims=('month', ), coords=coords)
weights = xr.DataArray([31, 28, 31], dims=('month', ), coords=coords)

Create a weighted object:

.. ipython:: python

weighted_prec = prec.weighted(weights)
weighted_prec

Calculate the weighted sum:

.. ipython:: python

weighted_prec.sum()

Calculate the weighted mean:

.. ipython:: python

weighted_prec.mean(dim="month")

The weighted sum corresponds to:

.. ipython:: python

weighted_sum = (prec * weights).sum()
weighted_sum

and the weighted mean to:

.. ipython:: python

weighted_mean = weighted_sum / weights.sum()
weighted_mean

However, the functions also take missing values in the data into account:

.. ipython:: python

data = xr.DataArray([np.NaN, 2, 4])
weights = xr.DataArray([8, 1, 1])

data.weighted(weights).mean()

Using ``(data * weights).sum() / weights.sum()`` would (incorrectly) result
in 0.6.


If the weights add up to to 0, ``sum`` returns 0:

.. ipython:: python

data = xr.DataArray([1.0, 1.0])
weights = xr.DataArray([-1.0, 1.0])

data.weighted(weights).sum()

and ``mean`` returns ``NaN``:

.. ipython:: python

data.weighted(weights).mean()


.. note::
``weights`` must be a :py:class:`DataArray` and cannot contain missing values.
Missing values can be replaced manually by ``weights.fillna(0)``.

.. _comput.coarsen:

Coarsen large arrays
====================

``DataArray`` and ``Dataset`` objects include a
:py:class:`DataArray` and :py:class:`Dataset` objects include a
:py:meth:`~xarray.DataArray.coarsen` and :py:meth:`~xarray.Dataset.coarsen`
methods. This supports the block aggregation along multiple dimensions,

Expand Down
1 change: 1 addition & 0 deletions doc/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ Examples

examples/weather-data
examples/monthly-means
examples/area_weighted_temperature
examples/multidimensional-coords
examples/visualization_gallery
examples/ROMS_ocean_model
Expand Down
226 changes: 226 additions & 0 deletions doc/examples/area_weighted_temperature.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"toc": true
},
"source": [
"<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Compare-weighted-and-unweighted-mean-temperature\" data-toc-modified-id=\"Compare-weighted-and-unweighted-mean-temperature-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Compare weighted and unweighted mean temperature</a></span><ul class=\"toc-item\"><li><ul class=\"toc-item\"><li><span><a href=\"#Data\" data-toc-modified-id=\"Data-1.0.1\"><span class=\"toc-item-num\">1.0.1&nbsp;&nbsp;</span>Data</a></span></li><li><span><a href=\"#Creating-weights\" data-toc-modified-id=\"Creating-weights-1.0.2\"><span class=\"toc-item-num\">1.0.2&nbsp;&nbsp;</span>Creating weights</a></span></li><li><span><a href=\"#Weighted-mean\" data-toc-modified-id=\"Weighted-mean-1.0.3\"><span class=\"toc-item-num\">1.0.3&nbsp;&nbsp;</span>Weighted mean</a></span></li><li><span><a href=\"#Plot:-comparison-with-unweighted-mean\" data-toc-modified-id=\"Plot:-comparison-with-unweighted-mean-1.0.4\"><span class=\"toc-item-num\">1.0.4&nbsp;&nbsp;</span>Plot: comparison with unweighted mean</a></span></li></ul></li></ul></li></ul></div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Compare weighted and unweighted mean temperature\n",
"\n",
"\n",
"Author: [Mathias Hauser](https://github.com/mathause/)\n",
"\n",
"\n",
"We use the `air_temperature` example dataset to calculate the area-weighted temperature over its domain. This dataset has a regular latitude/ longitude grid, thus the gridcell area decreases towards the pole. For this grid we can use the cosine of the latitude as proxy for the grid cell area.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2020-03-17T14:43:57.222351Z",
"start_time": "2020-03-17T14:43:56.147541Z"
}
},
"outputs": [],
"source": [
"%matplotlib inline\n",
"\n",
"import cartopy.crs as ccrs\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
"import xarray as xr"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data\n",
"\n",
"Load the data, convert to celsius, and resample to daily values"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2020-03-17T14:43:57.831734Z",
"start_time": "2020-03-17T14:43:57.651845Z"
}
},
"outputs": [],
"source": [
"ds = xr.tutorial.load_dataset(\"air_temperature\")\n",
"\n",
"# to celsius\n",
"air = ds.air - 273.15\n",
"\n",
"# resample from 6-hourly to daily values\n",
"air = air.resample(time=\"D\").mean()\n",
"\n",
"air"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Plot the first timestep:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2020-03-17T14:43:59.887120Z",
"start_time": "2020-03-17T14:43:59.582894Z"
}
},
"outputs": [],
"source": [
"projection = ccrs.LambertConformal(central_longitude=-95, central_latitude=45)\n",
"\n",
"f, ax = plt.subplots(subplot_kw=dict(projection=projection))\n",
"\n",
"air.isel(time=0).plot(transform=ccrs.PlateCarree(), cbar_kwargs=dict(shrink=0.7))\n",
"ax.coastlines()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating weights\n",
"\n",
"For a for a rectangular grid the cosine of the latitude is proportional to the grid cell area."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2020-03-17T14:44:18.777092Z",
"start_time": "2020-03-17T14:44:18.736587Z"
}
},
"outputs": [],
"source": [
"weights = np.cos(np.deg2rad(air.lat))\n",
"weights.name = \"weights\"\n",
"weights"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Weighted mean"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2020-03-17T14:44:52.607120Z",
"start_time": "2020-03-17T14:44:52.564674Z"
}
},
"outputs": [],
"source": [
"air_weighted = air.weighted(weights)\n",
"air_weighted"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2020-03-17T14:44:54.334279Z",
"start_time": "2020-03-17T14:44:54.280022Z"
}
},
"outputs": [],
"source": [
"weighted_mean = air_weighted.mean((\"lon\", \"lat\"))\n",
"weighted_mean"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Plot: comparison with unweighted mean\n",
"\n",
"Note how the weighted mean temperature is higher than the unweighted."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2020-03-17T14:45:08.877307Z",
"start_time": "2020-03-17T14:45:08.673383Z"
}
},
"outputs": [],
"source": [
"weighted_mean.plot(label=\"weighted\")\n",
"air.mean((\"lon\", \"lat\")).plot(label=\"unweighted\")\n",
"\n",
"plt.legend()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": true,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": true
}
},
"nbformat": 4,
"nbformat_minor": 4
}
3 changes: 3 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ Breaking changes
New Features
~~~~~~~~~~~~

- Weighted array reductions are now supported via the new :py:meth:`DataArray.weighted`
and :py:meth:`Dataset.weighted` methods. See :ref:`comput.weighted`. (:issue:`422`, :pull:`2922`).
By `Mathias Hauser <https://github.com/mathause>`_
- Added support for :py:class:`pandas.DatetimeIndex`-style rounding of
``cftime.datetime`` objects directly via a :py:class:`CFTimeIndex` or via the
:py:class:`~core.accessor_dt.DatetimeAccessor`.
Expand Down
Loading