Skip to content

Commit fcc96c9

Browse files
committed
Merge remote-tracking branch 'upstream/master' into load_nonindex_coords
2 parents 32c9145 + 7611ed9 commit fcc96c9

23 files changed

+442
-213
lines changed

ci/requirements-py36.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ dependencies:
1616
- seaborn
1717
- toolz
1818
- rasterio
19+
- bottleneck
1920
- pip:
2021
- coveralls
2122
- pytest-cov

doc/Makefile

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,11 @@
22
#
33

44
# You can set these variables from the command line.
5-
SPHINXOPTS =
6-
SPHINXBUILD = sphinx-build
7-
PAPER =
8-
BUILDDIR = _build
5+
SPHINXOPTS =
6+
SPHINXBUILD = sphinx-build
7+
SPHINXATUOBUILD = sphinx-autobuild
8+
PAPER =
9+
BUILDDIR = _build
910

1011
# Internal variables.
1112
PAPEROPT_a4 = -D latex_paper_size=a4
@@ -18,6 +19,7 @@ I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
1819
help:
1920
@echo "Please use \`make <target>' where <target> is one of"
2021
@echo " html to make standalone HTML files"
22+
@echo " livehtml Make standalone HTML files and rebuild the documentation when a change is detected. Also includes a livereload enabled web server"
2123
@echo " dirhtml to make HTML files named index.html in directories"
2224
@echo " singlehtml to make a single large HTML file"
2325
@echo " pickle to make pickle files"
@@ -56,6 +58,11 @@ html:
5658
@echo
5759
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
5860

61+
.PHONY: livehtml
62+
livehtml:
63+
# @echo "$(SPHINXATUOBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html"
64+
$(SPHINXATUOBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
65+
5966
.PHONY: dirhtml
6067
dirhtml:
6168
$(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml

doc/dask.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ dependency in a future version of xarray.
1313
For a full example of how to use xarray's dask integration, read the
1414
`blog post introducing xarray and dask`_.
1515

16-
.. _blog post introducing xarray and dask: http://continuum.io/blog/xray-dask
16+
.. _blog post introducing xarray and dask: https://www.anaconda.com/blog/developer-blog/xray-dask-out-core-labeled-arrays-python/
1717

1818
What is a dask array?
1919
---------------------

doc/installing.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ For accelerating xarray
3030

3131
- `bottleneck <https://github.com/kwgoodman/bottleneck>`__: speeds up
3232
NaN-skipping and rolling window aggregations by a large factor
33-
(1.0 or later)
33+
(1.1 or later)
3434
- `cyordereddict <https://github.com/shoyer/cyordereddict>`__: speeds up most
3535
internal operations with xarray data structures (for python versions < 3.5)
3636

doc/io.rst

Lines changed: 46 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -38,18 +38,22 @@ module:
3838
3939
pickle.loads(pkl)
4040
41-
Pickle support is important because it doesn't require any external libraries
41+
Pickling is important because it doesn't require any external libraries
4242
and lets you use xarray objects with Python modules like
43-
:py:mod:`multiprocessing`. However, there are two important caveats:
43+
:py:mod:`multiprocessing` or :ref:`Dask <dask>`. However, pickling is
44+
**not recommended for long-term storage**.
4445

45-
1. To simplify serialization, xarray's support for pickle currently loads all
46-
array values into memory before dumping an object. This means it is not
47-
suitable for serializing datasets too big to load into memory (e.g., from
48-
netCDF or OPeNDAP).
49-
2. Pickle will only work as long as the internal data structure of xarray objects
50-
remains unchanged. Because the internal design of xarray is still being
51-
refined, we make no guarantees (at this point) that objects pickled with
52-
this version of xarray will work in future versions.
46+
Restoring a pickle requires that the internal structure of the types for the
47+
pickled data remain unchanged. Because the internal design of xarray is still
48+
being refined, we make no guarantees (at this point) that objects pickled with
49+
this version of xarray will work in future versions.
50+
51+
.. note::
52+
53+
When pickling an object opened from a NetCDF file, the pickle file will
54+
contain a reference to the file on disk. If you want to store the actual
55+
array values, load it into memory first with :py:meth:`~xarray.Dataset.load`
56+
or :py:meth:`~xarray.Dataset.compute`.
5357

5458
.. _dictionary io:
5559

@@ -384,6 +388,38 @@ over the network until we look at particular values:
384388

385389
.. image:: _static/opendap-prism-tmax.png
386390

391+
Some servers require authentication before we can access the data. For this
392+
purpose we can explicitly create a :py:class:`~xarray.backends.PydapDataStore`
393+
and pass in a `Requests`__ session object. For example for
394+
HTTP Basic authentication::
395+
396+
import xarray as xr
397+
import requests
398+
399+
session = requests.Session()
400+
session.auth = ('username', 'password')
401+
402+
store = xr.backends.PydapDataStore.open('http://example.com/data',
403+
session=session)
404+
ds = xr.open_dataset(store)
405+
406+
`Pydap's cas module`__ has functions that generate custom sessions for
407+
servers that use CAS single sign-on. For example, to connect to servers
408+
that require NASA's URS authentication::
409+
410+
import xarray as xr
411+
from pydata.cas.urs import setup_session
412+
413+
ds_url = 'https://gpm1.gesdisc.eosdis.nasa.gov/opendap/hyrax/example.nc'
414+
415+
session = setup_session('username', 'password', check_url=ds_url)
416+
store = xr.backends.PydapDataStore.open(ds_url, session=session)
417+
418+
ds = xr.open_dataset(store)
419+
420+
__ http://docs.python-requests.org
421+
__ http://pydap.readthedocs.io/en/latest/client.html#authentication
422+
387423
.. _io.rasterio:
388424

389425
Rasterio

doc/whats-new.rst

Lines changed: 42 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,23 @@ Breaking changes
2323

2424
- Supplying ``coords`` as a dictionary to the ``DataArray`` constructor without
2525
also supplying an explicit ``dims`` argument is no longer supported. This
26-
behavior was deprecated in version 0.9 but is now an error (:issue:`727`).
26+
behavior was deprecated in version 0.9 but will now raise an error
27+
(:issue:`727`).
2728
By `Joe Hamman <https://github.com/jhamman>`_.
2829

30+
- ``repr`` and the Jupyter Notebook won't automatically compute dask variables.
31+
Datasets loaded with ``open_dataset`` won't automatically read coords from
32+
disk when calling ``repr`` (:issue:`1522`).
33+
By `Guido Imperiale <https://github.com/crusaderky>`_.
34+
2935
Backward Incompatible Changes
3036
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3137

3238
- Old numpy < 1.11 and pandas < 0.18 are no longer supported (:issue:`1512`).
3339
By `Keisuke Fujii <https://github.com/fujiisoup>`_.
40+
- The minimum supported version bottleneck has increased to 1.1
41+
(:issue:`1279`).
42+
By `Joe Hamman <https://github.com/jhamman>`_.
3443

3544
Enhancements
3645
~~~~~~~~~~~~
@@ -58,7 +67,7 @@ Enhancements
5867

5968
- More attributes available in :py:attr:`~xarray.Dataset.attrs` dictionary when
6069
raster files are opened with :py:func:`~xarray.open_rasterio`.
61-
By `Greg Brener <https://github.com/gbrener>`_
70+
By `Greg Brener <https://github.com/gbrener>`_.
6271

6372
- Support for NetCDF files using an ``_Unsigned`` attribute to indicate that a
6473
a signed integer data type should be interpreted as unsigned bytes
@@ -97,23 +106,41 @@ Enhancements
97106
other means (:issue:`1459`).
98107
By `Ryan May <https://github.com/dopplershift>`_.
99108

100-
- Support passing keyword arguments to ``load``, ``compute``, and ``persist``
101-
methods. Any keyword arguments supplied to these methods are passed on to
102-
the corresponding dask function (:issue:`1523`).
103-
By `Joe Hamman <https://github.com/jhamman>`_.
109+
- Support passing keyword arguments to ``load``, ``compute``, and ``persist``
110+
methods. Any keyword arguments supplied to these methods are passed on to
111+
the corresponding dask function (:issue:`1523`).
112+
By `Joe Hamman <https://github.com/jhamman>`_.
113+
104114
- Encoding attributes are now preserved when xarray objects are concatenated.
105115
The encoding is copied from the first object (:issue:`1297`).
106116
By `Joe Hamman <https://github.com/jhamman>`_ and
107-
`Gerrit Holl <https://github.com/gerritholl`_.
117+
`Gerrit Holl <https://github.com/gerritholl>`_.
118+
119+
- Changed :py:class:`~xarray.backends.PydapDataStore` to take a Pydap dataset.
120+
This permits opening Opendap datasets that require authentication, by
121+
instantiating a Pydap dataset with a session object. Also added
122+
:py:meth:`xarray.backends.PydapDataStore.open` which takes a url and session
123+
object (:issue:`1068`).
124+
By `Philip Graae <https://github.com/mrpgraae>`_.
125+
126+
- Support applying rolling window operations using bottleneck's moving window
127+
functions on data stored as dask arrays (:issue:`1279`).
128+
By `Joe Hamman <https://github.com/jhamman>`_.
108129

109130
Bug fixes
110131
~~~~~~~~~
111132

133+
- Unsigned int support for reduce methods with skipna=True
134+
(:issue:1562).
135+
By `Keisuke Fujii` <https://github.com/fujiisoup>`_.
136+
112137
- Fixes to ensure xarray works properly with the upcoming pandas 0.21 release:
138+
113139
- Fix :py:meth:`~xarray.DataArray.isnull` method (:issue:`1549`).
114140
- :py:meth:`~xarray.DataArray.to_series` and
115141
:py:meth:`~xarray.Dataset.to_dataframe` should not return a ``pandas.MultiIndex``
116142
for 1D data (:issue:`1548`).
143+
By `Stephan Hoyer <https://github.com/shoyer>`_.
117144

118145
- :py:func:`~xarray.open_rasterio` method now shifts the rasterio
119146
coordinates so that they are centered in each pixel.
@@ -141,6 +168,9 @@ Bug fixes
141168
objects with data stored as ``dask`` arrays (:issue:`1529`).
142169
By `Joe Hamman <https://github.com/jhamman>`_.
143170

171+
- ``:py:meth:`~xarray.Dataset.__init__` raises a ``MergeError`` if an
172+
coordinate shares a name with a dimension but is comprised of arbitrary
173+
dimensions(:issue:`1120`).
144174
- :py:func:`~xarray.open_rasterio` method now skips rasterio.crs -attribute if
145175
it is none.
146176
By `Leevi Annala <https://github.com/leevei>`_.
@@ -149,6 +179,10 @@ Bug fixes
149179
provided (:issue:`1410`).
150180
By `Joe Hamman <https://github.com/jhamman>`_.
151181

182+
- Fix :py:func:`xarray.save_mfdataset` to properly raise an informative error
183+
when objects other than ``Dataset`` are provided (:issue:`1555`).
184+
By `Joe Hamman <https://github.com/jhamman>`_.
185+
152186
.. _whats-new.0.9.6:
153187

154188
v0.9.6 (8 June 2017)
@@ -1433,7 +1467,7 @@ a collection of netCDF (using dask) as a single ``xray.Dataset`` object. For
14331467
more on dask, read the `blog post introducing xray + dask`_ and the new
14341468
documentation section :doc:`dask`.
14351469

1436-
.. _blog post introducing xray + dask: http://continuum.io/blog/xray-dask
1470+
.. _blog post introducing xray + dask: https://www.anaconda.com/blog/developer-blog/xray-dask-out-core-labeled-arrays-python/
14371471

14381472
Dask makes it possible to harness parallelism and manipulate gigantic datasets
14391473
with xray. It is currently an optional dependency, but it may become required

xarray/backends/api.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,14 @@
22
from __future__ import division
33
from __future__ import print_function
44
import os.path
5-
from distutils.version import LooseVersion
65
from glob import glob
76
from io import BytesIO
87
from numbers import Number
98

109

1110
import numpy as np
1211

13-
from .. import backends, conventions
12+
from .. import backends, conventions, Dataset
1413
from .common import ArrayWriter, GLOBAL_LOCK
1514
from ..core import indexing
1615
from ..core.combine import auto_combine
@@ -289,7 +288,7 @@ def maybe_decode_store(store, lock=False):
289288
store = backends.ScipyDataStore(filename_or_obj,
290289
autoclose=autoclose)
291290
elif engine == 'pydap':
292-
store = backends.PydapDataStore(filename_or_obj)
291+
store = backends.PydapDataStore.open(filename_or_obj)
293292
elif engine == 'h5netcdf':
294293
store = backends.H5NetCDFStore(filename_or_obj, group=group,
295294
autoclose=autoclose)
@@ -656,6 +655,11 @@ def save_mfdataset(datasets, paths, mode='w', format=None, groups=None,
656655
raise ValueError("cannot use mode='w' when writing multiple "
657656
'datasets to the same path')
658657

658+
for obj in datasets:
659+
if not isinstance(obj, Dataset):
660+
raise TypeError('save_mfdataset only supports writing Dataset '
661+
'objects, recieved type %s' % type(obj))
662+
659663
if groups is None:
660664
groups = [None] * len(datasets)
661665

xarray/backends/pydap_.py

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,9 +60,19 @@ class PydapDataStore(AbstractDataStore):
6060
This store provides an alternative way to access OpenDAP datasets that may
6161
be useful if the netCDF4 library is not available.
6262
"""
63-
def __init__(self, url):
63+
def __init__(self, ds):
64+
"""
65+
Parameters
66+
----------
67+
ds : pydap DatasetType
68+
"""
69+
self.ds = ds
70+
71+
@classmethod
72+
def open(cls, url, session=None):
6473
import pydap.client
65-
self.ds = pydap.client.open_url(url)
74+
ds = pydap.client.open_url(url, session=session)
75+
return cls(ds)
6676

6777
def open_store_variable(self, var):
6878
data = indexing.LazilyIndexedArray(PydapArrayWrapper(var))

xarray/core/common.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -474,6 +474,33 @@ def rolling(self, min_periods=None, center=False, **windows):
474474
Returns
475475
-------
476476
rolling : type of input argument
477+
478+
Examples
479+
--------
480+
Create rolling seasonal average of monthly data e.g. DJF, JFM, ..., SON:
481+
482+
>>> da = xr.DataArray(np.linspace(0,11,num=12),
483+
... coords=[pd.date_range('15/12/1999',
484+
... periods=12, freq=pd.DateOffset(months=1))],
485+
... dims='time')
486+
>>> da
487+
<xarray.DataArray (time: 12)>
488+
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11.])
489+
Coordinates:
490+
* time (time) datetime64[ns] 1999-12-15 2000-01-15 2000-02-15 ...
491+
>>> da.rolling(time=3).mean()
492+
<xarray.DataArray (time: 12)>
493+
array([ nan, nan, 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
494+
Coordinates:
495+
* time (time) datetime64[ns] 1999-12-15 2000-01-15 2000-02-15 ...
496+
497+
Remove the NaNs using ``dropna()``:
498+
499+
>>> da.rolling(time=3).mean().dropna('time')
500+
<xarray.DataArray (time: 10)>
501+
array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
502+
Coordinates:
503+
* time (time) datetime64[ns] 2000-02-15 2000-03-15 2000-04-15 ...
477504
"""
478505

479506
return self._rolling_cls(self, min_periods=min_periods,

xarray/core/dask_array_ops.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
"""Define core operations for xarray objects.
2+
"""
3+
import numpy as np
4+
5+
try:
6+
import dask.array as da
7+
except ImportError:
8+
pass
9+
10+
11+
def dask_rolling_wrapper(moving_func, a, window, min_count=None, axis=-1):
12+
'''wrapper to apply bottleneck moving window funcs on dask arrays'''
13+
# inputs for ghost
14+
if axis < 0:
15+
axis = a.ndim + axis
16+
depth = {d: 0 for d in range(a.ndim)}
17+
depth[axis] = window - 1
18+
boundary = {d: np.nan for d in range(a.ndim)}
19+
# create ghosted arrays
20+
ag = da.ghost.ghost(a, depth=depth, boundary=boundary)
21+
# apply rolling func
22+
out = ag.map_blocks(moving_func, window, min_count=min_count,
23+
axis=axis, dtype=a.dtype)
24+
# trim array
25+
result = da.ghost.trim_internal(out, depth)
26+
return result

xarray/core/dataarray.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -447,8 +447,8 @@ def _level_coords(self):
447447
"""
448448
level_coords = OrderedDict()
449449
for cname, var in self._coords.items():
450-
if var.ndim == 1:
451-
level_names = var.to_index_variable().level_names
450+
if var.ndim == 1 and isinstance(var, IndexVariable):
451+
level_names = var.level_names
452452
if level_names is not None:
453453
dim, = var.dims
454454
level_coords.update({lname: dim for lname in level_names})

0 commit comments

Comments
 (0)