Skip to content

xarray to and from Iris #1750

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Dec 20, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
f7842b3
Added a first go at a converter from xarray.DataArrays objects to Iri…
nparley Apr 1, 2016
e0498c5
Update tests to use original.coords and add extra tests for coord att…
nparley Apr 1, 2016
c261845
Remove set literals just in case. Replace has_key with in. Use AuxCoo…
nparley Apr 2, 2016
edae053
Create dimensions if the Iris cube does not have any
nparley Apr 4, 2016
cd92bca
Add code to convert cell_methods
nparley Apr 8, 2016
44930af
Don't append blank cell method
nparley Apr 11, 2016
6bed306
Update cell method code to use internal Iris functions. Also add tests.
nparley Apr 20, 2016
cd06a2e
Update the API for IRIS change
nparley Apr 21, 2016
15fbbba
Merge remote-tracking branch 'upstream/master'
nparley May 25, 2016
e7f9cb1
Move helper functions outside of main functions
nparley May 25, 2016
03e9076
Merge remote-tracking branch 'upstream/master'
nparley May 26, 2016
877d06f
Update to build dims with mix of Dimension and Auxiliary coordinates
nparley Aug 9, 2016
f48de5a
Merge remote-tracking branch 'upstream/master'
nparley Aug 9, 2016
e42aeb2
Fix import after merge
nparley Aug 9, 2016
338ef6b
Bug fix / refactoring
nparley Aug 10, 2016
46f68ff
Change the dencode_cf method and raise error if coord has no var_name
nparley Aug 11, 2016
bfb2d16
Merge branch 'master' of https://github.com/nparley/xarray into iris_…
duncanwp Nov 30, 2017
edcb3fd
Add missing test and two minor fixes
duncanwp Nov 30, 2017
fb724da
Replacing function calls with set literals
duncanwp Nov 30, 2017
d510244
Adding what's new and updating api.rst
duncanwp Nov 30, 2017
208d129
Adding from_cube method to docs
duncanwp Dec 1, 2017
5d933a8
Make Iris dependency >=1.10 so we can rely on the newer parse_cell_me…
duncanwp Dec 5, 2017
05760aa
Adding Iris section to the I/O docs
duncanwp Dec 5, 2017
90ada4b
Fixing typos
duncanwp Dec 6, 2017
ebf3820
Improving error message
duncanwp Dec 6, 2017
0144215
Test edge cases
duncanwp Dec 6, 2017
3a4fc63
Preserve dask data arrays if at all possible. Convert to (dask) maske…
duncanwp Dec 6, 2017
95b0197
Updates to remove the hard dependency on dask, and to test the conver…
duncanwp Dec 18, 2017
239958b
Merge branch 'master' of https://github.com/pydata/xarray into iris_c…
duncanwp Dec 18, 2017
1470bc5
Minor doc fixes
duncanwp Dec 18, 2017
b457c74
Minor typo
duncanwp Dec 18, 2017
441a84f
Use dask_array_type for dask checks
duncanwp Dec 18, 2017
0dce7f3
Updating to run the iris docs
duncanwp Dec 18, 2017
301b2ef
Fix typo
duncanwp Dec 18, 2017
b62367a
Fixing long lines
duncanwp Dec 19, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ matrix:
- python: 2.7
env: CONDA_ENV=py27-min
- python: 2.7
env: CONDA_ENV=py27-cdat+pynio
env: CONDA_ENV=py27-cdat+iris+pynio
- python: 3.4
env: CONDA_ENV=py34
- python: 3.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ dependencies:
- seaborn
- toolz
- rasterio
- iris>=1.10
- zarr
- pip:
- coveralls
Expand Down
2 changes: 2 additions & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -448,6 +448,8 @@ DataArray methods
DataArray.to_index
DataArray.to_masked_array
DataArray.to_cdms2
DataArray.to_iris
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also note the from_iris method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

DataArray.from_iris
DataArray.to_dict
DataArray.from_series
DataArray.from_cdms2
Expand Down
1 change: 1 addition & 0 deletions doc/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,4 @@ dependencies:
- rasterio=0.36.0
- sphinx-gallery
- zarr
- iris
32 changes: 32 additions & 0 deletions doc/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,38 @@ supported by netCDF4-python: 'standard', 'gregorian', 'proleptic_gregorian' 'nol
By default, xarray uses the 'proleptic_gregorian' calendar and units of the smallest time
difference between values, with a reference time of the first time value.

.. _io.iris:

Iris
----

The Iris_ tool allows easy reading of common meteorological and climate model formats
(including GRIB and UK MetOffice PP files) into ``Cube`` objects which are in many ways very
similar to ``DataArray`` objects, while enforcing a CF-compliant data model. If iris is
installed xarray can convert a ``DataArray`` into a ``Cube`` using
:py:meth:`~xarray.DataArray.to_iris`:

.. ipython:: python

da = xr.DataArray(np.random.rand(4, 5), dims=['x', 'y'],
coords=dict(x=[10, 20, 30, 40],
y=pd.date_range('2000-01-01', periods=5)))

cube = da.to_iris()
cube

Conversely, we can create a new ``DataArray`` object from a ``Cube`` using
:py:meth:`~xarray.DataArray.from_iris`:

.. ipython:: python

da_cube = xr.DataArray.from_iris(cube)
da_cube


.. _Iris: http://scitools.org.uk/iris


OPeNDAP
-------

Expand Down
4 changes: 4 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ Enhancements
1D co-ordinate (e.g. time) and a 2D co-ordinate (e.g. depth as a function of
time) (:issue:`1737`).
By `Deepak Cherian <https://github.com/dcherian>`_.
- Added :py:meth:`DataArray.to_iris <xray.DataArray.to_iris>` and :py:meth:`DataArray.from_iris <xray.DataArray.from_iris>` for
converting data arrays to and from Iris_ Cubes with the same data and coordinates (:issue:`621` and :issue:`37`).
By `Neil Parley <https://github.com/nparley>`_ and `Duncan Watson-Parris <https://github.com/duncanwp>`_.
- Use ``pandas.Grouper`` class in xarray resample methods rather than the
deprecated ``pandas.TimeGrouper`` class (:issue:`1766`).
By `Joe Hamman <https://github.com/jhamman>`_.
Expand All @@ -36,6 +39,7 @@ Enhancements

.. _Zarr: http://zarr.readthedocs.io/

.. _Iris: http://scitools.org.uk/iris

Bug fixes
~~~~~~~~~
Expand Down
182 changes: 172 additions & 10 deletions xarray/convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,24 +7,42 @@
import numpy as np

from .core.dataarray import DataArray
from .core.pycompat import OrderedDict, range
from .core.dtypes import get_fill_value
from .conventions import (
maybe_encode_timedelta, maybe_encode_datetime, decode_cf)

ignored_attrs = set(['name', 'tileIndex'])
cdms2_ignored_attrs = {'name', 'tileIndex'}
iris_forbidden_keys = {'standard_name', 'long_name', 'units', 'bounds', 'axis',
'calendar', 'leap_month', 'leap_year', 'month_lengths',
'coordinates', 'grid_mapping', 'climatology',
'cell_methods', 'formula_terms', 'compress',
'missing_value', 'add_offset', 'scale_factor',
'valid_max', 'valid_min', 'valid_range', '_FillValue'}
cell_methods_strings = {'point', 'sum', 'maximum', 'median', 'mid_range',
'minimum', 'mean', 'mode', 'standard_deviation',
'variance'}


def encode(var):
return maybe_encode_timedelta(maybe_encode_datetime(var.variable))


def _filter_attrs(attrs, ignored_attrs):
""" Return attrs that are not in ignored_attrs
"""
return dict((k, v) for k, v in attrs.items() if k not in ignored_attrs)


def from_cdms2(variable):
"""Convert a cdms2 variable into an DataArray
"""
def get_cdms2_attrs(var):
return dict((k, v) for k, v in var.attributes.items()
if k not in ignored_attrs)

values = np.asarray(variable)
name = variable.id
coords = [(v.id, np.asarray(v), get_cdms2_attrs(v))
coords = [(v.id, np.asarray(v),
_filter_attrs(v.attributes, cdms2_ignored_attrs))
for v in variable.getAxisList()]
attrs = get_cdms2_attrs(variable)
attrs = _filter_attrs(variable.attributes, cdms2_ignored_attrs)
dataarray = DataArray(values, coords=coords, name=name, attrs=attrs)
return decode_cf(dataarray.to_dataset())[dataarray.name]

Expand All @@ -35,9 +53,6 @@ def to_cdms2(dataarray):
# we don't want cdms2 to be a hard dependency
import cdms2

def encode(var):
return maybe_encode_timedelta(maybe_encode_datetime(var.variable))

def set_cdms2_attrs(var, attrs):
for k, v in attrs.items():
setattr(var, k, v)
Expand All @@ -53,3 +68,150 @@ def set_cdms2_attrs(var, attrs):
cdms2_var = cdms2.createVariable(var.values, axes=axes, id=dataarray.name)
set_cdms2_attrs(cdms2_var, var.attrs)
return cdms2_var


def _pick_attrs(attrs, keys):
""" Return attrs with keys in keys list
"""
return dict((k, v) for k, v in attrs.items() if k in keys)


def _get_iris_args(attrs):
""" Converts the xarray attrs into args that can be passed into Iris
"""
# iris.unit is deprecated in Iris v1.9
import cf_units
args = {'attributes': _filter_attrs(attrs, iris_forbidden_keys)}
args.update(_pick_attrs(attrs, ('standard_name', 'long_name',)))
unit_args = _pick_attrs(attrs, ('calendar',))
if 'units' in attrs:
args['units'] = cf_units.Unit(attrs['units'], **unit_args)
return args


# TODO: Add converting bounds from xarray to Iris and back
def to_iris(dataarray):
""" Convert a DataArray into a Iris Cube
"""
# Iris not a hard dependency
import iris
from iris.fileformats.netcdf import parse_cell_methods
from xarray.core.pycompat import dask_array_type

dim_coords = []
aux_coords = []

for coord_name in dataarray.coords:
coord = encode(dataarray.coords[coord_name])
coord_args = _get_iris_args(coord.attrs)
coord_args['var_name'] = coord_name
axis = None
if coord.dims:
axis = dataarray.get_axis_num(coord.dims)
if coord_name in dataarray.dims:
iris_coord = iris.coords.DimCoord(coord.values, **coord_args)
dim_coords.append((iris_coord, axis))
else:
iris_coord = iris.coords.AuxCoord(coord.values, **coord_args)
aux_coords.append((iris_coord, axis))

args = _get_iris_args(dataarray.attrs)
args['var_name'] = dataarray.name
args['dim_coords_and_dims'] = dim_coords
args['aux_coords_and_dims'] = aux_coords
if 'cell_methods' in dataarray.attrs:
args['cell_methods'] = \
parse_cell_methods(dataarray.attrs['cell_methods'])

# Create the right type of masked array (should be easier after #1769)
if isinstance(dataarray.data, dask_array_type):
from dask.array import ma as dask_ma
masked_data = dask_ma.masked_invalid(dataarray)
else:
masked_data = np.ma.masked_invalid(dataarray)

cube = iris.cube.Cube(masked_data, **args)

return cube


def _iris_obj_to_attrs(obj):
""" Return a dictionary of attrs when given a Iris object
"""
attrs = {'standard_name': obj.standard_name,
'long_name': obj.long_name}
if obj.units.calendar:
attrs['calendar'] = obj.units.calendar
if obj.units.origin != '1':
attrs['units'] = obj.units.origin
attrs.update(obj.attributes)
return dict((k, v) for k, v in attrs.items() if v is not None)


def _iris_cell_methods_to_str(cell_methods_obj):
""" Converts a Iris cell methods into a string
"""
cell_methods = []
for cell_method in cell_methods_obj:
names = ''.join(['{}: '.format(n) for n in cell_method.coord_names])
intervals = ' '.join(['interval: {}'.format(interval)
for interval in cell_method.intervals])
comments = ' '.join(['comment: {}'.format(comment)
for comment in cell_method.comments])
extra = ' '.join([intervals, comments]).strip()
if extra:
extra = ' ({})'.format(extra)
cell_methods.append(names + cell_method.method + extra)
return ' '.join(cell_methods)


def from_iris(cube):
""" Convert a Iris cube into an DataArray
"""
import iris.exceptions
from xarray.core.pycompat import dask_array_type

name = cube.var_name
dims = []
for i in range(cube.ndim):
try:
dim_coord = cube.coord(dim_coords=True, dimensions=(i,))
dims.append(dim_coord.var_name)
except iris.exceptions.CoordinateNotFoundError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dims.append("dim_{}".format(i))

coords = OrderedDict()

for coord in cube.coords():
coord_attrs = _iris_obj_to_attrs(coord)
coord_dims = [dims[i] for i in cube.coord_dims(coord)]
if not coord.var_name:
raise ValueError("Coordinate '{}' has no "
"var_name attribute".format(coord.name()))
if coord_dims:
coords[coord.var_name] = (coord_dims, coord.points, coord_attrs)
else:
coords[coord.var_name] = ((),
np.asscalar(coord.points), coord_attrs)

array_attrs = _iris_obj_to_attrs(cube)
cell_methods = _iris_cell_methods_to_str(cube.cell_methods)
if cell_methods:
array_attrs['cell_methods'] = cell_methods

# Deal with iris 1.* and 2.*
cube_data = cube.core_data() if hasattr(cube, 'core_data') else cube.data

# Deal with dask and numpy masked arrays
if isinstance(cube_data, dask_array_type):
from dask.array import ma as dask_ma
filled_data = dask_ma.filled(cube_data, get_fill_value(cube.dtype))
elif isinstance(cube_data, np.ma.MaskedArray):
filled_data = np.ma.filled(cube_data, get_fill_value(cube.dtype))
else:
filled_data = cube_data

dataarray = DataArray(filled_data, coords=coords, name=name,
attrs=array_attrs, dims=dims)
decoded_ds = decode_cf(dataarray._to_temp_dataset())
return dataarray._from_temp_dataset(decoded_ds)
13 changes: 13 additions & 0 deletions xarray/core/dataarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -1539,6 +1539,19 @@ def from_cdms2(cls, variable):
from ..convert import from_cdms2
return from_cdms2(variable)

def to_iris(self):
"""Convert this array into a iris.cube.Cube
"""
from ..convert import to_iris
return to_iris(self)

@classmethod
def from_iris(cls, cube):
"""Convert a iris.cube.Cube into an xarray.DataArray
"""
from ..convert import from_iris
return from_iris(cube)

def _all_compat(self, other, compat_str):
"""Helper function for equals and identical"""

Expand Down
Loading