-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
xarray to and from Iris #1750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xarray to and from Iris #1750
Changes from 32 commits
f7842b3
e0498c5
c261845
edae053
cd92bca
44930af
6bed306
cd06a2e
15fbbba
e7f9cb1
03e9076
877d06f
f48de5a
e42aeb2
338ef6b
46f68ff
bfb2d16
edcb3fd
fb724da
d510244
208d129
5d933a8
05760aa
90ada4b
ebf3820
0144215
3a4fc63
95b0197
239958b
1470bc5
b457c74
441a84f
0dce7f3
301b2ef
b62367a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,6 +22,7 @@ dependencies: | |
- seaborn | ||
- toolz | ||
- rasterio | ||
- iris>=1.10 | ||
- zarr | ||
- pip: | ||
- coveralls | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -338,6 +338,36 @@ supported by netCDF4-python: 'standard', 'gregorian', 'proleptic_gregorian' 'nol | |
By default, xarray uses the 'proleptic_gregorian' calendar and units of the smallest time | ||
difference between values, with a reference time of the first time value. | ||
|
||
.. _io.iris: | ||
|
||
Iris | ||
---- | ||
|
||
The Iris_ tool allows easy reading of common meteorological and climate model formats | ||
(including GRIB and UK MetOffice PP files) into ``Cube`` objects which are in many ways very | ||
similar to ``DataArray`` objects, while enforcing a CF-compliant data model. If iris is | ||
installed xarray can convert a ``Cube`` into a ``DataArray`` using | ||
:py:meth:`~xarray.Dataset.from_iris`: | ||
|
||
.. ipython:: python | ||
:verbatim: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I would definitely prefer adding Iris to our docs builds if that's feasible. See: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I'll give it a shot. How does the environment work, are all the examples run in the same interpreter session? And are there pre-built datarrays I can just use as examples? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have I'm OK without actually printing output here (maybe for now).... but in that case I would set things up a little differently so verbatim prints it properly, e.g.,
Basically, don't include the line that would print the repr for the cube/array unless you actually would make the Python objects. |
||
|
||
da_cube = xr.Dataset.from_iris(cube) | ||
da_cube | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are these examples working? If you'd like this code to run at build time you'll have to add iris to the doc build environment and define the Alternatively, you can use the |
||
|
||
Conversly, we can create a new cube object from a ``DataArray`` using | ||
:py:meth:`~xarray.Dataset.to_iris`: | ||
|
||
.. ipython:: python | ||
:verbatim: | ||
|
||
cube = da.to_iris() | ||
cube | ||
|
||
.. _Iris: http://scitools.org.uk/iris | ||
|
||
|
||
OPeNDAP | ||
------- | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,24 +7,39 @@ | |
import numpy as np | ||
|
||
from .core.dataarray import DataArray | ||
from .core.pycompat import OrderedDict, range | ||
from .core.dtypes import get_fill_value | ||
from .conventions import ( | ||
maybe_encode_timedelta, maybe_encode_datetime, decode_cf) | ||
maybe_encode_timedelta, maybe_encode_datetime, decode_cf, decode_cf_variable) | ||
|
||
ignored_attrs = set(['name', 'tileIndex']) | ||
cdms2_ignored_attrs = {'name', 'tileIndex'} | ||
iris_forbidden_keys = {'standard_name', 'long_name', 'units', 'bounds', 'axis', 'calendar', 'leap_month', 'leap_year', | ||
'month_lengths', 'coordinates', 'grid_mapping', 'climatology', 'cell_methods', 'formula_terms', | ||
'compress', 'missing_value', 'add_offset', 'scale_factor', 'valid_max', 'valid_min', | ||
'valid_range', '_FillValue'} | ||
cell_methods_strings = {'point', 'sum', 'maximum', 'median', 'mid_range', 'minimum', 'mean', 'mode', | ||
'standard_deviation', 'variance'} | ||
|
||
|
||
def encode(var): | ||
return maybe_encode_timedelta(maybe_encode_datetime(var.variable)) | ||
|
||
|
||
def _filter_attrs(attrs, ignored_attrs): | ||
""" Return attrs that are not in ignored_attrs | ||
""" | ||
return dict((k, v) for k, v in attrs.items() if k not in ignored_attrs) | ||
|
||
|
||
def from_cdms2(variable): | ||
"""Convert a cdms2 variable into an DataArray | ||
""" | ||
def get_cdms2_attrs(var): | ||
return dict((k, v) for k, v in var.attributes.items() | ||
if k not in ignored_attrs) | ||
|
||
values = np.asarray(variable) | ||
name = variable.id | ||
coords = [(v.id, np.asarray(v), get_cdms2_attrs(v)) | ||
coords = [(v.id, np.asarray(v), | ||
_filter_attrs(v.attributes, cdms2_ignored_attrs)) | ||
for v in variable.getAxisList()] | ||
attrs = get_cdms2_attrs(variable) | ||
attrs = _filter_attrs(variable.attributes, cdms2_ignored_attrs) | ||
dataarray = DataArray(values, coords=coords, name=name, attrs=attrs) | ||
return decode_cf(dataarray.to_dataset())[dataarray.name] | ||
|
||
|
@@ -35,9 +50,6 @@ def to_cdms2(dataarray): | |
# we don't want cdms2 to be a hard dependency | ||
import cdms2 | ||
|
||
def encode(var): | ||
return maybe_encode_timedelta(maybe_encode_datetime(var.variable)) | ||
|
||
def set_cdms2_attrs(var, attrs): | ||
for k, v in attrs.items(): | ||
setattr(var, k, v) | ||
|
@@ -53,3 +65,147 @@ def set_cdms2_attrs(var, attrs): | |
cdms2_var = cdms2.createVariable(var.values, axes=axes, id=dataarray.name) | ||
set_cdms2_attrs(cdms2_var, var.attrs) | ||
return cdms2_var | ||
|
||
|
||
def _pick_attrs(attrs, keys): | ||
""" Return attrs with keys in keys list | ||
""" | ||
return dict((k, v) for k, v in attrs.items() if k in keys) | ||
|
||
|
||
def _get_iris_args(attrs): | ||
""" Converts the xarray attrs into args that can be passed into Iris | ||
""" | ||
# iris.unit is deprecated in Iris v1.9 | ||
import cf_units | ||
args = {'attributes': _filter_attrs(attrs, iris_forbidden_keys)} | ||
args.update(_pick_attrs(attrs, ('standard_name', 'long_name',))) | ||
unit_args = _pick_attrs(attrs, ('calendar',)) | ||
if 'units' in attrs: | ||
args['units'] = cf_units.Unit(attrs['units'], **unit_args) | ||
return args | ||
|
||
|
||
# TODO: Add converting bounds from xarray to Iris and back | ||
def to_iris(dataarray): | ||
""" Convert a DataArray into a Iris Cube | ||
""" | ||
# Iris not a hard dependency | ||
import iris | ||
from iris.fileformats.netcdf import parse_cell_methods | ||
from xarray.core.pycompat import dask_array_type | ||
|
||
dim_coords = [] | ||
aux_coords = [] | ||
|
||
for coord_name in dataarray.coords: | ||
coord = encode(dataarray.coords[coord_name]) | ||
coord_args = _get_iris_args(coord.attrs) | ||
coord_args['var_name'] = coord_name | ||
axis = None | ||
if coord.dims: | ||
axis = dataarray.get_axis_num(coord.dims) | ||
if coord_name in dataarray.dims: | ||
iris_coord = iris.coords.DimCoord(coord.values, **coord_args) | ||
dim_coords.append((iris_coord, axis)) | ||
else: | ||
iris_coord = iris.coords.AuxCoord(coord.values, **coord_args) | ||
aux_coords.append((iris_coord, axis)) | ||
|
||
args = _get_iris_args(dataarray.attrs) | ||
args['var_name'] = dataarray.name | ||
args['dim_coords_and_dims'] = dim_coords | ||
args['aux_coords_and_dims'] = aux_coords | ||
if 'cell_methods' in dataarray.attrs: | ||
args['cell_methods'] = parse_cell_methods(dataarray.attrs['cell_methods']) | ||
|
||
# Create the right type of masked array (should be easier after #1769) | ||
if isinstance(dataarray.data, dask_array_type): | ||
from dask.array import ma as dask_ma | ||
masked_data = dask_ma.masked_invalid(dataarray) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. just import There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Much nicer - thanks! |
||
else: | ||
masked_data = np.ma.masked_invalid(dataarray) | ||
|
||
cube = iris.cube.Cube(masked_data, **args) | ||
|
||
return cube | ||
|
||
|
||
def _iris_obj_to_attrs(obj): | ||
""" Return a dictionary of attrs when given a Iris object | ||
""" | ||
attrs = {'standard_name': obj.standard_name, | ||
'long_name': obj.long_name} | ||
if obj.units.calendar: | ||
attrs['calendar'] = obj.units.calendar | ||
if obj.units.origin != '1': | ||
attrs['units'] = obj.units.origin | ||
attrs.update(obj.attributes) | ||
return dict((k, v) for k, v in attrs.items() if v is not None) | ||
|
||
|
||
def _iris_cell_methods_to_str(cell_methods_obj): | ||
""" Converts a Iris cell methods into a string | ||
""" | ||
cell_methods = [] | ||
for cell_method in cell_methods_obj: | ||
names = ''.join(['{}: '.format(n) for n in cell_method.coord_names]) | ||
intervals = ' '.join(['interval: {}'.format(interval) | ||
for interval in cell_method.intervals]) | ||
comments = ' '.join(['comment: {}'.format(comment) | ||
for comment in cell_method.comments]) | ||
extra = ' '.join([intervals, comments]).strip() | ||
if extra: | ||
extra = ' ({})'.format(extra) | ||
cell_methods.append(names + cell_method.method + extra) | ||
return ' '.join(cell_methods) | ||
|
||
|
||
def from_iris(cube): | ||
""" Convert a Iris cube into an DataArray | ||
""" | ||
import iris.exceptions | ||
from xarray.core.pycompat import dask_array_type | ||
|
||
name = cube.var_name | ||
dims = [] | ||
for i in range(cube.ndim): | ||
try: | ||
dim_coord = cube.coord(dim_coords=True, dimensions=(i,)) | ||
dims.append(dim_coord.var_name) | ||
except iris.exceptions.CoordinateNotFoundError: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line doesn't have an test coverage: |
||
dims.append("dim_{}".format(i)) | ||
|
||
coords = OrderedDict() | ||
|
||
for coord in cube.coords(): | ||
coord_attrs = _iris_obj_to_attrs(coord) | ||
coord_dims = [dims[i] for i in cube.coord_dims(coord)] | ||
if not coord.var_name: | ||
raise ValueError("Coordinate '{}' has no var_name attribute".format(coord.name())) | ||
if coord_dims: | ||
coords[coord.var_name] = (coord_dims, coord.points, coord_attrs) | ||
else: | ||
coords[coord.var_name] = ((), np.asscalar(coord.points), coord_attrs) | ||
|
||
array_attrs = _iris_obj_to_attrs(cube) | ||
cell_methods = _iris_cell_methods_to_str(cube.cell_methods) | ||
if cell_methods: | ||
array_attrs['cell_methods'] = cell_methods | ||
|
||
# Deal with iris 1.* and 2.* | ||
cube_data = cube.core_data() if hasattr(cube, 'core_data') else cube.data | ||
|
||
# Deal with dask and numpy masked arrays | ||
if isinstance(cube_data, dask_array_type): | ||
from dask.array import ma as dask_ma | ||
filled_data = dask_ma.filled(cube_data, get_fill_value(cube.dtype)) | ||
elif isinstance(cube_data, np.ma.MaskedArray): | ||
filled_data = np.ma.filled(cube_data, get_fill_value(cube.dtype)) | ||
else: | ||
filled_data = cube_data | ||
|
||
dataarray = DataArray(filled_data, coords=coords, name=name, | ||
attrs=array_attrs, dims=dims) | ||
decoded_ds = decode_cf(dataarray._to_temp_dataset()) | ||
return dataarray._from_temp_dataset(decoded_ds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also note the
from_iris
methodThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done