Skip to content

Combine by point coords #3982

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from
Closed
2 changes: 0 additions & 2 deletions doc/api-hidden.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,6 @@
.. autosummary::
:toctree: generated/

auto_combine

Dataset.nbytes
Dataset.chunks

Expand Down
1 change: 0 additions & 1 deletion doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ Top-level functions
broadcast
concat
merge
auto_combine
combine_by_coords
combine_nested
where
Expand Down
10 changes: 10 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,13 @@ Breaking changes
<https://matplotlib.org/api/prev_api_changes/api_changes_3.1.0.html#passing-a-line2d-s-drawstyle-together-with-the-linestyle-is-deprecated>`_.
(:pull:`3274`)
By `Elliott Sales de Andrade <https://github.com/QuLogic>`_
- The old :py:func:`auto_combine` function has now been removed in
favour of the :py:func:`combine_by_coords` and
:py:func:`combine_nested` functions. This also means that
the default behaviour of :py:func:`open_mfdataset` has changed to use
``combine='by_coords'`` as the default argument value. (:issue:`2616`, :pull:`3926`)
By `Tom Nicholas <https://github.com/TomNicholas>`_.


New Features
~~~~~~~~~~~~
Expand All @@ -41,6 +48,9 @@ New Features
- Implement :py:meth:`DataArray.idxmax`, :py:meth:`DataArray.idxmin`,
:py:meth:`Dataset.idxmax`, :py:meth:`Dataset.idxmin`. (:issue:`60`, :pull:`3871`)
By `Todd Jennings <https://github.com/toddrjen>`_
- :py:func:`combine_by_coords` will now also use zero-dimensional point coordinates
for concatenation where appropriate. (:issue:`3774`)
By `Tom Nicholas <https://github.com/TomNicholas>`_.


Bug fixes
Expand Down
3 changes: 1 addition & 2 deletions xarray/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from .coding.cftimeindex import CFTimeIndex
from .conventions import SerializationWarning, decode_cf
from .core.alignment import align, broadcast
from .core.combine import auto_combine, combine_by_coords, combine_nested
from .core.combine import combine_by_coords, combine_nested
from .core.common import ALL_DIMS, full_like, ones_like, zeros_like
from .core.computation import apply_ufunc, dot, polyval, where
from .core.concat import concat
Expand Down Expand Up @@ -46,7 +46,6 @@
"align",
"apply_ufunc",
"as_variable",
"auto_combine",
"broadcast",
"cftime_range",
"combine_by_coords",
Expand Down
59 changes: 15 additions & 44 deletions xarray/backends/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
from io import BytesIO
from numbers import Number
from pathlib import Path
from textwrap import dedent
from typing import (
TYPE_CHECKING,
Callable,
Expand All @@ -23,7 +22,6 @@
from ..core.combine import (
_infer_concat_order_from_positions,
_nested_combine,
auto_combine,
combine_by_coords,
)
from ..core.dataarray import DataArray
Expand Down Expand Up @@ -717,7 +715,7 @@ def open_mfdataset(
lock=None,
data_vars="all",
coords="different",
combine="_old_auto",
combine="by_coords",
autoclose=None,
parallel=False,
join="outer",
Expand All @@ -730,9 +728,8 @@ def open_mfdataset(
the datasets into one before returning the result, and if combine='nested' then
``combine_nested`` is used. The filepaths must be structured according to which
combining function is used, the details of which are given in the documentation for
``combine_by_coords`` and ``combine_nested``. By default the old (now deprecated)
``auto_combine`` will be used, please specify either ``combine='by_coords'`` or
``combine='nested'`` in future. Requires dask to be installed. See documentation for
``combine_by_coords`` and ``combine_nested``. By default ``combine='by_coords'``
will be used. Requires dask to be installed. See documentation for
details on dask [1]_. Global attributes from the ``attrs_file`` are used
for the combined dataset.

Expand All @@ -742,7 +739,7 @@ def open_mfdataset(
Either a string glob in the form ``"path/to/my/files/*.nc"`` or an explicit list of
files to open. Paths can be given as strings or as pathlib Paths. If
concatenation along more than one dimension is desired, then ``paths`` must be a
nested list-of-lists (see ``manual_combine`` for details). (A string glob will
nested list-of-lists (see ``combine_nested`` for details). (A string glob will
be expanded to a 1-dimensional list.)
chunks : int or dict, optional
Dictionary with keys given by dimension names and values given by chunk sizes.
Expand All @@ -752,15 +749,16 @@ def open_mfdataset(
see the full documentation for more details [2]_.
concat_dim : str, or list of str, DataArray, Index or None, optional
Dimensions to concatenate files along. You only need to provide this argument
if any of the dimensions along which you want to concatenate is not a dimension
in the original datasets, e.g., if you want to stack a collection of 2D arrays
along a third dimension. Set ``concat_dim=[..., None, ...]`` explicitly to
disable concatenation along a particular dimension.
if ``combine='by_coords'``, and if any of the dimensions along which you want to
concatenate is not a dimension in the original datasets, e.g., if you want to
stack a collection of 2D arrays along a third dimension. Set
``concat_dim=[..., None, ...]`` explicitly to disable concatenation along a
particular dimension. Default is None, which for a 1D list of filepaths is
equivalent to opening the files separately and then merging them with
``xarray.merge``.
combine : {'by_coords', 'nested'}, optional
Whether ``xarray.combine_by_coords`` or ``xarray.combine_nested`` is used to
combine all the data. If this argument is not provided, `xarray.auto_combine` is
used, but in the future this behavior will switch to use
`xarray.combine_by_coords` by default.
combine all the data. Default is to use ``xarray.combine_by_coords``.
compat : {'identical', 'equals', 'broadcast_equals',
'no_conflicts', 'override'}, optional
String indicating how to compare variables of the same name for
Expand Down Expand Up @@ -853,7 +851,6 @@ def open_mfdataset(
--------
combine_by_coords
combine_nested
auto_combine
open_dataset

References
Expand Down Expand Up @@ -881,11 +878,8 @@ def open_mfdataset(
# If combine='nested' then this creates a flat list which is easier to
# iterate over, while saving the originally-supplied structure as "ids"
if combine == "nested":
if str(concat_dim) == "_not_supplied":
raise ValueError("Must supply concat_dim when using " "combine='nested'")
else:
if isinstance(concat_dim, (str, DataArray)) or concat_dim is None:
concat_dim = [concat_dim]
if isinstance(concat_dim, (str, DataArray)) or concat_dim is None:
concat_dim = [concat_dim]
combined_ids_paths = _infer_concat_order_from_positions(paths)
ids, paths = (list(combined_ids_paths.keys()), list(combined_ids_paths.values()))

Expand Down Expand Up @@ -917,30 +911,7 @@ def open_mfdataset(

# Combine all datasets, closing them in case of a ValueError
try:
if combine == "_old_auto":
# Use the old auto_combine for now
# Remove this after deprecation cycle from #2616 is complete
basic_msg = dedent(
"""\
In xarray version 0.15 the default behaviour of `open_mfdataset`
will change. To retain the existing behavior, pass
combine='nested'. To use future default behavior, pass
combine='by_coords'. See
http://xarray.pydata.org/en/stable/combining.html#combining-multi
"""
)
warnings.warn(basic_msg, FutureWarning, stacklevel=2)

combined = auto_combine(
datasets,
concat_dim=concat_dim,
compat=compat,
data_vars=data_vars,
coords=coords,
join=join,
from_openmfds=True,
)
elif combine == "nested":
if combine == "nested":
# Combined nested list by successive concat and merge operations
# along each dimension, using structure given by "ids"
combined = _nested_combine(
Expand Down
Loading