Skip to content

Commit 993ad42

Browse files
committed
Merge branch 'main' into speedup-dt-accesor
* main: Introduce Grouper objects internally (pydata#7561) [skip-ci] Add cftime groupby, resample benchmarks (pydata#7795) Fix groupby binary ops when grouped array is subset relative to other (pydata#7798) adjust the deprecation policy for python (pydata#7793) [pre-commit.ci] pre-commit autoupdate (pydata#7803) Allow the label run-upstream to run upstream CI (pydata#7787) Update asv links in contributing guide (pydata#7801) Implement DataArray.to_dask_dataframe() (pydata#7635) `ds.to_dict` with data as arrays, not lists (pydata#7739) Add lshift and rshift operators (pydata#7741) Use canonical name for set_horizonalalignment over alias set_ha (pydata#7786) Remove pandas<2 pin (pydata#7785) [pre-commit.ci] pre-commit autoupdate (pydata#7783)
2 parents cce3df8 + fde773e commit 993ad42

28 files changed

+1133
-422
lines changed

.github/workflows/upstream-dev-ci.yaml

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ on:
66
pull_request:
77
branches:
88
- main
9+
types: [opened, reopened, synchronize, labeled]
910
schedule:
1011
- cron: "0 0 * * *" # Daily “At 00:00” UTC
1112
workflow_dispatch: # allows you to trigger the workflow run manually
@@ -41,6 +42,7 @@ jobs:
4142
&& (
4243
(github.event_name == 'schedule' || github.event_name == 'workflow_dispatch')
4344
|| needs.detect-ci-trigger.outputs.triggered == 'true'
45+
|| contains( github.event.pull_request.labels.*.name, 'run-upstream')
4446
)
4547
defaults:
4648
run:
@@ -92,3 +94,58 @@ jobs:
9294
uses: xarray-contrib/issue-from-pytest-log@v1
9395
with:
9496
log-path: output-${{ matrix.python-version }}-log.jsonl
97+
98+
mypy-upstream-dev:
99+
name: mypy-upstream-dev
100+
runs-on: ubuntu-latest
101+
needs: detect-ci-trigger
102+
if: |
103+
always()
104+
&& (
105+
contains( github.event.pull_request.labels.*.name, 'run-upstream')
106+
)
107+
defaults:
108+
run:
109+
shell: bash -l {0}
110+
strategy:
111+
fail-fast: false
112+
matrix:
113+
python-version: ["3.10"]
114+
steps:
115+
- uses: actions/checkout@v3
116+
with:
117+
fetch-depth: 0 # Fetch all history for all branches and tags.
118+
- name: Set up conda environment
119+
uses: mamba-org/provision-with-micromamba@v15
120+
with:
121+
environment-file: ci/requirements/environment.yml
122+
environment-name: xarray-tests
123+
extra-specs: |
124+
python=${{ matrix.python-version }}
125+
pytest-reportlog
126+
conda
127+
- name: Install upstream versions
128+
run: |
129+
bash ci/install-upstream-wheels.sh
130+
- name: Install xarray
131+
run: |
132+
python -m pip install --no-deps -e .
133+
- name: Version info
134+
run: |
135+
conda info -a
136+
conda list
137+
python xarray/util/print_versions.py
138+
- name: Install mypy
139+
run: |
140+
python -m pip install mypy --force-reinstall
141+
- name: Run mypy
142+
run: |
143+
python -m mypy --install-types --non-interactive --cobertura-xml-report mypy_report
144+
- name: Upload mypy coverage to Codecov
145+
uses: codecov/[email protected]
146+
with:
147+
file: mypy_report/cobertura.xml
148+
flags: mypy
149+
env_vars: PYTHON_VERSION
150+
name: codecov-umbrella
151+
fail_ci_if_error: false

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ repos:
1616
files: ^xarray/
1717
- repo: https://github.com/charliermarsh/ruff-pre-commit
1818
# Ruff version.
19-
rev: 'v0.0.261'
19+
rev: 'v0.0.263'
2020
hooks:
2121
- id: ruff
2222
args: ["--fix"]

asv_bench/asv.conf.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@
3030
// determined by looking for tools on the PATH environment
3131
// variable.
3232
"environment_type": "conda",
33+
"conda_channels": ["conda-forge"],
3334

3435
// timeout in seconds for installing any dependencies in environment
3536
// defaults to 10 min

asv_bench/benchmarks/groupby.py

Lines changed: 54 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -18,23 +18,29 @@ def setup(self, *args, **kwargs):
1818
"c": xr.DataArray(np.arange(2 * self.n)),
1919
}
2020
)
21-
self.ds2d = self.ds1d.expand_dims(z=10)
21+
self.ds2d = self.ds1d.expand_dims(z=10).copy()
2222
self.ds1d_mean = self.ds1d.groupby("b").mean()
2323
self.ds2d_mean = self.ds2d.groupby("b").mean()
2424

2525
@parameterized(["ndim"], [(1, 2)])
2626
def time_init(self, ndim):
2727
getattr(self, f"ds{ndim}d").groupby("b")
2828

29-
@parameterized(["method", "ndim"], [("sum", "mean"), (1, 2)])
30-
def time_agg_small_num_groups(self, method, ndim):
29+
@parameterized(
30+
["method", "ndim", "use_flox"], [("sum", "mean"), (1, 2), (True, False)]
31+
)
32+
def time_agg_small_num_groups(self, method, ndim, use_flox):
3133
ds = getattr(self, f"ds{ndim}d")
32-
getattr(ds.groupby("a"), method)().compute()
34+
with xr.set_options(use_flox=use_flox):
35+
getattr(ds.groupby("a"), method)().compute()
3336

34-
@parameterized(["method", "ndim"], [("sum", "mean"), (1, 2)])
35-
def time_agg_large_num_groups(self, method, ndim):
37+
@parameterized(
38+
["method", "ndim", "use_flox"], [("sum", "mean"), (1, 2), (True, False)]
39+
)
40+
def time_agg_large_num_groups(self, method, ndim, use_flox):
3641
ds = getattr(self, f"ds{ndim}d")
37-
getattr(ds.groupby("b"), method)().compute()
42+
with xr.set_options(use_flox=use_flox):
43+
getattr(ds.groupby("b"), method)().compute()
3844

3945
def time_binary_op_1d(self):
4046
(self.ds1d.groupby("b") - self.ds1d_mean).compute()
@@ -115,15 +121,21 @@ def setup(self, *args, **kwargs):
115121
def time_init(self, ndim):
116122
getattr(self, f"ds{ndim}d").resample(time="D")
117123

118-
@parameterized(["method", "ndim"], [("sum", "mean"), (1, 2)])
119-
def time_agg_small_num_groups(self, method, ndim):
124+
@parameterized(
125+
["method", "ndim", "use_flox"], [("sum", "mean"), (1, 2), (True, False)]
126+
)
127+
def time_agg_small_num_groups(self, method, ndim, use_flox):
120128
ds = getattr(self, f"ds{ndim}d")
121-
getattr(ds.resample(time="3M"), method)().compute()
129+
with xr.set_options(use_flox=use_flox):
130+
getattr(ds.resample(time="3M"), method)().compute()
122131

123-
@parameterized(["method", "ndim"], [("sum", "mean"), (1, 2)])
124-
def time_agg_large_num_groups(self, method, ndim):
132+
@parameterized(
133+
["method", "ndim", "use_flox"], [("sum", "mean"), (1, 2), (True, False)]
134+
)
135+
def time_agg_large_num_groups(self, method, ndim, use_flox):
125136
ds = getattr(self, f"ds{ndim}d")
126-
getattr(ds.resample(time="48H"), method)().compute()
137+
with xr.set_options(use_flox=use_flox):
138+
getattr(ds.resample(time="48H"), method)().compute()
127139

128140

129141
class ResampleDask(Resample):
@@ -132,3 +144,32 @@ def setup(self, *args, **kwargs):
132144
super().setup(**kwargs)
133145
self.ds1d = self.ds1d.chunk({"time": 50})
134146
self.ds2d = self.ds2d.chunk({"time": 50, "z": 4})
147+
148+
149+
class ResampleCFTime(Resample):
150+
def setup(self, *args, **kwargs):
151+
self.ds1d = xr.Dataset(
152+
{
153+
"b": ("time", np.arange(365.0 * 24)),
154+
},
155+
coords={
156+
"time": xr.date_range(
157+
"2001-01-01", freq="H", periods=365 * 24, calendar="noleap"
158+
)
159+
},
160+
)
161+
self.ds2d = self.ds1d.expand_dims(z=10)
162+
self.ds1d_mean = self.ds1d.resample(time="48H").mean()
163+
self.ds2d_mean = self.ds2d.resample(time="48H").mean()
164+
165+
166+
@parameterized(["use_cftime", "use_flox"], [[True, False], [True, False]])
167+
class GroupByLongTime:
168+
def setup(self, use_cftime, use_flox):
169+
arr = np.random.randn(10, 10, 365 * 30)
170+
time = xr.date_range("2000", periods=30 * 365, use_cftime=use_cftime)
171+
self.da = xr.DataArray(arr, dims=("y", "x", "time"), coords={"time": time})
172+
173+
def time_mean(self, use_cftime, use_flox):
174+
with xr.set_options(use_flox=use_flox):
175+
self.da.groupby("time.year").mean()

ci/min_deps_check.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
"pytest-timeout",
3030
}
3131

32-
POLICY_MONTHS = {"python": 24, "numpy": 18}
32+
POLICY_MONTHS = {"python": 30, "numpy": 18}
3333
POLICY_MONTHS_DEFAULT = 12
3434
POLICY_OVERRIDE: dict[str, tuple[int, int]] = {}
3535
errors = []
@@ -109,6 +109,9 @@ def metadata(entry):
109109
(3, 6): datetime(2016, 12, 23),
110110
(3, 7): datetime(2018, 6, 27),
111111
(3, 8): datetime(2019, 10, 14),
112+
(3, 9): datetime(2020, 10, 5),
113+
(3, 10): datetime(2021, 10, 4),
114+
(3, 11): datetime(2022, 10, 24),
112115
}
113116
)
114117

doc/api.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -632,6 +632,7 @@ DataArray methods
632632
DataArray.from_iris
633633
DataArray.from_series
634634
DataArray.to_cdms2
635+
DataArray.to_dask_dataframe
635636
DataArray.to_dataframe
636637
DataArray.to_dataset
637638
DataArray.to_dict

doc/contributing.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -829,17 +829,17 @@ Running the performance test suite
829829
830830
Performance matters and it is worth considering whether your code has introduced
831831
performance regressions. *xarray* is starting to write a suite of benchmarking tests
832-
using `asv <https://github.com/spacetelescope/asv>`__
832+
using `asv <https://github.com/airspeed-velocity/asv>`__
833833
to enable easy monitoring of the performance of critical *xarray* operations.
834834
These benchmarks are all found in the ``xarray/asv_bench`` directory.
835835
836836
To use all features of asv, you will need either ``conda`` or
837837
``virtualenv``. For more details please check the `asv installation
838-
webpage <https://asv.readthedocs.io/en/latest/installing.html>`_.
838+
webpage <https://asv.readthedocs.io/en/stable/installing.html>`_.
839839
840840
To install asv::
841841
842-
pip install git+https://github.com/spacetelescope/asv
842+
python -m pip install asv
843843
844844
If you need to run a benchmark, change your directory to ``asv_bench/`` and run::
845845

doc/getting-started-guide/installing.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ Minimum dependency versions
8686
Xarray adopts a rolling policy regarding the minimum supported version of its
8787
dependencies:
8888

89-
- **Python:** 24 months
89+
- **Python:** 30 months
9090
(`NEP-29 <https://numpy.org/neps/nep-0029-deprecation_policy.html>`_)
9191
- **numpy:** 18 months
9292
(`NEP-29 <https://numpy.org/neps/nep-0029-deprecation_policy.html>`_)

doc/user-guide/computation.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,10 @@ Data arrays also implement many :py:class:`numpy.ndarray` methods:
6363
arr.round(2)
6464
arr.T
6565
66+
intarr = xr.DataArray([0, 1, 2, 3, 4, 5])
67+
intarr << 2 # only supported for int types
68+
intarr >> 1
69+
6670
.. _missing_values:
6771

6872
Missing values

doc/whats-new.rst

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,19 +22,26 @@ v2023.05.0 (unreleased)
2222

2323
New Features
2424
~~~~~~~~~~~~
25+
- Added new method :py:meth:`DataArray.to_dask_dataframe`, convert a dataarray into a dask dataframe (:issue:`7409`).
26+
By `Deeksha <https://github.com/dsgreen2>`_.
27+
- Add support for lshift and rshift binary operators (``<<``, ``>>``) on
28+
:py:class:`xr.DataArray` of type :py:class:`int` (:issue:`7727` , :pull:`7741`).
29+
By `Alan Brammer <https://github.com/abrammer>`_.
2530

2631

2732
Breaking changes
2833
~~~~~~~~~~~~~~~~
29-
34+
- adjust the deprecation policy for python to once again align with NEP-29 (:issue:`7765`, :pull:`7793`)
35+
By `Justus Magin <https://github.com/keewis>`_.
3036

3137
Deprecations
3238
~~~~~~~~~~~~
3339

3440

3541
Bug fixes
3642
~~~~~~~~~
37-
43+
- Fix groupby binary ops when grouped array is subset relative to other. (:issue:`7797`).
44+
By `Deepak Cherian <https://github.com/dcherian>`_.
3845

3946
Documentation
4047
~~~~~~~~~~~~~
@@ -102,6 +109,10 @@ New Features
102109
- Added ability to save ``DataArray`` objects directly to Zarr using :py:meth:`~xarray.DataArray.to_zarr`.
103110
(:issue:`7692`, :pull:`7693`) .
104111
By `Joe Hamman <https://github.com/jhamman>`_.
112+
- Keyword argument `data='array'` to both :py:meth:`xarray.Dataset.to_dict` and
113+
:py:meth:`xarray.DataArray.to_dict` will now return data as the underlying array type. Python lists are returned for `data='list'` or `data=True`. Supplying `data=False` only returns the schema without data. ``encoding=True`` returns the encoding dictionary for the underlying variable also.
114+
(:issue:`1599`, :pull:`7739`) .
115+
By `James McCreight <https://github.com/jmccreight>`_.
105116

106117
Breaking changes
107118
~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)