Skip to content

Commit b83dae1

Browse files
committed
original API detection & warning
support for isinstance / numeric ops support for comparison ops
1 parent 5b59fc0 commit b83dae1

File tree

8 files changed

+328
-68
lines changed

8 files changed

+328
-68
lines changed

doc/source/timeseries.rst

+16-12
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ Resample:
6868
.. ipython:: python
6969
7070
# Daily means
71-
ts.resample('D', how='mean')
71+
ts.resample('D').mean()
7272
7373
7474
.. _timeseries.overview:
@@ -1091,7 +1091,7 @@ An example of how holidays and holiday calendars are defined:
10911091
Using this calendar, creating an index or doing offset arithmetic skips weekends
10921092
and holidays (i.e., Memorial Day/July 4th). For example, the below defines
10931093
a custom business day offset using the ``ExampleCalendar``. Like any other offset,
1094-
it can be used to create a ``DatetimeIndex`` or added to ``datetime``
1094+
it can be used to create a ``DatetimeIndex`` or added to ``datetime``
10951095
or ``Timestamp`` objects.
10961096

10971097
.. ipython:: python
@@ -1211,6 +1211,11 @@ Converting to Python datetimes
12111211
Resampling
12121212
----------
12131213

1214+
.. warning::
1215+
1216+
The interface to ``.resample`` has changed in 0.18.0 to be more groupby-like and hence more flexible.
1217+
See the :ref:`whatsnew docs <whatsnew_0180.breaking.resample>` for a comparison with prior versions.
1218+
12141219
Pandas has a simple, powerful, and efficient functionality for
12151220
performing resampling operations during frequency conversion (e.g., converting
12161221
secondly data into 5-minutely data). This is extremely common in, but not
@@ -1226,7 +1231,7 @@ See some :ref:`cookbook examples <cookbook.resample>` for some advanced strategi
12261231
12271232
ts = Series(randint(0, 500, len(rng)), index=rng)
12281233
1229-
ts.resample('5Min', how='sum')
1234+
ts.resample('5Min').sum()
12301235
12311236
The ``resample`` function is very flexible and allows you to specify many
12321237
different parameters to control the frequency conversion and resampling
@@ -1237,11 +1242,11 @@ an array and produces aggregated values:
12371242

12381243
.. ipython:: python
12391244
1240-
ts.resample('5Min') # default is mean
1245+
ts.resample('5Min').mean()
12411246
1242-
ts.resample('5Min', how='ohlc')
1247+
ts.resample('5Min').ohlc()
12431248
1244-
ts.resample('5Min', how=np.max)
1249+
ts.resample('5Min').max()
12451250
12461251
Any function available via :ref:`dispatching <groupby.dispatch>` can be given to
12471252
the ``how`` parameter by name, including ``sum``, ``mean``, ``std``, ``sem``,
@@ -1284,18 +1289,17 @@ frequency periods.
12841289
Up Sampling
12851290
~~~~~~~~~~~
12861291

1287-
For upsampling, the ``fill_method`` and ``limit`` parameters can be specified
1288-
to interpolate over the gaps that are created:
1292+
For upsampling, you can specify an way to upsample and the ``limit`` parameter to interpolate over the gaps that are created:
12891293

12901294
.. ipython:: python
12911295
12921296
# from secondly to every 250 milliseconds
12931297
1294-
ts[:2].resample('250L')
1298+
ts[:2].resample('250L').upsample()
12951299
1296-
ts[:2].resample('250L', fill_method='pad')
1300+
ts[:2].resample('250L').ffill()
12971301
1298-
ts[:2].resample('250L', fill_method='pad', limit=2)
1302+
ts[:2].resample('250L').ffill(limit=2)
12991303
13001304
Sparse Resampling
13011305
~~~~~~~~~~~~~~~~~
@@ -1317,7 +1321,7 @@ If we want to resample to the full range of the series
13171321

13181322
.. ipython:: python
13191323
1320-
ts.resample('3T',how='sum')
1324+
ts.resample('3T').sum()
13211325
13221326
We can instead only resample those groups where we have points as follows:
13231327

doc/source/whatsnew/v0.10.0.txt

+53-11
Original file line numberDiff line numberDiff line change
@@ -70,16 +70,59 @@ nfrequencies are unaffected. The prior defaults were causing a great deal of
7070
confusion for users, especially resampling data to daily frequency (which
7171
labeled the aggregated group with the end of the interval: the next day).
7272

73-
Note:
74-
75-
.. ipython:: python
76-
77-
dates = pd.date_range('1/1/2000', '1/5/2000', freq='4h')
78-
series = Series(np.arange(len(dates)), index=dates)
79-
series
80-
series.resample('D', how='sum')
81-
# old behavior
82-
series.resample('D', how='sum', closed='right', label='right')
73+
.. code-block:: python
74+
75+
In [1]: dates = pd.date_range('1/1/2000', '1/5/2000', freq='4h')
76+
77+
In [2]: series = Series(np.arange(len(dates)), index=dates)
78+
79+
In [3]: series
80+
Out[3]:
81+
2000-01-01 00:00:00 0
82+
2000-01-01 04:00:00 1
83+
2000-01-01 08:00:00 2
84+
2000-01-01 12:00:00 3
85+
2000-01-01 16:00:00 4
86+
2000-01-01 20:00:00 5
87+
2000-01-02 00:00:00 6
88+
2000-01-02 04:00:00 7
89+
2000-01-02 08:00:00 8
90+
2000-01-02 12:00:00 9
91+
2000-01-02 16:00:00 10
92+
2000-01-02 20:00:00 11
93+
2000-01-03 00:00:00 12
94+
2000-01-03 04:00:00 13
95+
2000-01-03 08:00:00 14
96+
2000-01-03 12:00:00 15
97+
2000-01-03 16:00:00 16
98+
2000-01-03 20:00:00 17
99+
2000-01-04 00:00:00 18
100+
2000-01-04 04:00:00 19
101+
2000-01-04 08:00:00 20
102+
2000-01-04 12:00:00 21
103+
2000-01-04 16:00:00 22
104+
2000-01-04 20:00:00 23
105+
2000-01-05 00:00:00 24
106+
Freq: 4H, dtype: int64
107+
108+
In [4]: series.resample('D', how='sum')
109+
Out[4]:
110+
2000-01-01 15
111+
2000-01-02 51
112+
2000-01-03 87
113+
2000-01-04 123
114+
2000-01-05 24
115+
Freq: D, dtype: int64
116+
117+
In [5]: # old behavior
118+
In [6]: series.resample('D', how='sum', closed='right', label='right')
119+
Out[6]:
120+
2000-01-01 0
121+
2000-01-02 21
122+
2000-01-03 57
123+
2000-01-04 93
124+
2000-01-05 129
125+
Freq: D, dtype: int64
83126

84127
- Infinity and negative infinity are no longer treated as NA by ``isnull`` and
85128
``notnull``. That they every were was a relic of early pandas. This behavior
@@ -354,4 +397,3 @@ Adding experimental support for Panel4D and factory functions to create n-dimens
354397
See the :ref:`full release notes
355398
<release>` or issue tracker
356399
on GitHub for a complete list.
357-

doc/source/whatsnew/v0.18.0.txt

+45-8
Original file line numberDiff line numberDiff line change
@@ -203,12 +203,12 @@ other anchored offsets like ``MonthBegin`` and ``YearBegin``.
203203
d + pd.offsets.QuarterBegin(n=0, startingMonth=2)
204204

205205

206-
.. _whatsnew_0180.enhancements.resample:
206+
.. _whatsnew_0180.breaking.resample:
207207

208208
Resample API
209209
^^^^^^^^^^^^
210210

211-
Like the change in the window functions API `above :ref:whatsnew_0180.enhancements.moments:`, ``.resample(...)`` is changing to have
211+
Like the change in the window functions API :ref:`above <whatsnew_0180.enhancements.moments>`, ``.resample(...)`` is changing to have
212212
a more groupy-like API. (:issue:`11732`).
213213

214214
.. ipython:: python
@@ -219,7 +219,10 @@ a more groupy-like API. (:issue:`11732`).
219219
index=pd.date_range('2010-01-01 09:00:00', periods=10, freq='s'))
220220
df
221221

222-
Previously you would write a resampling operations:
222+
223+
**Previous API**:
224+
225+
You would write a resampling operation that immediately evaluates.
223226

224227
This defaults to ``how='mean'``
225228

@@ -234,7 +237,7 @@ This defaults to ``how='mean'``
234237
2010-01-01 09:00:06 0.624988 0.609738 0.633165 0.612452
235238
2010-01-01 09:00:08 0.510470 0.534317 0.573201 0.806949
236239

237-
You can also specify a ``how`` directly
240+
You could also specify a ``how`` directly
238241

239242
.. code-block:: python
240243

@@ -247,6 +250,37 @@ You can also specify a ``how`` directly
247250
2010-01-01 09:00:06 1.249976 1.219477 1.266330 1.224904
248251
2010-01-01 09:00:08 1.020940 1.068634 1.146402 1.613897
249252

253+
.. warning::
254+
255+
This change will allow the existing API to work with a deprecation warning in most cases. Here is a typical use case:
256+
257+
.. code-block:: python
258+
259+
In [4]: r = df.resample('2s')
260+
261+
In [6]: r*10
262+
pandas/tseries/resample.py:80: FutureWarning: .resample() is now a deferred operation
263+
use .resample(...).mean() instead of .resample(...)
264+
265+
Out[6]:
266+
A B C D
267+
2010-01-01 09:00:00 4.857476 4.473507 3.570960 7.936154
268+
2010-01-01 09:00:02 8.208011 7.943173 3.640340 5.310957
269+
2010-01-01 09:00:04 4.339846 3.145823 4.241039 6.257326
270+
2010-01-01 09:00:06 6.249881 6.097384 6.331650 6.124518
271+
2010-01-01 09:00:08 5.104699 5.343172 5.732009 8.069486
272+
273+
Assignment operations will raise a ``ValueError``:
274+
275+
.. code-block:: python
276+
277+
In [7]: r.iloc[0] = 5
278+
ValueError: .resample() is now a deferred operation
279+
use .resample(...).mean() instead of .resample(...)
280+
assignment will have no effect as you are working on a copy
281+
282+
**New API**:
283+
250284
Now, you write ``.resample`` as a 2-stage operation like groupby, which
251285
yields a ``Resampler``.
252286

@@ -269,13 +303,13 @@ These are downsampling operations (going from a lower frequency to a higher one)
269303

270304
r.sum()
271305

272-
Furthermore, resample now supports ``getitem`` operations to selectively perform the resample.
306+
Furthermore, resample now supports ``getitem`` operations to perform the resample on specific columns.
273307

274308
.. ipython:: python
275309

276310
r[['A','C']].mean()
277311

278-
and ``.aggregate`` type of operations.
312+
and ``.aggregate`` type operations.
279313

280314
.. ipython:: python
281315

@@ -290,8 +324,11 @@ These accessors can of course, be combined
290324
Upsampling
291325
''''''''''
292326

327+
.. currentmodule:: pandas.tseries.resample
328+
293329
Upsampling operations take you from a higher frequency to a lower frequency. These are now
294-
performed with the ``Resampler`` objects with pad/fill/upsample methods.
330+
performed with the ``Resampler`` objects with :meth:`~Resampler.pad`,
331+
:meth:`~Resampler.ffill`, and :meth:`~Resampler.upsample` methods.
295332

296333
.. ipython:: python
297334

@@ -326,7 +363,7 @@ New API
326363

327364
s.resample('M').ffill()
328365

329-
:: note:
366+
.. note::
330367

331368
In the new API, you can either downsample OR upsample. The prior implementation would allow you to pass an aggregator function (like ``mean``) even though you were upsampling, provide a bit of confusion.
332369

doc/source/whatsnew/v0.9.1.txt

+12-5
Original file line numberDiff line numberDiff line change
@@ -112,14 +112,21 @@ API changes
112112
- Upsampling data with a PeriodIndex will result in a higher frequency
113113
TimeSeries that spans the original time window
114114

115-
.. ipython:: python
116-
117-
prng = period_range('2012Q1', periods=2, freq='Q')
115+
.. code-block:: python
118116

119-
s = Series(np.random.randn(len(prng)), prng)
117+
In [1]: prng = period_range('2012Q1', periods=2, freq='Q')
120118

121-
s.resample('M')
119+
In [2]: s = Series(np.random.randn(len(prng)), prng)
122120

121+
In [4]: s.resample('M')
122+
Out[4]:
123+
2012-01 -1.471992
124+
2012-02 NaN
125+
2012-03 NaN
126+
2012-04 -0.493593
127+
2012-05 NaN
128+
2012-06 NaN
129+
Freq: M, dtype: float64
123130

124131
- Period.end_time now returns the last nanosecond in the time interval
125132
(:issue:`2124`, :issue:`2125`, :issue:`1764`)

pandas/core/groupby.py

+2
Original file line numberDiff line numberDiff line change
@@ -302,6 +302,7 @@ def f(self):
302302

303303
class _GroupBy(PandasObject, SelectionMixin):
304304
_group_selection = None
305+
_apply_whitelist = frozenset([])
305306

306307
def __init__(self, obj, keys=None, axis=0, level=None,
307308
grouper=None, exclusions=None, selection=None, as_index=True,
@@ -2697,6 +2698,7 @@ def true_and_notnull(x, *args, **kwargs):
26972698
return filtered
26982699

26992700
def nunique(self, dropna=True):
2701+
""" Returns number of unique elements in the group """
27002702
ids, _, _ = self.grouper.group_info
27012703
val = self.obj.get_values()
27022704

0 commit comments

Comments
 (0)