WIP: implement reductions for DatetimeArray/TimedeltaArray/PeriodArray #23890

jbrockmendel · 2018-11-25T01:04:30Z

Some idiosyncrasies in signatures between Series vs Index make the parametrized test cases ugly. Do we want to adapt the Index reductions to have a skipna kwarg like the Series and EA methods?

Still needs tests for timedelta and period dtypes.

…duction2

pep8speaks · 2018-11-25T01:04:46Z

Hello @jbrockmendel! Thanks for submitting the PR.

There are no PEP8 issues in the file pandas/core/arrays/datetimelike.py !
There are no PEP8 issues in the file pandas/core/indexes/period.py !
There are no PEP8 issues in the file pandas/core/nanops.py !
There are no PEP8 issues in the file pandas/core/series.py !
There are no PEP8 issues in the file pandas/tests/indexes/datetimes/test_ops.py !
There are no PEP8 issues in the file pandas/tests/series/test_analytics.py !

TomAugspurger

Is your intent to share the same implementation between the arrays and index classes? Having _make_reduction and _get_reduction_vals be top-level functions instead of bound methods is a bit unfortunate, but OK if we get to have one implementation.

TomAugspurger · 2018-11-25T13:39:43Z

pandas/core/arrays/datetimelike.py

+
+    def method(self, skipna=True, **kwargs):
+        if only_timedelta:
+            raise TypeError('"{meth}" reduction is not valid for {cls}'


I don't understand this. only_timedelta makes me think we shouldn't raise here if the array is timedelta dtype, but right now it looks like we raise unconditionally.

woops, that should be if only_timedelta and not is_timedelta64_dtype(self)

TomAugspurger · 2018-11-25T13:41:54Z

pandas/core/arrays/datetimelike.py

+        if vals is NaT:
+            return NaT
+
+        # Try to minimize floating point error by rounding before casting


Does DatetimeIndex do this casting?

The only reductions DatetimeIndex has is min and max, for which it is not relevant.

pandas/core/series.py

TomAugspurger · 2018-11-25T13:45:34Z

pandas/core/nanops.py

@@ -460,6 +460,14 @@ def nanmean(values, axis=None, skipna=True, mask=None):
    elif is_float_dtype(dtype):
        dtype_sum = dtype
        dtype_count = dtype
+    elif is_datetime64_dtype(dtype) or is_datetime64tz_dtype(dtype):


is_datetime64_any_dtype I think.

Yah, but I'm kind of hoping to get rid of that

TomAugspurger · 2018-11-25T13:46:27Z

pandas/core/nanops.py

@@ -460,6 +460,14 @@ def nanmean(values, axis=None, skipna=True, mask=None):
    elif is_float_dtype(dtype):
        dtype_sum = dtype
        dtype_count = dtype
+    elif is_datetime64_dtype(dtype) or is_datetime64tz_dtype(dtype):
+        from pandas import DatetimeIndex


Why does this need to be boxed in an index? Shouldn't values be a DatetimeArray right now? Or is it not yet since we haven't implemented DTA as an extension array?

I don't know this module especially well, but my assumption is that it could be a numpy array at this point.

I think that Ideally nanmean will only be called via Series.mean and our ExtensionArray's .mean methods. Though I may be missing some cases.

I suppose the exception will be Series[datetime64[ns]], which may pass an ndarray here...

jreback · 2018-12-03T01:33:43Z

pandas/core/arrays/datetimelike.py


 from pandas.tseries import frequencies
 from pandas.tseries.offsets import DateOffset, Tick

 from .base import ExtensionOpsMixin


+def _get_reduction_vals(obj, skipna):


you seem to be reinventing the wheel here. we already do all of this in nanops.py for timedelta. I am not sure how this should be integrated here, but this is not the way. (meaning re-write all of the code we already have)

Yes, this is definitely a Proof Of Concept. For now the main question for you is if you're on board with the idea of bringing the Index reduction signatures in line with everything else

yes about the signatures generally. but maybe let's start with min/max as more straightforward? To do what you are suggestnig here will require using nanops (the interface can certainly be here), but the implementation is already there (in nanops).

TomAugspurger · 2018-12-08T11:56:56Z

pandas/core/base.py

@@ -795,7 +795,7 @@ def _ndarray_values(self):
    def empty(self):
        return not self.size

-    def max(self):
+    def max(self, skipna=True, axis=None):


In Series, these are all (self, axis=None, skipna=None, ...).

jreback · 2018-12-08T12:30:01Z

pandas/core/nanops.py

+        masked_vals = values
+        if mask is not None:
+            masked_vals = values[~mask]
+        the_mean = DatetimeIndex(masked_vals).mean(skipna=skipna)


pls try to follow the patterns in this module
_wrap_results exists for a reason

again i see lots of reinventing the wheel

Nothing in this PR has been updated since previous conversation. I’ll get around to this suggestion before too long.

jbrockmendel · 2019-01-13T23:42:17Z

Closing in favor of #24757.

jbrockmendel added 3 commits November 24, 2018 16:56

implement reductions for datetimelike dtypes

3c3c156

Merge branch 'master' of https://github.com/pandas-dev/pandas into re…

e8a5e8b

…duction2

comment for series case

167989c

gfyoung added the Datetime Datetime data dtype label Nov 25, 2018

TomAugspurger reviewed Nov 25, 2018

View reviewed changes

jbrockmendel added 2 commits November 27, 2018 17:15

standardize signatures on Index reductions

febaf67

Merge ../red into reduction2

e3fce02

jreback requested changes Dec 3, 2018

View reviewed changes

TomAugspurger mentioned this pull request Dec 4, 2018

REF: DatetimeLikeArray #24024

Merged

12 tasks

TomAugspurger reviewed Dec 8, 2018

View reviewed changes

jreback requested changes Dec 8, 2018

View reviewed changes

jbrockmendel mentioned this pull request Dec 15, 2018

standardize signature for Index reductions, implement nanmean for datetime64 dtypes #24293

Merged

4 tasks

jbrockmendel mentioned this pull request Jan 13, 2019

implement+test mean for datetimelike EA/Index/Series #24757

Merged

4 tasks

jbrockmendel closed this Jan 13, 2019

jbrockmendel deleted the reduction2 branch April 5, 2020 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: implement reductions for DatetimeArray/TimedeltaArray/PeriodArray #23890

WIP: implement reductions for DatetimeArray/TimedeltaArray/PeriodArray #23890

jbrockmendel commented Nov 25, 2018

pep8speaks commented Nov 25, 2018

TomAugspurger left a comment

TomAugspurger Nov 25, 2018

jbrockmendel Nov 25, 2018

TomAugspurger Nov 25, 2018

jbrockmendel Nov 25, 2018

TomAugspurger Nov 25, 2018

jbrockmendel Nov 25, 2018

TomAugspurger Nov 25, 2018

jbrockmendel Nov 25, 2018

TomAugspurger Dec 8, 2018

TomAugspurger Dec 8, 2018

jreback Dec 3, 2018

jbrockmendel Dec 3, 2018

jreback Dec 3, 2018

TomAugspurger Dec 8, 2018

jreback Dec 8, 2018

jbrockmendel Dec 8, 2018

jbrockmendel commented Jan 13, 2019

WIP: implement reductions for DatetimeArray/TimedeltaArray/PeriodArray #23890

WIP: implement reductions for DatetimeArray/TimedeltaArray/PeriodArray #23890

Conversation

jbrockmendel commented Nov 25, 2018

pep8speaks commented Nov 25, 2018

TomAugspurger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Jan 13, 2019