-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: implement TimedeltaArray/TimedeltaIIndex sum, median, std #28165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a release note, and a couple test questions. Implementation looks good.
@pytest.mark.parametrize("skipna", [True, False]) | ||
def test_reductions_empty(self, name, skipna): | ||
tdi = pd.TimedeltaIndex([]) | ||
arr = tdi._data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_data
-> .array
?
@@ -384,6 +386,49 @@ def astype(self, dtype, copy=True): | |||
return self | |||
return dtl.DatetimeLikeArrayMixin.astype(self, dtype, copy=copy) | |||
|
|||
def sum( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annotations would be nice for new developments, especially where reasonably easy to add
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like mypy starts complaining about index.py if i add types here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's it saying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ mypy pandas/core/arrays/timedeltas.py
pandas/core/indexes/base.py:257: error: Missing return statement
If that’s it then just could just add return None at the end of that func
…Sent from my iPhone
On Aug 27, 2019, at 7:41 PM, jbrockmendel ***@***.***> wrote:
@jbrockmendel commented on this pull request.
In pandas/core/arrays/timedeltas.py:
> @@ -384,6 +386,49 @@ def astype(self, dtype, copy=True):
return self
return dtl.DatetimeLikeArrayMixin.astype(self, dtype, copy=copy)
+ def sum(
$ mypy pandas/core/arrays/timedeltas.py
pandas/core/indexes/base.py:257: error: Missing return statement
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Looks like that func is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this change TimedeltaIndex behavior at all? e.g. I think that these were implemented before? or were they not correct before?
assert isinstance(result, pd.Timedelta) | ||
assert result == expected | ||
|
||
result = arr.std(skipna=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you leave these here on purpose (as this tests median)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a typo, will fix
ahh I see you are closing an issue (xref my last comment), can you add a whatsnew note describing the change. |
Do any of these (median) need docstrings? I believe that May need to add these to the |
this got lost but I think was ok, can you rebase and we'll look again. |
rebased |
thanks @jbrockmendel I guess this could have a whatsnew note, can you in future / other PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we follow the behaviour of numerical dtypes for empty arrays? (-> sum giving 0 instead of nan)
assert isinstance(result, pd.Timedelta) | ||
assert result == expected | ||
|
||
result = tdi.sum(min_count=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The min_count
was mainly added for the empty case, and for this it does not seem to work:
In [27]: tdi = pd.TimedeltaIndex([])
In [28]: tdi.sum()
Out[28]: NaT
In [29]: tdi.sum(min_count=1)
Out[29]: NaT
In [30]: tdi.sum(min_count=0)
Out[30]: NaT
If we follow the same behaviour as for numerical sum, the result should be 0 instead of NaT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right
The signatures match those in PandasArray.