-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG/ENH: Handle NonexistentTimeError in date rounding #23406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @mroeschke! Thanks for submitting the PR.
|
Codecov Report
@@ Coverage Diff @@
## master #23406 +/- ##
==========================================
+ Coverage 92.18% 92.19% +<.01%
==========================================
Files 161 161
Lines 51184 51192 +8
==========================================
+ Hits 47185 47194 +9
+ Misses 3999 3998 -1
Continue to review full report at Codecov.
|
result = getattr(ts, method)(freq, nonexistent='NaT') | ||
assert result is NaT | ||
|
||
with pytest.raises(pytz.NonExistentTimeError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything meaningful in the error message? (question applies for all of your tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this just raises NonExistentTimeError: [time that was nonexistent]
but I can test for the message as well.
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -227,6 +227,7 @@ Other Enhancements | |||
- :class:`Series` and :class:`DataFrame` now support :class:`Iterable` in constructor (:issue:`2193`) | |||
- :class:`DatetimeIndex` gained :attr:`DatetimeIndex.timetz` attribute. Returns local time with timezone information. (:issue:`21358`) | |||
- :meth:`round`, :meth:`ceil`, and meth:`floor` for :class:`DatetimeIndex` and :class:`Timestamp` now support an ``ambiguous`` argument for handling datetimes that are rounded to ambiguous times (:issue:`18946`) | |||
- :meth:`round`, :meth:`ceil`, and meth:`floor` for :class:`DatetimeIndex` and :class:`Timestamp` now support an ``nonexistent`` argument for handling datetimes that are rounded to nonexistent times (:issue:`22647`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an nonexistent
--> a nonexistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this requires a mini-section to explain behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can link the section in the timeseries.rst
that explains nonexistent
behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. That works just as well.
|
||
with pytest.raises(pytz.NonExistentTimeError, | ||
message='2018-03-11 02:00:00'): | ||
getattr(s.dt, method)(freq, nonexistent='raise') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm almost inclined to have parameterization on nonexistent
, but up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to keep the format of this test. IMO parameterization over nonexistent would obfuscate the test too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. Just a thought. 👍
More general question, to what extent are we using pytest.raises
vs tm.assert_raises_regex
?
cc @jreback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok with pytest.raises
to the extent we are just checking the type of the error. so ok here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback : pytest.raises
is also checking the error message here...
(that's why I asked about this)
pandas/_libs/tslibs/conversion.pyx
Outdated
@@ -852,6 +851,8 @@ def tz_localize_to_utc(ndarray[int64_t] vals, object tz, object ambiguous=None, | |||
Py_ssize_t i, idx, pos, ntrans, n = len(vals) | |||
int64_t *tdata | |||
int64_t v, left, right, val, v_left, v_right | |||
int64_t remaining_minutes, new_local |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add with the previous line here
@@ -852,6 +851,8 @@ def tz_localize_to_utc(ndarray[int64_t] vals, object tz, object ambiguous=None, | |||
Py_ssize_t i, idx, pos, ntrans, n = len(vals) | |||
int64_t *tdata | |||
int64_t v, left, right, val, v_left, v_right | |||
int64_t remaining_minutes, new_local | |||
int delta_idx_offset, delta_idx | |||
ndarray[int64_t] result, result_a, result_b, dst_hours |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we use Py_ssize_t for indexers?
@@ -484,6 +484,17 @@ class NaTType(_NaT): | |||
- 'raise' will raise an AmbiguousTimeError for an ambiguous time | |||
|
|||
.. versionadded:: 0.24.0 | |||
nonexistent : 'shift', 'NaT', default 'raise' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if any way to share these doc-strings would be great, maybe templating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel do you happen to know why for some of these NaT methods we aren't just passing is Timestamp.[method].__doc__
instead of copying these docstrings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of the dependency structure. We import NaT in timestznps but not the other way around.
existing time | ||
- 'NaT' will return NaT where there are nonexistent times | ||
- 'raise' will raise an NonExistentTimeError if there are | ||
nonexistent times |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above about the doc-strings
Addressed your comments @jreback. I can make another issue between docstring templating between |
thanks @mroeschke yes would like to template the doc-strings to avoid duplication. |
…xamples * repo_org/master: (66 commits) CLN: doc string (pandas-dev#23469) DOC: Add cookbook entry for triangular correlation matrix (GH22840) (pandas-dev#23032) add number of Errors, Warnings to scripts/validate_docstrings.py (pandas-dev#23150) BUG: Allow freq conversion from dt64 to period (pandas-dev#23460) ENH: Add FrozenList.union and .difference (pandas-dev#23394) REF: cython cleanup, typing, optimizations (pandas-dev#23464) strictness and checks for Timedelta _simple_new (pandas-dev#23433) Fixing flake8 problems new to flake8 3.6.0 (pandas-dev#23472) DOC: Updating the docstring of Series.dot (pandas-dev#22890) TST: Fixturize series/test_analytics.py (pandas-dev#22755) BUG/ENH: Handle NonexistentTimeError in date rounding (pandas-dev#23406) PERF: speed up concat on Series by making _get_axis_number() a classmethod (pandas-dev#23404) REF: Remove DatetimelikeArrayMixin._shallow_copy (pandas-dev#23430) REF: strictness/simplification in DatetimeArray/Index _simple_new (pandas-dev#23431) REF: cython cleanup, typing, optimizations (pandas-dev#23456) TST: tweak Hypothesis configuration and idioms (pandas-dev#23441) BUG: fix HDFStore.append with all empty strings error (GH12242) (pandas-dev#23435) TST: Skip 32bit failing IntervalTree tests (pandas-dev#23442) BUG: Deprecate nthreads argument (pandas-dev#23112) style: fix import format at pandas/core/reshape (pandas-dev#23387) ...
git diff upstream/master -u -- "*.py" | flake8 --diff
Similar strategy as #22647, added a
nonexistent
keyword argument toround
,ceil
, andfloor
to control rounding when encountering aNonexistentTimeError
This is also fixes a bug in the
nonexistent='shift'
implementation in #22644 where dates with timezones with negative UTC offsets got shifted by an additional hour. This bug is naturally tested by these rounding tests