-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Revert handling of i8values to DatetimeIndex #24708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I'll take a look at Tom's edits in a little bit. #24686 should be good to go and handles the only part of |
Travis looks like just a complaint about stacklevel |
Does this correspond to one of the options in the table in #24559? |
The options in those tables represent the desired future behavior. This is changing the current behavior of DTI to match 0.23.4's behavior (i8 values are treated as wall times). We need to figure out
For the RC I only care about
|
Not directly, as this is mainly about reverting the behaviour to 0.23.4. So if we add a deprecation warning, that means that we choose one of the options where DatetimeIndex(int) is unix-epoch (that are still multiple options though). (I think that is one of the aspects we are most sure about / most agreement, but if needed, we can also only here do the revert, and defer adding a deprecation warning until we made a final discussion on which option to choose) |
Double post :-) Yes, agreed on the priorities for the RC. But so the deprecation warning still depends somewhat on which desired future behaviour we want. |
Agreed. We'll need to sort that out. I'm off for ~5 hours. Hopefully you all figure it out in the meantime :) |
Actually, I think the relevant decisions have been made for this PR, right? Does everyone agree with the following axiom?
In words, passing integer values and a timezone is equivalent to passing interger values, localizing to UTC, and converting to the desired tz? In other words, integer values are treated like UTC (unix epochs). All the options in |
On In [2]: pd.Index([946684800000000000]).astype('datetime64[ns, US/Central]')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/indexes/base.py in astype(self, dtype, copy)
1291 try:
-> 1292 return Index(self.values.astype(dtype, copy=copy), name=self.name,
1293 dtype=dtype)
TypeError: Invalid datetime unit in metadata string "[ns, US/Central]"
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-2-e1a69a3f7ce5> in <module>
----> 1 pd.Index([946684800000000000]).astype('datetime64[ns, US/Central]')
~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/indexes/base.py in astype(self, dtype, copy)
1294 except (TypeError, ValueError):
1295 msg = 'Cannot cast {name} to dtype {dtype}'
-> 1296 raise TypeError(msg.format(name=type(self).__name__, dtype=dtype))
1297
1298 def _to_safe_for_reshape(self):
TypeError: Cannot cast Int64Index to dtype datetime64[ns, US/Central] Does anyone object to doing essentially In [8]: pd.Index([946684800000000000]).astype('datetime64[ns]').tz_localize("UTC").tz_convert("US/Central")
Out[8]: DatetimeIndex(['1999-12-31 18:00:00-06:00'], dtype='datetime64[ns, US/Central]', freq=None) which is equivalent to the future behavior? |
No objections on my end. So will the |
In [2]: pd.Index([946684800000000000], dtype='datetime64[ns, US/Central]')
Out[2]: DatetimeIndex(['1999-12-31 18:00:00-06:00'], dtype='datetime64[ns, US/Central]', freq=None)
In [3]: pd.Index([946684800000000000]).astype('datetime64[ns]').tz_localize("UTC").tz_convert("US/Central")
Out[3]: DatetimeIndex(['1999-12-31 18:00:00-06:00'], dtype='datetime64[ns, US/Central]', freq=None) I just need to change the implementation to avoid the warning. This will introduce a temporary discrepancy between |
Codecov Report
@@ Coverage Diff @@
## master #24708 +/- ##
===========================================
- Coverage 92.38% 43.06% -49.32%
===========================================
Files 166 166
Lines 52310 52318 +8
===========================================
- Hits 48327 22532 -25795
- Misses 3983 29786 +25803
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #24708 +/- ##
==========================================
+ Coverage 92.38% 92.38% +<.01%
==========================================
Files 166 166
Lines 52321 52335 +14
==========================================
+ Hits 48338 48352 +14
Misses 3983 3983
Continue to review full report at Codecov.
|
See 1456699 for the astype changes. I'll push a whatsnew later tonight, unless anyone beats me to it. We have an outstanding TODO: Should |
Also need the filterwarnings to match our message, so we aren't accidentally silencing errors. |
I think there's a slight preference not to warn in this case. |
Updating UTC handling and writing a release note now. |
I don't think 90afdee will do this trick, but trying anyway. |
Just cloned, will take a look |
I'm guessing it's just strangeness with python 2's warning handling. The tests are fine on their own AFAICT. |
Got it locally. Just narrowing down now. |
|
6633344 may do it, not sure why we would need to skip that test on py2... |
Not fixed yet :/ |
|
I can reproduce all 7 failures on Linux with |
Narrowed it down to just |
That seemed to do it. Removing the hopefully unnecessary skip. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the docstring for the new param
Thanks. Merging in 1 hour if no objections. |
Should we maybe add some specific TODO statement in those places (eg in some astype related code) where you now did some extra if/else that can be cleaned up once deprecation is removed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very minor comments
def test_constructor_with_int_tz(self, klass, box, tz, dtype): | ||
# GH 20997, 20964 | ||
ts = Timestamp('2018-01-01', tz=tz) | ||
result = klass(box([ts.value]), dtype=dtype) | ||
expected = klass([ts]) | ||
assert result == expected | ||
|
||
# This is the desired future behavior | ||
@pytest.mark.xfail(reason="TBD", strict=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be determined, thought I suppose we've determined that the test is the desired future behavior.
Will fix those up and add some TODOs for when the deprecation is enforced. |
All green. Merging. |
@TomAugspurger Thanks! |
TODO:
Questions
pd.Index(i8).dtype(DatetimeTZDtype)
?int_as_wall_time
parameter, or @jbrockmendel's_from_sequence_dti
from jbrockmendel@635b267cc @jbrockmendel @jorisvandenbossche @jreback @mroeschke
xref #24559