-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Wrong hour of DST end for Europe TZ #11481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think it is an issue in the datetime string parsing, instead of the underlying datetime type itself:
|
not a bug a all. you are just using this in an incorrect manner. you need to localize a time, simply constructing is exactly that, it just takes what you give it.
|
@jreback I agree that localizing is the better option, but I don't really see how the below can be correct:
1 o'clock should not become 2 o'clock as it is both in the same DST part? |
you misunderstand what is happening. you are taking a naive time, and simply SETTING it to a tz. you HAVE to localize. If you don't you get the CURRENT time zone. |
In any case, I agree this is not good documented. It is not really clear how the BTW,
and
(there is actually nothing ambiguous to pick in this case, 1am can only be 1am)
@jreback It is true I don't really understand what is happening under the hood, but IMO this is not 'incorrect code' from a user perspective. This is perfectly allowed in our API (why is there otherwise a |
I think this is using a different path for the localization when it is passing thru the Timestamp constructor. It should first create it as naive, THEN localize.
I suspect the path for a |
our DST experts |
Thanks for all comments - just from user perspective - it would be quite weird if Timestamp(x, tz=tz) behave differently from Timestamp(x).tz_localize(tz) - the result e in both cases is timestamp in same tz, right? And it should behave correctly - e.g. DST end is is 2 am in this time zone. |
I agree that this is a bug. This is a fall transition, not a spring one, so unless there's political hoo-hah going on, this is an ambiguous time rather than a nonexistent one. I'm not sure which offset the result should have, but it should have the original local time in that offset. This is regardless of whether you follow the Also I'm not sure whether I deserve the designation of "expert". I just have loud opinions. ;D |
@ischwabacher hahah, ok, let's revise to 'interested' parties! |
Also, I agree that this is a dupe of #8225. |
@ischwabacher hmm, isn't #8225 about parsing though? (or is it really just that the parsing is ok, but its locazling incorrectly?) |
@ischwabacher note that the initial example in this thread is not an ambiguous time. '2015-10-25 01:00' is unambiguously defined, as the transition happens from 03:00 -> 02:00, so only the timestamps between 02:00 and 03:00 are ambiguous. As a side note, I would also prefer that |
@jorisvandenbossche I'm not sure about the default handling of an ambiguous time. The latest news from python-dev is PEP 495. One of the themes in the discussion of that PEP was avoiding raising AmbiguousTimeError/NonexistentTimeError. If python-dev wants to avoid exceptions in its environment of individual datetime instances, I think it's definitely worth avoiding in our vectorized context. (FWIW, I think they're wrong, but "they" here refers to Tim and Guido, so the degree of certainty I have in that belief is low.) @jreback It's definitely a dupe: In [1]: import pandas as pd
In [2]: pd.__version__
Out[2]: u'0.17.0'
In [3]: pd.Timestamp('2015-10-25 01:00', tz='Europe/Prague')
Out[3]: Timestamp('2015-10-25 02:00:00+0200', tz='Europe/Prague')
In [4]: pd.Timestamp('2015-10-25 1:00', tz='Europe/Prague')
Out[4]: Timestamp('2015-10-25 01:00:00+0200', tz='Europe/Prague')
In [5]: pd.Timestamp('2015-10-25 01:00').tz_localize('Europe/Prague')
Out[5]: Timestamp('2015-10-25 01:00:00+0200', tz='Europe/Prague')
In [6]: pd.Timestamp('2015-10-25 1:00').tz_localize('Europe/Prague')
Out[6]: Timestamp('2015-10-25 01:00:00+0200', tz='Europe/Prague') Relevant:
|
@ischwabacher right, so the parsing is a red-herring. ok then! |
actually I don't think the problem is parsing at all. It is a naive time when parsed. It is when assigning it is localized incorrectly I think. |
But given that the result is correct/incorrect depending on slight changes in the string format, it seems that it has something to do with the parsing (or triggering another code route, so maybe not in the parsing itself but in only one of the code paths depending on which type of parsing was done):
|
Exactly. If you make the ISO 8601 parser give up, it passes it on to dateutil, which parses it correctly. But you can't get rid of the ISO 8601 parser, because you need a fast path for reading in large CSV files. But that third example baffles me. I hadn't noticed that before— it's exactly the |
Further note that the route taken by
|
hmm, these use exactly the same parser. the difference is that in the array processing (e.g. [61]), the tz is handled after, while in a Timestamp it is handled as the result of the |
WTB: debugger that runs same code in parallel with different inputs, breaks when code paths diverge. |
I don't think that's it, despite the fact that it definitely looks like a bug, since In [15]: pd.Timestamp('2015-9-27 03:00:00', tz='Pacific/Chatham')
Out[15]: Timestamp('2015-09-27 03:00:00+1245', tz='Pacific/Chatham')
In [16]: pd.Timestamp('2015-9-27 3:00:00', tz='Pacific/Chatham')
Out[16]: Timestamp('2015-09-27 03:00:00+1245', tz='Pacific/Chatham')
In [17]: pd.Timestamp('2015-9-27 04:00:00', tz='Pacific/Chatham')
Out[17]: Timestamp('2015-09-27 04:00:00+1345', tz='Pacific/Chatham')
In [18]: pd.Timestamp('2015-9-27 4:00:00', tz='Pacific/Chatham')
Out[18]: Timestamp('2015-09-27 04:00:00+1345', tz='Pacific/Chatham') |
As @jreback says, tz localization logic after |
xref #11708
In Europe DST ends by 2 am (this year on Sunday 2015-10-25), when clock is moved one hour back. However in Timestamp this transition is one hour earlier ( 1 am - which is hour, when transition happens in US I think). pytz works as expected.
See code ( tested in pandas 0.17.0 and 0.16.2):
The text was updated successfully, but these errors were encountered: