PERF: datetime index getters functions are 10 times slower with ZoneInfo vs pytz timezone#64379
PERF: datetime index getters functions are 10 times slower with ZoneInfo vs pytz timezone#64379kjmin622 wants to merge 8 commits intopandas-dev:mainfrom
Conversation
…nfo vs pytz timezone
pandas/_libs/tslibs/tzconversion.pyx
Outdated
| self.deltas = deltas | ||
|
|
||
| if typ != "pytz" and typ != "dateutil": | ||
| if typ not in ("pytz", "dateutil", "zoneinfo"): |
There was a problem hiding this comment.
nitpick: is this going to create an unnecessary python tuple object?
pandas/_libs/tslibs/timezones.pyx
Outdated
| # Daylight Savings | ||
|
|
||
|
|
||
| cdef object _get_zoneinfo_trans_and_deltas(tzinfo tz): |
There was a problem hiding this comment.
can we re-use some of this in the treat_tz_as_dateutil path?
|
Couple of test failures to figure out, but this looks like the right approach. |
|
@jbrockmendel Thank you for your review! I've applied your feedback. |
Co-authored-by: mv-python <matusvalo@users.noreply.github.com>
Co-authored-by: mv-python <matusvalo@users.noreply.github.com>
This surprises me. Is there a minimal example? |
| dateutil_tz : tzinfo | ||
| A dateutil timezone object with _trans_list and _trans_idx attributes. | ||
| first_offset_seconds : int64_t | ||
| The UTC offset in seconds for the period before the first transition. |
There was a problem hiding this comment.
This is needed to address the dateutil-vs-zoneinfo discrepancy?
There was a problem hiding this comment.
No, this is just extracting common logic into a single function. The discrepancy is addressed by the fallback logic in tzconversion.pyx (the part you asked me to add comments to).
| return utc_val + self.delta | ||
| else: | ||
| pos[0] = bisect_right_i8(self.tdata, utc_val, self.ntrans) - 1 | ||
| if self.use_zoneinfo_fallback and pos[0] == 0: |
There was a problem hiding this comment.
can you add comments explaining why this is necessary
@jbrockmendel from zoneinfo import ZoneInfo
from dateutil.tz import gettz
from datetime import datetime
dt = datetime(1900, 1, 1)
# Africa/Lusaka: 2-second difference
print(ZoneInfo("Africa/Lusaka").utcoffset(dt)) # -> 2:10:18
print(gettz("Africa/Lusaka").utcoffset(dt)) # -> 2:10:20
# Africa/Sao_Tome: much larger difference
print(ZoneInfo("Africa/Sao_Tome").utcoffset(dt)) # -> -1 day, 23:23:15
print(gettz("Africa/Sao_Tome").utcoffset(dt)) # -> 0:26:56Note that this discrepancy may not occur in all environments. Looking at the CI failure results, macOS passes while Ubuntu and Windows fail. |
|
@pganssle thoughts here on how to explain/handle the discrepancy? |
|
I suspect this would also close #58962 |
doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.AGENTS.md.