Skip to content

BUG: DatetimeIndex from datetime.datetime shifted by n minutes #25897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
GrivIN opened this issue Mar 27, 2019 · 2 comments
Closed

BUG: DatetimeIndex from datetime.datetime shifted by n minutes #25897

GrivIN opened this issue Mar 27, 2019 · 2 comments

Comments

@GrivIN
Copy link

GrivIN commented Mar 27, 2019

Code Sample, a copy-pastable example if possible

import pandas as pd
import datetime as dt
import pytz

tz = pytz.timezone('America/New_York')

data1 = pd.DataFrame([
    {'date': dt.datetime(2019, 1, 1, tzinfo=tz), 'value': 10},
    {'date': dt.datetime(2019, 1, 1, tzinfo=tz), 'value': 10},
    {'date': dt.datetime(2019, 1, 2, tzinfo=tz), 'value': 20}
]).set_index('date')
print(data1.index)

# DatetimeIndex(['2018-12-31 23:56:00-05:00', '2018-12-31 23:56:00-05:00',
#                '2019-01-01 23:56:00-05:00'],
#               dtype='datetime64[ns, America/New_York]', name='date', freq=None)
# 

data2 = pd.DataFrame([
    {'date': pd.Timestamp(year=2019, month=1, day=1, tzinfo=tz), 'value': 10},
    {'date': pd.Timestamp(year=2019, month=1, day=1, tzinfo=tz), 'value': 10},
    {'date': pd.Timestamp(year=2019, month=1, day=2, tzinfo=tz), 'value': 20}
]).set_index('date')
print(data2.index)

# DatetimeIndex(['2019-01-01 00:00:00-05:00', '2019-01-01 00:00:00-05:00',
#                '2019-01-02 00:00:00-05:00'],
#               dtype='datetime64[ns, America/New_York]', name='date', freq=None)
# 

Problem description

Creating index from python datetime object shifts time by n minutes depending on chosen timezone

Expected Output

Output for index created from datetime object should be same as created from Timestamp

Similar Issues

#1676
#1790

Output of pd.show_versions()

INSTALLED VERSIONS

commit: ac318d2
python: 3.6.7.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-46-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.25.0.dev0+327.gac318d26c
pytest: None
pip: 9.0.1
setuptools: 39.0.1
Cython: 0.29.6
numpy: 1.16.2
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@Liam3851
Copy link
Contributor

This call is incorrect use of a pytz timezone:

dt.datetime(2019, 1, 1, tzinfo=tz)

It should be:

tz.localize(dt.datetime(2019, 1, 1))

The latter call applies the TZ rules on the given date. As-is your original case is taking 2019-01-01 00:00 in New York solar time, rather than in Eastern Standard time (the time in New York in 2019), hence the 4 minutes difference.

@mroeschke
Copy link
Member

@mroeschke mroeschke added this to the No action milestone Mar 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants