Skip to content

Timezone info lost when broadcasting scalar datetime to DataFrame #11682

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ajenkins-cargometrics opened this issue Nov 23, 2015 · 2 comments
Closed
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Timezones Timezone data dtype
Milestone

Comments

@ajenkins-cargometrics
Copy link
Contributor

I've encountered a bug in pandas 0.16.2, where when using broadcasting to assign a datetime.datetime value to a whole column of a DataFrame, the timezone info is lost. Here is an example:

In [1]: import pandas, datetime, pytz

In [2]: df = pandas.DataFrame({'a': [1,2,3]})

In [3]: dt = datetime.datetime.now(pytz.utc)

In [4]: dt.tzinfo
Out[4]: <UTC>

In [5]: df['b'] = dt

In [6]: df
Out[6]: 
   a                          b
0  1 2015-11-23 21:02:54.562175
1  2 2015-11-23 21:02:54.562175
2  3 2015-11-23 21:02:54.562175

In [7]: df['b'][0].tzinfo

Note how dt has a timezone attached, but the values in the 'b' column don't. The problem only occurs when broadcasting a scalar datetime column, not when assigning an array or series. Also, the problem only occurs when using the builtin datetime.datetime class, not pandas's Timestamp class.

I've tracked the problem down to the pandas.core.common._infer_dtype_from_scalar function, which is called during the assignment. It contains this code for handling scalar date times:

    elif isinstance(val, (np.datetime64, datetime)) and getattr(val,'tz',None) is None:
        val = lib.Timestamp(val).value
        dtype = np.dtype('M8[ns]')

The problem is that the Timestamp.value property returns an integer value which doesn't contain the timezone information, so the timezone is lost. The reason this problem occurs for datetime.datetime, but not for pandas.Timestamp, is because the code is looking for the 'tz' attribute, which is specific to Timestamp. If the gettattr call was changed to look at the 'tzinfo' attribute instead, this code would work correctly for both pandas.Timestamp and datetime.datetime values. So a fix for this code which works for both datetime and Timestamp would be:

    elif isinstance(val, (np.datetime64, datetime)) and getattr(val,'tzinfo',None) is None:
        val = lib.Timestamp(val).value
        dtype = np.dtype('M8[ns]')

I checked and this bug still exists in the latest version of the pandas source. Nevertheless here is the output of show_versions() on my machine:

In [8]: pandas.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.9.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.16.2
nose: 1.3.7
Cython: 0.23.2
numpy: 1.9.2
scipy: 0.16.0
statsmodels: 0.6.1
IPython: 3.1.0
sphinx: 1.3.1
patsy: None
dateutil: 2.4.2
pytz: 2012c
bottleneck: None
tables: 3.2.1.1
numexpr: 2.4.4
matplotlib: 1.4.2
openpyxl: None
xlrd: 0.9.4
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.8
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
@jreback
Copy link
Contributor

jreback commented Nov 24, 2015

this is covered by #11672
thanks for the report

@jreback jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves Timezones Timezone data dtype labels Nov 24, 2015
@jreback jreback added this to the 0.18.0 milestone Nov 24, 2015
@jreback
Copy link
Contributor

jreback commented Nov 27, 2015

closed by #11672

@jreback jreback closed this as completed Nov 27, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants