Skip to content

BUG: error in in _convert_to_indexer while using .loc with tz-aware DateTimeIndex #11679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lopezco opened this issue Nov 23, 2015 · 7 comments · Fixed by #21612
Closed

BUG: error in in _convert_to_indexer while using .loc with tz-aware DateTimeIndex #11679

lopezco opened this issue Nov 23, 2015 · 7 comments · Fixed by #21612
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Testing pandas testing functions or related to the test suite Timezones Timezone data dtype
Milestone

Comments

@lopezco
Copy link

lopezco commented Nov 23, 2015

addtl example / repro in #13908

Hello,

I encounter some problems using the data provided here: data.txt

When I was trying to use .loc on my DateTimeIndex, I go t the following error

KeyError: "['2015-03-01T02:00:00.000000000+0100'] not in index"

Notice that the error occurs only if we have the DST changing time in the DateTimeIndex

While debugging I've found that the error comes from the _convert_to_indexer method at:

# indexing.py
mask = check == -1
if mask.any():
    raise KeyError('%s not in index' % objarr[mask]) <------------------------------------

Here is the copy-paste example using the data on the .txt attached:

data = pd.read_csv("data.txt"), parse_dates={'time':['Date']})
data.set_index('time', inplace=True)
data.index = data.index.tz_localize('Europe/Paris', ambiguous='infer').tz_convert('UTC')

for i in range(1, len(data)):
    try:-aware 
        data.loc[data.index[:i], 'value'] = -1
    except Exception as e:
        print i,e
        break

Other important fact: the error occurs for pandas version 0.17.0 and 0.17.1. However, I've managed to make it work with an older version of pandas (0.16.2)

I hope this is explicit enough.

UPDATE

In[19]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.0-33-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: 1.3.7
pip: 7.1.2
setuptools: 18.5
Cython: 0.23.4
numpy: 1.10.1
scipy: 0.16.1
statsmodels: None
IPython: 4.0.0
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: None
numexpr: 2.4.6
matplotlib: 1.5.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None
@jreback
Copy link
Contributor

jreback commented Nov 23, 2015

pls show a copy-pastable example, and pd.show_versions()

@lopezco lopezco changed the title error in in _convert_to_indexer while using .loc for DST changing times (with tz-aware DateTimeIndex) error in in _convert_to_indexer while using .loc with a DataFrame tha has DST changing times in its DateTimeIndex Nov 23, 2015
@briandavidgreen
Copy link

I'm experiencing a similar issue, specifically when using a DateTimeIndex as a major_axis in a panel. Somewhere inside the pandas indexing, the dtype of a time-zone aware DateTimeIndex is being stripped away. Copy-paste example:
import pandas as pd

##########################################################################################
#Works
axis_0 = ['A','B','C']
axis_1 = pd.DatetimeIndex(start='2015-11-04',end='2015-11-05',freq='120T')
axis_2 = ['d','e','f']

normal_panel = pd.Panel(items = axis_0, major_axis = axis_1, minor_axis = axis_2)

normal_panel.ix[:,:,'d'] = 1.
normal_panel.ix[:,:,'e'] = normal_panel.ix[:,:,'d']

print normal_panel.ix[:,:,'e']

##########################################################################################
#Doesn't work
axis_0 = ['A','B','C']
axis_1 = pd.DatetimeIndex(start='2015-11-04',end='2015-11-05',freq='120T',tz='US/Central')
axis_2 = ['d','e','f']

tz_panel = pd.Panel(items = axis_0, major_axis = axis_1, minor_axis = axis_2)

tz_panel.ix[:,:,'d'] = 1.
tz_panel.ix[:,:,'e'] = tz_panel.ix[:,:,'d']

print tz_panel.ix[:,:,'e']

@briandavidgreen
Copy link

INSTALLED VERSIONS

commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 62 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.17.0
nose: 1.3.7
pip: 7.1.2
setuptools: 18.4
Cython: 0.23.4
numpy: 1.10.1
scipy: 0.16.0
statsmodels: 0.6.1
IPython: 4.0.0
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.4.3
openpyxl: 2.2.6
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.7.7
lxml: 3.4.4
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None

@briandavidgreen
Copy link

Similarly I am able to produce joseRLC's issue when slicing on the major_axis:

tz_panel.ix[:,tz_panel.major_axis[0:1],'d'] = tz_panel.ix[:,tz_panel.major_axis[0:1],'e']

KeyError Traceback (most recent call last)
in ()
----> 1 tz_panel.ix[:,tz_panel.major_axis[0:1],'d'] = tz_panel.ix[:,tz_panel.major_axis[0:1],'e']

C:\Anaconda2\lib\site-packages\pandas\core\indexing.pyc in setitem(self, key, value)
112
113 def setitem(self, key, value):
--> 114 indexer = self._get_setitem_indexer(key)
115 self._setitem_with_indexer(indexer, value)
116

C:\Anaconda2\lib\site-packages\pandas\core\indexing.pyc in _get_setitem_indexer(self, key)
104
105 if isinstance(key, tuple) and not self.ndim < len(key):
--> 106 return self._convert_tuple(key, is_setter=True)
107
108 try:

C:\Anaconda2\lib\site-packages\pandas\core\indexing.pyc in _convert_tuple(self, key, is_setter)
153 else:
154 for i, k in enumerate(key):
--> 155 idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
156 keyidx.append(idx)
157 return tuple(keyidx)

C:\Anaconda2\lib\site-packages\pandas\core\indexing.pyc in _convert_to_indexer(self, obj, axis, is_setter)
1119 mask = check == -1
1120 if mask.any():
-> 1121 raise KeyError('%s not in index' % objarr[mask])
1122
1123 return _values_from_object(indexer)

KeyError: "['2015-11-04T00:00:00.000000000-0600'] not in index"

@lopezco lopezco changed the title error in in _convert_to_indexer while using .loc with a DataFrame tha has DST changing times in its DateTimeIndex error in in _convert_to_indexer while using .loc with DateTimeIndex Nov 24, 2015
@briandavidgreen
Copy link

@JoseRLC issue title should probably say "...with tz_aware DateTimeIndex" The issue does not occur for non-tz_aware indices.

@lopezco lopezco changed the title error in in _convert_to_indexer while using .loc with DateTimeIndex error in in _convert_to_indexer while using .loc with tz-aware DateTimeIndex Nov 24, 2015
@lopezco
Copy link
Author

lopezco commented Nov 24, 2015

@briandavidgreen you're right. Sorry!

@jreback jreback added this to the 0.18.1 milestone Mar 1, 2016
@jreback jreback modified the milestones: 0.18.2, 0.18.1 Apr 25, 2016
@jreback jreback added the Indexing Related to indexing on series/frames, not to indexes themselves label Aug 4, 2016
@jreback jreback changed the title error in in _convert_to_indexer while using .loc with tz-aware DateTimeIndex BUG: error in in _convert_to_indexer while using .loc with tz-aware DateTimeIndex Aug 4, 2016
@jorisvandenbossche jorisvandenbossche modified the milestones: Next Major Release, 0.19.0 Aug 21, 2016
@jreback
Copy link
Contributor

jreback commented Jun 28, 2017

originally I linked #13908 with an example to this one. It seems fixed on master, so if someone wants to see if this issue is closable as well would be great comment

@TomAugspurger TomAugspurger added the Testing pandas testing functions or related to the test suite label Jun 29, 2017
@jreback jreback modified the milestones: Next Major Release, 0.24.0 Jun 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Testing pandas testing functions or related to the test suite Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants