Skip to content

multi-index indexing for 3-level index behaving mysteriously. #2646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
michaelaye opened this issue Jan 6, 2013 · 9 comments · Fixed by #8526
Closed

multi-index indexing for 3-level index behaving mysteriously. #2646

michaelaye opened this issue Jan 6, 2013 · 9 comments · Fixed by #8526
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Milestone

Comments

@michaelaye
Copy link
Contributor

This seems to be solvable my using df.sortlevel, but Wesley encouraged me to submit it for further investigation.
I wrote a pretty detailed investigation here:

http://nbviewer.ipython.org/4465051/

with the gist

https://gist.github.com/4465051

In short words:
2-level indexing for a 3-level multi-index fails for me, when the size of one of the first 2 levels is larger than 9 and not using sortlevel.
In examples:

df.ix['c1','det4'] works as long as the sizes of the multi-index are up to (9,9,x) with x anything.
But if the levels size is (10,9,x) I am forced to use sortlevel(0) to make it work. (see notebook).

@michaelaye
Copy link
Contributor Author

Just a comment: The issue is not that I did not make (tuples) out the lev1 and lev2 indices. I checked to see if it makes any difference, and all tests behave exactly the same as without tupling them.

@ghost ghost assigned wesm Jan 20, 2013
@wesm
Copy link
Member

wesm commented Jan 20, 2013

The indexing code needs to be refactored to work (but perhaps not be fast) in the unsorted case. punting til 0.10.2

@ghost
Copy link

ghost commented Mar 30, 2013

Simpler repro for the first case in the nb:

In [34]: from pandas.util.testing import makeCustomDataframe as mkdf
    ...: df=mkdf(1000,3,r_idx_nlevels=3,r_ndupe_l=[10,10])
    ...: df.ix['R_l0_g19','R_l2_g191']
/home/user1/src/pandas/pandas/core/indexing.py in _getitem_lowerdim(self, tup)
    315                 # raise the error if we are not sorted
    316                 if not ax0.is_lexsorted_for_tuple(tup):
--> 317                     raise e1
    318                 try:
    319                     loc = ax0.get_loc(tup[0])

KeyError: 'MultiIndex lexsort depth 0, key was length 2'

loc also fails:

df.loc['R_l0_g19','R_l2_g191']
KeyError: 'MultiIndex lexsort depth 0, key was length 2'

This also occurs for a 2 level multindex:

df=mkdf(1000,2,r_idx_nlevels=2,r_ndupe_l=[20,20])
df.ix['R_l0_g19','R_l1_g19']

KeyError: 'MultiIndex lexsort depth 0, key was length 2'

@jreback
Copy link
Contributor

jreback commented Mar 30, 2013

so is indexing on an unsorted mi a feature or a bug?

@wesm
Copy link
Member

wesm commented Apr 12, 2013

Punting again til 0.12. Several people have reported this bug to me but I don't think it's worth blocking 0.11 RC

@jtratner
Copy link
Contributor

jtratner commented Oct 1, 2013

General bump - this still doesn't work in 0.12 or current master, but one of the error messages has changed from KeyError to IndexingError (complaining about too many indexers).

@michaelaye
Copy link
Contributor Author

This time for 0.13?

@jreback
Copy link
Contributor

jreback commented Nov 27, 2013

the indexing code is getting a facelift in 0.14 so better handled then

@jtratner
Copy link
Contributor

@michaelaye - as @jreback says, we're cleaning up indexing code next release. I'll definitely keep this bug in mind and make sure to get to it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants