-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
multi-index indexing for 3-level index behaving mysteriously. #2646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Just a comment: The issue is not that I did not make (tuples) out the lev1 and lev2 indices. I checked to see if it makes any difference, and all tests behave exactly the same as without tupling them. |
The indexing code needs to be refactored to work (but perhaps not be fast) in the unsorted case. punting til 0.10.2 |
Simpler repro for the first case in the nb: In [34]: from pandas.util.testing import makeCustomDataframe as mkdf
...: df=mkdf(1000,3,r_idx_nlevels=3,r_ndupe_l=[10,10])
...: df.ix['R_l0_g19','R_l2_g191']
/home/user1/src/pandas/pandas/core/indexing.py in _getitem_lowerdim(self, tup)
315 # raise the error if we are not sorted
316 if not ax0.is_lexsorted_for_tuple(tup):
--> 317 raise e1
318 try:
319 loc = ax0.get_loc(tup[0])
KeyError: 'MultiIndex lexsort depth 0, key was length 2' loc also fails: df.loc['R_l0_g19','R_l2_g191']
KeyError: 'MultiIndex lexsort depth 0, key was length 2' This also occurs for a 2 level multindex: df=mkdf(1000,2,r_idx_nlevels=2,r_ndupe_l=[20,20])
df.ix['R_l0_g19','R_l1_g19']
KeyError: 'MultiIndex lexsort depth 0, key was length 2' |
so is indexing on an unsorted mi a feature or a bug? |
Punting again til 0.12. Several people have reported this bug to me but I don't think it's worth blocking 0.11 RC |
General bump - this still doesn't work in 0.12 or current master, but one of the error messages has changed from KeyError to IndexingError (complaining about too many indexers). |
This time for 0.13? |
the indexing code is getting a facelift in 0.14 so better handled then |
@michaelaye - as @jreback says, we're cleaning up indexing code next release. I'll definitely keep this bug in mind and make sure to get to it soon. |
This seems to be solvable my using df.sortlevel, but Wesley encouraged me to submit it for further investigation.
I wrote a pretty detailed investigation here:
http://nbviewer.ipython.org/4465051/
with the gist
https://gist.github.com/4465051
In short words:
2-level indexing for a 3-level multi-index fails for me, when the size of one of the first 2 levels is larger than 9 and not using sortlevel.
In examples:
df.ix['c1','det4'] works as long as the sizes of the multi-index are up to (9,9,x) with x anything.
But if the levels size is (10,9,x) I am forced to use sortlevel(0) to make it work. (see notebook).
The text was updated successfully, but these errors were encountered: