Allow level wildcard via slice(None) in df.ix[] with MultiIndex #2425

timcera · 2012-12-04T17:01:51Z

arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = zip(*arrays)
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = DataFrame(randn(3, 8), index=['A', 'B', 'C'], columns=index)
df = df.T

The following works:

df.A.ix[slice(None), 'two']

first
bar      0.233442
baz     -1.551486
foo      0.175177
qux      1.524440
Name: A

It is very possible that this is something about a MultiIndex DataFrame that I don't understand, but I think the following should work:

df.ix[slice(None), 'two']

Instead of the expected:

first
bar      0.233442      0.359638  2.061794
baz     -1.551486    -1.888951  0.172784
foo      0.175177     -1.085777 -0.988670
qux      1.524440     0.070535  1.006851

I get a KeyError:

...
/sjr/beodata/local/python_linux/lib/python2.6/site-packages/pandas/core/internals.pyc in _check_have(self, item)
949 def _check_have(self, item):
950 if item not in self.items:
--> 951 raise KeyError('no item named %s' % com.pprint_thing(item))
952
953 def reindex_axis(self, new_axis, method=None, axis=0, copy=True):

KeyError: u'no item named two'

The text was updated successfully, but these errors were encountered:

ghost · 2012-12-06T07:03:02Z

AFICT from the code, the args of ix correspond to the axes, so you're asking for all rows
and column "two". which fails.
case in point:

In [4]: df.ix[:]
Out[4]: 
                     A         B         C
first second                              
bar   one    -0.720545 -0.382630  0.573031
      two    -0.263034  0.462324 -0.126281
baz   one     1.676899 -0.660316  1.216486
      two    -0.343970  0.234571  0.347938
foo   one    -0.563490 -1.136923 -0.450143
      two    -1.209016  0.044605  0.879672
qux   one    -0.276785  0.563070 -0.133299
      two    -0.449211  0.545187 -0.869852

In [6]: df.ix[:,('A','B')]
Out[6]: 
                     A         B
first second                    
bar   one    -0.720545 -0.382630
      two    -0.263034  0.462324
baz   one     1.676899 -0.660316
      two    -0.343970  0.234571
foo   one    -0.563490 -1.136923
      two    -1.209016  0.044605
qux   one    -0.276785  0.563070
      two    -0.449211  0.545187

This also works

In [10]: df.ix[('foo','two'),:]
Out[10]: 
A   -1.209016
B    0.044605
C    0.879672
Name: (foo, two)

This however doesn't, which is disappointing:

In [7]: df.ix[(slice(None),'two')]
/home/user1/src/pandas/pandas/core/internals.pyc in _check_have(self, item)
   1002     def _check_have(self, item):
   1003         if item not in self.items:
-> 1004             raise KeyError('no item named %s' % com.pprint_thing(item))
   1005 
   1006     def reindex_axis(self, new_axis, method=None, axis=0, copy=True):

KeyError: u'no item named two'

and maybe that is a bug.

If your question comes from a real need,

In [7]: df.xs('two',level=1)
Out[7]: 
              A         B         C
first                              
bar   -0.677739 -0.740875  0.072675
baz    0.061356 -1.522032  1.084492
foo   -0.124634 -2.342294 -0.625460
qux   -0.647809 -0.051477  0.724003

will get you there, until this is otherwise addressed.

p.s. I think you may have mixed up columns and index in your example.

timcera · 2012-12-06T13:45:51Z

Fixed my example about the column/row confusion - forgot a command during copy/paste...

timcera · 2012-12-06T14:21:52Z

I have a 5 level MultiIndex DataFrame and want to select rows by wildcard one or more of the levels. I figured out a solution yesterday - outside of the pandas framework by using 'get_level_values' for each level to build a pseudo database. Works, and actually is fast enough.

When trying to use native pandas I was hoping that if 'df.ix' supported 'slice(None)' I could use it for my DataFrame as something like:

df.ix[('PERLND', 112, 'PWATER', slice(None), 2)]

Which would wildcard the fourth level - right? If that make sense, that is what I think I want.

ghost · 2012-12-12T14:29:11Z

related #1766

jreback · 2013-12-18T19:59:09Z

closing in favor of #4036

jreback mentioned this issue Sep 21, 2013

Method for df #1766

Closed

jreback closed this as completed Dec 18, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow level wildcard via slice(None) in df.ix[] with MultiIndex #2425

Allow level wildcard via slice(None) in df.ix[] with MultiIndex #2425

timcera commented Dec 4, 2012

ghost commented Dec 6, 2012

timcera commented Dec 6, 2012

timcera commented Dec 6, 2012

ghost commented Dec 12, 2012

jreback commented Dec 18, 2013

Allow level wildcard via slice(None) in df.ix[] with MultiIndex #2425

Allow level wildcard via slice(None) in df.ix[] with MultiIndex #2425

Comments

timcera commented Dec 4, 2012

ghost commented Dec 6, 2012

timcera commented Dec 6, 2012

timcera commented Dec 6, 2012

ghost commented Dec 12, 2012

jreback commented Dec 18, 2013