Skip to content

Boolean indexer assginment error #6039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
richardwu0 opened this issue Jan 22, 2014 · 5 comments
Closed

Boolean indexer assginment error #6039

richardwu0 opened this issue Jan 22, 2014 · 5 comments

Comments

@richardwu0
Copy link

I'm new to Python and Pandas. Can someone help me understand the following. After installing pandas 0.13.0, the following code will crash python. I tested it in pandas 0.12.0, and it's OK.

Thank you!

import pandas as pd
df = pd.DataFrame(range(100), columns=['a'])
msk = df['a']<5
df['a']=df['a'].map(str)
df['a'][msk]='1'
print msk

Here're the current packages versions:

from pandas.util.print_versions import show_versions
show_versions()


INSTALLED VERSIONS
------------------
Python: 2.7.6.final.0
OS: Linux
Release: 2.6.18-308.4.1.el5
Processor: x86_64
byteorder: little
LC_ALL: C
LANG: en_US.UTF-8

pandas: 0.13.0
Cython: 0.19.2
Numpy: 1.7.1
Scipy: 0.13.2
statsmodels: 0.5.0
    patsy: 0.2.1
scikits.timeseries: Not installed
dateutil: 1.5
pytz: 2013b
bottleneck: Not installed
PyTables: 3.0.0
    numexpr: 2.2.2
matplotlib: 1.3.1
openpyxl: 1.6.2
xlrd: 0.9.2
xlwt: 0.7.5
xlsxwriter: Not installed
sqlalchemy: 0.8.3
lxml: 3.2.3
bs4: 4.3.1
html5lib: Not installed
bigquery: Not installed
apiclient: Not installed
@jreback
Copy link
Contributor

jreback commented Jan 22, 2014

This is chained indexing with assignment

http://pandas.pydata.org/pandas-docs/dev/indexing.html#indexing-view-versus-copy
(its a bug that's in numpy < 1.8), will be fixed in pandas 0.13.1

in any event this is much faster, and will work in 0.13.

In [4]: df = pd.DataFrame(range(100), columns=['a'])

In [5]:  msk = df['a']<5

In [6]: df['a'] = df['a'].astype(str)

In [7]: df.loc[msk,'a'] = '1'

@jreback jreback closed this as completed Jan 22, 2014
@jreback
Copy link
Contributor

jreback commented Jan 22, 2014

#6031 fixed this

@richardwu0
Copy link
Author

Thanks!

@richardwu0
Copy link
Author

@jreback
A related question: The following will generate the SettingWithCopyWarning, though it seems to me that it's normal to operate a DataFrame after selection. Is there a way to avoid the warning other than manually disable it (pd.set_option('mode.chained_assignment',None))? Thanks.

import pandas as pd
dfc = pd.DataFrame({'a':['aaa','bbb','ccc'],'b':[1,2,3]})
dfc = dfc[dfc['b']>=2]
dfc.b +=1

@jreback
Copy link
Contributor

jreback commented Jan 22, 2014

To set, ALWAYS use loc/iloc/ix

What you show above does give the warning in 0.13 (I might be able to eliminate it for 0.13.1, but this is quite tricky). Almost all objects in pandas return copies, setting (and using inplace) set values; these need to be explicit. (The entire reason for the warning is that some setting operations will operate on a copy which makes it very non-obvious what is happening).

In [12]: dfc = pd.DataFrame({'a':['aaa','bbb','ccc'],'b':[1,2,3]})

In [13]: dfc.loc[dfc['b']>=2,'b'] += 1

In [14]: dfc
Out[14]: 
     a  b
0  aaa  1
1  bbb  3
2  ccc  4

[3 rows x 2 columns]

You could also:

dfc.is_copy=None or dfc = dfc.copy() (e.g. actually copy it, which clears the flag)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants