Skip to content

BUG: setitem using loc not aligning on index? #56024

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
MarcoGorelli opened this issue Nov 17, 2023 · 12 comments · Fixed by #59340
Closed
3 tasks done

BUG: setitem using loc not aligning on index? #56024

MarcoGorelli opened this issue Nov 17, 2023 · 12 comments · Fixed by #59340
Assignees
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Needs Tests Unit test(s) needed to prevent regressions

Comments

@MarcoGorelli
Copy link
Member

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

In [5]: df = DataFrame({"a": [1, 2], "b": [3, 4]})

In [6]: other = pd.Series([200, 999], index=[1, 0])

In [7]: df
Out[7]:
   a  b
0  1  3
1  2  4

In [8]: other
Out[8]:
1    200
0    999
dtype: int64

In [9]: df.loc[:, 'a'] = other

In [10]: df
Out[10]:
     a  b
0  200  3
1  999  4

Issue Description

When setting a column using .loc, and value is a Series, then the index is ignored

Expected Behavior

I think operations like this one usually align automatically on the index?

@phofl @jbrockmendel do you know about this one?

Installed Versions

INSTALLED VERSIONS

commit : 2a953cf
python : 3.10.12.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.102.1-microsoft-standard-WSL2
Version : #1 SMP Wed Mar 2 00:30:59 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 2.1.3
numpy : 1.25.1
pytz : 2023.3
dateutil : 2.8.2
setuptools : 67.6.1
pip : 23.1.2
Cython : None
pytest : 7.3.1
hypothesis : 6.82.4
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.16.1
pandas_datareader : None
bs4 : 4.12.2
bottleneck : None
dataframe-api-compat: None
fastparquet : 2023.7.0
fsspec : 2023.6.0
gcsfs : None
matplotlib : 3.7.1
numba : 0.58.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 12.0.1
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.11.0
sqlalchemy : 2.0.20
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None

@MarcoGorelli MarcoGorelli added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 17, 2023
@MarcoGorelli
Copy link
Member Author

It is respected, though, if doing the assignment horizontally:

In [15]: df = pd.DataFrame({'a': [1,1,2], 'b': [4,5,6]})

In [16]: df.loc[0, :] = pd.Series({'b': 999, 'a': 888})

In [17]: df
Out[17]:
     a    b
0  888  999
1    1    5
2    2    6

@MarcoGorelli
Copy link
Member Author

hey, the non-inplace version does actually respect the index 😄

In [25]: df = DataFrame({"a": [1, 2], "b": [3, 4]})

In [26]: other = pd.Series([200.5, 999], index=[1, 0])

In [27]: df.loc[:, 'a'] = other

In [28]: df
Out[28]:
       a  b
0  999.0  3
1  200.5  4

@MarcoGorelli MarcoGorelli added Indexing Related to indexing on series/frames, not to indexes themselves and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 17, 2023
@phofl
Copy link
Member

phofl commented Nov 17, 2023

Can you check with df

df = DataFrame({"a": [1, 2], "b": [3, 4.5]})

we have code paths for single block and multi block.

You are correct, all of those should align

@MarcoGorelli
Copy link
Member Author

that one works!

In [38]: df = DataFrame({"a": [1, 2], "b": [3, 4.5]})

In [39]: other = pd.Series([200, 999], index=[1, 0])

In [40]:  df.loc[:, 'a'] = other

In [41]: df
Out[41]:
     a    b
0  999  3.0
1  200  4.5

@phofl
Copy link
Member

phofl commented Nov 17, 2023

Yeah that's what I thought, the single block code path is the culprit (as usual...)

@jbrockmendel
Copy link
Member

possibly related: #51386, #37516.

my default opinion is "if its ambiguous, let's deprecate allowing it"

@phofl
Copy link
Member

phofl commented Nov 17, 2023

No it's not ambiguous, it should work in our current setup

@phofl
Copy link
Member

phofl commented Nov 29, 2023

FWIW my PR that was a precursor for @MarcoGorelli PR about the warning should have fixed this (could you add a test in your other pr @MarcoGorelli ?)

@MarcoGorelli MarcoGorelli added the Needs Tests Unit test(s) needed to prevent regressions label Dec 28, 2023
@MarcoGorelli
Copy link
Member Author

thanks - looks fixed on main, so have added 'needs tests'

@DipanshiB
Copy link
Contributor

Hi, is this issue still available to take? Would like to work on this as an open source beginner

@MarcoGorelli
Copy link
Member Author

yup, go ahead

@DipanshiB
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants