Skip to content

BUG: groupby(level=).shift(axis=1) changes order of original index #44269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
prutskov opened this issue Nov 1, 2021 · 1 comment
Closed
3 tasks done

BUG: groupby(level=).shift(axis=1) changes order of original index #44269

prutskov opened this issue Nov 1, 2021 · 1 comment
Labels
Bug Groupby Index Related to the Index class or subclasses

Comments

@prutskov
Copy link

prutskov commented Nov 1, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd

df = pd.DataFrame({'a': [2,1,2,1], 'b': [7,2,3,9], 'c': [2.0, 3.0, 1.1, 7.8]})
df = df.set_index('a')

result = df.groupby(level=0).shift()
result_axis_1 = df.groupby(level=0).shift(axis=1)

print(df)
print(result)
print(result_axis_1)

Issue Description

Ouput:

   b    c
a
2  7  2.0
1  2  3.0
2  3  1.1
1  9  7.8
     b    c
a
2  NaN  NaN
1  NaN  NaN
2  7.0  2.0
1  2.0  3.0
    b  c
a
2 NaN  7
2 NaN  3
1 NaN  2
1 NaN  9

We can see that groupby.shift(axis=1) breaks the original order of values in index.

The same behavior we can see for groupby.apply(lambda df: df.shift()) function:

import pandas as pd

df = pd.DataFrame({'a': [2,1,2,1], 'b': [7,2,3,9], 'c': [2.0, 3.0, 1.1, 7.8]})
df = df.set_index('a')

result = df.groupby(level=0).shift()
result_apply = df.groupby(level=0).apply(lambda df: df.shift())

print(df)
print(result)
print(result_apply)

Output:

   b    c
a
2  7  2.0
1  2  3.0
2  3  1.1
1  9  7.8
     b    c
a
2  NaN  NaN
1  NaN  NaN
2  7.0  2.0
1  2.0  3.0
     b    c
a
2  NaN  NaN
2  7.0  2.0
1  NaN  NaN
1  2.0  3.0

We get different order of index for shift/apply(shift)

Expected Behavior

   b    c
a
2  7  2.0
1  2  3.0
2  3  1.1
1  9  7.8
     b    c
a
2  NaN  NaN
1  NaN  NaN
2  7.0  2.0
1  2.0  3.0
    b  c
a
2 NaN  7
1 NaN  2
2 NaN  3
1 NaN  9

Installed Versions

INSTALLED VERSIONS

commit : 945c9ed
python : 3.8.12.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-65-generic
Version : #73-Ubuntu SMP Mon Jan 18 17:25:17 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US
LOCALE : en_US.ISO8859-1

pandas : 1.3.4
numpy : 1.21.3
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.0.4
Cython : None
pytest : 6.2.5
hypothesis : None
sphinx : None
blosc : None
feather : 0.4.1
xlsxwriter : None
lxml.etree : 4.6.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.2
IPython : 7.28.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : 2021.10.1
fastparquet : None
gcsfs : None
matplotlib : 3.2.2
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.9
pandas_gbq : 0.15.0
pyarrow : 6.0.0
pyxlsb : None
s3fs : 2021.10.1
scipy : 1.7.1
sqlalchemy : 1.4.26
tables : 3.6.1
tabulate : None
xarray : 0.19.0
xlrd : 2.0.1
xlwt : None
numba : None

@prutskov prutskov added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 1, 2021
@mroeschke mroeschke added Groupby Index Related to the Index class or subclasses and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 5, 2021
@rhshadrach
Copy link
Member

axis=1 in this method is now deprecated. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Groupby Index Related to the Index class or subclasses
Projects
None yet
Development

No branches or pull requests

3 participants