Skip to content

Ability to Sort DataFrame by Transformed Values #6663

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cancan101 opened this issue Mar 18, 2014 · 5 comments
Closed

Ability to Sort DataFrame by Transformed Values #6663

cancan101 opened this issue Mar 18, 2014 · 5 comments
Labels
Duplicate Report Duplicate issue or pull request

Comments

@cancan101
Copy link
Contributor

Let's say I have a DataFrame of stock returns. It would be great to be able to sort easily the DataFrame by the abs value of the returns. Currently I do this by inserting a new column and then sorting on the new column and than slicing on the original columns. This is ungainly:

ret["returns_abs"] = ret.returns.abs()
ret.sort("returns_abs", ascending=False)[["returns", "key"]]

It would be great if I could either pass a Series or a function to the sort method:

ret.sort(ret.returns.abs(), ascending=False)

or

ret.sort(lambda x: abs(x.ix["returns"], ascending=False)
@jreback
Copy link
Contributor

jreback commented Mar 18, 2014

just reindex

@cancan101
Copy link
Contributor Author

I assume that I am doing something wrong here:

df = pd.DataFrame(np.random.rand(13, 3), columns=list('abc'))

In [148]:

print df.reindex(df.a.abs())

           a   b   c
a                   
0.074637 NaN NaN NaN
0.348153 NaN NaN NaN
0.751475 NaN NaN NaN
0.062545 NaN NaN NaN
0.117850 NaN NaN NaN
0.296180 NaN NaN NaN
0.729935 NaN NaN NaN
0.814809 NaN NaN NaN
0.935224 NaN NaN NaN
0.960110 NaN NaN NaN
0.121138 NaN NaN NaN
0.103864 NaN NaN NaN
0.808862 NaN NaN NaN

[13 rows x 3 columns]

@jreback
Copy link
Contributor

jreback commented Mar 18, 2014

In [19]: df = pd.DataFrame(np.random.rand(5, 3), columns=list('abc'))-.5

In [20]: df
Out[20]: 
          a         b         c
0  0.086302  0.280969 -0.472230
1 -0.232537 -0.159322  0.079688
2 -0.467258  0.380615 -0.239272
3 -0.334849 -0.011547  0.499909
4 -0.060898  0.180957 -0.002005

[5 rows x 3 columns]

In [21]: df.reindex(index=df.a.abs().order().index)
Out[21]: 
          a         b         c
4 -0.060898  0.180957 -0.002005
0  0.086302  0.280969 -0.472230
1 -0.232537 -0.159322  0.079688
3 -0.334849 -0.011547  0.499909
2 -0.467258  0.380615 -0.239272

[5 rows x 3 columns]

@jreback
Copy link
Contributor

jreback commented Mar 18, 2014

see also #3942

@jreback
Copy link
Contributor

jreback commented Mar 18, 2014

this issue is a dupe of that

@jreback jreback closed this as completed Mar 18, 2014
@jreback jreback added the Dupe label Mar 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

2 participants