Skip to content

Not able to change direction of sort on per column basis for DataFrame #928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mattharrison opened this issue Mar 16, 2012 · 7 comments
Closed
Milestone

Comments

@mattharrison
Copy link

The sort method on DataFrame supports sorting by multiple columns, yet only supports sorting in one direction (ascending or descending). It would be great to specify this. Perhaps something like this: df.sort(columns=['Name|descending', 'Age'] which would sort name first in descending order and then where names match use ascending order.

Also I see no mention of whether the sort is stable. If it is, this can be replicated by multiple sort calls.

@jhrhew
Copy link

jhrhew commented Oct 11, 2012

I wish to have this feature too, which can be done in MS Excel.

@wesm
Copy link
Member

wesm commented Oct 11, 2012

I was thinking you would do

df.sort_index(by=['a', 'b'], ascending=[False, True])

any opinions?

@jhrhew
Copy link

jhrhew commented Oct 11, 2012

Thanks for the quick response.
That looks quite intuitive and consistent with the current API if the list is short. However, I cannot think of any good idea for a longer list like
df.sort_index(by=['a', 'b', 'c', 'd', 'e'], ascending=[False, True, False, True, True])

@wesm
Copy link
Member

wesm commented Oct 11, 2012

About as short as you can make it I guess. ascending=[0, 1, 0, 1, 1] would work too

@jhrhew
Copy link

jhrhew commented Oct 11, 2012

That's clever; it didn't occur to me. Thanks. Then, I think your proposal serves my need.

@wesm
Copy link
Member

wesm commented Nov 1, 2012

Got around to this finally:

In [2]: df
Out[2]: 
    A  B         C
0   2  0 -0.696943
1   2  2 -0.304504
2   0  1 -0.686636
3   1  4  0.298355
4   0  3 -1.167454
5   0  0  0.478933
6   2  3 -1.859343
7   2  4  0.040016
8   1  1  0.484016
9   0  4  0.355799
10  0  2 -0.127496
11  1  2  1.078274
12  2  1  1.920544
13  1  3  0.678504
14  1  0  0.210838

In [3]: df.sort
df.sort        df.sort_index  df.sortlevel   

In [3]: df.sort(['A', 'B'])
Out[3]: 
    A  B         C
5   0  0  0.478933
2   0  1 -0.686636
10  0  2 -0.127496
4   0  3 -1.167454
9   0  4  0.355799
14  1  0  0.210838
8   1  1  0.484016
11  1  2  1.078274
13  1  3  0.678504
3   1  4  0.298355
0   2  0 -0.696943
12  2  1  1.920544
1   2  2 -0.304504
6   2  3 -1.859343
7   2  4  0.040016

In [4]: df.sort(['A', 'B'], ascending=[1, 0])
Out[4]: 
    A  B         C
9   0  4  0.355799
4   0  3 -1.167454
10  0  2 -0.127496
2   0  1 -0.686636
5   0  0  0.478933
3   1  4  0.298355
13  1  3  0.678504
11  1  2  1.078274
8   1  1  0.484016
14  1  0  0.210838
7   2  4  0.040016
6   2  3 -1.859343
1   2  2 -0.304504
12  2  1  1.920544
0   2  0 -0.696943

In [5]: df.sort(['A', 'B'], ascending=[1, 1])
Out[5]: 
    A  B         C
5   0  0  0.478933
2   0  1 -0.686636
10  0  2 -0.127496
4   0  3 -1.167454
9   0  4  0.355799
14  1  0  0.210838
8   1  1  0.484016
11  1  2  1.078274
13  1  3  0.678504
3   1  4  0.298355
0   2  0 -0.696943
12  2  1  1.920544
1   2  2 -0.304504
6   2  3 -1.859343
7   2  4  0.040016

In [6]: df.sort(['A', 'B'], ascending=[0, 0])
Out[6]: 
    A  B         C
7   2  4  0.040016
6   2  3 -1.859343
1   2  2 -0.304504
12  2  1  1.920544
0   2  0 -0.696943
3   1  4  0.298355
13  1  3  0.678504
11  1  2  1.078274
8   1  1  0.484016
14  1  0  0.210838
9   0  4  0.355799
4   0  3 -1.167454
10  0  2 -0.127496
2   0  1 -0.686636
5   0  0  0.478933

@wesm wesm closed this as completed in e301671 Nov 1, 2012
@jhrhew
Copy link

jhrhew commented Nov 5, 2012

Thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants