Skip to content

BUG: in _nsorted for frame with duplicated values index #13428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from

Conversation

Tux1
Copy link
Contributor

@Tux1 Tux1 commented Jun 12, 2016

@Tux1
Copy link
Contributor Author

Tux1 commented Jun 12, 2016

what doesn't pass tests ?

@sinhrks
Copy link
Member

sinhrks commented Jun 12, 2016

You must fix test to pass flake8 check.

@sinhrks sinhrks added the Bug label Jun 12, 2016
@sinhrks sinhrks added this to the 0.18.2 milestone Jun 12, 2016
@@ -81,3 +81,5 @@ Performance Improvements

Bug Fixes
~~~~~~~~~

- Bug in ``DataFrame._nsorted`` when data-frame has duplicated value index. (:issue:`13412`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Describe the problem from user's point of view. Users don't care _nsorted.

@codecov-io
Copy link

codecov-io commented Jun 12, 2016

Current coverage is 84.23%

Merging #13428 into master will increase coverage by <.01%

@@             master     #13428   diff @@
==========================================
  Files           138        138          
  Lines         50805      50810     +5   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          42796      42801     +5   
  Misses         8009       8009          
  Partials          0          0          

Powered by Codecov. Last updated by 62b4327...9e47bbe

@TomAugspurger
Copy link
Contributor

git diff upstream/master | flake8 --diff will help you track down the changes you have to make.

@jreback
Copy link
Contributor

jreback commented Jun 13, 2016

I think a simple soln will work here

(Pdb) p self.loc[ser.index].head(n).sort_values(columns, ascending=ascending,kind='mergesort')
   a  b
1  3  2
(Pdb) p self.loc[ser.index]
   a  b
1  3  2
1  4  1

@Tux1
Copy link
Contributor Author

Tux1 commented Jun 13, 2016

@jreback what do you mean ? in _nsorted ? I don't think so

Testing. I think you're right

@jreback
Copy link
Contributor

jreback commented Jun 13, 2016

maybe the .head() should go at the end. I don't recall the exact guarantees of this.

@Tux1
Copy link
Contributor Author

Tux1 commented Jun 13, 2016

I tested and your soln doesn't work and doesn't pass the test with this case :
df = pd.DataFrame({'a': [1, 2, 3, 4], 'b': [4, 3, 2, 1]}, index=[0, 0, 1, 1])

Any other suggestion ?

@jreback
Copy link
Contributor

jreback commented Jun 14, 2016

@Tux1 well then play around with it. What you are doing is WAY too complicated for a simple take.

@jreback jreback removed this from the 0.18.2 milestone Jun 14, 2016
@jreback
Copy link
Contributor

jreback commented Sep 9, 2016

can you rebase / update?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: in _nsorted for frame with duplicated values index
5 participants