-
Notifications
You must be signed in to change notification settings - Fork 367
Implement 'keep' parameter for drop_duplicates
#1303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@deepyaman do you mind rebasing and syncing to the master? There are many conflicts as of 30b3334 |
Codecov Report
@@ Coverage Diff @@
## master #1303 +/- ##
=========================================
Coverage ? 93.75%
=========================================
Files ? 34
Lines ? 7254
Branches ? 0
=========================================
Hits ? 6801
Misses ? 453
Partials ? 0
Continue to review full report at Codecov.
|
|
I noticed another issue with Series while trying to implement this. 1cb4ba0 changed koalas/databricks/koalas/frame.py Line 2865 in 1cb4ba0
column == index_column ever got activated, so it didn't matter. This PR also attempts to fix this.
|
@HyukjinKwon Synced, tests added, and ready to go! |
ueshin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, LGTM.
|
Adding additional doctests decreased code coverage to fail the build. T_T |
|
|
||
| for (msg, pser), keep in product(psers.items(), keeps): | ||
| with self.subTest(msg, keep=keep): | ||
| kser = ks.Series(pser) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we prefer to use ks.from_pandas(pser) instread of ks.Series(pser).
|
Thanks! merging. |
Close #1302
TODO: