Skip to content

Conversation

@RainFung
Copy link
Contributor

Implement Index.drop_duplicates by using spark drop_duplicates API without keep parameter

@codecov-io
Copy link

codecov-io commented Dec 12, 2019

Codecov Report

Merging #1121 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1121      +/-   ##
==========================================
+ Coverage   95.17%   95.17%   +<.01%     
==========================================
  Files          35       35              
  Lines        7048     7051       +3     
==========================================
+ Hits         6708     6711       +3     
  Misses        340      340
Impacted Files Coverage Δ
databricks/koalas/missing/indexes.py 100% <ø> (ø) ⬆️
databricks/koalas/indexes.py 96.64% <100%> (+0.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d435072...e87f3a6. Read the comment docs.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Index.is_interval
Index.is_numeric
Index.is_object
Index.drop_duplicates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, can you add drop_duplicates at MultiIndex too?

Copy link
Contributor Author

@RainFung RainFung Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MultiIndex.drop_duplicates has been deprecated in pandas 0.26 doc.https://dev.pandas.io/docs/reference/indexing.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@softagram-bot
Copy link

Softagram Impact Report for pull/1121 (head commit: e87f3a6)

⚠️ Copy paste found

ℹ️ indexes.py: Copy paste fragment inside the same file on lines 757, 1136:

            raise NotImplementedError(
                \"Doesn't support symmetric_difference between Index & MultiIndex for now\")

        sdf_self = self._kdf._s...(truncated 477 chars)

ℹ️ test_indexes.py: Copy paste fragment on line 30 shared with ../test_dataframe.py:


    @property
    def pdf(self):
        return pd.DataFrame({
            'a': [1, 2, 3, 4, 5, 6, 7, 8, 9],
            'b': [4, 5, 6, 3, 2, 1, ...(truncated 160 chars)

ℹ️ test_indexes.py: Copy paste fragment on line 32 shared with ../test_dataframe.py, ../test_numpy_compat.py:

    def pdf(self):
        return pd.DataFrame({
            'a': [1, 2, 3, 4, 5, 6, 7, 8, 9],
            'b': [4, 5, 6, 3, 2, 1, 0, 0, 0],
        }, index=[0, 1, 3, 5, 6, 8, 9, 9, 9])...(truncated 105 chars)

Now that you are on the file, it would be easier to pay back some tech. debt.

⭐ Change Overview

Showing the changed files, dependency changes and the impact - click for full size
(Open in Softagram Desktop for full details)

📄 Full report

Impact Report explained. Give feedback on this report to [email protected]

@HyukjinKwon HyukjinKwon merged commit a44e734 into databricks:master Dec 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants