TST: #15752 Add drop_duplicates tests for uint, float and bool for Series #17974

jamestran201 · 2017-10-25T02:42:37Z

tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

pep8speaks · 2017-10-25T02:42:39Z

Hello @tmnhat2001! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on October 31, 2017 at 02:40 Hours UTC

jamestran201 · 2017-10-25T02:52:33Z

Hi, this is my first attempt at issue 15752. Please let me know if the test cases are OK. If they are, I'll do the same to the remaining tests.

codecov · 2017-10-25T03:53:29Z

Codecov Report

Merging #17974 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #17974      +/-   ##
==========================================
- Coverage   91.23%   91.22%   -0.02%     
==========================================
  Files         163      163              
  Lines       50113    50113              
==========================================
- Hits        45723    45714       -9     
- Misses       4390     4399       +9

Flag	Coverage Δ
#multiple	`89.03% <ø> (ø)`	⬆️
#single	`40.31% <ø> (-0.06%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/core/frame.py	`97.75% <0%> (-0.1%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e1dabf3...523ac99. Read the comment docs.

codecov · 2017-10-25T03:53:59Z

Codecov Report

Merging #17974 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #17974      +/-   ##
==========================================
- Coverage   91.23%   91.23%   -0.01%     
==========================================
  Files         163      163              
  Lines       50113    50114       +1     
==========================================
- Hits        45723    45720       -3     
- Misses       4390     4394       +4

Flag	Coverage Δ
#multiple	`89.04% <ø> (+0.01%)`	⬆️
#single	`40.24% <ø> (-0.14%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/core/generic.py	`92.42% <0%> (-0.12%)`	⬇️
pandas/tseries/frequencies.py	`96% <0%> (-0.11%)`	⬇️
pandas/core/frame.py	`97.75% <0%> (-0.1%)`	⬇️
pandas/core/indexing.py	`92.8% <0%> (-0.02%)`	⬇️
pandas/io/excel.py	`80.39% <0%> (-0.01%)`	⬇️
pandas/io/stata.py	`93.7% <0%> (-0.01%)`	⬇️
pandas/tseries/offsets.py	`97.15% <0%> (-0.01%)`	⬇️
pandas/core/indexes/timedeltas.py	`91.19% <0%> (ø)`	⬆️
pandas/core/indexes/datetimelike.py	`97.1% <0%> (ø)`	⬆️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e1dabf3...7b035e4. Read the comment docs.

jreback · 2017-10-25T10:19:07Z

pandas/tests/series/test_analytics.py

-            sc = s.copy()
-            sc.drop_duplicates(keep=False, inplace=True)
-            assert_series_equal(sc, s[~expected])
+    @pytest.mark.parametrize('dtype', ['int_', 'uint', 'float_'])


so the parametrization is good here, just pull all dtypes to here to avoid repeating all of this code multiple times.

@pytest.mark.parametrize('arg', [ (Series([1, 2, 3,3 ], dtype='int_'), (Series([1, 2, 3,3 ], dtype='uint'), ..... (Series(['1', '2', '3', '4',]) ])

and if you have different expected values, just add antother parameter (and expand the tuple to multiple values)

jamestran201 · 2017-10-26T03:06:50Z

Thanks for the suggestion.I also parametrized the test according to your suggestion. I removed some duplicate test cases to reduce the number of parameters to the test function, otherwise the parametrized block would get very big.

jreback · 2017-10-27T10:22:32Z

pandas/tests/series/test_analytics.py

+            Series([True, False, False]),
+            Series([True, False, True, False])
+        ]
+    )


I would pull the bool tests out and do them separately (as they are simpler and don't need as much checking);
also just in-ilne the expected results, don't use an if for that. the way to do this is to pass in things as a pair

e.g.

@pytest.mark.parametrize("tc1, tc2", [ (Series(1,2, 3, 3, dtype='int_'), Series([1, 2, 3, 5,3 ,2, 4])), ..... ]) def test_drop_duplicates(self, tc1, tc2)

…and fix test parameterization

jamestran201 · 2017-10-28T04:21:28Z

Thanks for the input. When you said that bools does not need as much checking, you meant that we can just use 1 test case for it right?

jreback · 2017-10-31T12:08:45Z

thanks @tmnhat2001 !

I think to close the original issue we need some categorical ones for various dtypes. (can be directly in test_categorical).

jamestran201 · 2017-11-01T01:28:46Z

Thanks! I'll keep working on the remaining test cases.

…ool for Series (pandas-dev#17974)

Add drop_duplicates test for uint, float and bool

194d7cd

tmnhat2001 added 2 commits October 24, 2017 22:47

TST: #15752 resolve PEP8 issues in test_analytics.py

888e937

TST: #15752 resolve PEP8 issues in test_analytics.py

523ac99

jamestran201 mentioned this pull request Oct 25, 2017

TST: full dtype tests for .drop_duplicates and .duplicated #15752

Closed

jreback reviewed Oct 25, 2017

View reviewed changes

jreback added the Testing pandas testing functions or related to the test suite label Oct 25, 2017

TST: #15752 parametrized test_drop_duplicates and removed duplicate code

897b09e

TST #15752 Re-parametrize test_drop_duplicates

1952ba3

jreback requested changes Oct 27, 2017

View reviewed changes

TST #15752: create separate method to test drop duplicates for bools …

7b035e4

…and fix test parameterization

jamestran201 closed this Oct 31, 2017

jamestran201 reopened this Oct 31, 2017

jreback added this to the 0.22.0 milestone Oct 31, 2017

jreback approved these changes Oct 31, 2017

View reviewed changes

jreback merged commit 4578a03 into pandas-dev:master Oct 31, 2017

jamestran201 deleted the issue15752 branch November 1, 2017 01:31

GuessWhoSamFoo pushed a commit to GuessWhoSamFoo/pandas that referenced this pull request Nov 1, 2017

TST: pandas-dev#15752 Add drop_duplicates tests for uint, float and b…

61d42bf

…ool for Series (pandas-dev#17974)

No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017

TST: pandas-dev#15752 Add drop_duplicates tests for uint, float and b…

8e11d13

…ool for Series (pandas-dev#17974)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST: #15752 Add drop_duplicates tests for uint, float and bool for Series #17974

TST: #15752 Add drop_duplicates tests for uint, float and bool for Series #17974

jamestran201 commented Oct 25, 2017 •

edited

Loading

pep8speaks commented Oct 25, 2017 •

edited

Loading

jamestran201 commented Oct 25, 2017

codecov bot commented Oct 25, 2017

codecov bot commented Oct 25, 2017 •

edited

Loading

jreback Oct 25, 2017

jreback Oct 25, 2017

jamestran201 commented Oct 26, 2017

jreback Oct 27, 2017

jamestran201 commented Oct 28, 2017

jreback commented Oct 31, 2017

jamestran201 commented Nov 1, 2017

TST: #15752 Add drop_duplicates tests for uint, float and bool for Series #17974

TST: #15752 Add drop_duplicates tests for uint, float and bool for Series #17974

Conversation

jamestran201 commented Oct 25, 2017 • edited Loading

pep8speaks commented Oct 25, 2017 • edited Loading

Comment last updated on October 31, 2017 at 02:40 Hours UTC

jamestran201 commented Oct 25, 2017

codecov bot commented Oct 25, 2017

Codecov Report

codecov bot commented Oct 25, 2017 • edited Loading

Codecov Report

jreback Oct 25, 2017

Choose a reason for hiding this comment

jreback Oct 25, 2017

Choose a reason for hiding this comment

jamestran201 commented Oct 26, 2017

jreback Oct 27, 2017

Choose a reason for hiding this comment

jamestran201 commented Oct 28, 2017

jreback commented Oct 31, 2017

jamestran201 commented Nov 1, 2017

jamestran201 commented Oct 25, 2017 •

edited

Loading

pep8speaks commented Oct 25, 2017 •

edited

Loading

codecov bot commented Oct 25, 2017 •

edited

Loading