Skip to content

TST: #15752 Add drop_duplicates tests for uint, float and bool for Series #17974

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 31, 2017
Merged

TST: #15752 Add drop_duplicates tests for uint, float and bool for Series #17974

merged 6 commits into from
Oct 31, 2017

Conversation

jamestran201
Copy link

@jamestran201 jamestran201 commented Oct 25, 2017

  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@pep8speaks
Copy link

pep8speaks commented Oct 25, 2017

Hello @tmnhat2001! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on October 31, 2017 at 02:40 Hours UTC

@jamestran201
Copy link
Author

Hi, this is my first attempt at issue 15752. Please let me know if the test cases are OK. If they are, I'll do the same to the remaining tests.

@codecov
Copy link

codecov bot commented Oct 25, 2017

Codecov Report

Merging #17974 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17974      +/-   ##
==========================================
- Coverage   91.23%   91.22%   -0.02%     
==========================================
  Files         163      163              
  Lines       50113    50113              
==========================================
- Hits        45723    45714       -9     
- Misses       4390     4399       +9
Flag Coverage Δ
#multiple 89.03% <ø> (ø) ⬆️
#single 40.31% <ø> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e1dabf3...523ac99. Read the comment docs.

@codecov
Copy link

codecov bot commented Oct 25, 2017

Codecov Report

Merging #17974 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17974      +/-   ##
==========================================
- Coverage   91.23%   91.23%   -0.01%     
==========================================
  Files         163      163              
  Lines       50113    50114       +1     
==========================================
- Hits        45723    45720       -3     
- Misses       4390     4394       +4
Flag Coverage Δ
#multiple 89.04% <ø> (+0.01%) ⬆️
#single 40.24% <ø> (-0.14%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/generic.py 92.42% <0%> (-0.12%) ⬇️
pandas/tseries/frequencies.py 96% <0%> (-0.11%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️
pandas/core/indexing.py 92.8% <0%> (-0.02%) ⬇️
pandas/io/excel.py 80.39% <0%> (-0.01%) ⬇️
pandas/io/stata.py 93.7% <0%> (-0.01%) ⬇️
pandas/tseries/offsets.py 97.15% <0%> (-0.01%) ⬇️
pandas/core/indexes/timedeltas.py 91.19% <0%> (ø) ⬆️
pandas/core/indexes/datetimelike.py 97.1% <0%> (ø) ⬆️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e1dabf3...7b035e4. Read the comment docs.

sc = s.copy()
sc.drop_duplicates(keep=False, inplace=True)
assert_series_equal(sc, s[~expected])
@pytest.mark.parametrize('dtype', ['int_', 'uint', 'float_'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the parametrization is good here, just pull all dtypes to here to avoid repeating all of this code multiple times.

@pytest.mark.parametrize('arg',
    [
        (Series([1, 2, 3,3 ], dtype='int_'),
        (Series([1, 2, 3,3 ], dtype='uint'),
        .....
        (Series(['1', '2', '3', '4',])
])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and if you have different expected values, just add antother parameter (and expand the tuple to multiple values)

@jreback jreback added the Testing pandas testing functions or related to the test suite label Oct 25, 2017
@jamestran201
Copy link
Author

Thanks for the suggestion.I also parametrized the test according to your suggestion. I removed some duplicate test cases to reduce the number of parameters to the test function, otherwise the parametrized block would get very big.

Series([True, False, False]),
Series([True, False, True, False])
]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would pull the bool tests out and do them separately (as they are simpler and don't need as much checking);
also just in-ilne the expected results, don't use an if for that. the way to do this is to pass in things as a pair

e.g.

@pytest.mark.parametrize("tc1, tc2",
[
   (Series(1,2, 3, 3, dtype='int_'), Series([1, 2, 3, 5,3 ,2, 4])),
.....
])
def test_drop_duplicates(self, tc1, tc2)

@jamestran201
Copy link
Author

Thanks for the input. When you said that bools does not need as much checking, you meant that we can just use 1 test case for it right?

@jamestran201 jamestran201 reopened this Oct 31, 2017
@jreback jreback added this to the 0.22.0 milestone Oct 31, 2017
@jreback jreback merged commit 4578a03 into pandas-dev:master Oct 31, 2017
@jreback
Copy link
Contributor

jreback commented Oct 31, 2017

thanks @tmnhat2001 !

I think to close the original issue we need some categorical ones for various dtypes. (can be directly in test_categorical).

@jamestran201
Copy link
Author

Thanks! I'll keep working on the remaining test cases.

@jamestran201 jamestran201 deleted the issue15752 branch November 1, 2017 01:31
GuessWhoSamFoo pushed a commit to GuessWhoSamFoo/pandas that referenced this pull request Nov 1, 2017
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants