-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: Fix behavior of replace_list with mixed types. #40555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix behavior of replace_list with mixed types. #40555
Conversation
hasan-yaman
commented
Mar 21, 2021
- closes BUG: Change in behavior of replace with integer series and float to_replace #40371
- tests added / passed
- Ensure all linting tests pass, see here for how to run them
- whatsnew entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pr, @hasan-yaman!
Looks like this caused some test errors elsewhere: https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=56659&view=logs&j=b516280a-8c75-541b-8ba0-ae13561718a7&t=a9217cb1-24f6-5625-717d-72085a70e1d3&l=41 |
@pytest.mark.parametrize( | ||
"s, to_replace, value, expected", | ||
[ | ||
(DataFrame([1]), np.array([1.0]), [0], DataFrame([0])), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May want to test this for different container types for to_replace
rather than only np.array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added lists too. Should I add more types?
Thanks for the comments.
Removing if statement here will solve the problem but i am wondering is there are better way to solve it? |
@@ -1659,3 +1658,25 @@ def test_replace_bytes(self, frame_or_series): | |||
expected = obj.copy() | |||
obj = obj.replace({None: np.nan}) | |||
tm.assert_equal(obj, expected) | |||
|
|||
@pytest.mark.parametrize( | |||
"s, to_replace, value, expected", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you give this a multi-letter name instead of 's'
Thanks @hasan-yaman for the PR. can you fix-up failing tests... it appears actual fix has been lost in a subsequent commit and add a release note (will be 1.2.5 but can add to doc\source\whatsnew\v1.2.4.rst in the first instance and move once doc\source\whatsnew\v1.2.5.rst is available) |
this is just a tests, no whatsnew needed. |
IIUC this PR a regression fix for #40371 but the fix, added in the first commit, has been lost. |
Actually, first commit was causing too many problems with the tests, I couldn't fix them. So, I reverted the changes from the first commit. |
The regression was caused by #38097, so the fix probably involves code changed in that PR. cc @jbrockmendel |
what do you mean? if this doesn't have a patch then the tests shouldn't pass (or is this *already) fixed? |
the regression is not fixed. |
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
The tests are not passing. |
to_replace is dispatched to _replace_list if is_list_like(to_replace) _replace_list only accepts lists, but also need to make sure that the list is python floats
or change _can_hold_element |
this maybe a better option since this is where we have the inconsistency
|
changed my mind again! since we have a mature backport branch and may not have any more patch releases on 1.2.x let's not make this change and just patch the code that was added in #38097 to convert a numpy array to a list of python objects. |
([1.0], [1], [0], [0.0]), | ||
], | ||
) | ||
@pytest.mark.parametrize("box", [list, tuple, np.array]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to test pd.array
too? Better safe than sorry. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would fail. EA is not a documented allowed type for the to_replace
argument of the replace
method. to be fair, nor is numpy array or tuple. The safest way to restore the old undocumented behavior is to revert #38097, but not sure that a performance hit for users passing the correct type is the way forward.
Any container, even a list, with numpy scalars will still fail. This is a fix for a specific case for an arguably incorrect usage without a perf hit for correct usage.
# until the issue is fixed properly in can_hold_element | ||
|
||
# error: "Iterable[Any]" has no attribute "tolist" | ||
if hasattr(src_list, "tolist"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could maybe change this to a simple isinstance(src_list, np.array)
will merge (and backport) this shortly if no further comments |
yep looks ok (rebase if u can though) |
merged master an hour ago, no commits to master since then |
Thanks @hasan-yaman |
This comment has been minimized.
This comment has been minimized.
…s. (#41761) Co-authored-by: hasan-yaman <[email protected]>
Co-authored-by: Simon Hawkins <[email protected]>
Co-authored-by: Simon Hawkins <[email protected]>