-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: Fix replace for different dtype equal value #34878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if inplace: | ||
return [self] | ||
return [self.copy()] | ||
return self.astype(object).replace( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only want to do this if these are different types, not generally. please show asv's for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the if not self._can_hold_element(to_replace)
check is making sure we have different types.
Not seeing any performance regressions from the change (at least in the replace benchmarks):
(pandas-dev) > asv continuous -f 1.1 upstream/master replace-diff-dtype -b ^replace
· Creating environments
· Discovering benchmarks
·· Uninstalling from conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt.
·· Installing 1e014777 <replace-diff-dtype> into conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt..
· Running 12 total benchmarks (2 commits * 1 environments * 6 benchmarks)
[ 0.00%] · For pandas commit 9114b4bc <master> (round 1/2):
[ 0.00%] ·· Building for conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt...
[ 0.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 4.17%] ··· Running (replace.Convert.time_replace--)....
[ 20.83%] ··· Running (replace.ReplaceList.time_replace_list--)..
[ 25.00%] · For pandas commit 1e014777 <replace-diff-dtype> (round 1/2):
[ 25.00%] ·· Building for conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt...
[ 25.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 29.17%] ··· Running (replace.Convert.time_replace--)....
[ 45.83%] ··· Running (replace.ReplaceList.time_replace_list--)..
[ 50.00%] · For pandas commit 1e014777 <replace-diff-dtype> (round 2/2):
[ 50.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 54.17%] ··· replace.Convert.time_replace ok
[ 54.17%] ··· ============= ============ ============
-- replace_data
------------- -------------------------
constructor Timestamp Timedelta
============= ============ ============
DataFrame 98.6±3ms 95.2±0.5ms
Series 74.0±0.3ms 74.6±1ms
============= ============ ============
[ 58.33%] ··· replace.FillNa.time_fillna ok
[ 58.33%] ··· ========= =============
inplace
--------- -------------
True 1.04±0.01ms
False 3.01±0.3ms
========= =============
[ 62.50%] ··· replace.FillNa.time_replace ok
[ 62.50%] ··· ========= =============
inplace
--------- -------------
True 1.10±0.02ms
False 4.02±0.07ms
========= =============
[ 66.67%] ··· replace.ReplaceDict.time_replace_series ok
[ 66.67%] ··· ========= ============
inplace
--------- ------------
True 4.27±0.02s
False 4.24±0.04s
========= ============
[ 70.83%] ··· replace.ReplaceList.time_replace_list ok
[ 70.83%] ··· ========= ============
inplace
--------- ------------
True 51.6±0.6μs
False 405±5ms
========= ============
[ 75.00%] ··· replace.ReplaceList.time_replace_list_one_match ok
[ 75.00%] ··· ========= ==========
inplace
--------- ----------
True 89.2±3ms
False 467±4ms
========= ==========
[ 75.00%] · For pandas commit 9114b4bc <master> (round 2/2):
[ 75.00%] ·· Building for conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt...
[ 75.00%] ·· Benchmarking conda-py3.7-Cython0.29.16-matplotlib-numba-numexpr-numpy-odfpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 79.17%] ··· replace.Convert.time_replace ok
[ 79.17%] ··· ============= =========== ===========
-- replace_data
------------- -----------------------
constructor Timestamp Timedelta
============= =========== ===========
DataFrame 102±4ms 103±3ms
Series 80.0±4ms 83.1±4ms
============= =========== ===========
[ 83.33%] ··· replace.FillNa.time_fillna ok
[ 83.33%] ··· ========= =============
inplace
--------- -------------
True 1.10±0.04ms
False 3.04±0.1ms
========= =============
[ 87.50%] ··· replace.FillNa.time_replace ok
[ 87.50%] ··· ========= =============
inplace
--------- -------------
True 1.20±0.04ms
False 4.26±0.2ms
========= =============
[ 91.67%] ··· replace.ReplaceDict.time_replace_series ok
[ 91.67%] ··· ========= ============
inplace
--------- ------------
True 4.26±0.01s
False 4.24±0.01s
========= ============
[ 95.83%] ··· replace.ReplaceList.time_replace_list ok
[ 95.83%] ··· ========= ==========
inplace
--------- ----------
True 49.6±1μs
False 412±6ms
========= ==========
[100.00%] ··· replace.ReplaceList.time_replace_list_one_match ok
[100.00%] ··· ========= ==========
inplace
--------- ----------
True 89.3±2ms
False 464±30ms
========= ==========
BENCHMARKS NOT SIGNIFICANTLY CHANGED.
can u merge master and will look |
@dsaxton can you move release note |
marking as stale 😉 |
doc/source/whatsnew/v1.1.1.rst
Outdated
@@ -46,6 +46,10 @@ Bug fixes | |||
- Bug in ``.groupby(..).rolling(..)`` where passing ``closed`` with column selection would raise a ``ValueError`` (:issue:`35549`) | |||
- Bug in :class:`DataFrame` constructor failing to raise ``ValueError`` in some cases when ``data`` and ``index`` have mismatched lengths (:issue:`33437`) | |||
|
|||
**Reshaping** | |||
|
|||
- Bug where :meth:`DataFrame.replace` and :meth:`Series.replace` would fail when replacement value was of a different dtype but equal to values in the original object (:issue:`34871`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move to 1.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think it's the same issue. If another fix avoids astyping to object I would think it's preferred.
Closing since the other PR looks like a better fix |
Bug noticed when looking at a different issue #34871 (comment)
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff