-
Notifications
You must be signed in to change notification settings - Fork 367
Implement Series.where #922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #922 +/- ##
==========================================
+ Coverage 94.52% 94.53% +<.01%
==========================================
Files 34 34
Lines 6465 6476 +11
==========================================
+ Hits 6111 6122 +11
Misses 354 354
Continue to review full report at Codecov.
|
ueshin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add more tests in test_series to check various patterns? e.g.,
>>> s1 = pd.Series([0, 1, 2, 3, 4])
>>> s2 = pd.Series([100, 200, 300, 400, 500])
>>> s1.where(s2 > 100)
0 NaN
1 1.0
2 2.0
3 3.0
4 4.0
dtype: float64and negative cases?
|
|
||
| return self._with_new_scol(current) | ||
|
|
||
| def where(self, cond, other=np.nan): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@itholic seems like pandas shares the same implementation internally. After this PR is merged, can you move this into _Frame class and implement DataFrame.where as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, i'm going to work right after this PR is merged
|
Seems fine to me otherwise. |
Softagram Impact Report for pull/922 (head commit: b620849)⭐ Change Overview
📄 Full report
Impact Report explained. Give feedback on this report to [email protected] |
| # | 4| 4| true| 500| | ||
| # +-----------------+---+----------------+-----------------+ | ||
| data_col_name = self._internal.column_name_for(self._internal.column_index[0]) | ||
| index_column = self._internal.index_columns[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@itholic, I think this doesn't support multi-level index cases. Can you fix this please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index_columns can be multiple and we cannot just use the first one only.
| set_option("compute.ops_on_diff_frames", True) | ||
|
|
||
| @classmethod | ||
| def tearDownClass(cls): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@itholic disable this. compute.ops_on_diff_frames is disabled by default because it costs a lot. We should move the test cases into OpsOnDiffFramesEnabledTest
| kser.drop_duplicates().sort_values()) | ||
|
|
||
| def test_where(self): | ||
| pser1 = pd.Series([0, 1, 2, 3, 4], name=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test when compute.ops_on_diff_frames is off? I think we can still use a scalar values for other such as int.

Like pandas Series.where (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.where.html)
implemented function
wherefor series.