-
Notifications
You must be signed in to change notification settings - Fork 367
Refine Frame._reduce_for_stat_function. #1975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine Frame._reduce_for_stat_function. #1975
Conversation
|
|
||
| return self._reduce_for_stat_function(var, name="var", axis=axis, numeric_only=numeric_only) | ||
|
|
||
| def median( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is moved here for ease of type annotation; bool is overwritten later..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is bool overwritten?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
koalas/databricks/koalas/generic.py
Line 1885 in 5b1205b
| def bool(self) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it!
Codecov Report
@@ Coverage Diff @@
## master #1975 +/- ##
==========================================
- Coverage 94.60% 94.56% -0.05%
==========================================
Files 50 50
Lines 10905 10935 +30
==========================================
+ Hits 10317 10341 +24
- Misses 588 594 +6
Continue to review full report at Codecov.
|
|
|
||
| def max(self, axis=None, numeric_only=None) -> Union[Scalar, "Series"]: | ||
| def max( | ||
| self, axis: Union[int, str] = None, numeric_only: bool = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we use numeric_only: bool = None rather than numeric_only: bool = False for pandas compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think so. cc @itholic who changed it to None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, It's for pandas compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
| """ | ||
|
|
||
| def count(spark_column, spark_type): | ||
| # Special handle floating point types because Spark's count treats nan as a valid value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The refactoring is much cleaner.
|
Looks great to me! Left some questions. |
|
Thanks! I'd merge this now. |
|
Great. Thanks! |
Refines
DataFrame/Series._reduce_for_stat_functionto avoid special handling based on a specific function.Also:
countand supportnumeric_onlyparameter.