Skip to content

TST: Test to verify behavior of groupby.std with no numeric columns and numeric_only=True #51761

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 6, 2023
Merged
8 changes: 4 additions & 4 deletions pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -1915,7 +1915,7 @@ def std(
.. versionadded:: 1.4.0

numeric_only : bool, default False
Include only `float`, `int` or `boolean` data.
Include only `float`, `int` or `boolean` columns.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe data is used here because this docstring is shared by DataFrameGroupBy and SeriesGroupBy; the latter does not have columns. I think this should be reverted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I have reverted these changes. But should't we have separate docstrings for DataFrameGroupBy and SeriesGroupBy for more clarity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could template these docstrings or just separate them out entirely; but that should be a separate PR.


.. versionadded:: 1.5.0

Expand Down Expand Up @@ -1998,7 +1998,7 @@ def var(
.. versionadded:: 1.4.0

numeric_only : bool, default False
Include only `float`, `int` or `boolean` data.
Include only `float`, `int` or `boolean` columns.

.. versionadded:: 1.5.0

Expand Down Expand Up @@ -2167,7 +2167,7 @@ def sem(self, ddof: int = 1, numeric_only: bool = False):
Degrees of freedom.

numeric_only : bool, default False
Include only `float`, `int` or `boolean` data.
Include only `float`, `int` or `boolean` columns.

.. versionadded:: 1.5.0

Expand Down Expand Up @@ -3093,7 +3093,7 @@ def quantile(
interpolation : {'linear', 'lower', 'higher', 'midpoint', 'nearest'}
Method to use when the desired quantile falls between two points.
numeric_only : bool, default False
Include only `float`, `int` or `boolean` data.
Include only `float`, `int` or `boolean` columns.

.. versionadded:: 1.5.0

Expand Down
19 changes: 19 additions & 0 deletions pandas/tests/groupby/test_groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -2843,3 +2843,22 @@ def test_obj_with_exclusions_duplicate_columns():
result = gb._obj_with_exclusions
expected = df.take([0, 2, 3], axis=1)
tm.assert_frame_equal(result, expected)


def test_groupby_numeric_only_std_no_result():
# GH 51080
dicts_non_numeric = [{"a": "foo", "b": "bar"}, {"a": "car", "b": "dar"}]
df = DataFrame(dicts_non_numeric)
dfgb = df.groupby("a")
result = dfgb.std(numeric_only=True)

assert result.empty


def test_groupby_std_raises_error():
# GH 51080
dicts_non_numeric = [{"a": "foo", "b": "bar"}, {"a": "car", "b": "dar"}]
df = DataFrame(dicts_non_numeric)
dfgb = df.groupby("a")
with pytest.raises(ValueError, match="could not convert string to float: 'bar'"):
dfgb.std()