Skip to content

DOC: Include SeriesGroupBy ops in API docs #48399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rhshadrach opened this issue Sep 5, 2022 · 3 comments · Fixed by #48500
Closed

DOC: Include SeriesGroupBy ops in API docs #48399

rhshadrach opened this issue Sep 5, 2022 · 3 comments · Fixed by #48500

Comments

@rhshadrach
Copy link
Member

rhshadrach commented Sep 5, 2022

https://pandas.pydata.org/pandas-docs/dev/reference/groupby.html

Currently we have documentation of various groupby methods on GroupBy for any methods of SeriesGroupBy and DataFrameGroupBy that share an implementation, on DataFrameGroupBy if they exist on both but do not share an implementation, or on SeriesGroupBy xor DataFrameGroupBy if they only exist on one. This can make locating the docs a bit tricky, and the documentation also contains the line:

The following methods are available in both SeriesGroupBy and DataFrameGroupBy objects, but may differ slightly, usually in that the DataFrameGroupBy version usually permits the specification of an axis argument, and often an argument indicating whether to restrict application to columns of a specific data type.

I think it would be better to generate docs for all SeriesGroupBy methods and all DataFrameGroupBy methods individually, even if they entirely overlap.

This is somewhat related to #6944, as I think it would be then possible to treat GroupBy as an internal pandas object (at least from a documentation perspective). But this isn't logically necessary.

cc @jorisvandenbossche

@rhshadrach rhshadrach added Docs Groupby Needs Discussion Requires discussion from core team before further action labels Sep 5, 2022
@mroeschke
Copy link
Member

I think it would be better to generate docs for all SeriesGroupBy methods and all DataFrameGroupBy methods individually, even if they entirely overlap.

+1 for this suggestion. Ideally users shouldn't really need to know of SeriesGroupBy and DataFrameGroupBy objects either, but it appears the most straightforward way to indicate whether Series or DataFrame supports a groupby method.

@rhshadrach
Copy link
Member Author

I was playing around with this a bit yesterday - I'm wondering what the best way to organize the documentation is. Two natural options, neither of which I personally love, but am finding I prefer the 2nd:

Computations / descriptive stats
--------------------------------
DataFrameGroupBy.mean
SeriesGroupBy.mean
DataFrameGroupBy.sum
SeriesGroupBy.sum

vs

DataFrameGroupBy computations / descriptive stats
-------------------------------------------------
DataFrameGroupBy.mean
DataFrameGroupBy.sum

SeriesGroupBy computations / descriptive stats
----------------------------------------------
SeriesGroupBy.mean
SeriesGroupBy.sum

Another option I considered was a table with two column - DataFrameGroupBy and SeriesGroupBy - but I think this doesn't work with autosummary.

@mroeschke
Copy link
Member

The second looks better to me also IMO

@rhshadrach rhshadrach added this to the 1.6 milestone Sep 10, 2022
@rhshadrach rhshadrach added Enhancement and removed Needs Discussion Requires discussion from core team before further action labels Sep 10, 2022
@mroeschke mroeschke modified the milestones: 1.6, 2.0 Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants