Skip to content

DOC: make clear that DataFrame.astype supports Series input #49508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
randolf-scholz opened this issue Nov 3, 2022 · 3 comments · Fixed by #49556
Closed
1 task done

DOC: make clear that DataFrame.astype supports Series input #49508

randolf-scholz opened this issue Nov 3, 2022 · 3 comments · Fixed by #49556
Labels
Docs Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@randolf-scholz
Copy link
Contributor

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.astype.html

Documentation problem

Currently, the parameter specification

dtype: data type, or dict of column name

does not mention that Series are natively supported.

Suggested fix for documentation

Something along the lines of

dtype: string, data type, Series or Mapping of column name

Use a data type like object (string, numpy.dtype, pandas.ExtensionDtype or Python type) to cast entire pandas object to the same type. Alternatively, use a Mapping such as a dictionary of the form {col: dtype, …}, where col is a column label and dtype is the scalar type to cast one or more of the DataFrame’s columns to column-specific types.

One might also want to add: (#43837)

The Mapping is not allowed to contain column names that are present in the DataFrame.

I also noticed that the relevant code

def is_dict_like(obj) -> bool:
"""
Check if the object is dict-like.
Parameters
----------
obj : The object to check
Returns
-------
is_dict_like : bool
Whether `obj` has dict-like properties.
Examples
--------
>>> is_dict_like({1: 2})
True
>>> is_dict_like([1, 2, 3])
False
>>> is_dict_like(dict)
False
>>> is_dict_like(dict())
True
"""
dict_like_attrs = ("__getitem__", "keys", "__contains__")
return (
all(hasattr(obj, attr) for attr in dict_like_attrs)
# [GH 25196] exclude classes
and not isinstance(obj, type)
)

essentially is a weaker version of isinstance(obj, collections.abc.Mapping) that I guess was introduced to also catch Series.
I would propose to think about replacing this with isinstance(obj, Mapping | Series) when appropriate.

Related: pandas-dev/pandas-stubs#410

@randolf-scholz randolf-scholz added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 3, 2022
@randolf-scholz
Copy link
Contributor Author

randolf-scholz commented Nov 3, 2022

Supporting Series is crucial in order to guarantee self-consistency:

import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
df = df.astype(df.dtypes)

@ramvikrams
Copy link
Contributor

@randolf-scholz can I do the fix in the documentation

@randolf-scholz
Copy link
Contributor Author

@ramvikrams Of course.

mroeschke pushed a commit that referenced this issue Nov 11, 2022
* for #49508 changing Doc for DataFrame.astype

added the series in input in the doc of DataFrame.astype

* up

* up2

* up3

* up4

* up5
codamuse pushed a commit to codamuse/pandas that referenced this issue Nov 12, 2022
…49556)

* for pandas-dev#49508 changing Doc for DataFrame.astype

added the series in input in the doc of DataFrame.astype

* up

* up2

* up3

* up4

* up5
mliu08 pushed a commit to mliu08/pandas that referenced this issue Nov 27, 2022
…49556)

* for pandas-dev#49508 changing Doc for DataFrame.astype

added the series in input in the doc of DataFrame.astype

* up

* up2

* up3

* up4

* up5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants