Skip to content

Use better typing for Series.apply based on return type of the callable determining return type of Series.apply() #293

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
phofl opened this issue Sep 12, 2022 · 3 comments · Fixed by #343
Labels
Apply Apply, Aggregate, Transform Regression Functionality that used to work in a prior pandas version Series Series data structure

Comments

@phofl
Copy link
Member

phofl commented Sep 12, 2022

Describe the bug
A clear and concise description of what the bug is.

Series().apply().tolist() errors with

error: "Series[Any]" not callable  [operator]

after latest release.

To Reproduce

  1. Provide a minimal runnable pandas example that is not properly checked by the stubs.
  2. Indicate which type checker you are using (mypy or pyright).
  3. Show the error message received from that type checker while checking your example.
s = pd.Series([1, 2, 3]).apply(lambda x: x).tolist()
error: "Series[Any]" not callable  [operator]

Please complete the following information:

  • OS: [e.g. Windows, Linux, MacOS] MacOS
  • OS Version [e.g. 22] 12.5.1
  • python version 3.10
  • version of type checker mypy 0.960
  • version of installed pandas-stubs [v1.4.4.220906]

Additional context
Add any other context about the problem here.

@phofl phofl added Apply Apply, Aggregate, Transform Regression Functionality that used to work in a prior pandas version Series Series data structure labels Sep 12, 2022
@twoertwein
Copy link
Member

We probably never had annotations for tolist() but it "worked" because __getattr__ was defined on NDFrame with an un-annotated return type (tolist was probably inferred as Any/unknown).

Since #261, DataFrame.__getattr__ returns only a Series so that deprecated (and missing) methods are no longer inferred as Any/Unknown. It returns a Series since DataFrame.some_column_name is a Series.

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Sep 12, 2022

We probably never had annotations for tolist() but it "worked" because __getattr__ was defined on NDFrame with an un-annotated return type (tolist was probably inferred as Any/unknown).

That's not the whole story. We do have tolist() and to_list() defined for Series.

The issue here is that we have typed apply() to return DataFrame | Series, and tolist() doesn't work for a DataFrame.

The solution is to do

s = cast(pd.Series, pd.Series([1, 2, 3]).apply(lambda x: x)).tolist()

Having said that, we can probably better type apply(), by specifying the type of Callable, because docs say if the Callable returns a Series, then the result is DataFrame. But for that to work with this example, you'd have to change lambda x: x to be typed, or create a function def myfun(x: Scalar) -> Scalar: return x

So I'm going to update the title of this issue accordingly.

@Dr-Irv Dr-Irv changed the title Series.apply not callable when applying tolist() Use better typing for Series.apply based on return type of the callable determining return type of Series.apply() Sep 12, 2022
@marcglobality
Copy link

Same here:

import pandas as pd 
def func1(df: pd.DataFrame) -> pd.Series:
      return df.url.apply(get_depth)

def get_depth(url: str) -> int:
      return url.strip("/").count("/") - 2

~ pds ❯ mypy test.py
test.py:4: error: Incompatible return value type (got "Union[Series[Any], DataFrame]", expected "Series[Any]")
Found 1 error in 1 file (checked 1 source file)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform Regression Functionality that used to work in a prior pandas version Series Series data structure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants