Skip to content

ENH: Export (a subset of?) pandas._typing for type checking #55231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 of 3 tasks
caneff opened this issue Sep 21, 2023 · 36 comments
Open
1 of 3 tasks

ENH: Export (a subset of?) pandas._typing for type checking #55231

caneff opened this issue Sep 21, 2023 · 36 comments
Labels
Enhancement Typing type annotations, mypy/pyright type checking

Comments

@caneff
Copy link
Contributor

caneff commented Sep 21, 2023

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

There are public functions and methods whose arguments are types that are not actually exported. This makes it hard to propogate those types to other functions that then call the pandas ones.

For instance, merge has a how argument that has a type _typing.MergeHow = Literal['cross', 'inner', 'left', 'outer', 'right'], but since _typing is protected, there is no good way to take it as an argument and instead I have to say

def foo(df: pd.DataFrame, ...., how: str):
  ...
  assert how in ['cross', 'inner', 'left', 'outer', 'right']
  pd.merge(..., how=how)

For my type checker to be OK with it. This is both annoyingly verbose and fragile to updates of Pandas

Feature Description

Add a typing module that exposes a (possible subset) of _typing.

I say possible subset because from looking at the _typing module there are clearly types that are internal usage only and I'm guessing we don't want to have them public so that they can be changed easier.

I would propose the subset be all types that are used as arguments of public functions and methods.

This way my function above could have;

import pandas as pd
import pandas.typing as pd_typing
def foo(df: pd.DataFrame, ..., how: pd_typing.MergeHow):
   ...
   pd.merge(...., how=how)

and have everything work.

Alternative Solutions

Technically these types are "available" when imported by other modules, so you can access MergeHow via pandas.core.reshape.merge.MergeHow or pandas.core.frame.MergeHow but those are just imports from _typing imported to be used by those modules themselves, not something users should rely on.

Other alternatives

A) Split the public ones out of _typing into typing, could from typing import * in _typing if we don't want to rewrite everywhere the newly public types are used.

B) Just make all of typing public. As someone who is not heavy into Pandas internals I have no strong opinion here but my guess is that there are internal types that we don't want public.

Additional Context

I'm more than willing to take this PR myself I just want feedback about whether this would be accepted.

@caneff caneff added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 21, 2023
@twoertwein
Copy link
Member

A subset of _typing exposed in #48577

@caneff
Copy link
Contributor Author

caneff commented Sep 22, 2023 via email

@twoertwein
Copy link
Member

I think pandas.api.typing currently contains mostly classes (they are definitely stable) but I'm honestly not sure how stable the type aliases in pandas._typing are (both their name and value).

We also have pandas-stubs (but you can obviously not import from there).

@Dr-Irv @rhshadrach

@twoertwein twoertwein added the Typing type annotations, mypy/pyright type checking label Sep 22, 2023
@Dr-Irv
Copy link
Contributor

Dr-Irv commented Sep 22, 2023

Not sure why you can't just use from pandas._typing import MergeHow to address your current issue.

Alternatively, we could have a pandas/typing.py for the ones we are willing to export (or maybe put all of those in pandas.api.typing) and keep pandas/_typing.py for internal usage.

There is also the issue of creating the documentation for things like MergeHow. If we put them in a public place (whether pandas.typing or pandas.api.typing, then they would need to be documented, so that changing the names of the types (if that should ever happen) would be known to the public.

So, this isn't just an issue of whether we start exposing these literals, it's also a documentation issue.

@caneff
Copy link
Contributor Author

caneff commented Sep 22, 2023 via email

@rhshadrach
Copy link
Member

I think pandas.api.typing would make sense for this, and find having pandas.typing alongside pandas.api.typing a bit confusing.

But the main concern is with evolution of type hints alongside the library. How do we go about changing a name or removing a type-hint without breaking user code? If type-hints were strings, then I don't think we'd be at risk. For example:

from __future__ import annotations


def foo(x) -> bar:
    return x + 1

foo(1)

is valid Python. But they currently aren't by default and without the future import this code fails. I don't believe we have any way to signal to a user that a type-hint is deprecated or going to change.

@caneff
Copy link
Contributor Author

caneff commented Sep 23, 2023 via email

@twoertwein
Copy link
Member

twoertwein commented Sep 23, 2023

Two thoughts:

  • We could separate the stable pandas.api.typing and create a new pandas.api.typing.aliases that is documented to be public but NOT stable (names might change without warnings between releases and the values too) - typing pandas is already slow, I would like to avoid more friction (adding deprecation/future warnings for typing changes)
  • Most people probably do not use the panda's internal annotations (only bare pyright people or people who specifically add a py.typed), pylance and presumably most mypy people use pandas-stubs: if we expose type aliases, we need to ensure that they are in sync between pandas and pandas-stubs!

@caneff
Copy link
Contributor Author

caneff commented Sep 23, 2023 via email

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Sep 23, 2023

  1. At Google we haven't adopted stubs yet (but it so on my list). We depend on the bare types ATM and given our size of codebase it will be a slow transition.

I think you will find that using pandas-stubs will help improve your code, because the typing checks a lot more things than what is available with just using the library's typing, which is really meant for checking the internal code consistency.

Do pandas-stubs not just import from typing anyway? Isn't that how we keep things in sync?

See https://github.com/pandas-dev/pandas-stubs#differences-between-type-declarations-in-pandas-and-pandas-stubs

@caneff
Copy link
Contributor Author

caneff commented Sep 23, 2023 via email

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Sep 26, 2023

Yes I agree. But with the scale of our codebase makes it daunting to say the least. Any bit of new typing results in a lot of new typing violations to be handled. Any suggestions for how to maybe introduce them piecewise?

With mypy and pyright, I am pretty sure that you can specify the files to include/exclude for type checking.

Also how would the stubs help with my current issue though? Do the stubs intentionally export MergeHow anywhere?

I think you could do something like this:

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from pandas._typing import MergeHow

def foo(df: pd.DataFrame, ..., how: "MergeHow"):
   ...
   pd.merge(...., how=how)

@lithomas1 lithomas1 removed the Needs Triage Issue that has not been reviewed by a pandas team member label Sep 26, 2023
@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 3, 2025

@WillAyd Back in 2019, you added this text in #27050 (slightly edited since then) that now appears at https://pandas.pydata.org/pandas-docs/stable/development/contributing_codebase.html#pandas-specific-types that says:

"Commonly used types specific to pandas will appear in pandas._typing and you should use these where applicable. This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas."

pyright has made a change that now considers any module with a leading underscore to be considered private. So when someone installs pandas-stubs, pyright reports that there is an import of a private module if someone imports any of the types in pandas._typing and uses pandas-stubs.

Ideally, we would move more of the "public" types into a module and document them, and have a correspondence between both pandas and pandas-stubs for the publicly supported types. Because of the change in pyright, this has now become more important. See this comment from the pyright developer here: microsoft/pyright#10248 (comment) where he asked how long do we think it would take to make that change.

One quick option is we don't worry about the documentation, and just make pandas.typing a non-private module. But then we also have pandas.api.typing so that could be confusing.

In any case, I think we need to now handle the sentence you wrote back then: " This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas."

Thoughts? Who else should we include in this conversation? (Adding @rhshadrach to get his thoughts)

@rhshadrach
Copy link
Member

What are the objects from pandas._typing that users would want to use in practice?

One quick option is we don't worry about the documentation, and just make pandas.typing a non-private module. But then we also have pandas.api.typing so that could be confusing.

Then any future changes to the objects in pandas.typing could potentially break user type-hinting, and I believe there is very limited ways to support deprecations (is this correct?). Instead of making changes to the types there, we would have to just use new ones?

Would we be aligned to "usage of pandas type-hinting objects is subject to breakage in even minor versions of pandas". Would users?

I'm also negative on having both pandas.typing and pandas.api.typing.

@rhshadrach
Copy link
Member

rhshadrach commented Apr 6, 2025

Reading over the linked pyright issue, people importing from a private module are breaking Python conventions. They are free to do so at their own peril, but it is not onerous to add a type: ignore in such situations. However I do feel strongly that neither pandas._typing nor the equivalent in pandas-stubs should be considered public as-is, which I think is effectively what is being suggested at the bottom of microsoft/pyright#10248 (comment).

That said, I'm not necessarily adverse to exposing some subset in a public location, but the real-world utility should be demonstrated first.

@simonjayhawkins
Copy link
Member

"Commonly used types specific to pandas will appear in pandas._typing and you should use these where applicable. This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas."

Presumably this statement pre-dates pandas-stubs. I guess this was in the early days of pandas typing journey when how best to help users type check their pandas code had not been decided?

I think you will find that using pandas-stubs will help improve your code, because the typing checks a lot more things than what is available with just using the library's typing, which is really meant for checking the internal code consistency.

Agreed. I think that now we should agree that pandas._typing is private, will always be private and that we now have no intentions of exposing this in the future. pandas-stubs is now provided to allow end-users to type check their code.

But then we also have pandas.api.typing so that could be confusing.

I would not be adverse to renaming this if to avoid confusion. The classes here are public. They are mostly returned from pandas operations and not directly constructed. So effectively we cannot change these classes without breaking backwards compatibility. The ARE part of the API, but were not included in the documented API section because users are not expected to construct these.

The reason that they were put in pandas.api.typing was to correct this problem that they ARE public but were not considered public. IIRC the reason "typing" was used in the naming is because when using type hints for return types in pandas, these return types/classes were incorrectly not considered public.

@simonjayhawkins
Copy link
Member

I would not be adverse to renaming this if to avoid confusion.

of course, the simplest solution may be to document these properly and just include them in pandas.api

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 14, 2025

Commonly used types specific to pandas will appear in pandas._typing and you should use these where applicable. This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas."

Presumably this statement pre-dates pandas-stubs. I guess this was in the early days of pandas typing journey when how best to help users type check their pandas code had not been decided?

Yes, it is from 2019 and pandas-stubs started in 2022.

I think you will find that using pandas-stubs will help improve your code, because the typing checks a lot more things than what is available with just using the library's typing, which is really meant for checking the internal code consistency.

Agreed. I think that now we should agree that pandas._typing is private, will always be private and that we now have no intentions of exposing this in the future. pandas-stubs is now provided to allow end-users to type check their code.

The issue that we now have is that people are doing imports from pandas._typing in their code (to support type checking). So we have to inventory that module (and potentially other private modules in pandas that people might be importing) to determine which ones we want to make public. I have to spend some time doing that inventory.

But then we also have pandas.api.typing so that could be confusing.

I would not be adverse to renaming this if to avoid confusion. The classes here are public. They are mostly returned from pandas operations and not directly constructed. So effectively we cannot change these classes without breaking backwards compatibility. The ARE part of the API, but were not included in the documented API section because users are not expected to construct these.

The reason that they were put in pandas.api.typing was to correct this problem that they ARE public but were not considered public. IIRC the reason "typing" was used in the naming is because when using type hints for return types in pandas, these return types/classes were incorrectly not considered public.

It's also because they appear in the docs. E.g., we document that pandas.api.typing.DataFrameGroupBy and pandas.api.typing.SeriesGroupBy are returned by groupby, so this allows people to use those types in their type-safe code.

So I actually think that whatever "types" we want to make public should be in pandas.api.typing, and we could consider moving some of the types from pandas._typing over to pandas.api.typing .

@simonjayhawkins
Copy link
Member

The issue that we now have is that people are doing imports from pandas._typing in their code (to support type checking). So we have to inventory that module (and potentially other private modules in pandas that people might be importing) to determine which ones we want to make public. I have to spend some time doing that inventory.

maybe I'm misunderstanding. This sounds like the tail wagging the dog?

So I actually think that whatever "types" we want to make public should be in pandas.api.typing, and we could consider moving some of the types from pandas._typing over to pandas.api.typing .

Again I must be misunderstanding. It's not the "types" that we WANT to make public. It's anything that our public API returns that defacto is also public?

@WillAyd
Copy link
Member

WillAyd commented Apr 14, 2025

Sorry for missing the original ping. It has been a while and a lot has changed in the landscape, but I think I agree with @simonjayhawkins as to the original intent of that private module, i.e. the module itself was not necessarily intended to be public in the way we normally expose objects, but rather the types included describe the API that we have

I don't have a strong preference on how to proceed as I haven't been involved in types in quite some time. I am happy to defer to whatever you think is best

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 14, 2025

The issue that we now have is that people are doing imports from pandas._typing in their code (to support type checking). So we have to inventory that module (and potentially other private modules in pandas that people might be importing) to determine which ones we want to make public. I have to spend some time doing that inventory.

maybe I'm misunderstanding. This sounds like the tail wagging the dog?

At least for right now, the only way I can see to determine which types to expose are the ones that are used in the tests of pandas-stubs.

So I actually think that whatever "types" we want to make public should be in pandas.api.typing, and we could consider moving some of the types from pandas._typing over to pandas.api.typing .

Again I must be misunderstanding. It's not the "types" that we WANT to make public. It's anything that our public API returns that defacto is also public?

It's anything that our public API returns, PLUS anything that is a type of a method/function argument.

For example, in _typing.py, we have:

CompressionOptions = Optional[
    Union[Literal["infer", "gzip", "bz2", "zip", "xz", "zstd", "tar"], CompressionDict]
]

and then a method like DataFrame.to_json() has compression: CompressionOptions in its signature, so a user should be able to write something like:

def compress_my_df(df: pd.DataFrame, copt: CompressionOptions):
    return df.to_json(..., compression = copt)

@simonjayhawkins
Copy link
Member

It's anything that our public API returns, PLUS anything that is a type of a method/function argument.

I assume aliases used by end users type checking their code would be taken from pandas-stubs and NOT from pandas?

IIRC we discussed ways of generally keeping these in sync but the types in pandas._typing and the aliases defined there are for typing checking the pandas codebase ONLY (i.e. internally for robustness) and end users should NOT be using these directly from pandas?

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 14, 2025

I assume aliases used by end users type checking their code would be taken from pandas-stubs and NOT from pandas?

As a user, you'd then have to surround any pandas-stubs specific imports with if TYPE_CHECKING, which is a bit annoying, and put any pandas-stubs attributes in your code surrounded by double quotes, because otherwise the type wouldn't exist at runtime.

So it's better if whatever is imported by a user is both in pandas and in pandas-stubs.

IIRC we discussed ways of generally keeping these in sync but the types in pandas._typing and the aliases defined there are for typing checking the pandas codebase ONLY (i.e. internally for robustness) and end users should NOT be using these directly from pandas?

Well, yes, end users should NOT be doing from pandas._typing import ..., but they do...

So the idea (which is the original intent of this issue) is that we pick a subset of pandas._typing that we want to allow users to import.

@simonjayhawkins
Copy link
Member

So it's better if whatever is imported by a user is both in pandas and in pandas-stubs.

This seems to be a deviation from our original motivation of creating the stubs library in order to decouple the typing for end users from the codebase internal checks?

So the idea (which is the original intent of this issue) is that we pick a subset of pandas._typing that we want to allow users to import.

-1

We should retain the freedom to improve and enhance the internal codebase type checking without the constraints of breaking changes and deprecations?

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 14, 2025

So it's better if whatever is imported by a user is both in pandas and in pandas-stubs.

This seems to be a deviation from our original motivation of creating the stubs library in order to decouple the typing for end users from the codebase internal checks?

I would say our original motivation was to provide typing support for users of the API above what is handled in the codebase.

But I have maintained a policy that if you import something in your code that is documented, it should work at runtime for users. This works fine if a user doesn't import something from pandas._typing. But users are doing that, and I think there are things in pandas._typing that are useful to expose to users.

So the idea (which is the original intent of this issue) is that we pick a subset of pandas._typing that we want to allow users to import.

-1

We should retain the freedom to improve and enhance the internal codebase type checking without the constraints of breaking changes and deprecations?

Absolutely. But I think there are types in pandas._typing that are useful for users. And I would argue that some of the things in pandas._typing are more useful for users than they are for our internal code base. (e.g., the example I gave with CompressionOptions).

@simonjayhawkins
Copy link
Member

I appreciate the in-depth conversation on how users are relying on types from pandas._typing even though it was never meant for public consumption. I agree that it may benefit the community if we could clearly expose a well-defined and minimal subset of type aliases—such as MergeHow and CompressionOptions—that are used in public APIs. Placing these in a dedicated, documented module (for example, under pandas.api.typing) would reduce reliance on private internals and maybe help ensure consistency between pandas and pandas-stubs.

Of course, we need to be cautious about committing to a public API that might constrain our ability to evolve internal type checking. Clear documentation and a careful inventory of the types that are safe to expose would go a long way toward balancing stability with internal flexibility.

Overall, this seems like a promising direction to improve type checking support for users while keeping our internal evolution agile.

IIUC @twoertwein and @rhshadrach also seem to have similar reservations to myself.

@Dr-Irv If the proposed solution was labeled as experimental, allowing us to make changes at any time, would that be acceptable to you and the others?

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 14, 2025

@Dr-Irv If the proposed solution was labeled as experimental, allowing us to make changes at any time, would that be acceptable to you and the others?

Yes, I'd be fine with that.

@simonjayhawkins
Copy link
Member

I think I prefer @twoertwein suggestion in #55231 (comment) of using pandas.api.typing.aliases. I think this confers the intention to limit the scope of exported types to just aliases while avoiding the potential naming confusion if we make pandas._typing public by just removing the underscore. I think this is also better than putting the aliases directly in pandas.api.typing as it keeps a clear separation from the stable return types to the experimental method/function argument aliases.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 14, 2025

I think I prefer @twoertwein suggestion in #55231 (comment) of using pandas.api.typing.aliases. I think this confers the intention to limit the scope of exported types to just aliases while avoiding the potential naming confusion if we make pandas._typing public by just removing the underscore. I think this is also better than putting the aliases directly in pandas.api.typing as it keeps a clear separation from the stable return types to the experimental method/function argument aliases.

Seems reasonable to me.

erictraut added a commit to microsoft/pyright that referenced this issue Apr 14, 2025
… check to enforce imports from submodules whose names begin with an underscore. This change was disruptive to users of the popular `pandas` library because it exports a submodule named `_typing`. This was originally intended to be experimental or private, but it was never fully addressed. The pandas maintainers will work to fix this issue (pandas-dev/pandas#55231) over the next year. I'm going to back out this change to pyright in the meantime.
erictraut added a commit to microsoft/pyright that referenced this issue Apr 14, 2025
… check to enforce imports from submodules whose names begin with an underscore. This change was disruptive to users of the popular `pandas` library because it exports a submodule named `_typing`. This was originally intended to be experimental or private, but it was never fully addressed. The pandas maintainers will work to fix this issue (pandas-dev/pandas#55231) over the next year. I'm going to back out this change to pyright in the meantime. (#10322)
@Andrej730
Copy link

Is there any common practice for naming additional typing-only modules included to the stub?

Met the similar reportPrivateImportUsage issue recently with _typing module from fake-bpy-module (nutti/fake-bpy-module#362).

To give some background - it's stubs for Blender Python API. Blender's Python API for the most part lives in C++ and there are lots of enums that API is using but they're not really exposed to Python API as such.
So to make stubs more readable fake-bpy-module added _typing module that groups all those enums using, gave them reasonable names and referred to them using _typing names across the API. The name _typing was chosen instead of typing to avoid giving impression to users that this module is part of the real bpy module.
And this module is not for private use only, those enum literals are also can be used to statically ensure values consistency with API (in some cases they're even necessary to avoid typing errors, due to existing limitations like #8647) and without this module you would need to retype them manually and might miss some new value added in the next API update.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 22, 2025

Is there any common practice for naming additional typing-only modules included to the stub?

As I understand it, the issue is whether you are importing a private-only module (e.g., pandas._typing) in a code that is importing pandas. It's OK to have a private typing module within the library, but those types should not be imported by external users.

In the case of pandas, we need to create a list of types that are OK to import when pandas is installed (and that should work at runtime as well), and have separate types for use in type checking the library source code.

You can create a stub-only module for typing, as long as only the stubs import that module. Runtime code would fail if you had a stubs-only module.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 27, 2025

@rhshadrach Here is a proposal of types to initially put in pandas.api.typing.aliases (or somewhere else), along with reasons why. I determined this by looking at what pyright versions < 1.1.400 were complaining about the stub tests. Some of them may need to be copied over from the stubs.

Type to Create Meaning Why Create it Notes
Components named tuple of Timedelta components Result of Timedelta.components Might be better in pandas.tseries.api
DictWrapper class in pandas._config Setter for pandas.options Users may need to make sure they are passing the correct type
Display Possible display options Result of pandas.options.display Users may want to type the result
DtypeObj Union[np.dtype, "ExtensionDtype"] Result of pandas.api.pandas.dtype Users may want to type the result
IntervalClosedType Literal["left", "right", "both", "neither"] Result of IntervalIndex.closed Users may want to type the result
Options dict that describes options Getter for pandas.options Users may want to type the result
Scalar Union[PythonScalar, PandasScalar, np.datetime64, np.timedelta64, date] return result of various methods Users may want to type the result
TakeIndexer Union[Sequence[int], Sequence[np.integer]. npt.NDArray[np.integer]]` used as argument for ExtensionArray.take If creating an extension array, want to type that argument
TimeUnit Literal["s", "ms", "us", "ns"] Result of Timedelta.unit, Timestamp.unit users may want to type the result

I based this on looking at the tests we do in pandas-stubs that would be useful to have available to import.

I still have to look at some other aliases that are used as arguments to various methods to see what else could be added (e.g., MergeHow that is in the OP of this issue).

Comments and criticism welcome!

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 28, 2025

I still have to look at some other aliases that are used as arguments to various methods to see what else could be added (e.g., MergeHow that is in the OP of this issue).

I started to look at this and am not sure what to do. In the OP, he wrote:

I would propose the subset be all types that are used as arguments of public functions and methods.

Let's consider a function like read_parquet(). In pandas-stubs, this is typed as (note - this is not up to date with pandas 2.x):

def read_parquet(
    path: FilePath | ReadBuffer[bytes],
    engine: ParquetEngine = ...,
    columns: list[str] | None = ...,
    storage_options: StorageOptions = ...,
    **kwargs: Any,
) -> DataFrame: ...

Here are the relevant types in pandas-stubs._typing.pyi that are defined for that stub:

from os import PathLike
FilePath: TypeAlias = str | PathLike[str]
AnyStr_cov = TypeVar("AnyStr_cov", str, bytes, covariant=True)
class BaseBuffer(Protocol):
    @property
    def mode(self) -> str: ...
    def seek(self, offset: int, whence: int = ..., /) -> int: ...
    def seekable(self) -> bool: ...
    def tell(self) -> int: ...

class ReadBuffer(BaseBuffer, Protocol[AnyStr_cov]):
    def read(self, n: int = ..., /) -> AnyStr_cov: ...

ParquetEngine: TypeAlias = Literal["auto", "pyarrow", "fastparquet"]
StorageOptions: TypeAlias = dict[str, Any] | None

If the goal is to allow someone to write a cover function for read_parquet() (e.g., like the one for merge in the OP), then, on the one hand, we should expose all the relevant types used in the arguments of function declarations. If that's the case, then it's probably easiest to just change pandas/_typing.py to be public via pandas/typing.py and we don't have to worry about this being a private module.

On the other hand, maybe we should be more selective, and only include types like ParquetEngine that are Literal substitutes. I just don't know if only exposing the Literal types is enough or if we should have a rule to also include other types in pandas/_typing.py.

@rhshadrach
Copy link
Member

I think we should not expose an alias like Components as-is. This is a fine internal name, but in my opinion it is not an okay name to expose for users. I think the same about DictWrapper.

If that's the case, then it's probably easiest to just change pandas/_typing.py to be public via pandas/typing.py and we don't have to worry about this being a private module.

We should never expose Manager, P, HashableT2, IndexT and a number of other things in _typing.py.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 28, 2025

I think we should not expose an alias like Components as-is. This is a fine internal name, but in my opinion it is not an okay name to expose for users. I think the same about DictWrapper.

I don't agree. If you call Timedelta.components, you'd like to know the type of what is returned. Now maybe we need to not expose it at the top level of typing - it could be pandas.tseries.api.Components .

And for DictWrapper, we could make it pandas.api.typing.DictWrapper .

We should never expose Manager, P, HashableT2, IndexT and a number of other things in _typing.py.

Yes, I agree, but maybe I should just go through all of _typing.py and suggest which ones we expose in pandas.api.typing ??

@rhshadrach
Copy link
Member

rhshadrach commented Apr 28, 2025

To be sure, I'd be fine with something like TimedeltaComponents, but there are many things that can be referred to as Components, the name is not descriptive enough to stand on its own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

No branches or pull requests

8 participants