Skip to content

Commit e435594

Browse files
chadrikpre-commit-ci[bot]JelleZijlstra
authored
stubgen: unify C extension and pure python stub generators with object oriented design (#15770)
This MR is a major overhaul to `stubgen`. It has been tested extensively in the process of creating stubs for multiple large and varied libraries (detailed below). ## User story The impetus of this change is as follows: as a maintainer of third-party stubs I do _not_ want to use `stubgen` as a starting point for hand-editing stub files, I want a framework to regenerate stubs against upstream changes to a library. ## Summary of Changes - Introduces an object-oriented design for C extension stub generation, including a common base class that is shared between inspection-based and parsing-based stub generation. - Generally unifies and harmonizes the behavior between inspection and parsing approaches. For example, function formatting, import tracking, signature generators, and attribute filtering are now handled with the same code. - Adds support for `--include-private` and `--export-less` to c-extensions (inspection-based generation). - Adds support for force enabling inspection-based stub generation (the approach used for C extensions) on pure python code using a new `--inspect-mode` flag. Useful for packages that employ dynamic function or class factories. Also makes it possible to generate stubs for pyc-only modules (yes, this is a real use case) - Adds an alias `--no-analysis` for `--parse-only` to clarify the purpose of this option. - Removes filtering of `__version__` attribute from modules: I've encountered a number of cases in real-world code that utilize this attribute. - Adds a number of tests for inspection mode. Even though these run on pure python code they increase coverage of the C extension code since it shares much of hte same code base. Below I've compiled some basic information about each stub library that I've created using my changes, and a link to the specialized code for procedurally generating the stubs. | Library | code type | other notes | | --- | --- | --- | | [USD](https://github.com/LumaPictures/cg-stubs/blob/master/usd/stubgen_usd.py) | boost-python | integrates types from doxygen | | [katana](https://github.com/LumaPictures/cg-stubs/blob/master/katana/stubgen_katana.py) | pyc and C extensions | uses epydoc docstrings. has pyi-only packages | | [mari](https://github.com/LumaPictures/cg-stubs/blob/master/mari/stubgen_mari.py) | pure python and C extensions | uses epydoc docstrings | | [opencolorio](https://github.com/LumaPictures/cg-stubs/blob/master/ocio/stubgen_ocio.py) | pybind11 | | | [pyside2](https://github.com/LumaPictures/cg-stubs/blob/master/pyside/stubgen_pyside.py) | shiboken | | | substance_painter | pure python | basic / non-custom. reads types from annotations | | pymel | pure python | integrates types parsed from custom docs | I know that this is a pretty big PR, and I know it's a lot to go through, but I've spent a huge amount of time on it and I believe this makes mypy's stubgen tool the absolute best available. If it helps, I also have 13 merged mypy PRs under my belt and I'll be around to fix any issues if they come up. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jelle Zijlstra <[email protected]>
1 parent ff9deb3 commit e435594

File tree

12 files changed

+2125
-1442
lines changed

12 files changed

+2125
-1442
lines changed

docs/source/stubgen.rst

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -127,12 +127,22 @@ alter the default behavior:
127127
unwanted side effects, such as the running of tests. Stubgen tries to skip test
128128
modules even without this option, but this does not always work.
129129

130-
.. option:: --parse-only
130+
.. option:: --no-analysis
131131

132132
Don't perform semantic analysis of source files. This may generate
133133
worse stubs -- in particular, some module, class, and function aliases may
134134
be represented as variables with the ``Any`` type. This is generally only
135-
useful if semantic analysis causes a critical mypy error.
135+
useful if semantic analysis causes a critical mypy error. Does not apply to
136+
C extension modules. Incompatible with :option:`--inspect-mode`.
137+
138+
.. option:: --inspect-mode
139+
140+
Import and inspect modules instead of parsing source code. This is the default
141+
behavior for C modules and pyc-only packages. The flag is useful to force
142+
inspection for pure Python modules that make use of dynamically generated
143+
members that would otherwise be omitted when using the default behavior of
144+
code parsing. Implies :option:`--no-analysis` as analysis requires source
145+
code.
136146

137147
.. option:: --doc-dir PATH
138148

mypy/moduleinspect.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,10 @@ def is_c_module(module: ModuleType) -> bool:
3939
return os.path.splitext(module.__dict__["__file__"])[-1] in [".so", ".pyd", ".dll"]
4040

4141

42+
def is_pyc_only(file: str | None) -> bool:
43+
return bool(file and file.endswith(".pyc") and not os.path.exists(file[:-1]))
44+
45+
4246
class InspectError(Exception):
4347
pass
4448

mypy/stubdoc.py

Lines changed: 90 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,14 @@
88

99
import contextlib
1010
import io
11+
import keyword
1112
import re
1213
import tokenize
1314
from typing import Any, Final, MutableMapping, MutableSequence, NamedTuple, Sequence, Tuple
1415
from typing_extensions import TypeAlias as _TypeAlias
1516

17+
import mypy.util
18+
1619
# Type alias for signatures strings in format ('func_name', '(arg, opt_arg=False)').
1720
Sig: _TypeAlias = Tuple[str, str]
1821

@@ -35,12 +38,16 @@ class ArgSig:
3538

3639
def __init__(self, name: str, type: str | None = None, default: bool = False):
3740
self.name = name
38-
if type and not is_valid_type(type):
39-
raise ValueError("Invalid type: " + type)
4041
self.type = type
4142
# Does this argument have a default value?
4243
self.default = default
4344

45+
def is_star_arg(self) -> bool:
46+
return self.name.startswith("*") and not self.name.startswith("**")
47+
48+
def is_star_kwarg(self) -> bool:
49+
return self.name.startswith("**")
50+
4451
def __repr__(self) -> str:
4552
return "ArgSig(name={}, type={}, default={})".format(
4653
repr(self.name), repr(self.type), repr(self.default)
@@ -59,7 +66,80 @@ def __eq__(self, other: Any) -> bool:
5966
class FunctionSig(NamedTuple):
6067
name: str
6168
args: list[ArgSig]
62-
ret_type: str
69+
ret_type: str | None
70+
71+
def is_special_method(self) -> bool:
72+
return bool(
73+
self.name.startswith("__")
74+
and self.name.endswith("__")
75+
and self.args
76+
and self.args[0].name in ("self", "cls")
77+
)
78+
79+
def has_catchall_args(self) -> bool:
80+
"""Return if this signature has catchall args: (*args, **kwargs)"""
81+
if self.args and self.args[0].name in ("self", "cls"):
82+
args = self.args[1:]
83+
else:
84+
args = self.args
85+
return (
86+
len(args) == 2
87+
and all(a.type in (None, "object", "Any", "typing.Any") for a in args)
88+
and args[0].is_star_arg()
89+
and args[1].is_star_kwarg()
90+
)
91+
92+
def is_catchall_signature(self) -> bool:
93+
"""Return if this signature is the catchall identity: (*args, **kwargs) -> Any"""
94+
return self.has_catchall_args() and self.ret_type in (None, "Any", "typing.Any")
95+
96+
def format_sig(
97+
self,
98+
indent: str = "",
99+
is_async: bool = False,
100+
any_val: str | None = None,
101+
docstring: str | None = None,
102+
) -> str:
103+
args: list[str] = []
104+
for arg in self.args:
105+
arg_def = arg.name
106+
107+
if arg_def in keyword.kwlist:
108+
arg_def = "_" + arg_def
109+
110+
if (
111+
arg.type is None
112+
and any_val is not None
113+
and arg.name not in ("self", "cls")
114+
and not arg.name.startswith("*")
115+
):
116+
arg_type: str | None = any_val
117+
else:
118+
arg_type = arg.type
119+
if arg_type:
120+
arg_def += ": " + arg_type
121+
if arg.default:
122+
arg_def += " = ..."
123+
124+
elif arg.default:
125+
arg_def += "=..."
126+
127+
args.append(arg_def)
128+
129+
retfield = ""
130+
ret_type = self.ret_type if self.ret_type else any_val
131+
if ret_type is not None:
132+
retfield = " -> " + ret_type
133+
134+
prefix = "async " if is_async else ""
135+
sig = "{indent}{prefix}def {name}({args}){ret}:".format(
136+
indent=indent, prefix=prefix, name=self.name, args=", ".join(args), ret=retfield
137+
)
138+
if docstring:
139+
suffix = f"\n{indent} {mypy.util.quote_docstring(docstring)}"
140+
else:
141+
suffix = " ..."
142+
return f"{sig}{suffix}"
63143

64144

65145
# States of the docstring parser.
@@ -176,17 +256,17 @@ def add_token(self, token: tokenize.TokenInfo) -> None:
176256

177257
# arg_name is empty when there are no args. e.g. func()
178258
if self.arg_name:
179-
try:
259+
if self.arg_type and not is_valid_type(self.arg_type):
260+
# wrong type, use Any
261+
self.args.append(
262+
ArgSig(name=self.arg_name, type=None, default=bool(self.arg_default))
263+
)
264+
else:
180265
self.args.append(
181266
ArgSig(
182267
name=self.arg_name, type=self.arg_type, default=bool(self.arg_default)
183268
)
184269
)
185-
except ValueError:
186-
# wrong type, use Any
187-
self.args.append(
188-
ArgSig(name=self.arg_name, type=None, default=bool(self.arg_default))
189-
)
190270
self.arg_name = ""
191271
self.arg_type = None
192272
self.arg_default = None
@@ -240,7 +320,7 @@ def args_kwargs(signature: FunctionSig) -> bool:
240320

241321

242322
def infer_sig_from_docstring(docstr: str | None, name: str) -> list[FunctionSig] | None:
243-
"""Convert function signature to list of TypedFunctionSig
323+
"""Convert function signature to list of FunctionSig
244324
245325
Look for function signatures of function in docstring. Signature is a string of
246326
the format <function_name>(<signature>) -> <return type> or perhaps without

0 commit comments

Comments
 (0)