Skip to content

Improve stub generation so it works better with pybind11 wrappers. #5814

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Oct 24, 2018

Conversation

pkerichang
Copy link
Contributor

  1. Modify walk_packages() so when --recursive option is set, submodules
    in C extensions are also visited properly.

  2. use inspect package to implement is_c_function(), is_c_method(),
    is_c_classmethod(), and is_c_type() so it is more accurate. The
    inspect methods are OR'ed with the old check to prevent breaking
    old code.

  3. Improve infer_sig_from_docstring() so it supports Python-annotation
    style type hints (like a: int) in docstrings.

1. Modify walk_packages() so when --recursive option is set, submodules
   in C extensions are also visited properly.

2. use inspect package to implement is_c_function(), is_c_method(),
   is_c_classmethod(), and is_c_type() so it is more accurate.  The
   inspect methods are OR'ed with the old check to prevent breaking
   old code.

3. Improve infer_sig_from_docstring() so it supports Python-annotation
   style type hints (like a: int) in docstrings.
1. fix bug where signatures may not have any commas.
1. added support for brackets.
2. fix bug in signature generation.
@gvanrossum
Copy link
Member

I appreciate your work, but you don't seem to be adding any tests. You're adding pretty subtle logic, e.g. the added complications to generate_c_function_stub() look like it took you several tries to get right, and I can't say it's easy to follow that code.

I also am not super excited that walk_packages() is now recursive -- I am having a hard time imagining what it does now it has doubled in length. (Maybe at least add a docstring and some comments?)

It would also be nice to show some of the output it used to give on pybind11 (I had never heard of it before and it looks complicated) and what it produces after your improvements.

1. added code to generate property definitions
   (type set to Any for now).

2. update is_c_classmethod() so that it's more robust.

3. In add_typing_import(), check for more common types.
1. Added support for parsing Google/Numpy style docstrings to
   infer property type.

2. Added Any annotation to function return value.
1. when trying to infer function signature from docstring, try to
   infer return type as well.  As the result, infer_sig_from_docstring()
   return type is changed to Optional[Tuple[str, str]].  When
   return type cannot be determined, it returns 'Any'.

2. annotate return type of C extension functions when generating stub
   files.

3. fix bug with is_c_property() check; exclude getsetdescriptor by
   checking for existence of fget attribute.

4. remove unused parameter from infer_prop_type_from_docstring().
1. Added additional unit tests for infer_sig_from_docstring()

2. added unit tests for infer_prop_type_from_docstring()

3. fix multi-line import style in stubgenc.py
@pkerichang
Copy link
Contributor Author

Hello,

  1. thanks for your comments. I created a repository that builds a C++ extension library demonstrating most features of pybind11:
    https://github.com/pkerichang/pybind_example

In the repo there is a pure-python version and a C++ version of the same package, and I ran the stub generation code on them both. The following is the stub generated from the pure-python package, which should be the golden reference (although I don't know why it's labeling cls as Any, then fail to import Any on the type line):

# Stubs for example.subpackage.submodule (Python 3.7)
#
# NOTE: This dynamically typed stub was automatically generated by stubgen.

from typing import Optional, Tuple

answer: int

def add(a: int, b: int=...) -> int: ...

class SampleClass:
    def __init__(self, value: Optional[Tuple[int, float]]=...) -> None: ...
    def add_to_private(self, val: int) -> int: ...
    @classmethod
    def get_desc(cls: Any) -> str: ...
    @property
    def var_private(self) -> int: ...
    @var_private.setter
    def var_private(self, val: int) -> None: ...
    @property
    def var_public(self) -> int: ...
    @var_public.setter
    def var_public(self, val: int) -> None: ...
    @property
    def var_readonly(self) -> int: ...

the stubgen script right now has several problems; it fails to identify sub-packages in the C extension (so the --recursive option doesn't work), and instead mark sub-packages as an "Any" attribute. Also, it produces the following output on the C extension version:

# Stubs for example.subpackage.submodule (Python 3.7)
#
# NOTE: This dynamically typed stub was automatically generated by stubgen.

from typing import Any

SampleClass: Any
answer: int

def add(*args, **kwargs): ...

which shows that it fails to recongize SampleClass as a class, not an attribute. My PR now produces the following output:

# Stubs for example.subpackage.submodule (Python 3.7)
#
# NOTE: This dynamically typed stub was automatically generated by stubgen.

from typing import Tuple, Optional

answer: int

def add(a: int, b: int=3) -> int: ...

class SampleClass(pybind11_object):
    def __init__(self, value: Optional[Tuple[int, float]]=None) -> None: ...
    def add_to_private(self, val: int) -> int: ...
    @classmethod
    def get_desc(cls) -> str: ...
    @property
    def var_private(self) -> int: ...
    @var_private.setter
    def var_private(self, val: int) -> None: ...
    @property
    def var_public(self) -> float: ...
    @var_public.setter
    def var_public(self, val: float) -> None: ...
    @property
    def var_readonly(self) -> int: ...
  1. I would certainly like to add tests to this too. However, I'm unsure about how to proceed. As stubgen needs to parse a C extension package (which is OS dependent), we would need to include sample C extensions for each OS. We can also compile it from source, but then we would need to rely on a C++ compiler. Which approach would you prefer?

@gvanrossum
Copy link
Member

I got stuck trying your example when it said "CMake not installed". Can you rewrite the setup.py to use the default mechanism for building a C extension?

@pkerichang
Copy link
Contributor Author

I updated the repo so there's some instructions in README, and setup.py should work without cmake. Thanks.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good enough (it's okay if stubgen is a little rougher around the edges than the rest of mypy, and if anything you're making it smoother :-).

# using the inspect module
subpackages = [package.__name__ + "." + name
for name, val in inspect.getmembers(package)
if inspect.ismodule(val)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I just realized something: inspect.getmembers() only returns current members. With ordinary Python packages, at least, import pkg does not import all the submodules -- only those that are imported by the __init__.py or have been otherwise imported by other code running before. E.g. I ran this for the http module and it doesn't return anything.

Now this may not be relevant for pybind11 or for your example package -- those seem to be comprised of all C extensions. Is that always the case? Or can a C extension module have a Python submodule? (Perhaps only if it also sets __path__, in which case we should be fine.)

Of course with the old way of doing things, no subpackages would be found either, so I guess it's still a net improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I did notice that importing a pybind11 package automatically import all the subpackages too, and was wondering if that's a pybind11 specific detail or it holds true for all C extensions in general. I'll keep this in mind and see if there's a more robust solution.

Thanks for the quick response!

@gvanrossum gvanrossum merged commit 2296af8 into python:master Oct 24, 2018
msullivan added a commit that referenced this pull request Nov 16, 2018
PR #5814 made stubgen try to walk as a module any attribute of a C
extension module of module type. This is too aggressive: mypyc
generated extension modules will often have attributes of module type
from imports that do not correspond to actual modules.

This causes a stubgen test to fail under mypy_mypyc while trying to
import `mypy.errors.os`, since `mypy.errors` imports `os`.

Only walk such a submodule when its `__name__` matches it being a
submodule.
msullivan added a commit that referenced this pull request Nov 16, 2018
PR #5814 made stubgen try to walk as a module any attribute of a C
extension module of module type. This is too aggressive: mypyc
generated extension modules will often have attributes of module type
from imports that do not correspond to actual modules.

This causes a stubgen test to fail under mypy_mypyc while trying to
import `mypy.errors.os`, since `mypy.errors` imports `os`.

Only walk such a submodule when its `__name__` matches it being a
submodule.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants