Should we accept new thirdparty stubs? #2440

srittau · 2018-09-10T15:54:10Z

Now that PEP 561 is supported by mypy I wonder whether we should accept new third party stubs into typeshed. Personally, I think we should steer those contributions to either be included into the parent package or to be distributed as a separate stubs package. This means that the people who actually care about those stubs (i.e. the submitters) can update the stubs more easily without having to go through the typeshed maintainers.

Maybe we could also offer a kind of "review service" (as time allows) for getting those packages off the ground?

gvanrossum · 2018-09-10T16:20:29Z

Agreed. Though I think we could make (carefully considered) exceptions for commonly used packages of low complexity where the upstream maintainers aren't interested in putting out stubs (but do allow us to put them in typeshed). Packages that look like their stubs would require significant work to create or keep up to date should not be accepted into typeshed. Nor should packages without a large user base.

srittau · 2018-09-10T16:44:55Z

Makes sense. For example, it makes sense to have stubs for the packaging infrastructure in typeshed. The same is true for backports of stdlib packages.

JelleZijlstra · 2018-09-10T18:07:19Z

I also agree. I like your "review service" idea; here's a few other things that could perhaps be helpful:

Provide a listing (maybe in typeshed's repo or the mypy docs) of known third-party stubs package of reasonably high quality.
Provide tooling (e.g., an abstraction for running mypy tests) to help authors of stub packages. I think @ethanhs has been thinking about this too.

emmatyping · 2018-09-10T22:20:47Z

I fully agree, the PEP is clear we want to decentralize typed packages so that it scales better :)

I've already started offering help to several early adopters (numpy, attrs, hypothesis, and now typed-django), so if we want to formalize that process, I am all for it.

I don't have a strong opinion on providing a listing of known third-party stub packages. It will certainly help discoverability, but will likely grow unwieldy over time.

As for tooling, I have a few things I am hoping to accomplish:

better testing infrastructure (see https://github.com/ethanhs/pytest-pep484, I have some ideas for this, but have yet to find time to implement anything yet)
setuptools support for these packages. Ideally there would be a way to easily find pyi files and have them be installed
Wharehouse should probably give <pkg>-stubs to the owner of <pkg> (see Handle security implications of PEP 561 type hinting packages pypi/warehouse#4164)

Also, I created https://github.com/ethanhs/pep-561 which may be a good place to discuss PEP 561 specific issues.

Closes: python#2440

srittau · 2018-09-11T02:12:23Z

Please have a look at PR #2441, were I try to reword our contributing guidelines in regards to third party stubs.

srittau · 2018-09-11T02:38:20Z

Here is a list of issues and pull requests for new third party libraries. I suggest to close most with an explanation:

Stub for flask #28, request for flask stubs (very worthwhile, but outside of typeshed)
No stubs for mysql #146, request for mysql stubs
Missing pyspark third-party library #1494, request for pyspark stubs
Add stubs for cython #1554, PR for Cython stubs, stale since September 2017
Could use stubs for third_party package psutil #1675, request for psutil stubs, psutil is an optional mypy depedency though
Django #1702, request for Django stubs, seems there is also an external effort at https://github.com/TypedDjango/django-stubs
Here is a working stub for the chardet package from PyPI #1707, request for chardet stubs
Creating PyMongo Stubs #1768, request for pymongo stubs
Add stubs for OpenSSL.SSL #2052, request for pyopenssl stubs
No stub for setuptools? #2171, request for setuptools stubs, although if someone was to contribute this, I think it would make sense to have it in typeshed
Add stubs for tabulate #2384, PR for tabulate stubs

ilevkivskyi · 2018-09-11T15:23:17Z

I like the idea of decentralizing things. Another example is https://github.com/dropbox/sqlalchemy-stubs, which is already fully functional PEP 561 package.

ilevkivskyi · 2018-09-11T15:27:42Z

Btw https://github.com/dropbox/sqlalchemy-stubs is also a nice example of how mypy test framework can be used to easily add tests/CI for a stub package.

bluetech · 2018-09-22T20:27:36Z

In my opinion as a mypy user, this will not be a good move at this time. Here is how I order the methods of third party stubs distribution by preference:

Bundled with the package itself.
Typeshed.
Separate package (e.g. somepackage-stubs).

I think (1) is uncontroversial. The CONTRIBUTING.md documentation already encourages doing so whenever upstream is receptive.

I prefer (2) over (3) mainly because a separate package adds a dependency that I would otherwise not have. This means that I have to find out that it exists (not bundled with mypy); I have to trust the stubs' author; I have to keep the package updated; it can be abandoned in which case I need to fork or find a replacement; if I find a problem with the stubs and want to contribute a fix, the experience can be highly variable (compared to typeshed), and so I and others are less likely to do so.

Typescript should be mentioned as a point of comparison. I'd say they are using approach "2⅓" (see DefinitelyTyped). They cannot have a single "typeshed" package because it would be way too big (hopefully typeshed will have this problem too one day), and would not allow to version stubs separately. So what they do is have a single centrally-managed repository, but package each stub under a discoverable, predictable, centrally managed namespace @types/some-package. In my experience this works very well. In fairly big Typescript projects I have only ever used bundled stubs or @types/ stubs.

A note about huge projects like https://github.com/dropbox/sqlalchemy-stubs and https://github.com/TypedDjango/django-stubs. I think it makes a lot of sense to develop them in a separate repository. But once they are production-ready I would still like to see them merged upstream, or absent that, to typeshed.

gvanrossum · 2018-09-22T20:55:43Z

Also, I think the infrastructure we have for running mypy at Dropbox strongly favors typeshed. (To the extent that we've special-cased the sqlalchemy and numpy stubs in a few places.) If the trend away from typeshed continues we can adapt our infrastructure to it, but I haven't looked into the required effort, as (apart from those two) it hasn't come up yet. The point that decentralized infrastructure means that stub maintainers could walk away and it would be somewhat difficult to transfer maintenance to a new volunteer is also well taken.

JukkaL · 2018-09-24T11:03:55Z

All in all, I basically agree with most points made by @bluetech.

Here is some reasoning:

I worry that separately maintained stubs will get out of date after a while, or even stop working with the most recent version of mypy (or pytype, etc.).
If many packages are not interested in bundling type annotations, there could be hundreds (or even thousands) of separately maintained stub packages, each potentially with a unique contributor agreement, contributor documentation, code review process, style conventions, testing/QA approach, and release policy (or lack thereof). This raises the bar for making a contribution (@bluetech also raised this point).
Finding the right stub package can become a pain, since there can be multiple competing stub packages (say, forks of stale stub projects).
Some stub packages for less popular modules could be hard to find, especially if the package has a generic name that results in bogus Google results.
Setting up a new GitHub project for just a set of stubs for some package seems like a lot of friction. We can make it easier by providing templates, but the point still stands.

I think that we should try to learn from DefinitelyTyped -- they have ~20x the number of commits compared to typeshed, so their model appears to scale well. Here are some things that I was quickly able to learn and that we might want to adopt in some form:

They include thousands of third party packages in a single repository (and this seems to work well for them).
They automatically release new type packages from github master.
They allow installing types for a single package from the repository (as mentioned by @bluetech).
Their readme explains how to install types and how contribute new types, in quite a lot of detail. The latter could help attract more contributors.
They support writing tests for type definitions and explain this in their readme. Many tests seem pretty simple (example and another example) but they may still be very helpful, as the reviewers can't be expected to be experts on every single package there is.
They have a search engine for type definitions.

Packaging stubs seems like a problem for us. We could have a single monolithic typeshed package as we currently have, but this approach has issues:

If there will eventually be thousands of packages, the package will be huge.
Updates to newer typeshed versions can be painful, since it will be hard/impossible to pin the stub version for some package, in case the most recent version has some problem or there is a non-backward-compatible API change.

Maybe we could somehow automatically generate a separate PyPI package for each new contributed package stub? Or we could have a script that makes this easy to do, even it requires some manual work. We could automatically generate a version number from the last modified date.

There is also the issues of needing to ask for permission from the package maintainers. This creates some friction. We could work around this by proactively asking for permission from the owners of the top 500 most popular packages on PyPI, for example, and making that list prominently visible somewhere.

JukkaL · 2018-09-24T11:22:48Z

Additional thoughts:

DefinitelyTyped makes it easy to specify strictness options per package (example).
If we'd release a new stub package version automatically after every commit may, it could increase contributor activity, since there is a shorter wait before changes become available.
Having stub tests (even if they are incomplete) will make code reviews easier (this is something we've discussed earlier).
In order to grow typeshed significantly, we'd probably need more active code reviewers. One idea would be to introduce reviewers who focus only on certain packages. It should be easier to ramp up reviewers this way.

srittau · 2018-09-24T11:30:12Z

A few points:

My main issue is scaling. In the long term I believe that mypy and other type checkers should not ship all stubs, just the stdlib stubs and maybe a few select others. This does not mean that stdlib and third party stubs can't be in the same repository, but we'd need to rethink how typeshed packages are published.
I believe that the best way to publish stubs in the parent package, supported by upstream maintainers. I feel that DefinitelyTyped actually discourages package maintainers to do that.
I don't know how DefinitelyTyped is handling the contributor bottleneck, but typeshed currently certainly has one.
I don't think that third party stubs in typeshed will necessarily be maintained any better than stubs in an external package. We have quite a few third party stubs that are not in a great state.

That said, I agree that the DefinitelyTyped model seems quite successful. I especially like the predictability of package names and the high quality of the declaration files available. In summary I agree that working towards making typeshed more "DefinitelyTyped"-like is the better approach.

ilevkivskyi · 2018-09-24T14:19:10Z

They support writing tests for type definitions and explain this in their readme.

One thing that can help with this is publishing mypy test runner separately as a pytest plugin, with this one write nice simple tests like in https://github.com/dropbox/sqlalchemy-stubs/blob/master/test/test-data/sqlalchemy-basics.test (it includes mypy as a submodule but we can't do this here, because typeshed is already a submodule in mypy).

JukkaL · 2018-09-24T14:49:33Z

This does not mean that stdlib and third party stubs can't be in the same repository, but we'd need to rethink how typeshed packages are published.

As I mentioned above, I also don't think that publishing all stubs in a single package is scalable.

I believe that the best way to publish stubs in the parent package, supported by upstream maintainers.

This is reasonable at least for packages that are annotated, though currently we are missing tooling to generate stubs from annotated code. For packages without inline annotations, it may sometimes be better to have the stubs in typeshed (hopefully maintained by the package maintainer), at least if typeshed provides tooling that makes it easier to test and maintain the stubs.

I don't think that third party stubs in typeshed will necessarily be maintained any better than stubs in an external package. We have quite a few third party stubs that are not in a great state.

Yes, the stubs can still be of low quality. However, we could have some infrastructure (such as ability to write tests) that would help maintain quality. We'd at least ensure that the stubs don't get completely broken and that they work with recent versions of tools. Being able to easily declare that everything in a stub should be annotated could also be helpful.

My main argument is that if we have a central repository, any improvements to tooling around stubs immediately help all stubs in the repository (and new stubs), which makes things more scalable. Also, adding a new stub could be as simple as creating a PR with the stubs (+ tests).

Alternatively, we could provide tools or even a project template to make it relatively easy to have similar things for decentralized stubs, but the barrier to entry would still be much higher. Somebody would have to create a GitHub repo, learn about available tooling, set things up correctly (including tests and Travis), keep track of issues, review PRs, etc. This works for big projects where there is a lot of developer interest, such as Django, but I fear that for less popular packages there is nobody willing to do this -- but somebody might be willing to make a one-time contribution with only the stubs (and a few tests).

srittau · 2018-10-01T12:21:05Z

I'm trying to get a feel on how we should proceed. Personally, #2491 ("DefinitelyTypeshed") seems more promising than rejecting third-party stubs in the future. What does everyone else think? Reject 3rd party? DefiniteTypeshed? Keep the status quo for now? Another option? Whatever we decide, I will try to make some time to work on it.

JukkaL · 2018-10-01T14:30:55Z

I'm in favor of moving forward with #2491.

rspeer · 2018-10-01T21:45:08Z

I have a contributor who is interested in adding .pyi type hints to ordered_set, and is trying to follow PEP 561 to make them get distributed and installed as part of the ordered_set distribution. He has not succeeded, and I don't know if this is a tooling problem or a documentation problem. The two of us can't even figure out what code in Python is supposed to make the py.typed mechanism work.

I was going to suggest contributing the type hints to typeshed instead, but then I found this discussion.

Am I right that PEP 561 is still very future-looking and doesn't work with most people's tooling? Here's a contributor who's very interested in adding type hints, and a maintainer who's interested in merging the PR when it works, and we're both unable to follow the PEP and get type hints installed. If this is the state of things and we're not missing something obvious, then typeshed should keep accepting new packages until its replacement works for more people.

rspeer · 2018-10-01T21:56:35Z

Oh. I figured out why PEP 561 isn't helping us: "This PEP does not support distributing typing information as part of module-only distributions."

So PEP 561 doesn't cover everything. Is this also a case where typeshed wouldn't apply?

srittau · 2018-10-21T11:20:47Z

@rspeer Sorry for the late answer. The best alternative I found is to convert ordered_set into a package, like I did with the asserts package.

srittau · 2018-10-21T11:23:02Z

I am closing this in favor of #2491, since I feel the consensus is towards that solution. Please reopen if you think that there should be more discussion about this.

srittau added a commit to srittau/typeshed that referenced this issue Sep 11, 2018

Don't accept new third party stubs anymore

12d1353

Closes: python#2440

srittau mentioned this issue Sep 11, 2018

Don't accept new third party stubs anymore #2441

Closed

srittau mentioned this issue Sep 27, 2018

Explore building third party stubs as packages #2491

Closed

srittau added the project: policy Organization of the typeshed project label Oct 1, 2018

srittau closed this as completed Oct 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we accept new thirdparty stubs? #2440

Should we accept new thirdparty stubs? #2440

srittau commented Sep 10, 2018

gvanrossum commented Sep 10, 2018

srittau commented Sep 10, 2018

JelleZijlstra commented Sep 10, 2018

emmatyping commented Sep 10, 2018 •

edited

Loading

srittau commented Sep 11, 2018

srittau commented Sep 11, 2018

ilevkivskyi commented Sep 11, 2018

ilevkivskyi commented Sep 11, 2018

bluetech commented Sep 22, 2018

gvanrossum commented Sep 22, 2018 via email

JukkaL commented Sep 24, 2018

JukkaL commented Sep 24, 2018

srittau commented Sep 24, 2018

ilevkivskyi commented Sep 24, 2018

JukkaL commented Sep 24, 2018

srittau commented Oct 1, 2018

JukkaL commented Oct 1, 2018

rspeer commented Oct 1, 2018

rspeer commented Oct 1, 2018 •

edited

Loading

srittau commented Oct 21, 2018

srittau commented Oct 21, 2018

Should we accept new thirdparty stubs? #2440

Should we accept new thirdparty stubs? #2440

Comments

srittau commented Sep 10, 2018

gvanrossum commented Sep 10, 2018

srittau commented Sep 10, 2018

JelleZijlstra commented Sep 10, 2018

emmatyping commented Sep 10, 2018 • edited Loading

srittau commented Sep 11, 2018

srittau commented Sep 11, 2018

ilevkivskyi commented Sep 11, 2018

ilevkivskyi commented Sep 11, 2018

bluetech commented Sep 22, 2018

gvanrossum commented Sep 22, 2018 via email

JukkaL commented Sep 24, 2018

JukkaL commented Sep 24, 2018

srittau commented Sep 24, 2018

ilevkivskyi commented Sep 24, 2018

JukkaL commented Sep 24, 2018

srittau commented Oct 1, 2018

JukkaL commented Oct 1, 2018

rspeer commented Oct 1, 2018

rspeer commented Oct 1, 2018 • edited Loading

srittau commented Oct 21, 2018

srittau commented Oct 21, 2018

emmatyping commented Sep 10, 2018 •

edited

Loading

rspeer commented Oct 1, 2018 •

edited

Loading