Skip to content

Should we accept new thirdparty stubs? #2440

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
srittau opened this issue Sep 10, 2018 · 21 comments
Closed

Should we accept new thirdparty stubs? #2440

srittau opened this issue Sep 10, 2018 · 21 comments
Labels
project: policy Organization of the typeshed project

Comments

@srittau
Copy link
Collaborator

srittau commented Sep 10, 2018

Now that PEP 561 is supported by mypy I wonder whether we should accept new third party stubs into typeshed. Personally, I think we should steer those contributions to either be included into the parent package or to be distributed as a separate stubs package. This means that the people who actually care about those stubs (i.e. the submitters) can update the stubs more easily without having to go through the typeshed maintainers.

Maybe we could also offer a kind of "review service" (as time allows) for getting those packages off the ground?

@gvanrossum
Copy link
Member

Agreed. Though I think we could make (carefully considered) exceptions for commonly used packages of low complexity where the upstream maintainers aren't interested in putting out stubs (but do allow us to put them in typeshed). Packages that look like their stubs would require significant work to create or keep up to date should not be accepted into typeshed. Nor should packages without a large user base.

@srittau
Copy link
Collaborator Author

srittau commented Sep 10, 2018

Makes sense. For example, it makes sense to have stubs for the packaging infrastructure in typeshed. The same is true for backports of stdlib packages.

@JelleZijlstra
Copy link
Member

I also agree. I like your "review service" idea; here's a few other things that could perhaps be helpful:

  • Provide a listing (maybe in typeshed's repo or the mypy docs) of known third-party stubs package of reasonably high quality.
  • Provide tooling (e.g., an abstraction for running mypy tests) to help authors of stub packages. I think @ethanhs has been thinking about this too.

@emmatyping
Copy link
Member

emmatyping commented Sep 10, 2018

I fully agree, the PEP is clear we want to decentralize typed packages so that it scales better :)

I've already started offering help to several early adopters (numpy, attrs, hypothesis, and now typed-django), so if we want to formalize that process, I am all for it.

I don't have a strong opinion on providing a listing of known third-party stub packages. It will certainly help discoverability, but will likely grow unwieldy over time.

As for tooling, I have a few things I am hoping to accomplish:

Also, I created https://github.com/ethanhs/pep-561 which may be a good place to discuss PEP 561 specific issues.

srittau added a commit to srittau/typeshed that referenced this issue Sep 11, 2018
@srittau
Copy link
Collaborator Author

srittau commented Sep 11, 2018

Please have a look at PR #2441, were I try to reword our contributing guidelines in regards to third party stubs.

@srittau
Copy link
Collaborator Author

srittau commented Sep 11, 2018

Here is a list of issues and pull requests for new third party libraries. I suggest to close most with an explanation:

@ilevkivskyi
Copy link
Member

I like the idea of decentralizing things. Another example is https://github.com/dropbox/sqlalchemy-stubs, which is already fully functional PEP 561 package.

@ilevkivskyi
Copy link
Member

Btw https://github.com/dropbox/sqlalchemy-stubs is also a nice example of how mypy test framework can be used to easily add tests/CI for a stub package.

@bluetech
Copy link
Contributor

In my opinion as a mypy user, this will not be a good move at this time. Here is how I order the methods of third party stubs distribution by preference:

  1. Bundled with the package itself.
  2. Typeshed.
  3. Separate package (e.g. somepackage-stubs).

I think (1) is uncontroversial. The CONTRIBUTING.md documentation already encourages doing so whenever upstream is receptive.

I prefer (2) over (3) mainly because a separate package adds a dependency that I would otherwise not have. This means that I have to find out that it exists (not bundled with mypy); I have to trust the stubs' author; I have to keep the package updated; it can be abandoned in which case I need to fork or find a replacement; if I find a problem with the stubs and want to contribute a fix, the experience can be highly variable (compared to typeshed), and so I and others are less likely to do so.


Typescript should be mentioned as a point of comparison. I'd say they are using approach "2⅓" (see DefinitelyTyped). They cannot have a single "typeshed" package because it would be way too big (hopefully typeshed will have this problem too one day), and would not allow to version stubs separately. So what they do is have a single centrally-managed repository, but package each stub under a discoverable, predictable, centrally managed namespace @types/some-package. In my experience this works very well. In fairly big Typescript projects I have only ever used bundled stubs or @types/ stubs.

A note about huge projects like https://github.com/dropbox/sqlalchemy-stubs and https://github.com/TypedDjango/django-stubs. I think it makes a lot of sense to develop them in a separate repository. But once they are production-ready I would still like to see them merged upstream, or absent that, to typeshed.

@gvanrossum
Copy link
Member

gvanrossum commented Sep 22, 2018 via email

@JukkaL
Copy link
Contributor

JukkaL commented Sep 24, 2018

All in all, I basically agree with most points made by @bluetech.

Here is some reasoning:

  • I worry that separately maintained stubs will get out of date after a while, or even stop working with the most recent version of mypy (or pytype, etc.).
  • If many packages are not interested in bundling type annotations, there could be hundreds (or even thousands) of separately maintained stub packages, each potentially with a unique contributor agreement, contributor documentation, code review process, style conventions, testing/QA approach, and release policy (or lack thereof). This raises the bar for making a contribution (@bluetech also raised this point).
  • Finding the right stub package can become a pain, since there can be multiple competing stub packages (say, forks of stale stub projects).
  • Some stub packages for less popular modules could be hard to find, especially if the package has a generic name that results in bogus Google results.
  • Setting up a new GitHub project for just a set of stubs for some package seems like a lot of friction. We can make it easier by providing templates, but the point still stands.

I think that we should try to learn from DefinitelyTyped -- they have ~20x the number of commits compared to typeshed, so their model appears to scale well. Here are some things that I was quickly able to learn and that we might want to adopt in some form:

  1. They include thousands of third party packages in a single repository (and this seems to work well for them).
  2. They automatically release new type packages from github master.
  3. They allow installing types for a single package from the repository (as mentioned by @bluetech).
  4. Their readme explains how to install types and how contribute new types, in quite a lot of detail. The latter could help attract more contributors.
  5. They support writing tests for type definitions and explain this in their readme. Many tests seem pretty simple (example and another example) but they may still be very helpful, as the reviewers can't be expected to be experts on every single package there is.
  6. They have a search engine for type definitions.

Packaging stubs seems like a problem for us. We could have a single monolithic typeshed package as we currently have, but this approach has issues:

  1. If there will eventually be thousands of packages, the package will be huge.
  2. Updates to newer typeshed versions can be painful, since it will be hard/impossible to pin the stub version for some package, in case the most recent version has some problem or there is a non-backward-compatible API change.

Maybe we could somehow automatically generate a separate PyPI package for each new contributed package stub? Or we could have a script that makes this easy to do, even it requires some manual work. We could automatically generate a version number from the last modified date.

There is also the issues of needing to ask for permission from the package maintainers. This creates some friction. We could work around this by proactively asking for permission from the owners of the top 500 most popular packages on PyPI, for example, and making that list prominently visible somewhere.

@JukkaL
Copy link
Contributor

JukkaL commented Sep 24, 2018

Additional thoughts:

  • DefinitelyTyped makes it easy to specify strictness options per package (example).
  • If we'd release a new stub package version automatically after every commit may, it could increase contributor activity, since there is a shorter wait before changes become available.
  • Having stub tests (even if they are incomplete) will make code reviews easier (this is something we've discussed earlier).
  • In order to grow typeshed significantly, we'd probably need more active code reviewers. One idea would be to introduce reviewers who focus only on certain packages. It should be easier to ramp up reviewers this way.

@srittau
Copy link
Collaborator Author

srittau commented Sep 24, 2018

A few points:

  • My main issue is scaling. In the long term I believe that mypy and other type checkers should not ship all stubs, just the stdlib stubs and maybe a few select others. This does not mean that stdlib and third party stubs can't be in the same repository, but we'd need to rethink how typeshed packages are published.
  • I believe that the best way to publish stubs in the parent package, supported by upstream maintainers. I feel that DefinitelyTyped actually discourages package maintainers to do that.
  • I don't know how DefinitelyTyped is handling the contributor bottleneck, but typeshed currently certainly has one.
  • I don't think that third party stubs in typeshed will necessarily be maintained any better than stubs in an external package. We have quite a few third party stubs that are not in a great state.

That said, I agree that the DefinitelyTyped model seems quite successful. I especially like the predictability of package names and the high quality of the declaration files available. In summary I agree that working towards making typeshed more "DefinitelyTyped"-like is the better approach.

@ilevkivskyi
Copy link
Member

They support writing tests for type definitions and explain this in their readme.

One thing that can help with this is publishing mypy test runner separately as a pytest plugin, with this one write nice simple tests like in https://github.com/dropbox/sqlalchemy-stubs/blob/master/test/test-data/sqlalchemy-basics.test (it includes mypy as a submodule but we can't do this here, because typeshed is already a submodule in mypy).

@JukkaL
Copy link
Contributor

JukkaL commented Sep 24, 2018

This does not mean that stdlib and third party stubs can't be in the same repository, but we'd need to rethink how typeshed packages are published.

As I mentioned above, I also don't think that publishing all stubs in a single package is scalable.

I believe that the best way to publish stubs in the parent package, supported by upstream maintainers.

This is reasonable at least for packages that are annotated, though currently we are missing tooling to generate stubs from annotated code. For packages without inline annotations, it may sometimes be better to have the stubs in typeshed (hopefully maintained by the package maintainer), at least if typeshed provides tooling that makes it easier to test and maintain the stubs.

I don't think that third party stubs in typeshed will necessarily be maintained any better than stubs in an external package. We have quite a few third party stubs that are not in a great state.

Yes, the stubs can still be of low quality. However, we could have some infrastructure (such as ability to write tests) that would help maintain quality. We'd at least ensure that the stubs don't get completely broken and that they work with recent versions of tools. Being able to easily declare that everything in a stub should be annotated could also be helpful.

My main argument is that if we have a central repository, any improvements to tooling around stubs immediately help all stubs in the repository (and new stubs), which makes things more scalable. Also, adding a new stub could be as simple as creating a PR with the stubs (+ tests).

Alternatively, we could provide tools or even a project template to make it relatively easy to have similar things for decentralized stubs, but the barrier to entry would still be much higher. Somebody would have to create a GitHub repo, learn about available tooling, set things up correctly (including tests and Travis), keep track of issues, review PRs, etc. This works for big projects where there is a lot of developer interest, such as Django, but I fear that for less popular packages there is nobody willing to do this -- but somebody might be willing to make a one-time contribution with only the stubs (and a few tests).

@srittau srittau added the project: policy Organization of the typeshed project label Oct 1, 2018
@srittau
Copy link
Collaborator Author

srittau commented Oct 1, 2018

I'm trying to get a feel on how we should proceed. Personally, #2491 ("DefinitelyTypeshed") seems more promising than rejecting third-party stubs in the future. What does everyone else think? Reject 3rd party? DefiniteTypeshed? Keep the status quo for now? Another option? Whatever we decide, I will try to make some time to work on it.

@JukkaL
Copy link
Contributor

JukkaL commented Oct 1, 2018

I'm in favor of moving forward with #2491.

@rspeer
Copy link

rspeer commented Oct 1, 2018

I have a contributor who is interested in adding .pyi type hints to ordered_set, and is trying to follow PEP 561 to make them get distributed and installed as part of the ordered_set distribution. He has not succeeded, and I don't know if this is a tooling problem or a documentation problem. The two of us can't even figure out what code in Python is supposed to make the py.typed mechanism work.

I was going to suggest contributing the type hints to typeshed instead, but then I found this discussion.

Am I right that PEP 561 is still very future-looking and doesn't work with most people's tooling? Here's a contributor who's very interested in adding type hints, and a maintainer who's interested in merging the PR when it works, and we're both unable to follow the PEP and get type hints installed. If this is the state of things and we're not missing something obvious, then typeshed should keep accepting new packages until its replacement works for more people.

@rspeer
Copy link

rspeer commented Oct 1, 2018

Oh. I figured out why PEP 561 isn't helping us: "This PEP does not support distributing typing information as part of module-only distributions."

So PEP 561 doesn't cover everything. Is this also a case where typeshed wouldn't apply?

@srittau
Copy link
Collaborator Author

srittau commented Oct 21, 2018

@rspeer Sorry for the late answer. The best alternative I found is to convert ordered_set into a package, like I did with the asserts package.

@srittau
Copy link
Collaborator Author

srittau commented Oct 21, 2018

I am closing this in favor of #2491, since I feel the consensus is towards that solution. Please reopen if you think that there should be more discussion about this.

@srittau srittau closed this as completed Oct 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
project: policy Organization of the typeshed project
Projects
None yet
Development

No branches or pull requests

8 participants