-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Evaluate ways to determine accuracy and reduce incomplete type stubs #8481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Well, I think we all agree that incomplete and inaccurate stubs are BadTM :) How do you suggest we improve here? I've spent a fair amount of time over the last few months working on improving several of our existing tools for evaluating accuracy and completeness of stubs (stubtest, flake8-pyi, mypy_primer, etc.), as have most of the other maintainers of typeshed. I plan on continuing to do so. Is there something else we should be doing, in your opinion? |
kkirsche previously mentioned that we could do a better job of working with upstream to integrate type hints. Some ways we could improve the odds of that happening:
Some other ideas: One thing that would be useful is just better stub generation. Existing stub generators assume you want to lovingly handcraft your stubs and do no type inference for you. Maybe Jelle's autotyping would be something of a starting point? I also wouldn't be surprised if it was very easy to get pyright to do this / if pyright could already do it, it's been a while since I've played with A related category of tool that we could use are static validation of type hints against upstream code. For instance, something along the lines of "apply hints back to upstream code, type check it, and see what happens". This would take a bit of work to find something with high enough signal to noise, but I think it's doable to create something here that adds value. A similar thing that might be more feasible is type checking an upstream's tests against the stubs. (I realise I'm basically suggesting |
This actually kinda demonstrates what I think could be done. It sounds like a lot of the goals, plans, etc. are spread out across a variety of repositories, people, and is very individualized. That's 100% reasonable, and people who are contributing in their free time should be allowed to work on what they find most interesting or valuable. With that said, it would be useful for outsiders if these were more clearly documented and grouped to allow others to assist you all. I can't speak for others, but I've found determining what the team finds valuable contributions vs wastes of their times rather challenging and at times frustrating. Potentially using GitHub's milestones to set more tangible goals around type incompleteness, accuracy, and tooling could be provided to help users find how they want to contribute to these goals? Another idea may be to add an index of tooling, where it's located, what the motivation of its author (as not all are projects under the Python or PyCQA namespace) to the contributing or readme documentation. This could be grouped by organization or author so that users can more clearly understand when they're switching boundaries that may result in different expectations than exist under a different author / org. My third idea would be to expand into new tooling areas that do more analysis of the function bodies to evaluate if the type being declared is more restrictive (reasonable in some cases but may indicate a misunderstanding or error in others) than the functions the type is being used in to highlight areas where additional consideration should be given while reviewing the existing types. Curious what others think about these ideas or if I missed something that already exists related to these areas. EDIT: fixed iOS typo from wand to what |
I've reached out to psf/requests via their issue tracker (psf/requests#6211) to begin exploring why various projects aren't merging the type stubs from typeshed.
From requests, they've said that previous reviews of the type stubs are inaccurate and prevent them from merging the type hints into the main codebase.
This issue is to explore how type stub accuracy can be better evaluated to reduce the maintenance burden currently imposed on typeshed caused by inaccuracy or incompleteness.
Ultimately, while having stubs may help end users in the short term, inaccurate or incomplete stubs can prevent or delay long term adoption by the upstream team(s).
The text was updated successfully, but these errors were encountered: