Skip to content

[red-knot] Specialize str.startswith for string literals #17351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 11, 2025

Conversation

sharkdp
Copy link
Contributor

@sharkdp sharkdp commented Apr 11, 2025

Summary

Infer precise Boolean literal types for str.startswith calls where the instance and the prefix are both string literals. This allows us to understand sys.platform.startswith(…) branches.

Test Plan

New Markdown tests

@sharkdp sharkdp added the ty Multi-file analysis & type inference label Apr 11, 2025
Copy link
Contributor

github-actions bot commented Apr 11, 2025

mypy_primer results

Changes were detected when running on open source projects
pyinstrument (https://github.com/joerick/pyinstrument)
- warning[lint:possibly-unresolved-reference] /tmp/mypy_primer/projects/pyinstrument/pyinstrument/vendor/appdirs.py:99:33: Name `_get_win_folder` used when possibly not defined
- warning[lint:possibly-unresolved-reference] /tmp/mypy_primer/projects/pyinstrument/pyinstrument/vendor/appdirs.py:152:33: Name `_get_win_folder` used when possibly not defined
- warning[lint:possibly-unresolved-reference] /tmp/mypy_primer/projects/pyinstrument/pyinstrument/vendor/appdirs.py:319:33: Name `_get_win_folder` used when possibly not defined
- error[lint:unresolved-import] /tmp/mypy_primer/projects/pyinstrument/pyinstrument/vendor/appdirs.py:559:28: Module `ctypes` has no member `windll`
- error[lint:unresolved-import] /tmp/mypy_primer/projects/pyinstrument/pyinstrument/vendor/appdirs.py:562:20: Cannot resolve import `com.sun.jna`
+ warning[lint:unresolved-reference] /tmp/mypy_primer/projects/pyinstrument/pyinstrument/vendor/appdirs.py:99:33: Name `_get_win_folder` used when not defined
+ warning[lint:unresolved-reference] /tmp/mypy_primer/projects/pyinstrument/pyinstrument/vendor/appdirs.py:152:33: Name `_get_win_folder` used when not defined
+ warning[lint:unresolved-reference] /tmp/mypy_primer/projects/pyinstrument/pyinstrument/vendor/appdirs.py:319:33: Name `_get_win_folder` used when not defined
- Found 292 diagnostics
+ Found 290 diagnostics

@sharkdp
Copy link
Contributor Author

sharkdp commented Apr 11, 2025

The pyinstrument changes show that we now understand the sys.platform.startswith('java') branch in that file. We still report some diagnostics, because the reachable else branch sets system = sys.platform and then uses that symbol inside nested scopes for further platform checks. We still emit errors in those branches, because the public type of system is Literal["linux"] | Unknown, so we don't infer those platform check branches as unreachable.

So I think those diagnostics are true positives according to our model, where system could be reassigned (not that we would understand that yet, but we would require that symbol to be qualified with Final).

Two false positives further down in the file are now gone, because they use system for platform checks inside the scope it was defined in.

@sharkdp sharkdp marked this pull request as ready for review April 11, 2025 13:24
@sharkdp sharkdp force-pushed the david/str-startswith branch from 3e44581 to f35c255 Compare April 11, 2025 13:25
@AlexWaygood
Copy link
Member

It might be confusing for users if we infer precise types for str.startswith but not str.endswith — should we do that method too?

Overall I'm okay with this, but note that this goes beyond what the spec requires of us when it comes to understanding sys.platform branches

@sharkdp sharkdp marked this pull request as draft April 11, 2025 13:40
@sharkdp sharkdp force-pushed the david/str-startswith branch 2 times, most recently from b89dbc8 to 0042529 Compare April 11, 2025 13:53
@sharkdp sharkdp marked this pull request as ready for review April 11, 2025 13:53
@sharkdp
Copy link
Contributor Author

sharkdp commented Apr 11, 2025

It might be confusing for users if we infer precise types for str.startswith but not str.endswith — should we do that method too?

It's going to be extremely rare that someone actually calls x.startswith(y) where both x and y are string literal types. I don't think that we have to worry about similarly precise endswith inference being a required feature for anyone.

Overall I'm okay with this, but note that this goes beyond what the spec requires of us when it comes to understanding sys.platform branches

Overall I'm okay with this, but note that this goes beyond what the spec requires of us when it comes to understanding sys.platform branches

Yes, but the spec is vague. And there is this issue which aims to improve the spec. It includes this comment:

Maybe it would also make sense to support platform checks using startswith(), as that is officially recommended in the Python docs, and is required to properly support some Unix variants, like FreeBSD, that include the version number in the platform string. See also python/typeshed#11876.

And this comment:

mypy has actually always supported sys.platform.startswith, since 2016 (in addition to == and !=)

And this:

sys.platform.startswith probably makes sense to add since it helps FreeBSD (a Tier 3 platform), is already supported by mypy, and doesn't seem difficult to add to other type checkers.

@sharkdp sharkdp force-pushed the david/str-startswith branch from 0042529 to 1fd746c Compare April 11, 2025 14:00
@sharkdp sharkdp changed the title [red-knot] Type inference for str.startswith [red-knot] Specialize str.startswith for string literals Apr 11, 2025
@sharkdp sharkdp merged commit 47956db into main Apr 11, 2025
23 checks passed
@sharkdp sharkdp deleted the david/str-startswith branch April 11, 2025 14:26
dcreager added a commit that referenced this pull request Apr 11, 2025
* main:
  [red-knot] Specialize `str.startswith` for string literals (#17351)
  [syntax-errors] `yield`, `yield from`, and `await` outside functions (#17298)
  [red-knot] Refresh diagnostics when changing related files (#17350)
  Add `Checker::import_from_typing` (#17340)
  Don't add chaperone space after escaped quote in triple quote (#17216)
  [red-knot] Silence errors in unreachable type annotations / class bases (#17342)
sharkdp added a commit that referenced this pull request Apr 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ty Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants