You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fixes#5510
OK, so I noticed during last couple years, that every other time I
change something about type variables, a few unsafe overload overlap
errors either appears or disappears. At some point I almost stopped
looking at them. The problem is that unsafe overload overlap detection
for generic callables is currently ad-hoc. However, as I started working
on it, I discovered a bunch of foundational problems (and few smaller
issues), so I decided to re-work the unsafe overload overlap detection.
Here is a detailed summary:
* Currently return type compatibility is decided using regular subtype
check. Although it is technically
correct, in most cases there is nothing wrong if first overload returns
`list[Subtype]` and second returns `list[Supertype]`. All the unsafe
overload story is about runtime values, not static types, so we should
use `is_subset()` instead of `is_subtype()`, which is IIUC easy to
implement: we simply need to consider all invariant types covariant.
* Current implementation only checks for overlap between parameters,
i.e. it checks if there are some calls that are valid for both
overloads. But we also need to check that those common calls will not be
always caught by the first overload. I assume it was not checked
because, naively, we already check elsewhere that first overload doesn't
completely shadow the second one. But this is not the same: first
overload may be not more general overall, but when narrowed to common
calls, it may be more general. Example of such false-positive (this is
an oversimplified version of what is often used in situations with many
optional positional arguments):
```python
@overload
def foo(x: object) -> object: ...
@overload
def foo(x: int = ...) -> int: ...
```
* Currently overlap for generic callables is decided using some weird
two-way unification procedure, where we actually keep going on (with
non-unified variables, and/or `<never>`) if the right to left
unification fails. TBH I never understood this. What we need is to find
some set of type variable values that makes two overloads unsafely
overlapping. Constraint inference may be used as a (good) source of such
guesses, but is not decisive in any way. So instead I simply try all
combinations of upper bounds and values. The main benefit of such
approach is that it is guaranteed false-positive free. If such algorithm
finds an overlap it is definitely an overlap. There are however false
negatives, but we can incrementally tighten them in the future.
* I am making `Any` overlap nothing when considering overloads.
Currently it overlaps everything (i.e. it is not different from
`object`), but this violates the rule that replacing a precise type with
`Any` should not generate an error. IOW I essentially treat `Any` as
"too dynamic or not imported".
* I extend `None` special-casing to be more uniform. Now essentially it
only overlaps with explicitly optional types. This is important for
descriptor-like signatures.
* Finally, I did a cleanup in `is_overlapping_types()`, most notably
flags were not passed down to various (recursive) helpers, and
`ParamSpec`/`Parameters` were treated a bit arbitrary.
Pros/cons of the outcome:
* Pro: simple (even if not 100% accurate) mental model
* Pro: all major classes of false positives eliminated
* Pro: couple minor false negatives fixed
* Con: two new false negatives added, more details below
So here a two new false negatives and motivation on why I think they are
OK. First example is
```python
T = TypeVar("T")
@overload
def foo(x: str) -> int: ...
@overload
def foo(x: T) -> T: ...
def foo(x):
if isinstance(x, str):
return 0
return x
```
This is obviously unsafe (consider `T = float`), but not flagged after
this PR. I think this is ~fine for two reasons:
* There is no good alternative for a user, the error is not very
actionable. Using types like `(str | T) -> int | T` is a bad idea
because unions with type variables are not only imprecise, but also
highly problematic for inference.
* The false negative is mostly affecting unbounded type variables, if a
"suspicious" bound is used (like `bound=float` in this example), the
error will be still reported.
Second example is signatures like
```python
@overload
def foo(x: str, y: str) -> str: ...
@overload
def foo(*args: str) -> int: ...
@overload
def bar(*, x: str, y: str) -> str: ...
@overload
def bar(**kwds: str) -> int: ...
```
These are also unsafe because one can fool mypy with `x: tuple[str, ...]
= ("x", "y"); foo(*x)` and `x: dict[str, str] = {"x": "x", "y": "y"};
bar(**x)`. I think this is OK because while such unsafe calls are quite
rare, this kind of catch-all fallback as last overload is relatively
common.
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Shantanu <[email protected]>
0 commit comments