Skip to content

Commit 7cb733a

Browse files
ilevkivskyipre-commit-ci[bot]hauntsaninja
authored
Re-work overload overlap logic (#17392)
Fixes #5510 OK, so I noticed during last couple years, that every other time I change something about type variables, a few unsafe overload overlap errors either appears or disappears. At some point I almost stopped looking at them. The problem is that unsafe overload overlap detection for generic callables is currently ad-hoc. However, as I started working on it, I discovered a bunch of foundational problems (and few smaller issues), so I decided to re-work the unsafe overload overlap detection. Here is a detailed summary: * Currently return type compatibility is decided using regular subtype check. Although it is technically correct, in most cases there is nothing wrong if first overload returns `list[Subtype]` and second returns `list[Supertype]`. All the unsafe overload story is about runtime values, not static types, so we should use `is_subset()` instead of `is_subtype()`, which is IIUC easy to implement: we simply need to consider all invariant types covariant. * Current implementation only checks for overlap between parameters, i.e. it checks if there are some calls that are valid for both overloads. But we also need to check that those common calls will not be always caught by the first overload. I assume it was not checked because, naively, we already check elsewhere that first overload doesn't completely shadow the second one. But this is not the same: first overload may be not more general overall, but when narrowed to common calls, it may be more general. Example of such false-positive (this is an oversimplified version of what is often used in situations with many optional positional arguments): ```python @overload def foo(x: object) -> object: ... @overload def foo(x: int = ...) -> int: ... ``` * Currently overlap for generic callables is decided using some weird two-way unification procedure, where we actually keep going on (with non-unified variables, and/or `<never>`) if the right to left unification fails. TBH I never understood this. What we need is to find some set of type variable values that makes two overloads unsafely overlapping. Constraint inference may be used as a (good) source of such guesses, but is not decisive in any way. So instead I simply try all combinations of upper bounds and values. The main benefit of such approach is that it is guaranteed false-positive free. If such algorithm finds an overlap it is definitely an overlap. There are however false negatives, but we can incrementally tighten them in the future. * I am making `Any` overlap nothing when considering overloads. Currently it overlaps everything (i.e. it is not different from `object`), but this violates the rule that replacing a precise type with `Any` should not generate an error. IOW I essentially treat `Any` as "too dynamic or not imported". * I extend `None` special-casing to be more uniform. Now essentially it only overlaps with explicitly optional types. This is important for descriptor-like signatures. * Finally, I did a cleanup in `is_overlapping_types()`, most notably flags were not passed down to various (recursive) helpers, and `ParamSpec`/`Parameters` were treated a bit arbitrary. Pros/cons of the outcome: * Pro: simple (even if not 100% accurate) mental model * Pro: all major classes of false positives eliminated * Pro: couple minor false negatives fixed * Con: two new false negatives added, more details below So here a two new false negatives and motivation on why I think they are OK. First example is ```python T = TypeVar("T") @overload def foo(x: str) -> int: ... @overload def foo(x: T) -> T: ... def foo(x): if isinstance(x, str): return 0 return x ``` This is obviously unsafe (consider `T = float`), but not flagged after this PR. I think this is ~fine for two reasons: * There is no good alternative for a user, the error is not very actionable. Using types like `(str | T) -> int | T` is a bad idea because unions with type variables are not only imprecise, but also highly problematic for inference. * The false negative is mostly affecting unbounded type variables, if a "suspicious" bound is used (like `bound=float` in this example), the error will be still reported. Second example is signatures like ```python @overload def foo(x: str, y: str) -> str: ... @overload def foo(*args: str) -> int: ... @overload def bar(*, x: str, y: str) -> str: ... @overload def bar(**kwds: str) -> int: ... ``` These are also unsafe because one can fool mypy with `x: tuple[str, ...] = ("x", "y"); foo(*x)` and `x: dict[str, str] = {"x": "x", "y": "y"}; bar(**x)`. I think this is OK because while such unsafe calls are quite rare, this kind of catch-all fallback as last overload is relatively common. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Shantanu <[email protected]>
1 parent e1ff8aa commit 7cb733a

16 files changed

+352
-275
lines changed

mypy/checker.py

Lines changed: 122 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,6 @@
170170
false_only,
171171
fixup_partial_type,
172172
function_type,
173-
get_type_vars,
174173
is_literal_type_like,
175174
is_singleton_type,
176175
make_simplified_union,
@@ -787,7 +786,16 @@ def check_overlapping_overloads(self, defn: OverloadedFuncDef) -> None:
787786
type_vars = current_class.defn.type_vars if current_class else []
788787
with state.strict_optional_set(True):
789788
if is_unsafe_overlapping_overload_signatures(sig1, sig2, type_vars):
790-
self.msg.overloaded_signatures_overlap(i + 1, i + j + 2, item.func)
789+
flip_note = (
790+
j == 0
791+
and not is_unsafe_overlapping_overload_signatures(
792+
sig2, sig1, type_vars
793+
)
794+
and not overload_can_never_match(sig2, sig1)
795+
)
796+
self.msg.overloaded_signatures_overlap(
797+
i + 1, i + j + 2, flip_note, item.func
798+
)
791799

792800
if impl_type is not None:
793801
assert defn.impl is not None
@@ -1764,6 +1772,8 @@ def is_unsafe_overlapping_op(
17641772
# second operand is the right argument -- we switch the order of
17651773
# the arguments of the reverse method.
17661774

1775+
# TODO: this manipulation is dangerous if callables are generic.
1776+
# Shuffling arguments between callables can create meaningless types.
17671777
forward_tweaked = forward_item.copy_modified(
17681778
arg_types=[forward_base_erased, forward_item.arg_types[0]],
17691779
arg_kinds=[nodes.ARG_POS] * 2,
@@ -1790,7 +1800,9 @@ def is_unsafe_overlapping_op(
17901800

17911801
current_class = self.scope.active_class()
17921802
type_vars = current_class.defn.type_vars if current_class else []
1793-
return is_unsafe_overlapping_overload_signatures(first, second, type_vars)
1803+
return is_unsafe_overlapping_overload_signatures(
1804+
first, second, type_vars, partial_only=False
1805+
)
17941806

17951807
def check_inplace_operator_method(self, defn: FuncBase) -> None:
17961808
"""Check an inplace operator method such as __iadd__.
@@ -2185,7 +2197,7 @@ def get_op_other_domain(self, tp: FunctionLike) -> Type | None:
21852197
if isinstance(tp, CallableType):
21862198
if tp.arg_kinds and tp.arg_kinds[0] == ARG_POS:
21872199
# For generic methods, domain comparison is tricky, as a first
2188-
# approximation erase all remaining type variables to bounds.
2200+
# approximation erase all remaining type variables.
21892201
return erase_typevars(tp.arg_types[0], {v.id for v in tp.variables})
21902202
return None
21912203
elif isinstance(tp, Overloaded):
@@ -7827,68 +7839,112 @@ def are_argument_counts_overlapping(t: CallableType, s: CallableType) -> bool:
78277839
return min_args <= max_args
78287840

78297841

7842+
def expand_callable_variants(c: CallableType) -> list[CallableType]:
7843+
"""Expand a generic callable using all combinations of type variables' values/bounds."""
7844+
for tv in c.variables:
7845+
# We need to expand self-type before other variables, because this is the only
7846+
# type variable that can have other type variables in the upper bound.
7847+
if tv.id.is_self():
7848+
c = expand_type(c, {tv.id: tv.upper_bound}).copy_modified(
7849+
variables=[v for v in c.variables if not v.id.is_self()]
7850+
)
7851+
break
7852+
7853+
if not c.is_generic():
7854+
# Fast path.
7855+
return [c]
7856+
7857+
tvar_values = []
7858+
for tvar in c.variables:
7859+
if isinstance(tvar, TypeVarType) and tvar.values:
7860+
tvar_values.append(tvar.values)
7861+
else:
7862+
tvar_values.append([tvar.upper_bound])
7863+
7864+
variants = []
7865+
for combination in itertools.product(*tvar_values):
7866+
tvar_map = {tv.id: subst for (tv, subst) in zip(c.variables, combination)}
7867+
variants.append(expand_type(c, tvar_map).copy_modified(variables=[]))
7868+
return variants
7869+
7870+
78307871
def is_unsafe_overlapping_overload_signatures(
7831-
signature: CallableType, other: CallableType, class_type_vars: list[TypeVarLikeType]
7872+
signature: CallableType,
7873+
other: CallableType,
7874+
class_type_vars: list[TypeVarLikeType],
7875+
partial_only: bool = True,
78327876
) -> bool:
78337877
"""Check if two overloaded signatures are unsafely overlapping or partially overlapping.
78347878
7835-
We consider two functions 's' and 't' to be unsafely overlapping if both
7836-
of the following are true:
7879+
We consider two functions 's' and 't' to be unsafely overlapping if three
7880+
conditions hold:
7881+
7882+
1. s's parameters are partially overlapping with t's. i.e. there are calls that are
7883+
valid for both signatures.
7884+
2. for these common calls, some of t's parameters types are wider that s's.
7885+
3. s's return type is NOT a subset of t's.
78377886
7838-
1. s's parameters are all more precise or partially overlapping with t's
7839-
2. s's return type is NOT a subtype of t's.
7887+
Note that we use subset rather than subtype relationship in these checks because:
7888+
* Overload selection happens at runtime, not statically.
7889+
* This results in more lenient behavior.
7890+
This can cause false negatives (e.g. if overloaded function returns an externally
7891+
visible attribute with invariant type), but such situations are rare. In general,
7892+
overloads in Python are generally unsafe, so we intentionally try to avoid giving
7893+
non-actionable errors (see more details in comments below).
78407894
78417895
Assumes that 'signature' appears earlier in the list of overload
78427896
alternatives then 'other' and that their argument counts are overlapping.
78437897
"""
78447898
# Try detaching callables from the containing class so that all TypeVars
7845-
# are treated as being free.
7846-
#
7847-
# This lets us identify cases where the two signatures use completely
7848-
# incompatible types -- e.g. see the testOverloadingInferUnionReturnWithMixedTypevars
7849-
# test case.
7899+
# are treated as being free, i.e. the signature is as seen from inside the class,
7900+
# where "self" is not yet bound to anything.
78507901
signature = detach_callable(signature, class_type_vars)
78517902
other = detach_callable(other, class_type_vars)
78527903

7853-
# Note: We repeat this check twice in both directions due to a slight
7854-
# asymmetry in 'is_callable_compatible'. When checking for partial overlaps,
7855-
# we attempt to unify 'signature' and 'other' both against each other.
7856-
#
7857-
# If 'signature' cannot be unified with 'other', we end early. However,
7858-
# if 'other' cannot be modified with 'signature', the function continues
7859-
# using the older version of 'other'.
7860-
#
7861-
# This discrepancy is unfortunately difficult to get rid of, so we repeat the
7862-
# checks twice in both directions for now.
7863-
#
7864-
# Note that we ignore possible overlap between type variables and None. This
7865-
# is technically unsafe, but unsafety is tiny and this prevents some common
7866-
# use cases like:
7867-
# @overload
7868-
# def foo(x: None) -> None: ..
7869-
# @overload
7870-
# def foo(x: T) -> Foo[T]: ...
7871-
return is_callable_compatible(
7872-
signature,
7873-
other,
7874-
is_compat=is_overlapping_types_no_promote_no_uninhabited_no_none,
7875-
is_proper_subtype=False,
7876-
is_compat_return=lambda l, r: not is_subtype_no_promote(l, r),
7877-
ignore_return=False,
7878-
check_args_covariantly=True,
7879-
allow_partial_overlap=True,
7880-
no_unify_none=True,
7881-
) or is_callable_compatible(
7882-
other,
7883-
signature,
7884-
is_compat=is_overlapping_types_no_promote_no_uninhabited_no_none,
7885-
is_proper_subtype=False,
7886-
is_compat_return=lambda l, r: not is_subtype_no_promote(r, l),
7887-
ignore_return=False,
7888-
check_args_covariantly=False,
7889-
allow_partial_overlap=True,
7890-
no_unify_none=True,
7891-
)
7904+
# Note: We repeat this check twice in both directions compensate for slight
7905+
# asymmetries in 'is_callable_compatible'.
7906+
7907+
for sig_variant in expand_callable_variants(signature):
7908+
for other_variant in expand_callable_variants(other):
7909+
# Using only expanded callables may cause false negatives, we can add
7910+
# more variants (e.g. using inference between callables) in the future.
7911+
if is_subset_no_promote(sig_variant.ret_type, other_variant.ret_type):
7912+
continue
7913+
if not (
7914+
is_callable_compatible(
7915+
sig_variant,
7916+
other_variant,
7917+
is_compat=is_overlapping_types_for_overload,
7918+
check_args_covariantly=False,
7919+
is_proper_subtype=False,
7920+
is_compat_return=lambda l, r: not is_subset_no_promote(l, r),
7921+
allow_partial_overlap=True,
7922+
)
7923+
or is_callable_compatible(
7924+
other_variant,
7925+
sig_variant,
7926+
is_compat=is_overlapping_types_for_overload,
7927+
check_args_covariantly=True,
7928+
is_proper_subtype=False,
7929+
is_compat_return=lambda l, r: not is_subset_no_promote(r, l),
7930+
allow_partial_overlap=True,
7931+
)
7932+
):
7933+
continue
7934+
# Using the same `allow_partial_overlap` flag as before, can cause false
7935+
# negatives in case where star argument is used in a catch-all fallback overload.
7936+
# But again, practicality beats purity here.
7937+
if not partial_only or not is_callable_compatible(
7938+
other_variant,
7939+
sig_variant,
7940+
is_compat=is_subset_no_promote,
7941+
check_args_covariantly=True,
7942+
is_proper_subtype=False,
7943+
ignore_return=True,
7944+
allow_partial_overlap=True,
7945+
):
7946+
return True
7947+
return False
78927948

78937949

78947950
def detach_callable(typ: CallableType, class_type_vars: list[TypeVarLikeType]) -> CallableType:
@@ -7897,21 +7953,11 @@ def detach_callable(typ: CallableType, class_type_vars: list[TypeVarLikeType]) -
78977953
A callable normally keeps track of the type variables it uses within its 'variables' field.
78987954
However, if the callable is from a method and that method is using a class type variable,
78997955
the callable will not keep track of that type variable since it belongs to the class.
7900-
7901-
This function will traverse the callable and find all used type vars and add them to the
7902-
variables field if it isn't already present.
7903-
7904-
The caller can then unify on all type variables whether the callable is originally from
7905-
the class or not."""
7956+
"""
79067957
if not class_type_vars:
79077958
# Fast path, nothing to update.
79087959
return typ
7909-
seen_type_vars = set()
7910-
for t in typ.arg_types + [typ.ret_type]:
7911-
seen_type_vars |= set(get_type_vars(t))
7912-
return typ.copy_modified(
7913-
variables=list(typ.variables) + [tv for tv in class_type_vars if tv in seen_type_vars]
7914-
)
7960+
return typ.copy_modified(variables=list(typ.variables) + class_type_vars)
79157961

79167962

79177963
def overload_can_never_match(signature: CallableType, other: CallableType) -> bool:
@@ -8388,21 +8434,24 @@ def get_property_type(t: ProperType) -> ProperType:
83888434
return t
83898435

83908436

8391-
def is_subtype_no_promote(left: Type, right: Type) -> bool:
8392-
return is_subtype(left, right, ignore_promotions=True)
8437+
def is_subset_no_promote(left: Type, right: Type) -> bool:
8438+
return is_subtype(left, right, ignore_promotions=True, always_covariant=True)
83938439

83948440

8395-
def is_overlapping_types_no_promote_no_uninhabited_no_none(left: Type, right: Type) -> bool:
8396-
# For the purpose of unsafe overload checks we consider list[Never] and list[int]
8397-
# non-overlapping. This is consistent with how we treat list[int] and list[str] as
8398-
# non-overlapping, despite [] belongs to both. Also this will prevent false positives
8399-
# for failed type inference during unification.
8441+
def is_overlapping_types_for_overload(left: Type, right: Type) -> bool:
8442+
# Note that among other effects 'overlap_for_overloads' flag will effectively
8443+
# ignore possible overlap between type variables and None. This is technically
8444+
# unsafe, but unsafety is tiny and this prevents some common use cases like:
8445+
# @overload
8446+
# def foo(x: None) -> None: ..
8447+
# @overload
8448+
# def foo(x: T) -> Foo[T]: ...
84008449
return is_overlapping_types(
84018450
left,
84028451
right,
84038452
ignore_promotions=True,
8404-
ignore_uninhabited=True,
84058453
prohibit_none_typevar_overlap=True,
8454+
overlap_for_overloads=True,
84068455
)
84078456

84088457

mypy/constraints.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1055,7 +1055,7 @@ def visit_callable_type(self, template: CallableType) -> list[Constraint]:
10551055
# like U -> U, should be Callable[..., Any], but if U is a self-type, we can
10561056
# allow it to leak, to be later bound to self. A bunch of existing code
10571057
# depends on this old behaviour.
1058-
and not any(tv.id.raw_id == 0 for tv in cactual.variables)
1058+
and not any(tv.id.is_self() for tv in cactual.variables)
10591059
):
10601060
# If the actual callable is generic, infer constraints in the opposite
10611061
# direction, and indicate to the solver there are extra type variables

mypy/expandtype.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -221,7 +221,7 @@ def visit_instance(self, t: Instance) -> Type:
221221
def visit_type_var(self, t: TypeVarType) -> Type:
222222
# Normally upper bounds can't contain other type variables, the only exception is
223223
# special type variable Self`0 <: C[T, S], where C is the class where Self is used.
224-
if t.id.raw_id == 0:
224+
if t.id.is_self():
225225
t = t.copy_modified(upper_bound=t.upper_bound.accept(self))
226226
repl = self.variables.get(t.id, t)
227227
if isinstance(repl, ProperType) and isinstance(repl, Instance):

0 commit comments

Comments
 (0)