Skip to content

Overhaul overload semantics, remove erasure, add union math #5084

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
May 30, 2018

Conversation

Michael0x2a
Copy link
Collaborator

Overhaul overload semantics, remove erasure, add union math

This pull request:

  1. Modifies how mypy handles overloads to match the proposal I made in the typing repo.

  2. Starts removing type erasure from overload checks.

  3. Adds support for basic union math.

  4. Makes overloads respect keyword-only args

This pull request does NOT implement the following:

  1. Special-casing descriptors

  2. Improving how operator methods are handled: I've tweaked one or two methods to take advantage of some of the changes I made in this PR, but stayed away from making any drastic changes for now.

  3. Detecting partially overlapping argument types -- e.g. code like:

    @overload
    def f(x: Union[A, B]) -> str: ...
    @overload
    def f(x: Union[B, C]) -> int: ...
    
    @overload
    def g(x: Tuple[int, ...]) -> str: ...
    @overload
    def g(x: Tuple[int, int]) -> int: ...
    
  4. Detecting overlapping "argument counts". For example, we should flag the following as an error since the first alternative could potentially overlap with the second.

    @overload
    def f(*args: int) -> int: ...
    @overload
    def f(x: int, y: int, z: int) -> str: ...
    
  5. The "is-more-precise" relation. It's working in most normal cases but still contains a few bugs, mostly relating to type vars. For example, this currently isn't being flagged as an error:

    class Wrapper(Generic[T]):
        @overload
        def f(self, x: int) -> int: ...  # No error?
        @overload
        def f(self, x: T) -> str: ...
    

    (This PR does the right thing if 'T' isn't bound to a containing class though:)

    class Wrapper:
        @overload
        def f(self, x: int, y: int) -> int: ...  # Error
        @overload
        def f(self, x: T, y: T) -> str: ...
    

    Currently, what I'm doing is using the existing is_more_precise method, which calls is_proper_subtype. I think i'll either need to rewrite is_more_precise to do what I want with typevars or find a way of forcing the two methods to unify their typevars before running the is_proper_subtype check.

The plan is to address these 5 TODOs in future pull requests. Items 1 and 2 are basically orthogonal to the overloads overhaul; items 3, 4, and 5 basically boil down to finding ways to teach mypy to detect if one thing is potentially compatible with another.

For example, mypy contains code to tell if one type is definitely a subtype of another; fixing items 3 and 5 involve writing code to check if a type is potentially a subtype of another.

@Michael0x2a
Copy link
Collaborator Author

Michael0x2a commented May 19, 2018

@ilevkivskyi - I think this is ready to be reviewed whenever you have time. The tests are currently failing, but I that should be fixed once python/typeshed#2150 lands and we do the whole sync typeshed + rebase dance.

(It also looks like there are a few issues with the Python 2 stubs which I'm also investigating.)

Also, just a note: accepting this PR without implementing todos 3, 4, and 5 will mean mypy will have a hole in the type system, at least temporarily. If the team decides they'd rather avoid that, I don't mind working on those todos first: it's the same amount of work for me, so I guess it comes down to what's easier review/work with.

Another option might be to add an optional flag that throws a warning if a call ends up matching multiple overloads with distinct return types -- it'll give us a mechanism for detecting overloads that end up being unsafe/would be a useful knob for people who prefer a more paranoid typechecker.

(I also don't think it'll be too noisy: I tried implementing + running that check on mypy and got back only three errors, mostly relating to the zip builtin function.)

@Michael0x2a
Copy link
Collaborator Author

So, correction: most of the failing tests are due to typeshed issues that are easy to fix.

However, one of the failures exposed an issue that might be harder to address: it boils down to an unfortunate interaction between non-strict optional + removing type erasure. This came up with Python 2's map function which, when simplified, looks roughly like this:

@overload
def map(func: None, 
        iter1: Iterable[T1], 
        iter2: Iterable[T2]) -> Iterable[Tuple[T1, T2]]: ...
        
@overload
def map(func: Callable[[T1, T2], S], 
        iter1: Iterable[T1], 
        iter2: Iterable[T2]) -> Iterable[S]: ...

Basically, this is fine with strict optional since the two types for func are disjoint. It fails if you disable strict optional though, since None is a subtype of everything.

Previously, mypy didn't catch this since the two return types, when erased, look identical. But now that we're getting rid of type erasure...

Not 100% sure what to do here. I guess we could lie and pretend that strict optional always applies when checking overloads? Or special-case overloads + None somehow? These options seem sort of suboptimal though...

Thoughts?

@gvanrossum
Copy link
Member

Presumably when someone writes overloads like that they want None to be treated special regardless of the setting of strict_optional. So I think it should be honored. If map(None, ...) is called the overload should use the first alternative. If the argument is not just None, the first alternative should be skipped -- even if the argument type is Optional[Callable[......]], since the Optional is lost when strict_optional is off.

@ilevkivskyi
Copy link
Member

I guess we could lie and pretend that strict optional always applies when checking overloads? Or special-case overloads + None somehow?

Either way is fine. The idea if someone writes an overload on None one expects it to be a "normal type" (i.e.not magically a subtype of everything). I think we already have some special casing for protocols so that __hash__ = None works consistently with Python semantics independently of --strict-optional.

@ilevkivskyi
Copy link
Member

Also, just a note: accepting this PR without implementing todos 3, 4, and 5 will mean mypy will have a hole in the type system, at least temporarily.

This is totally fine. Having some false negatives temporarily is OK, just be sure there are no false positives, they attract more attention, while false negatives can be easily fixed later without disrupting peoples workflows too much.

@ilevkivskyi
Copy link
Member

@Michael0x2a I will start reviewing this tomorrow or Monday, in the meantime I created a new label to simplify coordination of work on overloads topic-overloads

@ilevkivskyi ilevkivskyi self-requested a review May 20, 2018 06:17
@ilevkivskyi ilevkivskyi self-assigned this May 20, 2018
This pull request:

1.  Modifies how mypy handles overloads to match the proposal
    I made in the typing repo.

2.  Starts removing type erasure from overload checks.

3.  Adds support for basic union math.

4.  Makes overloads respect keyword-only args

This pull request does NOT implement the following:

1.  Special-casing descriptors

2.  Improving how operator methods are handled: I've tweaked one or two
    methods to take advantage of some of the changes I made in this PR,
    but stayed away from making any drastic changes for now.

3.  Detecting partially overlapping argument types -- e.g. code like:

        @overload
        def f(x: Union[A, B]) -> str: ...
        @overload
        def f(x: Union[B, C]) -> int: ...

        @overload
        def g(x: Tuple[int, ...]) -> str: ...
        @overload
        def g(x: Tuple[int, int]) -> int: ...

4.  Detecting overlapping "argument counts". For example, we should
    flag the following as an error since the first alternative could
    potentially overlap with the second.

        @overload
        def f(*args: int) -> int: ...
        @overload
        def f(x: int, y: int, z: int) -> str: ...

5.  The "is-more-precise" relation. It's working in most normal cases
    but still contains a few bugs, mostly relating to type vars.
    For example, this currently isn't being flagged as an error:

        class Wrapper(Generic[T]):
            @overload
            def f(self, x: int) -> int: ...  # No error?
            @overload
            def f(self, x: T) -> str: ...

    (This PR does the right thing if 'T' isn't bound to a containing
    class though:)

        class Wrapper:
            @overload
            def f(self, x: int, y: int) -> int: ...  # Error
            @overload
            def f(self, x: T, y: T) -> str: ...

    Currently, what I'm doing is using the existing `is_more_precise`
    method, which calls `is_proper_subtype`. I think i'll either need to
    rewrite `is_more_precise` to do what I want with typevars or find a
    way of forcing the two methods to unify their typevars before running
    the `is_proper_subtype` check.

The plan is to address these 5 TODOs in future pull requests. Items 1
and 2 are basically orthogonal to the overloads overhaul; items 3, 4,
and 5 basically boil down to finding ways to teach mypy to detect if one
thing is *potentially* compatible with another.

For example, mypy contains code to tell if one type is *definitely* a
subtype of another; fixing items 3 and 5 involve writing code to check
if a type is *potentially* a subtype of another.
@Michael0x2a Michael0x2a force-pushed the overhaul-overloads branch from 75d4c79 to 96486b6 Compare May 20, 2018 20:46
Apparently, in Python 2, 'bytecode' promotes to 'unicode'.
Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks as a very good start to finally make overloads first class members in mypy!

Here are some comments.

After this and subsequent PRs land we will need to check issues with "topic-overloads" label, and check which will be fixed (also add tests cases for those if needed).

mypy/checker.py Outdated
assert isinstance(item, Decorator)
sig1 = self.function_type(item.func)
for j, item2 in enumerate(defn.items[i + 1:]):
with experiments.strict_optional_set(True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a comment why this is needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

mypy/checker.py Outdated
defn.impl)
if overload_can_never_match(sig1, sig2):
self.msg.overloaded_signature_will_never_match(
i + 1, i + j + 2, item2.func)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be just j + 1? Why do you need to add them?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because we're iterating over the remaining overloads using an enumeration: 'j' starts at zero.

(i + j + 2 was also what the code appeared to be doing before, so I figured it'd be fine to just copy it for the new check.)

mypy/messages.py Outdated
context: Context) -> None:
self.fail(
'Overloaded function signature {index2} will never be matched: '
'function {index1}\'s parameter type(s) are the same or broader'.format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would replace "function" -> "signature" here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

mypy/checker.py Outdated
assert False, "Impl isn't the right type"
# This can happen if we've got an overload with a different
# decorator too -- we gave up on the types.
if impl_type is None or isinstance(impl_type, AnyType) or sig1 is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be more specific in this comment. There are three conditions, it would be good to explain what could be the cause for each of them.

mypy/checker.py Outdated
if impl_type.variables:
unified = unify_generic_callable(impl_type, sig1, ignore_return=False)
if unified is None:
self.fail("Type variable mismatch between " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I am not sure I understand the logic here. Maybe add a comment about why do we need this. Also do we have enough tests for this error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't really understand the logic here either. TBH, I didn't look too hard at it: the only changes I made in my first pass was to just indent everything by a level (because of the strict-optional with statement).

I decided to just remove this error message in my second pass: I was able to find only one test case that produced this error message. (And that test case looked ill-defined anyways -- it was checking the f(x: Any) -> T case I was asking about on gitter earlier.)

(Same sort of thing for your comment about the three conditions up above: that was old logic I decided to just keep. I did remove the sig1 is None bit though -- I think that condition is actually always going to be false.)

#
# However, that means we'll also give up if the original overloads contained
# any unions. This is likely unnecessary -- we only really need to give up if
# there are more then one *synthesized* union arguments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even this is to strict technically speaking. Imagine this series of overloads (I will use shorter notations, all classes below don't have common bases):

(A, B) -> C1
(C, D) -> C2
(A, D) -> C3
(C, B) -> C4

In this case calling with (Union[A, C], Union[B, D]) is safe and should just infer Union[C1, ..., C4]. This however has 0.001% probability to appear in real code. The general algorithm would be something like reverse polynomial expansion, but I propose not to do this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was a can of worms I really didn't want to open.

I'll leave it up to somebody smarter then me to figure out how to handle cases like these if it ever becomes necessary.

instead of the other way around (contravariantly).

This function is mostly used to check if the left is a subtype of the right which
is why the default is to check the args contravariantly. However, it's occasionally
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe give a short example here when it is useful?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

mypy/types.py Outdated
@@ -674,6 +675,8 @@ class CallableType(FunctionLike):
# 'dict')
'from_type_type', # Was this callable generated by analyzing Type[...]
# instantiation?
'from_overloads', # Was this callable generated by synthesizing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where this flag is actually used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, forgot to delete that

@@ -1391,7 +1409,7 @@ def r(x: Any) -> Any:...
@overload
def g(x: A) -> A: ...
@overload
def g(x: Tuple[A1, ...]) -> B: ... # E: Overloaded function signatures 2 and 3 overlap with incompatible return types
def g(x: Tuple[A1, ...]) -> B: ... # TODO: This should be an error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to fix this before this PR is merged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way to fix this in the general case is to add support for identifying partially overlapping types.

As an interim measure though, I added back in some of the existing code for checking overlaps -- it'll check for cases like these at least on the top level.


[builtins fixtures/list.pyi]
[typing fixtures/typing-full.pyi]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need few more tests:

  • For allowing overloads on container item type (Lis[int] -> int, List[str] -> str, etc.) Just to be sure there are no spurious errors, also check what happens if called with List[Any].
  • For type inference with overloads, there are already some tests, I would just add few to more, in particular to be sure that type inference works correctly for union math and for nested calls with generic overloads.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (I hope)

This commit is a superficial one: it modifies the error message for
overlapping alternatives to use the word "signature" instead of
"function", as suggested.

The majority of the changes are updates in tests.
This commit adds extra comments/cleans up some existing ones as prompted
by the first code review.
I originally added the 'from_overloads' field because I thought it would
help us produce better error messages when both union math and the
"select the first match" algorithms fail.

However, I found a slightly better approach for handling errors +
decided to defer more adding more sophicated error messages for a future
PR and forgot to remove this field.
The previous plan was that I would defer checking for partial overlaps
for a future PR.

However, that led to some regressions -- for example, the tuple ellipsis
variance test case.

So, I decided to add in very rudimentary partial overlap checks as a
stop-gap measure to help catch more obviously unsafe signatures. The
checks only attempt to check for overlaps at the top-level, and handle
only unions, tuples, and unrestricted TypeVars.
The previous implementation of union_overload_matches blindly copied
code for handling generics from somewhere (I forget where). In a
nutshell, what the code did was pre-emptively infer what the types of
each callable ought to be based on the argument types before unioning.

I think this was the wrong approach, after doing some more testing. This
new approach does not attempt to do any inference and instead restricts
itself to "combining" TypeVars by name before unioning.

My understanding is that in Mypy, we normally ignore the name of the
TypeVar and instead rely exclusively on inference to figure out how each
TypeVar ought to be handled.

However, here, I think it makes sense to go by name: if a user used the
TypeVar 'T' in two different overload alternatives, they probably meant
for that TypeVar to mean the same thing.

(And if they didn't, well, I don't feel too bad about that: their code
is probably confusing and ought to be re-written.)
The previous version of the code only checked for the presence of 'Any'
at the top level.

This new version does a more robust check on the types so we can handle
inputs like 'List[Any]'.

It also streamlines 'infer_overload_return_type': rather then checking
to see if there's ambiguity on each iteration, we can just do it once at
the end.

This is theoretically a performance win: 'any_causes_overload_ambiguity'
loops over all of the alternatives so unnesting this logic removes
accidentally-quadratic behavior.

In practice, this is probably a wash: overloads are unlikely to be
super-long in practice. We also call 'check_call' more frequently: that
method is probably the more expensive of the two.

(But either way, I think this change does make the code look tidier.)
This commit rearranges the logic so we try performing union math first,
not second.

See the discussion in python#4063 for
details/justification about this change.
Since the code for special-casing overloaded descriptors looked so
simple, I went ahead and just added it. This also gave me an excuse to
add more complex tests.
This commit cleans up and documents the code for checking overload
implementations. In particular:

1.  I think 'sig1' can actually never be None (at least going by
    the types) so there's no need to check it.

2.  I think the second check was actually broken -- it seemed to break
    in some edge cases where the overload implementation returns a
    union.
This commit adds a few more miscellaneous tests.
@Michael0x2a
Copy link
Collaborator Author

@ilevkivskyi -- I think this is ready for a second round of review whenever you have time.

Also, I'm not sure if this will actually make reviewing any easier, but I tried to split up my changes into self-contained commits w/ comments so you can check each distinct change + corresponding tests one-by-one instead of all at once if you prefer.

The biggest change I made was to redo how union-math handles generics. My previous implementation tried to use the caller arguments to infer types before unioning; the new logic tries to defer generics handling as much as possible and just unions together the typevars as they are.

That said, my approach makes the assumption that typevars are internally "consistent" between overload alternatives: if a user uses a typevar T in one alternative and a typevar S in another in the same way, it treats those two TypeVars as distinct even though they could be unified. The testOverloadInferUnionReturnWithInconsistentTypevarNames test case has an example of this.

Let me know if you think this isn't a correct approach.

I also just went ahead and added code to special-case descriptors: I think it ended up being only like 2 lines of code, so it seemed like an easy win.


# TODO: Should we try and handle this differently?
# On one hand, this overload is defined in a weird way so it's arguably
# the user's fault; on the other, there's nothing overtly wrong with it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just open an issue about this so we don't forget about this corner case. Currently this looks too weird to care.

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks solid. TBH, I didn't check all the tests with type variables, but I trust you and I believe there are some unrelated bugs that may interfere with these tests (I don't know if you hit #1317 which I think we really need to fix).

Please consider some minor and optional comments. I think this could be merged after the next iteration, and we can continue working in the subsequent PRs.

Also, I'm not sure if this will actually make reviewing any easier, but I tried to split up my changes into self-contained commits w/ comments so you can check each distinct change + corresponding tests one-by-one instead of all at once if you prefer.

Yes, this was helpful.

# This is normally what we want, but for checking the validity of overload
# implementations, we actually want to use the same variance for both.
#
# TODO: Patch 'is_callable_compatible' and 'unify_generic_callable'?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential way to track these TODOs would be to open a follow-up issue with a list of checkboxes for TODOs you are going to fix n subsequent PRs, like this:

  • Write this list
  • TODO in checker.py about ..
  • TODO in ...

mypy/checker.py Outdated

Two signatures s and t are overlapping if both can be valid for the same
statically typed values and the return types are incompatible.
statically typed values and the return types are incompatible.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think spaces at the start are not needed.

plausible_targets = self.plausible_overload_call_targets(arg_types, arg_kinds,
arg_names, callee)

# Step 2: If the arguments contain a union, we try performing union math first.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please extend the comment here explaining why we need this before the Step 3 (maybe give a link to an issue). Because naively one may expect that performing Step 3 first will give more precise return types.

@@ -2993,3 +3203,44 @@ def map_formals_to_actuals(caller_kinds: List[int],
for actual in actuals:
actual_to_formal[actual].append(formal)
return actual_to_formal


def merge_typevars_in_callables_by_name(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can avoid adding this function here, and instead ensure that during semantic analysis there is a single binding for a type variable across overload variants. This would be a clearer solution IMO, but the current code around TypeVarScope is a bit "opaque" and requires a refactoring anyway, so maybe it makes sense to this in a separate PR, up to you.

mypy/types.py Outdated
special_sig: Optional[str] = _dummy,
from_type_type: bool = _dummy,
from_overloads: bool = _dummy,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to remove this one.

reveal_type(foo(a))
reveal_type(foo(b))
reveal_type(foo(c))
reveal_type(foo(d))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a test that [] matches the first overload (as IIRC it is defined in your spec)? Also we probably need similar tests for empty user defined covariant collections.

def g(*, p2: A, p1: X) -> Y: ...
def g(*, p1, p2): ...

[case testOverloadWithVariableArgsAreOverlapping-skip]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests cases are too long, please consider splitting at least the few longest in parts.


return arg1, arg2


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid double empty lines in tests.

@ilevkivskyi
Copy link
Member

(Also note there is a minor merge conflict now.)

@Michael0x2a
Copy link
Collaborator Author

@ilevkivskyi -- ok, I made the changes you suggested! I split up the tests + added the new ones you suggested. Let me know if the new test I added for empty covariant collections wasn't what you had in mind.

Regarding the semantic analysis/TypeVarScope related changes: I'd prefer to tackle that in a separate PR, if you don't mind. (My current plan is to plug the correctness holes related to partially overlapping types and arities first, then circle back around to cleanup and polishing.)

@ilevkivskyi
Copy link
Member

@msullivan I am merging this now, but I propose to not include this in the release, there are some things that still need to be polished, we will include this in the next release in three weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants