Refactor reversible operators #5475

Michael0x2a · 2018-08-14T06:37:02Z

This pull request refactors and reworks how we handle reversible operators like __add__.

Specifically, what our code was previously doing was assuming that given the expression A() + B(), we would always try calling A().__add__(B()) first, followed by B().__radd__(A()) second (if the __radd__ method exists).

Unfortunately, it seems like this model was a little too naive, which caused several mismatches/weird errors when I was working on refining how we handle overlaps and TypeVars in a subsequent PR.

Specifically, what actually happens is that...

When doing A() + A(), we only ever try calling A.__add__, never A.__radd__. This is the case even if __add__ is undefined.
If B is a subclass of A, and if B defines an __radd__ method, and we do A() + B(), Python will actually try checking B.__radd__ first, then A.__add__ second.

This lets a subclass effectively "refine" the desired return type.

Note that if B only inherits an __radd__ method, Python calls A.__add__ first as usual. Basically, B must provide a genuine refinement over whatever A returns.
In all other cases, we call __add__ then __radd__ as usual.

This pull request modifies both checker.py and checkexpr.py to match this behavior, and adds logic so that we check the calls in the correct order.

This ended up slightly changing a few error messages in certain edge cases.

This pull request adds more robust support for detecting partially overlapping types. Specifically, it detects overlaps with... 1. TypedDicts 2. Tuples 3. Unions 4. TypeVars 5. Generic types containing variations of the above. It also swaps out the code for detecting overlaps with operators and removes some associated (and now unused) code. This pull request builds on top of python#5474 and python#5475 -- once those two PRs are merged, I'll rebase this diff if necessary. This pull request also supercedes python#5475 -- that PR contains basically the same code as these three PRs, just smushed together.

JelleZijlstra · 2018-08-14T14:12:20Z

mypy/nodes.py

+    '__add__',
+    '__sub__',
+    '__mul__',
+    '__truediv__',


Should this include __div__ in Python 2?

Hmm, good point. I originally excluded it because none of the other data structures in the lines above and below included __div__ -- it seems like mypy handles __div__ in a mostly ad-hoc and inconsistent basis in general. (For example, it seemed like previously we never checked to see if __rdiv__ was unsafely overlapping with __div__ at all, we don't support from __future__ import division at all...)

Properly fixing that is probably something that's better done in a separate PR, but I went ahead and added it here + made a few other adjustments as a stopgap measure.

This pull request refactors and reworks how we handle reversible operators like __add__. Specifically, what our code was previously doing was assuming that given the expression `A() + B()`, we would always try calling `A().__add__(B())` first, followed by `B().__radd__(A())` second (if the `__radd__` method exists). Unfortunately, it seems like this model was a little too naive, which caused several mismatches/weird errors when I was working on refining how we handle overlaps and TypeVars in a subsequent PR. Specifically, what actually happens is that... 1. When doing `A() + A()`, we only ever try calling `A.__add__`, never `A.__radd__`. This is the case even if `__add__` is undefined. 2. If `B` is a subclass of `A`, and if `B` defines an `__radd__` method, and we do `A() + B()`, Python will actually try checking `B.__radd__` *first*, then `A.__add__` second. This lets a subclass effectively "refine" the desired return type. Note that if `B` only *inherits* an `__radd__` method, Python calls `A.__add__` first as usual. Basically, `B` must provide a genuine refinement over whatever `A` returns. 3. In all other cases, we call `__add__` then `__radd__` as usual. This pull request modifies both checker.py and checkexpr.py to match this behavior, and adds logic so that we check the calls in the correct order. This ended up slightly changing a few error messages in certain edge cases.

This pull request adds more robust support for detecting partially overlapping types. Specifically, it detects overlaps with... 1. TypedDicts 2. Tuples 3. Unions 4. TypeVars 5. Generic types containing variations of the above. It also swaps out the code for detecting overlaps with operators and removes some associated (and now unused) code. This pull request builds on top of python#5474 and python#5475 -- once those two PRs are merged, I'll rebase this diff if necessary. This pull request also supercedes python#5475 -- that PR contains basically the same code as these three PRs, just smushed together.

ilevkivskyi

Thanks! Looks good, I have several comments, all of them are minor.

ilevkivskyi · 2018-08-14T20:29:07Z

mypy/checker.py

@@ -1043,72 +1043,102 @@ def check_overlapping_op_methods(self,
        """Check for overlapping method and reverse method signatures.

        Assume reverse method has valid argument count and kinds.
+
+        Precondition:


This is probably a common terminology, but I would make it clear that the caller should ensure this (i.e. this function should not be called otherwise).

ilevkivskyi · 2018-08-14T20:31:33Z

mypy/checker.py

+                                 forward_item: CallableType,
+                                 forward_base: Type,
+                                 reverse_type: CallableType) -> bool:
+        # TODO check argument kinds


I like colons after TODO, NOTE, etc.

ilevkivskyi · 2018-08-14T20:32:32Z

mypy/checker.py

-        # non-overlapping.
+        #    This behavior deviates from how we handle overloads -- many of the
+        #    modules in typeshed seem to define __OP__ methods that shadow the
+        #    corresponding __rOP__ method.


This a very nice explanation. 👍

ilevkivskyi · 2018-08-14T20:33:10Z

mypy/checker.py

+        #    corresponding __rOP__ method.
+        #
+        # Note: we do not attempt to handle unsafe overlaps related to multiple
+        # inheritance.


Maybe add "as well as we do for overloads" or something of this kind?

ilevkivskyi · 2018-08-14T20:47:13Z

mypy/checker.py

+            # Not a valid operator method -- can't succeed anyway.
+            return False
+
+        # Erase the type if necessary to make sure we don't have a dangling


I would explain "dangling" a bit more, maybe add "(i.e. single)"?

ilevkivskyi · 2018-08-14T21:43:24Z

mypy/checkexpr.py

+        # Sometimes, the variants list is empty. In that case, we fall-back to attempting to
+        # call the __op__ method (even though it's missing).
+
+        if len(errors) == 0:


Why checking the length of errors instead of just length of variants?

Also, if len(seq) == 0 should be if not seq.

ilevkivskyi · 2018-08-14T21:43:47Z

mypy/checkexpr.py

+                errors.append(local_errors)
+                results.append(result)
+            else:
+                return result


I am not sure I understand how can this ever succeed, could you please add a comment with an example?

I originally added that out just in case, but it seems we do fall into that case at least once -- for example, adding an assert there makes this test case fail.

I added a TODO.

ilevkivskyi · 2018-08-14T21:45:21Z

mypy/messages.py

+                                             reverse_type: Type,
+                                             reverse_method: str,
+                                             context: Context) -> None:
+        msg = "{rfunc} will not be called when running '{cls} {op} {cls}': must define {ffunc}"


I would say "evaluating" instead of "running" here.

ilevkivskyi · 2018-08-14T21:52:11Z

test-data/unit/check-classes.test

+[case testReverseOperatorOrderingCase5]
+class A:
+    def __add__(self, other: B) -> int: ...
+    def __radd__(self, other: A) -> str: ...


Just do double check: this is not an unsafe overlap because only one of the two can be ever called? Could you please add a clarifying comment here?

Yep. I added a comment.

ilevkivskyi · 2018-08-14T21:55:26Z

test-data/unit/check-classes.test

+# A refinement made by a parent also counts
+reveal_type(A() + C())  # E: Revealed type is 'builtins.str'
+
+


Could you please add few tests for operator methods that involve generic classes and overloaded methods (just for completeness).

I added a few tests for overloads.

I'm not sure if I want to add tests for generics here though, since the way operators handle generics are still broken in this PR. I fixed this in my follow-up PR though, and added a few test cases there.

ilevkivskyi

Great, thanks! You can merge this after you fix the last few comments here. Then rebase the overlap PR and I will review it.

ilevkivskyi · 2018-08-16T18:09:58Z

mypy/checkexpr.py

+            # Currently, it seems we still need this to correctly deal with
+            # things like metaclasses?
+            #
+            # E.g. see the pythoneval.testMetaclassOpAccessAny test case.


Could you please add details about this to #5136?

ilevkivskyi · 2018-08-16T18:11:03Z

mypy/checkexpr.py

+                # This is probably related to the TODO in lookup_operator(...)
+                # up above.
+                #
+                # TODO: Remove this extra case


I think this is something deserving a follow-up issue. Could you please open one?

ilevkivskyi · 2018-08-16T18:13:20Z

test-data/unit/check-classes.test

+# gracefully -- it doesn't correctly switch to using __truediv__ when
+# 'from __future__ import division' is included, it doesn't display a very
+# graceful error if __div__ is missing but __truediv__ is present...
+# Also see https://github.com/python/mypy/issues/2048


Could you please add these details to the issue and raise its priority to at least normal? This future import is very important in loads of numeric code.

ilevkivskyi · 2018-08-16T18:14:03Z

mypy/checkexpr.py

+        return self.check_op_local(method, method_type, base_type, arg, context, local_errors)
+
+    def check_op_local(self,
+                       method_name,


method_name should be str.

This pull request adds more robust support for detecting partially overlapping types. Specifically, it detects overlaps with... 1. TypedDicts 2. Tuples 3. Unions 4. TypeVars 5. Generic types containing variations of the above. It also swaps out the code for detecting overlaps with operators and removes some associated (and now unused) code. This pull request builds on top of python#5474 and python#5475 -- once those two PRs are merged, I'll rebase this diff if necessary. This pull request also supercedes python#5475 -- that PR contains basically the same code as these three PRs, just smushed together.

This pull request adds more robust support for detecting partially overlapping types. Specifically, it detects overlaps with... 1. TypedDicts 2. Tuples 3. Unions 4. TypeVars 5. Generic types containing variations of the above. This new overlapping type check is used for detecting unsafe overlaps with overloads and operator methods as well as for performing reachability/unreachability analysis in `isinstance` and `if x is None` checks and the like. This PR also removes some (now unused) code that used to be used for detecting overlaps with operators. This pull request builds on top of #5474 and #5475 and supersedes #5475.

python#5475 introduced a new type of error message ("__rop__ will not be called when evaluating 'a + b'...") that triggers when the user tries evaluating expressions like `foo + foo` where `foo` does not contain an `__add__` method that accepts a value of the same type, but *does* contain an `__radd__` method that does. This pull request removes that error message on the grounds that it's too cryptic and unlikely to be helpful to most mypy users. That error message is useful mainly for people developing libraries containing custom numeric types (or libraries that appropriate operators to create custom DSLs) -- however, most people are not library creators and so will not find this error message useful.

#5475 introduced a new type of error message ("__rop__ will not be called when evaluating 'a + b'...") that triggers when the user tries evaluating expressions like `foo + foo` where `foo` does not contain an `__add__` method that accepts a value of the same type, but *does* contain an `__radd__` method that does. This pull request removes that error message on the grounds that it's too cryptic and unlikely to be helpful to most mypy users. That error message is useful mainly for people developing libraries containing custom numeric types (or libraries that appropriate operators to create custom DSLs) -- however, most people are not library creators and so will not find this error message useful.

Michael0x2a requested a review from ilevkivskyi August 14, 2018 06:37

Michael0x2a mentioned this pull request Aug 14, 2018

Add partial overload checks #5476

Merged

JelleZijlstra reviewed Aug 14, 2018

View reviewed changes

Michael0x2a force-pushed the refactor-reversible-operators branch from 4ca1879 to 0243661 Compare August 14, 2018 21:35

ilevkivskyi reviewed Aug 14, 2018

View reviewed changes

Michael0x2a added 2 commits August 16, 2018 08:17

Respond to code review

76d5a72

Fix bug with op names

24aabd9

ilevkivskyi approved these changes Aug 16, 2018

View reviewed changes

Add missing type hint

4728dd7

Michael0x2a mentioned this pull request Aug 16, 2018

from __future__ import division is returning int instead of float in python 2 #2048

Closed

Michael0x2a merged commit 7f29b73 into python:master Aug 16, 2018

Michael0x2a mentioned this pull request Aug 16, 2018

Refactor operator method access and protocol checks to use checkmember.py #5136

Open

Michael0x2a deleted the refactor-reversible-operators branch August 27, 2018 05:45

Michael0x2a mentioned this pull request Sep 4, 2018

Remove the 'rop will not be called' error message #5571

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor reversible operators #5475

Refactor reversible operators #5475

Michael0x2a commented Aug 14, 2018 •

edited

Loading

JelleZijlstra Aug 14, 2018

Michael0x2a Aug 16, 2018

ilevkivskyi left a comment

ilevkivskyi Aug 14, 2018

ilevkivskyi Aug 14, 2018

ilevkivskyi Aug 14, 2018

ilevkivskyi Aug 14, 2018

ilevkivskyi Aug 14, 2018

ilevkivskyi Aug 14, 2018

ilevkivskyi Aug 14, 2018

Michael0x2a Aug 16, 2018

ilevkivskyi Aug 14, 2018

ilevkivskyi Aug 14, 2018

Michael0x2a Aug 16, 2018

ilevkivskyi Aug 14, 2018

Michael0x2a Aug 16, 2018

ilevkivskyi left a comment

ilevkivskyi Aug 16, 2018

ilevkivskyi Aug 16, 2018

ilevkivskyi Aug 16, 2018

ilevkivskyi Aug 16, 2018

		# A refinement made by a parent also counts
		reveal_type(A() + C()) # E: Revealed type is 'builtins.str'

Refactor reversible operators #5475

Refactor reversible operators #5475

Conversation

Michael0x2a commented Aug 14, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilevkivskyi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilevkivskyi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Michael0x2a commented Aug 14, 2018 •

edited

Loading