-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Refactor reversible operators #5475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor reversible operators #5475
Conversation
This pull request adds more robust support for detecting partially overlapping types. Specifically, it detects overlaps with... 1. TypedDicts 2. Tuples 3. Unions 4. TypeVars 5. Generic types containing variations of the above. It also swaps out the code for detecting overlaps with operators and removes some associated (and now unused) code. This pull request builds on top of python#5474 and python#5475 -- once those two PRs are merged, I'll rebase this diff if necessary. This pull request also supercedes python#5475 -- that PR contains basically the same code as these three PRs, just smushed together.
mypy/nodes.py
Outdated
'__add__', | ||
'__sub__', | ||
'__mul__', | ||
'__truediv__', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this include __div__
in Python 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good point. I originally excluded it because none of the other data structures in the lines above and below included __div__
-- it seems like mypy handles __div__
in a mostly ad-hoc and inconsistent basis in general. (For example, it seemed like previously we never checked to see if __rdiv__
was unsafely overlapping with __div__
at all, we don't support from __future__ import division
at all...)
Properly fixing that is probably something that's better done in a separate PR, but I went ahead and added it here + made a few other adjustments as a stopgap measure.
This pull request refactors and reworks how we handle reversible operators like __add__. Specifically, what our code was previously doing was assuming that given the expression `A() + B()`, we would always try calling `A().__add__(B())` first, followed by `B().__radd__(A())` second (if the `__radd__` method exists). Unfortunately, it seems like this model was a little too naive, which caused several mismatches/weird errors when I was working on refining how we handle overlaps and TypeVars in a subsequent PR. Specifically, what actually happens is that... 1. When doing `A() + A()`, we only ever try calling `A.__add__`, never `A.__radd__`. This is the case even if `__add__` is undefined. 2. If `B` is a subclass of `A`, and if `B` defines an `__radd__` method, and we do `A() + B()`, Python will actually try checking `B.__radd__` *first*, then `A.__add__` second. This lets a subclass effectively "refine" the desired return type. Note that if `B` only *inherits* an `__radd__` method, Python calls `A.__add__` first as usual. Basically, `B` must provide a genuine refinement over whatever `A` returns. 3. In all other cases, we call `__add__` then `__radd__` as usual. This pull request modifies both checker.py and checkexpr.py to match this behavior, and adds logic so that we check the calls in the correct order. This ended up slightly changing a few error messages in certain edge cases.
4ca1879
to
0243661
Compare
This pull request adds more robust support for detecting partially overlapping types. Specifically, it detects overlaps with... 1. TypedDicts 2. Tuples 3. Unions 4. TypeVars 5. Generic types containing variations of the above. It also swaps out the code for detecting overlaps with operators and removes some associated (and now unused) code. This pull request builds on top of python#5474 and python#5475 -- once those two PRs are merged, I'll rebase this diff if necessary. This pull request also supercedes python#5475 -- that PR contains basically the same code as these three PRs, just smushed together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks good, I have several comments, all of them are minor.
mypy/checker.py
Outdated
@@ -1043,72 +1043,102 @@ def check_overlapping_op_methods(self, | |||
"""Check for overlapping method and reverse method signatures. | |||
|
|||
Assume reverse method has valid argument count and kinds. | |||
|
|||
Precondition: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably a common terminology, but I would make it clear that the caller should ensure this (i.e. this function should not be called otherwise).
mypy/checker.py
Outdated
forward_item: CallableType, | ||
forward_base: Type, | ||
reverse_type: CallableType) -> bool: | ||
# TODO check argument kinds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like colons after TODO
, NOTE
, etc.
mypy/checker.py
Outdated
# non-overlapping. | ||
# This behavior deviates from how we handle overloads -- many of the | ||
# modules in typeshed seem to define __OP__ methods that shadow the | ||
# corresponding __rOP__ method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This a very nice explanation. 👍
mypy/checker.py
Outdated
# corresponding __rOP__ method. | ||
# | ||
# Note: we do not attempt to handle unsafe overlaps related to multiple | ||
# inheritance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add "as well as we do for overloads" or something of this kind?
mypy/checker.py
Outdated
# Not a valid operator method -- can't succeed anyway. | ||
return False | ||
|
||
# Erase the type if necessary to make sure we don't have a dangling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would explain "dangling" a bit more, maybe add "(i.e. single)"?
mypy/checkexpr.py
Outdated
# Sometimes, the variants list is empty. In that case, we fall-back to attempting to | ||
# call the __op__ method (even though it's missing). | ||
|
||
if len(errors) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why checking the length of errors instead of just length of variants?
Also, if len(seq) == 0
should be if not seq
.
errors.append(local_errors) | ||
results.append(result) | ||
else: | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand how can this ever succeed, could you please add a comment with an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally added that out just in case, but it seems we do fall into that case at least once -- for example, adding an assert there makes this test case fail.
I added a TODO.
mypy/messages.py
Outdated
reverse_type: Type, | ||
reverse_method: str, | ||
context: Context) -> None: | ||
msg = "{rfunc} will not be called when running '{cls} {op} {cls}': must define {ffunc}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say "evaluating" instead of "running" here.
test-data/unit/check-classes.test
Outdated
[case testReverseOperatorOrderingCase5] | ||
class A: | ||
def __add__(self, other: B) -> int: ... | ||
def __radd__(self, other: A) -> str: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just do double check: this is not an unsafe overlap because only one of the two can be ever called? Could you please add a clarifying comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. I added a comment.
test-data/unit/check-classes.test
Outdated
# A refinement made by a parent also counts | ||
reveal_type(A() + C()) # E: Revealed type is 'builtins.str' | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add few tests for operator methods that involve generic classes and overloaded methods (just for completeness).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few tests for overloads.
I'm not sure if I want to add tests for generics here though, since the way operators handle generics are still broken in this PR. I fixed this in my follow-up PR though, and added a few test cases there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks! You can merge this after you fix the last few comments here. Then rebase the overlap PR and I will review it.
# Currently, it seems we still need this to correctly deal with | ||
# things like metaclasses? | ||
# | ||
# E.g. see the pythoneval.testMetaclassOpAccessAny test case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add details about this to #5136?
# This is probably related to the TODO in lookup_operator(...) | ||
# up above. | ||
# | ||
# TODO: Remove this extra case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is something deserving a follow-up issue. Could you please open one?
# gracefully -- it doesn't correctly switch to using __truediv__ when | ||
# 'from __future__ import division' is included, it doesn't display a very | ||
# graceful error if __div__ is missing but __truediv__ is present... | ||
# Also see https://github.com/python/mypy/issues/2048 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add these details to the issue and raise its priority to at least normal? This future import is very important in loads of numeric code.
mypy/checkexpr.py
Outdated
return self.check_op_local(method, method_type, base_type, arg, context, local_errors) | ||
|
||
def check_op_local(self, | ||
method_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method_name
should be str
.
This pull request adds more robust support for detecting partially overlapping types. Specifically, it detects overlaps with... 1. TypedDicts 2. Tuples 3. Unions 4. TypeVars 5. Generic types containing variations of the above. It also swaps out the code for detecting overlaps with operators and removes some associated (and now unused) code. This pull request builds on top of python#5474 and python#5475 -- once those two PRs are merged, I'll rebase this diff if necessary. This pull request also supercedes python#5475 -- that PR contains basically the same code as these three PRs, just smushed together.
This pull request adds more robust support for detecting partially overlapping types. Specifically, it detects overlaps with... 1. TypedDicts 2. Tuples 3. Unions 4. TypeVars 5. Generic types containing variations of the above. This new overlapping type check is used for detecting unsafe overlaps with overloads and operator methods as well as for performing reachability/unreachability analysis in `isinstance` and `if x is None` checks and the like. This PR also removes some (now unused) code that used to be used for detecting overlaps with operators. This pull request builds on top of #5474 and #5475 and supersedes #5475.
python#5475 introduced a new type of error message ("__rop__ will not be called when evaluating 'a + b'...") that triggers when the user tries evaluating expressions like `foo + foo` where `foo` does not contain an `__add__` method that accepts a value of the same type, but *does* contain an `__radd__` method that does. This pull request removes that error message on the grounds that it's too cryptic and unlikely to be helpful to most mypy users. That error message is useful mainly for people developing libraries containing custom numeric types (or libraries that appropriate operators to create custom DSLs) -- however, most people are not library creators and so will not find this error message useful.
#5475 introduced a new type of error message ("__rop__ will not be called when evaluating 'a + b'...") that triggers when the user tries evaluating expressions like `foo + foo` where `foo` does not contain an `__add__` method that accepts a value of the same type, but *does* contain an `__radd__` method that does. This pull request removes that error message on the grounds that it's too cryptic and unlikely to be helpful to most mypy users. That error message is useful mainly for people developing libraries containing custom numeric types (or libraries that appropriate operators to create custom DSLs) -- however, most people are not library creators and so will not find this error message useful.
This pull request refactors and reworks how we handle reversible operators like
__add__
.Specifically, what our code was previously doing was assuming that given the expression
A() + B()
, we would always try callingA().__add__(B())
first, followed byB().__radd__(A())
second (if the__radd__
method exists).Unfortunately, it seems like this model was a little too naive, which caused several mismatches/weird errors when I was working on refining how we handle overlaps and TypeVars in a subsequent PR.
Specifically, what actually happens is that...
When doing
A() + A()
, we only ever try callingA.__add__
, neverA.__radd__
. This is the case even if__add__
is undefined.If
B
is a subclass ofA
, and ifB
defines an__radd__
method, and we doA() + B()
, Python will actually try checkingB.__radd__
first, thenA.__add__
second.This lets a subclass effectively "refine" the desired return type.
Note that if
B
only inherits an__radd__
method, Python callsA.__add__
first as usual. Basically,B
must provide a genuine refinement over whateverA
returns.In all other cases, we call
__add__
then__radd__
as usual.This pull request modifies both checker.py and checkexpr.py to match this behavior, and adds logic so that we check the calls in the correct order.
This ended up slightly changing a few error messages in certain edge cases.