Skip to content

comparison-overlap not triggered for pathlib.Path #10661

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
NMertsch opened this issue Sep 4, 2023 · 14 comments · Fixed by #10662
Closed

comparison-overlap not triggered for pathlib.Path #10661

NMertsch opened this issue Sep 4, 2023 · 14 comments · Fixed by #10662

Comments

@NMertsch
Copy link

NMertsch commented Sep 4, 2023

--strict-comparison seems to assume that pathlib.Path objects can be equal to anything.

To Reproduce (playground)

# run with --strict-equality
from pathlib import Path

class A: pass

'x' == 1  # comparison-overlap: this is always false
A() == 1  # comparison-overlap: this is always false
Path('x') == 1  # fine for mypy

Expected Behavior
Both lines should trigger 'non-overlapping equality check'

Actual Behavior
Only 'x' == 1 triggers 'non-overlapping equality check'

@hauntsaninja
Copy link
Collaborator

@hauntsaninja hauntsaninja transferred this issue from python/mypy Sep 4, 2023
@AlexWaygood
Copy link
Member

This is because of https://github.com/python/typeshed/pull/7231/files#diff-b5aebc9806d203cfe171e2215b0b4f3106395fc59687382eb9276d7d8976bffd , transferring to typeshed

The pathlib changes in that PR were correct — PurePath does indeed implement a custom __eq__ method: https://github.com/python/cpython/blob/6304d983a0656c1841769bf36e5b42819508d21c/Lib/pathlib.py#L473

I think the false negative here is just due to the heuristics mypy uses in order to avoid false positives for its --strict-equality check (I believe it decides that a comparison between any two objects is okay, as long as one of the two objects is an instance of a class that implements a custom __eq__ method. So I can't see how this is a typeshed issue (though I imagine it would also be a wontfix for mypy).

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Sep 4, 2023

It's not just mypy, pyright appears to use the same heuristic for its reportUnnecessaryComparison strict check.

It seems the main benefit of PurePath.__eq__ is that it avoids a false positive for users who are explicitly calling __eq__ with a keyword argument. This seems very niche. On the other hand, I've definitely made the mistake of trying to compare a Path to a string.

hauntsaninja added a commit to hauntsaninja/typeshed that referenced this issue Sep 4, 2023
@AlexWaygood
Copy link
Member

It's not just mypy, pretty sure pyright uses a similar heuristic for its reportUnnecessaryComparison strict check.

Oh for sure, and I think it's a reasonable heuristic to use! That's why I thought it would probably be a wontfix if we considered it a mypy issue.

It seems the main benefit of PurePath.__eq__ is that it avoids a false positive for users who are explicitly calling __eq__ with a keyword argument. This seems very niche. On the other hand, I've definitely made the mistake of trying to compare a Path to a string.

Sure, I agree that this is an unfortunate side effect of adding PurePath.__eq__ to the stub here, and that in an ideal world, mypy would emit a comparison-overlap error if it saw someone comparing a Path to a str. I agree that that's a common error, and that calling __eq__ with a keyword argument is very uncommon.

I guess we can remove PurePath.__eq__ from the stub if we think it does more harm than good, but I feel somewhat uncomfortable with the basis on which we'd be making the change. Are we saying we should make a judgement call on each __eq__ we include in any stub, and try to determine the effect including that method might have on false-positives/false-negatives in --strict-equality checks? My previous understanding was that we should generally include all __eq__ implemented at runtime in the stubs, with the goal of keeping --strict-equality false positives in type checkers to a minimum.

But maybe I've been looking at it the wrong way :-)

@AlexWaygood
Copy link
Member

I.e. my basis for adding most of the __eq__ methods in #10464 (a more recent PR where I added a bunch of __eq__ and __hash__ methods) was our general policy about avoiding false positives being more important than avoiding false negatives. But if we want to get false negatives down on --strict-equality checks then we may want to look more closely at some of the changes I made there and see if they were worth it

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Sep 4, 2023

I think given we haven't been particularly intentional here, #10464 was definitely the right thing to do.

I guess a philosophy here could be:

But curious what other folks think!

I'm also not actually sure how type checkers handle types on __eq__. E.g., a type checker can't report comparison issues if the signature is def __eq__(self, other: object). But maybe it could if the signature was def __eq__(self, other: T) and type checkers take that to mean __eq__ would return NotImplemented if something other than a T was passed in?

@AlexWaygood
Copy link
Member

AlexWaygood commented Sep 4, 2023

I'm on board with that — it sounds like a reasonably pragmatic approach. Though I'm still not thrilled about making decisions about what goes in the stubs based on these heuristics the type checkers are using, since it feels like the precise heuristics they use could be subject to change in the future. So for cases where we remove or purposefully omit __eq__, I'd want to make sure we add tests and/or document why the __eq__ method is absent (as you're doing in #10662!).

  • I don't care about explicit __eq__ keyword-only false positives and am happy to trade that off for almost any other benefit

Fully agreed that this is the least important consideration

@AlexWaygood
Copy link
Member

I'm also not actually sure how type checkers handle types on __eq__. E.g., a type checker can't report comparison issues if the signature is def __eq__(self, other: object). But maybe it could if the signature was def __eq__(self, other: T) and type checkers take that to mean __eq__ would return NotImplemented if something other than a T was passed in?

Currently mypy and pyright will both complain at you if you have an __eq__ method that is annotated any differently to def __eq__(self, other: T), since that's the annotation we give for object.__eq__. So if we went down this route, it might require a lot of type: ignores 😄

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Sep 5, 2023

Huh, mypy actually has a pretty detailed error message for that one. Yeah, that was meant as a suggestion for how type checkers could change how they interpret __eq__ / what heuristics they use. This actually matters more for real code than in stubs, since it's quite unintuitive that you'd have to use if not TYPE_CHECKING to get comparison errors even though your custom __eq__ only handles instances of your class. Would require a bunch of coordination!

@srittau
Copy link
Collaborator

srittau commented Sep 5, 2023

A few general thoughts. (I don't care about keeping __eq__ too much in this case.)

  • Stubs should generally follow the implementation as closely as possible. There are sometimes reasons not to do that, but this usually brings its own problems.
  • Keeping the different heuristics that type checkers use in different cases in mind is hard.
  • This particular heuristic is fairly weak, i.e. it's easy to lose strict overlap checks.

Maybe we could find a way to help type checkers better with strict overlap checks. For example, we could have __eq__ overloads:

@overload
def __eq__(self, other: PurePath) -> bool: ...
@overload
def __eq__(self, other: object) -> NotImplemented: ...

Of course this is probably a discussion for typing-sig or discourse.

@Akuli
Copy link
Collaborator

Akuli commented Sep 5, 2023

A couple more thoughts:

  • If the __eq__ method is written incorrectly (somewhat rare), it usually errors if you compare to anything else than an instance of the class. I think there's no good way to type-annotate a broken __eq__ method.
  • If the __eq__ method is written correctly, it seems like relying on the type checker's overlap check will do the right thing in almost all practical cases. What is the motivation/reasoning for including __eq__ methods in stubs?

@AlexWaygood
Copy link
Member

AlexWaygood commented Sep 5, 2023

  • What is the motivation/reasoning for including __eq__ methods in stubs?

The motivation is that it can often lead to false positives if they're not included in stubs. Lots of Python classes compare equal to objects of different classes -- it would be really annoying if you couldn't compare a set to a frozenset without a type checker complaining, for example:

>>> {1, 2, 3} == frozenset((1, 2, 3))
True
>>> {1, 2, 3} == dict.fromkeys((1, 2, 3)).keys()
True
>>> from types import MappingProxyType
>>> {1:2, 3:4} == MappingProxyType({1:2, 3:4})
True

See e.g. #9580, where Jukka added types.MappingProxyType.__eq__ to get rid of some false positives when comparing MappingProxyType instances to dicts, or #10297, where a bunch of false-positive errors in the streamlit project were removed by adding typing.Mapping.__eq__

@NMertsch
Copy link
Author

NMertsch commented Sep 5, 2023

Thank you all for the discussion, I learnt a lot! But can someone tell me what's the benefit of the current stub for PurePath.__eq__?

The only typing-relevant difference between the default __eq__ and PurePath.__eq__ I can see is that it uses other as argument name, so it can also be used via Path("x").__eq__(other=Path("y")).

  • This behavior is not documented anywhere.
  • Mypy does not support it, even with the current stub: playground

So if I don't miss anything, removing the stub for __eq__ (as proposed in #10662) leads to the following:

  • type checkers are able to detect if "x" == Path("x") as an issue (it's always False, triggers comparison-overlap in mypy --strict-equality)
  • Nothing else

While it is true that type checkers might implement better heuristics in the future, I don't see any use for the current stub, while #10662 could prevent some nasty bugs.

@AlexWaygood
Copy link
Member

AlexWaygood commented Sep 5, 2023

But can someone tell me what's the benefit of the current stub for PurePath.__eq__?

There would be a small regression caused by removing PurePath.__eq__: see #10662 (comment). In this specific case, the false negative you're experiencing is much more likely to come up than the new false positive we'd create, so I don't really think it matters much. But it's illustrative of the reason why, in the past, we've generally opted to include __eq__ methods in the stubs in most situations.

hauntsaninja added a commit that referenced this issue Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants