Skip to content

re.match(), re.search() and re.fullmatch() cannot be used with AnyStr #9591

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
agateau-gg opened this issue Jan 26, 2023 · 5 comments · Fixed by #9592
Closed

re.match(), re.search() and re.fullmatch() cannot be used with AnyStr #9591

agateau-gg opened this issue Jan 26, 2023 · 5 comments · Fixed by #9592

Comments

@agateau-gg
Copy link

Given this test file:

from typing import AnyStr

from re import Match, Pattern

def check_re_search(pattern: Pattern[AnyStr], string: AnyStr) -> Match[AnyStr]:
    match = pattern.search(string)
    if match is None:
        raise ValueError(f"'{string!r}' does not match {pattern!r}")
    return match

pyright reports the following error:

No configuration file found.
No pyproject.toml file found.
stubPath /home/agateau/tmp/typings is not a valid directory.
Assuming Python platform Linux
Searching for source files
Found 1 source file
pyright 1.1.291
/home/agateau/tmp/crs.py
  /home/agateau/tmp/crs.py:6:21 - error: Could not bind method "search" because "Pattern[AnyStr@check_re_search]" is not assignable to parameter "self"
    "Pattern[AnyStr@check_re_search]" is incompatible with "Pattern[str]"
      TypeVar "AnyStr@Pattern" is invariant
        Type "AnyStr@check_re_search" cannot be assigned to type "str" (reportGeneralTypeIssues)
  /home/agateau/tmp/crs.py:6:21 - error: Could not bind method "search" because "Pattern[AnyStr@check_re_search]" is not assignable to parameter "self"
    "Pattern[AnyStr@check_re_search]" is incompatible with "Pattern[bytes]"
      TypeVar "AnyStr@Pattern" is invariant
        Type "AnyStr@check_re_search" cannot be assigned to type "bytes" (reportGeneralTypeIssues)

The same code works fine if one replaces all AnyStr with either str or bytes.

I originally reported this on pyright issue tracker (microsoft/pyright#4534) but it seems the issue is actually here (see microsoft/pyright#4534 (comment))

Quoting erictraut comment:

The fix for this would require an additional overload within the typeshed stubs. Something like this:

   @overload
   def search(self, string: AnyStr | ReadableBuffer, pos: int = ..., endpos: int = ...) -> Match[AnyStr] | None: ...
@AlexWaygood
Copy link
Member

Interesting, thanks for opening the issue! I'll try making a PR.

@AlexWaygood
Copy link
Member

AlexWaygood commented Jan 26, 2023

Hmm, there is a snag here: the overload @erictraut suggests would be unsafe, as I believe it would mean type checkers would no longer emit errors for code such as the following, which raises TypeError at runtime:

import re

pat = re.compile("foo")
string = bytearray(b"foo")  # ReadableBuffer
pat.search(string)

I believe this is the very reason why we use overloads here in the first place.

Our general policy in typeshed is to prefer false negatives over false positives, but the lack of type safety here would be pretty unfortunate. Possibly we could compromise here by just using AnyStr as the parameter, instead of AnyStr | ReadableBuffer:

@overload
def search(self, string: AnyStr, pos: int = ..., endpos: int = ...) -> Match[AnyStr] | None: ...

@JelleZijlstra
Copy link
Member

Yes, I think AnyStr | ReadableBuffer is wrong. We'd need a separate overload for ReadableBuffer that returns Match[bytes]. IIRC I didn't do it that way before because of a mypy bug, but maybe that's been fixed.

@erictraut
Copy link
Contributor

Thanks @AlexWaygood. I agree that my suggestion would lead to potential false negatives. I like your suggestion.

@AlexWaygood
Copy link
Member

Yes, I think AnyStr | ReadableBuffer is wrong. We'd need a separate overload for ReadableBuffer that returns Match[bytes]. IIRC I didn't do it that way before because of a mypy bug, but maybe that's been fixed.

Indeed, it looks like the mypy bug is still very much with us: #9593 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants