Skip to content

please support \" and \' as equivalent to " and ' respectively #522

Closed
@zackw

Description

@zackw

It would enhance compatibility with other regex engines if regex understood \" and \' as equivalents of " and '.

This came up for me when trying to use Rust's regexes from Python (Python's built-in regex module doesn't have an equivalent of RegexSet). Python's raw string literals do recognize some backslash escapes for purpose of deciding where the literal ends, so for instance r"<a href=\"https?://([a-z0-9.-]+)/robots.txt\">" is a single string literal — but they don't convert those escapes, so the contents of the parsed string are the same as if you had written "<a href=\\\"https?://([a-z0-9.-]+)/robots.txt\\\">". Python's built-in regex module understands this to mean the same thing as if the quote characters were not escaped., but regex will barf on the \".

This could also come up in any other context where backslash escapes quote characters but is not removed from the input in the process. For instance, some parsers for Windows-style .cfg files do this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions