Description
It would enhance compatibility with other regex engines if regex
understood \"
and \'
as equivalents of "
and '
.
This came up for me when trying to use Rust's regexes from Python (Python's built-in regex module doesn't have an equivalent of RegexSet). Python's raw string literals do recognize some backslash escapes for purpose of deciding where the literal ends, so for instance r"<a href=\"https?://([a-z0-9.-]+)/robots.txt\">"
is a single string literal — but they don't convert those escapes, so the contents of the parsed string are the same as if you had written "<a href=\\\"https?://([a-z0-9.-]+)/robots.txt\\\">"
. Python's built-in regex module understands this to mean the same thing as if the quote characters were not escaped., but regex
will barf on the \"
.
This could also come up in any other context where backslash escapes quote characters but is not removed from the input in the process. For instance, some parsers for Windows-style .cfg files do this.