-
Notifications
You must be signed in to change notification settings - Fork 732
Regex: Only escape characters when necessary #1130
Comments
The general advice is reasonable, personal feeling although in many cases I think it leads to readability almost being harder as you have to know the content to ensure what the meaning is /personal feeling. However from a modsec perspective sometimes it plays by slightly different rules then normal regex engines. I think however in the situation you prescribed would be fine advice. If you would like to do minimal work, i'd recommend adding a PR to our contributing guidelines (https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.2/dev/CONTRIBUTING.md). If you're feeling more helpful :-P. We'd love PR's that actually address this in the rules. Let me know if there is more feedback by reopening the issue. |
Thanks for the feedback!!! |
Feedback was given in-person; for the record: When evaluating the RE2 regex library for use with ModSecurity/CRS, it was noticed that 16 regexes used in CRS are rejected by RE2, five of them for the trivial reason of excessive escaping. So to pave the way for replacing PCRE (which has well-known poor worst-case behavior) with RE2, it would be good to clean these rules up. |
Are the said 16 rules the only thing that stops us from adopting RE2? And should we? |
In terms of RX compilation, yes. Full evaluation would require extensive testing, of course. RE2 should be preferred because of its predictable worst-case behavior, the original paper is here: https://swtch.com/~rsc/regexp/regexp1.html These are the issues: 1 instance of \Z at end of text, or before newline at end of text (NOT SUPPORTED) |
The following rules
942432 942430 942431 942421 942420
seem to escape all characters inside the character class as a matter of course. Most of the backslashes are not needed and are indeed prone to cause problems. Inside character classes, only the caret ^, the hyphen -, the closing bracket ] and the backslash itself are metacharacters. We should heed the advice in http://www.regexguru.com/2008/04/escape-characters-only-when-necessary/, so the regex should be (double quote escaped because it's inside a string, ^ not escaped because it's not at the start of the range):
with N = 2, 3, 6, 8 or 12, depending on the rule.
The text was updated successfully, but these errors were encountered: