-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Can not get data in group when using regular expression. #2336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @anhkhoa14592 , Thank you for the report. I do see a bug in the handling of matching groups that result in no content. E.g. the matching group '(id)?' where the 'match' occurs because there are 0 occurrences rather than 1. If you are looking for an immediate workaround, in your case you could consider turning the groups that you do not care about into non-matching groups. I.e. add '?:' at the beginning of each of '(asp|jserv|jw)', '(id)', and '(id|token)|sid)' -- this should then enable you to read the content resulting from the match group '([^\s]+)' using tx.3. I will work on a code fix for this shortly, though. |
Hi @martinhsv, Thanks you for your support. Hope to receive the update for this bug :). |
Previously, searchAll would stop search when it encountered an empty matching group in any position. This means that, for example, regular expression "(a)(b?)(c)" would match string "ac", but the resulting group list would be ["ac", "a"]. After this change, the resulting list for the aforementioned regular expression becomes ["ac", "a", "", "c"] like it should've been. Additionally, this also changes behaviour for multiple matches. For example, when "aaa00bbb" is matched by "[a-z]*", previously only "aaa" would be returned. Now the matching list is ["aaa", "", "", "bbb", ""]. The old behaviour was confusing and almost certainly a bug. The new behaviour is the same as in Python's re.findall. For reference, though, Go does it somewhat differently: empty matches at the end of non-empty matches are ignored, so in Go above example is ["aaa", "", "bbb"] instead. This is the root cause of issue owasp-modsecurity#2336 which has been already fixed by replacing searchAll call there with a new function.
Describe the bug
I tried to extract the value from PHPSESSID with regular pattern (This pattern from Web Application Defender's Cookbook: Battling Hackers and Protecting Users):
(?i:(j?sessionid|(php)?sessid|(asp|jserv|jw)?session[-_]?(id)?|cf(id|token)|sid)=([^\s]+)\;\s?)
But I can not get the value from group 6 (TX:6). I tried in others Text Editor and everything is fine but I don't know these pattern does not work. Maybe I miss somethings?
Logs and dumps
SecRule
Output of:
Response
HTTP/1.1 200
Server: nginx/1.18.0
Date: Thu, 11 Jun 2020 11:43:06 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
X-Powered-By: PHP/7.4.6
Set-Cookie: PHPSESSID=ea101040fa9365d3ad6e921d9e1e04da; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
AuditLog
Expected Behavior
Based on other TextEditor, Auditlog must have the value of PHPSESSID as below:
The text was updated successfully, but these errors were encountered: