feat: regex key support for ctl:ruleRemoveTargetById and ctl:ruleRemoveTargetByTag#3526
Conversation
- 2 test cases: JSON array keys, mixpanel suffix (SecRuleUpdateTargetById parity) - Expected to fail until regex support is implemented - OODA baseline: Test 1 parse error, Test 2 HTTP 403 (exclusion not applied) Made-with: Cursor
- Compile regex at config load, not per-request - RuleRemoveTargetByIdEntry struct: literal or shared_ptr<Regex> - Test 2 (ARGS:/mixpanel$/) passes; Test 1 blocked by parser owasp-modsecurity#2927 Made-with: Cursor
…veTargetByTag Add regex pattern matching in the variable-key position of ctl:ruleRemoveTargetById and ctl:ruleRemoveTargetByTag, enabling exclusions like: ctl:ruleRemoveTargetById=932125;ARGS:/^json\.\d+\.JobDescription$/ ctl:ruleRemoveTargetByTag=XSS;ARGS:/^json\.\d+\.JobDescription$/ JSON body processing generates argument names with dynamic array indices (json.0.Field, json.1.Field, ...). Without regex keys, operators cannot scope exclusions to specific keys without listing every possible index or disabling rules entirely. Design: - Regex detected by /pattern/ delimiter in COLLECTION:/pattern/ - Compiled once at config load via Utils::Regex (PCRE2/PCRE1) - Stored as shared_ptr - zero per-request compilation - Literal targets continue to work unchanged (no breaking change) - Shared RuleRemoveTargetSpec struct used by both ById and ByTag - Lexer REMOVE_RULE_TARGET_VALUE class shared by both actions Aligns ModSecurity v3 with Coraza (corazawaf/coraza#1561). Fixes owasp-modsecurity#3505
370f93b to
637ad9c
Compare
Known limitation:
|
| Instead of | Use |
|---|---|
\d{2,5} |
\d\d+ or \d+ |
[a-z]{3} |
[a-z][a-z][a-z] or [a-z]+ |
.{1,10} |
.+ |
The character class does include { and } (updated in latest push), so fixed-count quantifiers like \d{3} work fine — only the comma-containing {m,n} form is affected. This matches the same trade-off the v2 PR makes.
I believe this is acceptable for the target use case (JSON key patterns like ^json\.\d+\.FieldName$ and cookie name patterns like ^sess_[a-f0-9]+$), which don't need {m,n}.
|



Summary
Add regex pattern matching in the variable-key position of
ctl:ruleRemoveTargetByIdandctl:ruleRemoveTargetByTag.This enables exclusion patterns like:
ctl:ruleRemoveTargetById=932125;ARGS:/^json\.\d+\.JobDescription$/ ctl:ruleRemoveTargetByTag=XSS;ARGS:/^json\.\d+\.JobDescription$/Fixes #3505
Problem
JSON body processing generates argument names with unpredictable numeric indices (
json.0.JobDescription,json.1.JobDescription, ...). Without regex key support, operators must either:This is a common pain point for anyone running CRS with JSON/GraphQL APIs.
Approach
Following your guidance in the issue discussion, the regex is compiled once at config load — never recompiled per request. This directly addresses the concern about the v2 PR #3121 where regex compilation ran on every exclusion check.
How it works
init(), the/pattern/delimiter is detected in the target string (e.g.ARGS:/^json\.\d+\.Field$/)Utils::Regex(PCRE2 by default, PCRE1 with--with-pcre) at config load timeshared_ptr<Utils::Regex>— shared across all requests, zero per-request allocationsearchAll()runs against the short variable-key string (typically 10-40 chars)ARGS:password) continue to work unchanged via the existing==/ case-insensitive comparisonShared design for ById and ByTag
Both actions use a common
RuleRemoveTargetSpecstruct withmatchesFullName()andmatchesKeyWithCollection()methods, avoiding code duplication.Lexer change
The scanner character class
REMOVE_RULE_TARGET_VALUE(previouslyREMOVE_RULE_TARGET_BY_ID_VALUE, used only by ById) is now shared by both ById and ByTag. It includes regex metacharacters (^ $ + ( ) | ? \) but not comma — so chainedctl:actions still split correctly on,.Context: ModSecurity v2 and Coraza
This PR adds regex key support to the two actions that v3 already has (ById, ByTag). The missing
ctl:ruleRemoveTargetByMsgis a separate, larger discussion and is intentionally excluded.Files Changed (11 files)
headers/modsecurity/rule_remove_target_entry.hRuleRemoveTargetSpec,ByIdEntry,ByTagEntrystructsheaders/modsecurity/transaction.hsrc/actions/ctl/rule_remove_target_by_id.cc/pattern/, compile regex ininit()src/actions/ctl/rule_remove_target_by_id.hshared_ptr<Regex>membersrc/actions/ctl/rule_remove_target_by_tag.cc/pattern/, compile regex ininit()src/actions/ctl/rule_remove_target_by_tag.hshared_ptr<Regex>membersrc/parser/seclang-scanner.llREMOVE_RULE_TARGET_VALUEshared by ById + ByTagsrc/rule_with_operator.cctarget.matchesFullName()/matchesKeyWithCollection()test/test-cases/regression/issue-3505.jsontest/test-cases/regression/issue-3505-crs-ctl-byid-tag-msg.json@detectSQLi+ JSON bodytest/test-suite.inTest Results
7 new tests, all passing:
ARGS:/^json\.\d+\.JobDescription$/excludes dynamic JSON argsARGS:/mixpanel$/excludes args by suffix patternARGS:password— proves literal targets still work unchanged@detectSQLi@detectSQLiFull regression suite: 5005 total, 4987 pass, 18 skip, 0 fail, 0 error.
Performance
Benchmark: 25,000 iterations × 5 trials, JSON POST with 20 ARGS keys, 3 detection rules (
@detectSQLi,@detectXSS,@rx), 2 regex exclusions (one ById, one ByTag).Scaling with more ARGS keys:
The overhead scales linearly with ARGS count — no exponential blowup. At 100 keys (an extreme JSON body), the per-request cost is +0.014 ms. The cost is the
searchAll()call on short variable-name strings against precompiled PCRE2 patterns.Key design decisions keeping performance in check:
shared_ptrsearchAll()runs on short strings (variable names, typically 10-40 chars)==comparison — no regression for non-regex usersMade with Cursor