Skip to content

perf: pre-compute lowercase keys to avoid redundant ToLower calls#1531

Open
jptosso wants to merge 6 commits intomainfrom
perf/precompute-lower-key
Open

perf: pre-compute lowercase keys to avoid redundant ToLower calls#1531
jptosso wants to merge 6 commits intomainfrom
perf/precompute-lower-key

Conversation

@jptosso
Copy link
Copy Markdown
Member

@jptosso jptosso commented Mar 6, 2026

Summary

Eliminates redundant `strings.ToLower` calls from the hot path in `GetField` exception filtering:

  • Pre-computed `lowerKey` in `keyValue` struct — compute `strings.ToLower(key)` once at `Add`/`Set`/`SetIndex` time, surface via `MatchData.LowerKey_`
  • Pre-computed `lowerKeyStr` on `ruleVariableException` — avoids per-iteration `strings.ToLower(ex.KeyStr)` in the exception loop
  • `AddWithLowerKey` method — `AddRequestHeader`/`AddResponseHeader` already compute the lowercase key, now pass it directly to avoid Map.Add doing it again

Benchmark

```
main:
BenchmarkTxGetField-10 5461011 215.3 ns/op 400 B/op 8 allocs/op
BenchmarkTxGetFieldWithExceptions-10 2476045 467.5 ns/op 704 B/op 14 allocs/op

branch:
BenchmarkTxGetField-10 5714665 211.3 ns/op 448 B/op 8 allocs/op
BenchmarkTxGetFieldWithExceptions-10 5073336 237.9 ns/op 720 B/op 9 allocs/op
```

Benchmark Speed Allocs
GetField (no exceptions) ~2% faster same
GetFieldWithExceptions -49% (2x faster) 14 → 9 (-36%)

Test plan

  • `go test ./internal/collections/ ./internal/corazawaf/ ./internal/corazarules/ -count=1` passes
  • `TestAddWithLowerKey` verifies original case key preservation
  • `TestFindAllPopulatesLowerKey` verifies LowerKey_ population
  • Safe type assertion with comma-ok for non-`*corazarules.MatchData` implementations
  • `AddWithLowerKey` handles case-sensitive maps correctly

@jptosso
Copy link
Copy Markdown
Member Author

jptosso commented Mar 6, 2026

@copilot comment on this PR

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 6, 2026

@jptosso I've opened a new pull request, #1533, to work on those changes. Once the pull request is ready, I'll request review from you.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 88.09524% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.67%. Comparing base (febb510) to head (d53b2c4).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
internal/collections/map.go 88.00% 2 Missing and 1 partial ⚠️
internal/corazawaf/transaction.go 81.81% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1531      +/-   ##
==========================================
- Coverage   86.68%   86.67%   -0.02%     
==========================================
  Files         179      179              
  Lines        8790     8812      +22     
==========================================
+ Hits         7620     7638      +18     
- Misses        911      913       +2     
- Partials      259      261       +2     
Flag Coverage Δ
coraza.no_memoize 86.75% <88.09%> (-0.02%) ⬇️
coraza.rule.case_sensitive_args_keys 86.64% <88.09%> (-0.02%) ⬇️
coraza.rule.mandatory_rule_id_check 86.66% <88.09%> (-0.02%) ⬇️
coraza.rule.multiphase_evaluation 86.40% <88.09%> (-0.02%) ⬇️
coraza.rule.no_regex_multiline 86.63% <88.09%> (-0.02%) ⬇️
coraza.rule.rx_prefilter 86.67% <88.09%> (-0.02%) ⬇️
default 86.67% <88.09%> (-0.02%) ⬇️
examples+ 17.36% <59.52%> (+0.09%) ⬆️
examples+coraza.no_memoize 84.60% <85.71%> (-0.01%) ⬇️
examples+coraza.rule.case_sensitive_args_keys 84.60% <85.71%> (-0.01%) ⬇️
examples+coraza.rule.mandatory_rule_id_check 84.71% <85.71%> (-0.01%) ⬇️
examples+coraza.rule.multiphase_evaluation 86.40% <88.09%> (-0.02%) ⬇️
examples+coraza.rule.no_regex_multiline 84.53% <85.71%> (-0.01%) ⬇️
examples+coraza.rule.rx_prefilter 84.68% <85.71%> (-0.01%) ⬇️
examples+no_fs_access 84.05% <85.71%> (-0.01%) ⬇️
ftw 86.67% <88.09%> (-0.02%) ⬇️
no_fs_access 86.02% <88.09%> (-0.02%) ⬇️
tinygo 86.67% <88.09%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes transaction variable exception filtering by precomputing lowercase keys to avoid repeated strings.ToLower calls in hot paths.

Changes:

  • Added precomputed lowercase key fields (MatchData.LowerKey_, ruleVariableException.lowerKeyStr) and populated them during map lookups / exception creation.
  • Introduced Map.AddWithLowerKey and updated header insertion to reuse already-lowercased keys.
  • Updated GetField exception filtering to use precomputed lowercase values.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
internal/corazawaf/transaction_test.go Updates test data to include lowerKeyStr for exceptions.
internal/corazawaf/transaction.go Uses precomputed lowercase values and AddWithLowerKey in header paths and exception filtering.
internal/corazawaf/rule.go Adds/stores lowerKeyStr on exceptions at creation time.
internal/corazarules/rule_match.go Adds LowerKey_ to match data to carry precomputed lowercase keys.
internal/collections/named.go Populates LowerKey_ from stored entries when returning matches.
internal/collections/map.go Stores lowerKey per entry and adds AddWithLowerKey; wires LowerKey_ into returned matches.

@jptosso jptosso marked this pull request as ready for review March 6, 2026 13:19
@jptosso jptosso requested a review from a team as a code owner March 6, 2026 13:19
@jptosso jptosso added this to the v3.4 release milestone Mar 6, 2026
jptosso and others added 5 commits March 12, 2026 01:07
Store pre-computed lowercase keys in keyValue.lowerKey and surface
them via MatchData.LowerKey_. Pre-compute lowerKeyStr on
ruleVariableException at construction time. Add AddWithLowerKey to
Map for callers that already have the lowercase key.

This eliminates per-match strings.ToLower calls in GetField's
exception filtering loop and per-exception ToLower calls, both of
which are on the hot path during rule evaluation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jptosso jptosso force-pushed the perf/precompute-lower-key branch from 96d11b9 to e022a78 Compare March 12, 2026 00:09
@fzipi
Copy link
Copy Markdown
Member

fzipi commented Apr 1, 2026

@copilot Can you solve conflicts in this PR?

…-key

# Conflicts:
#	internal/corazawaf/rule.go
#	internal/corazawaf/transaction_test.go

Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 1, 2026

@copilot Can you solve conflicts in this PR?

Resolved in d53b2c4. Two conflicts in rule.go and transaction_test.go:

  • rule.go: preserved lowerKeyStr optimization and also included c.KeyRx (which our branch had accidentally dropped)
  • transaction_test.go: kept BenchmarkTxGetFieldWithExceptions from our branch and added makeTransactionWithJSONArgs, BenchmarkTxGetFieldWithShortRegexException, and BenchmarkTxGetFieldWithMediumRegexException from main

Copilot finished work on behalf of fzipi April 1, 2026 11:50
Copilot AI requested a review from fzipi April 1, 2026 11:50
@fzipi fzipi requested a review from Copilot April 1, 2026 12:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Comment on lines 149 to +157
// Add adds a new key-value pair to the map.
func (c *Map) Add(key string, value string) {
aVal := keyValue{key: key, value: value}
lk := key
if !c.isCaseSensitive {
key = strings.ToLower(key)
lk = strings.ToLower(key)
}
aVal := keyValue{key: key, value: value, lowerKey: lk}
c.data[lk] = append(c.data[lk], aVal)
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keyValue.lowerKey is currently set to the storage key (lk), which for case-sensitive maps equals the original key casing. Transaction.GetField now trusts MatchData.LowerKey_ to already be lowercase, so in case-sensitive collections (e.g. ARGS_* with build tag coraza.rule.case_sensitive_args_keys) exception filtering becomes case-sensitive and can fail to match. Consider always storing a lowercase form (e.g. strings.ToLower(key)) in keyValue.lowerKey, while using a separate storageKey (original key vs lowercased key) for the map index.

Copilot uses AI. Check for mistakes.
Comment on lines 171 to 207
// Sets the value of a key with the array of strings passed. If the key already exists, it will be overwritten.
func (c *Map) Set(key string, values []string) {
originalKey := key
lk := key
if !c.isCaseSensitive {
key = strings.ToLower(key)
lk = strings.ToLower(key)
}
dataSlice, exists := c.data[key]
dataSlice, exists := c.data[lk]
if !exists || cap(dataSlice) < len(values) {
dataSlice = make([]keyValue, len(values))
} else {
dataSlice = dataSlice[:len(values)] // Reuse existing slice with the same length
}
for i, v := range values {
dataSlice[i] = keyValue{key: originalKey, value: v}
dataSlice[i] = keyValue{key: originalKey, value: v, lowerKey: lk}
}
c.data[key] = dataSlice
c.data[lk] = dataSlice
}

// SetIndex sets the value of a key at the specified index. If the key already exists, it will be overwritten.
func (c *Map) SetIndex(key string, index int, value string) {
originalKey := key
lk := key
if !c.isCaseSensitive {
key = strings.ToLower(key)
lk = strings.ToLower(key)
}
values := c.data[key]
av := keyValue{key: originalKey, value: value}
values := c.data[lk]
av := keyValue{key: originalKey, value: value, lowerKey: lk}

switch {
case len(values) == 0:
c.data[key] = []keyValue{av}
c.data[lk] = []keyValue{av}
case len(values) <= index:
c.data[key] = append(c.data[key], av)
c.data[lk] = append(c.data[lk], av)
default:
c.data[key][index] = av
c.data[lk][index] = av
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as Add: Set/SetIndex assign keyValue.lowerKey to lk, which is not lowercased for case-sensitive maps. Since MatchData.LowerKey_ is used as the pre-lowercased key in GetField exception filtering, this can change behavior under case-sensitive collections. Suggest computing lowerKey := strings.ToLower(originalKey) unconditionally for the stored keyValue, and using storageKey (lowerKey vs originalKey) only for indexing into c.data.

Copilot uses AI. Check for mistakes.
b.ResetTimer()
for i := 0; i < b.N; i++ {
tx.GetField(rvp)
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This benchmark includes tx.Close() while the benchmark timer is still running, which can skew ns/op and alloc measurements for GetFieldWithExceptions. Consider calling b.StopTimer() before tx.Close() (and optionally restarting if needed) so the benchmark measures only GetField.

Suggested change
}
}
b.StopTimer()

Copilot uses AI. Check for mistakes.
Comment on lines +203 to +220
func TestFindAllPopulatesLowerKey(t *testing.T) {
m := NewMap(variables.ArgsGet)
m.Add("Content-Type", "text/html")

results := m.FindAll()
if len(results) != 1 {
t.Fatalf("expected 1 result, got %d", len(results))
}

// Access the MatchData to check LowerKey_ is populated
md, ok := results[0].(*corazarules.MatchData)
if !ok {
t.Fatal("expected *corazarules.MatchData")
}
if md.LowerKey_ != "content-type" {
t.Errorf("expected LowerKey_ 'content-type', got %q", md.LowerKey_)
}
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new LowerKey_ behavior is only asserted for the default (case-insensitive) Map. Since ARGS collections can be case-sensitive under build tag coraza.rule.case_sensitive_args_keys, it would be useful to add a test that FindAll/FindString still populate LowerKey_ with the lowercase form even when the map itself is case-sensitive (to prevent regressions in exception filtering).

Copilot generated this review using guidance from repository custom instructions.
@fzipi
Copy link
Copy Markdown
Member

fzipi commented Apr 3, 2026

@jptosso Can you follow on copilot's comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants