-
Notifications
You must be signed in to change notification settings - Fork 2.4k
fix(store): use RWMutex #4808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(store): use RWMutex #4808
Conversation
The alert store workload is read-heavy, so use RWMutex to allow reads and write in parallel. Store usage analysis: provider/mem/mem.go: Reads: - List() x 4 (Subscribe, SlurpAndSubscribe, GetPending, countByState) - Get() x 2 Writes: - Set() x 1 - GC() via gcLoop dispatch/dispatch.go (aggrGroup): Reads: - List() x 2 - Get() x 1 - Empty()x 1 Writes: - Set() x 1 - DeleteIfNotModified()x 1 inhibit/inhibit.go: Reads: - Get() x 2 Writes: - Set() x 1 Signed-off-by: Siavash Safi <[email protected]>
Spaceman1701
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intuition is that this will improve performance, but I'm not sure. RWMutex can be slower when there's a lot of write contention, which could happen in the alert ingestion flow.
Do you have any benchmarks or profiles of this change?
Generated some benchmarks: goos: darwin
goarch: arm64
pkg: github.com/prometheus/alertmanager/store
cpu: Apple M3 Pro
│ mutex.txt │ rwmutex.txt │
│ sec/op │ sec/op vs base │
Get-12 5.865n ± 1% 5.565n ± 1% -5.12% (p=0.001 n=7)
Set-12 102.2n ± 2% 103.6n ± 1% +1.37% (p=0.026 n=7)
List/100_alerts-12 172.0n ± 3% 169.6n ± 4% ~ (p=0.090 n=7)
List/1000_alerts-12 174.2n ± 5% 170.5n ± 5% ~ (p=0.710 n=7)
List/10000_alerts-12 173.1n ± 7% 172.6n ± 5% ~ (p=0.557 n=7)
ConcurrentReads-12 76.64n ± 1% 130.30n ± 40% +70.02% (p=0.001 n=7)
ConcurrentWrites-12 258.9n ± 4% 251.6n ± 10% -2.82% (p=0.026 n=7)
ConcurrentList/100_alerts-12 286.3n ± 1% 146.0n ± 27% -49.00% (p=0.001 n=7)
ConcurrentList/1000_alerts-12 286.0n ± 1% 199.4n ± 28% -30.28% (p=0.001 n=7)
MixedReadWrite/50pct_reads-12 169.6n ± 2% 147.9n ± 3% -12.79% (p=0.001 n=7)
MixedReadWrite/90pct_reads-12 92.64n ± 1% 57.49n ± 2% -37.94% (p=0.001 n=7)
MixedReadWrite/99pct_reads-12 78.78n ± 1% 37.83n ± 11% -51.98% (p=0.001 n=7)
geomean 116.5n 98.97n -15.05%Interestingly concurrent reads are 70% slower on a read only workload. Benchmark code// Benchmarks for comparing Mutex vs RWMutex performance.
func newBenchmarkAlert(i int) *types.Alert {
return &types.Alert{
Alert: model.Alert{
Labels: model.LabelSet{
"alertname": model.LabelValue("test"),
"instance": model.LabelValue(string(rune('a' + i%26))),
},
},
UpdatedAt: time.Now(),
}
}
// BenchmarkGet measures single-threaded Get performance.
func BenchmarkGet(b *testing.B) {
s := NewAlerts()
// Pre-populate with alerts.
for i := range 1000 {
_ = s.Set(newBenchmarkAlert(i))
}
alert := newBenchmarkAlert(500)
fp := alert.Fingerprint()
b.ResetTimer()
for range b.N {
_, _ = s.Get(fp)
}
}
// BenchmarkSet measures single-threaded Set performance.
func BenchmarkSet(b *testing.B) {
s := NewAlerts()
alert := newBenchmarkAlert(0)
b.ResetTimer()
for range b.N {
_ = s.Set(alert)
}
}
// BenchmarkList measures single-threaded List performance.
func BenchmarkList(b *testing.B) {
for _, size := range []int{100, 1000, 10000} {
b.Run(fmt.Sprintf("%d_alerts", size), func(b *testing.B) {
s := NewAlerts()
for i := range size {
_ = s.Set(newBenchmarkAlert(i))
}
b.ResetTimer()
for range b.N {
_ = s.List()
}
})
}
}
// BenchmarkConcurrentReads measures concurrent read performance.
// This is where RWMutex should outperform Mutex.
func BenchmarkConcurrentReads(b *testing.B) {
s := NewAlerts()
for i := range 1000 {
_ = s.Set(newBenchmarkAlert(i))
}
alert := newBenchmarkAlert(500)
fp := alert.Fingerprint()
b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
for pb.Next() {
_, _ = s.Get(fp)
}
})
}
// BenchmarkConcurrentWrites measures concurrent write performance.
func BenchmarkConcurrentWrites(b *testing.B) {
s := NewAlerts()
alerts := make([]*types.Alert, 100)
for i := range alerts {
alerts[i] = newBenchmarkAlert(i)
}
b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
i := 0
for pb.Next() {
_ = s.Set(alerts[i%len(alerts)])
i++
}
})
}
// BenchmarkConcurrentList measures concurrent List performance.
func BenchmarkConcurrentList(b *testing.B) {
for _, size := range []int{100, 1000} {
b.Run(fmt.Sprintf("%d_alerts", size), func(b *testing.B) {
s := NewAlerts()
for i := range size {
_ = s.Set(newBenchmarkAlert(i))
}
b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
for pb.Next() {
_ = s.List()
}
})
})
}
}
// BenchmarkMixedReadWrite simulates realistic workloads with different read/write ratios.
func BenchmarkMixedReadWrite(b *testing.B) {
for _, readPct := range []int{50, 90, 99} {
b.Run(fmt.Sprintf("%dpct_reads", readPct), func(b *testing.B) {
s := NewAlerts()
for i := range 1000 {
_ = s.Set(newBenchmarkAlert(i))
}
alert := newBenchmarkAlert(500)
fp := alert.Fingerprint()
b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
i := 0
for pb.Next() {
if i%100 < readPct {
_, _ = s.Get(fp)
} else {
_ = s.Set(alert)
}
i++
}
})
})
}
} |
|
After further analysis, it seems We can switch |
|
That's an interesting result! I guess using As far as how to move forward here, I don't know for sure. My concern is that we could accidentally introduce a performance regression without realizing it. I think we need some "real world" data before we can be confident that this helps more than in hurts. |
|
I can test this in a local lab environment next week, and in production next year with the canary instance and compare performance. |
That'd make me feel a lot more comfortable with the change. Thanks! |
The alert store workload is read-heavy, so use RWMutex to allow reads and write in parallel.
Store usage analysis:
provider/mem/mem.goList()x 4 (Subscribe,SlurpAndSubscribe,GetPending,countByState)Get()x 2Set()x 1GC()viagcLoopdispatch/dispatch.go (aggrGroup)Reads
List()x 2Get()x 1Empty()x 1Writes
Set()x 1DeleteIfNotModified()x 1inhibit/inhibit.goGet()x 2Set()x 1