improve bucket.Get requests performance by f41gh7 · Pull Request #92 · VictoriaMetrics/fastcache

f41gh7 · 2025-08-05T09:51:51Z

This commit changes order of bucket structure fields and mu access
in order to improve CPU cache locality. Previously, mu and atomic counters were accessed with different CPU cache lines and strongly hit performance at older CPUs.

The following benchmark shows up to 30% performance increase:

go test -run=^$ -bench=BenchmarkCache

goos: linux
goarch: amd64
pkg: github.com/VictoriaMetrics/fastcache
cpu: AMD EPYC 7B12
              │    master    │               opt_v3                │
              │    sec/op    │   sec/op     vs base                │
CacheSet-8      4.666m ±  2%   4.674m ± 2%        ~ (p=0.739 n=10)
CacheGet-8      2.364m ±  4%   1.732m ± 7%  -26.74% (p=0.000 n=10)
CacheHas-8      2.171m ±  6%   1.652m ± 2%  -23.89% (p=0.000 n=10)
CacheSetGet-8   11.65m ± 11%   12.03m ± 3%        ~ (p=0.481 n=10)
geomean         4.087m         3.562m       -12.84%

              │    master     │                opt_v3                │
              │      B/s      │     B/s       vs base                │
CacheSet-8      13.39Mi ±  2%   13.37Mi ± 2%        ~ (p=0.755 n=10)
CacheGet-8      26.44Mi ±  5%   36.09Mi ± 7%  +36.50% (p=0.000 n=10)
CacheHas-8      28.80Mi ±  6%   37.83Mi ± 2%  +31.38% (p=0.000 n=10)
CacheSetGet-8   10.73Mi ± 10%   10.39Mi ± 3%        ~ (p=0.469 n=10)
geomean         18.19Mi         20.87Mi       +14.73%

              │    master     │                opt_v3                 │
              │     B/op      │     B/op       vs base                │
CacheSet-8      19.04Ki ± 10%   19.11Ki ±  7%        ~ (p=0.542 n=10)
CacheGet-8      9.598Ki ± 20%   7.006Ki ±  5%  -27.00% (p=0.000 n=10)
CacheHas-8      8.700Ki ± 23%   6.499Ki ±  3%  -25.31% (p=0.000 n=10)
CacheSetGet-8   41.59Ki ±  9%   40.87Ki ± 10%        ~ (p=0.362 n=10)
geomean         16.04Ki         13.73Ki        -14.36%

              │   master    │               opt_v3                │
              │  allocs/op  │  allocs/op   vs base                │
CacheSet-8      32.00 ±  9%   32.50 ±  8%        ~ (p=0.418 n=10)
CacheGet-8      16.00 ± 19%   12.00 ±  8%  -25.00% (p=0.000 n=10)
CacheHas-8      15.00 ± 20%   11.00 ±  9%  -26.67% (p=0.000 n=10)
CacheSetGet-8   71.00 ± 10%   70.00 ± 10%        ~ (p=0.396 n=10)
geomean         27.17         23.41        -13.85%

This commit changes order of bucket structure fields and mu access in order to improve CPU cache locality. Previously, mu and atomic counters were accessed with different CPU cache lines and strongly hit performance at older CPUs. The following benchmark shows up to 30% performance increase: go test -run=^$ -bench=BenchmarkCache goos: linux goarch: amd64 pkg: github.com/VictoriaMetrics/fastcache cpu: AMD EPYC 7B12 │ master │ opt_v3 │ │ sec/op │ sec/op vs base │ CacheSet-8 4.666m ± 2% 4.674m ± 2% ~ (p=0.739 n=10) CacheGet-8 2.364m ± 4% 1.732m ± 7% -26.74% (p=0.000 n=10) CacheHas-8 2.171m ± 6% 1.652m ± 2% -23.89% (p=0.000 n=10) CacheSetGet-8 11.65m ± 11% 12.03m ± 3% ~ (p=0.481 n=10) geomean 4.087m 3.562m -12.84% │ master │ opt_v3 │ │ B/s │ B/s vs base │ CacheSet-8 13.39Mi ± 2% 13.37Mi ± 2% ~ (p=0.755 n=10) CacheGet-8 26.44Mi ± 5% 36.09Mi ± 7% +36.50% (p=0.000 n=10) CacheHas-8 28.80Mi ± 6% 37.83Mi ± 2% +31.38% (p=0.000 n=10) CacheSetGet-8 10.73Mi ± 10% 10.39Mi ± 3% ~ (p=0.469 n=10) geomean 18.19Mi 20.87Mi +14.73% │ master │ opt_v3 │ │ B/op │ B/op vs base │ CacheSet-8 19.04Ki ± 10% 19.11Ki ± 7% ~ (p=0.542 n=10) CacheGet-8 9.598Ki ± 20% 7.006Ki ± 5% -27.00% (p=0.000 n=10) CacheHas-8 8.700Ki ± 23% 6.499Ki ± 3% -25.31% (p=0.000 n=10) CacheSetGet-8 41.59Ki ± 9% 40.87Ki ± 10% ~ (p=0.362 n=10) geomean 16.04Ki 13.73Ki -14.36% │ master │ opt_v3 │ │ allocs/op │ allocs/op vs base │ CacheSet-8 32.00 ± 9% 32.50 ± 8% ~ (p=0.418 n=10) CacheGet-8 16.00 ± 19% 12.00 ± 8% -25.00% (p=0.000 n=10) CacheHas-8 15.00 ± 20% 11.00 ± 9% -26.67% (p=0.000 n=10) CacheSetGet-8 71.00 ± 10% 70.00 ± 10% ~ (p=0.396 n=10) geomean 27.17 23.41 -13.85% Signed-off-by: f41gh7 <nik@victoriametrics.com>

Copilot

Pull Request Overview

This PR optimizes CPU cache locality by reordering fields in the bucket struct and adjusting mutex acquisition order to improve performance of bucket.Get requests. Based on benchmark results, this change delivers up to 30% performance improvement for cache operations.

Key changes:

Reordered atomic counter fields (getCalls, setCalls, misses) to be adjacent to the mutex
Moved mutex acquisition before accessing chunks field in Set and Get methods

codecov · 2025-08-05T09:53:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.68%. Comparing base (66aca6e) to head (c311ee8).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master      #92   +/-   ##
=======================================
  Coverage   76.68%   76.68%           
=======================================
  Files           4        4           
  Lines         549      549           
=======================================
  Hits          421      421           
  Misses         73       73           
  Partials       55       55

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

f41gh7 requested review from Copilot, makasim and rtm0 August 5, 2025 09:51

Copilot AI reviewed Aug 5, 2025

View reviewed changes

Comment thread fastcache.go

rtm0 approved these changes Aug 5, 2025

View reviewed changes

f41gh7 mentioned this pull request Aug 5, 2025

lib/storage: remove extDB from indexDB VictoriaMetrics/VictoriaMetrics#9431

Merged

2 tasks

makasim approved these changes Aug 5, 2025

View reviewed changes

f41gh7 merged commit 2693e48 into master Aug 5, 2025
7 checks passed

f41gh7 deleted the get-perf branch August 5, 2025 12:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve bucket.Get requests performance#92

improve bucket.Get requests performance#92
f41gh7 merged 1 commit into
masterfrom
get-perf

f41gh7 commented Aug 5, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

codecov Bot commented Aug 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

f41gh7 commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

codecov Bot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

f41gh7 commented Aug 5, 2025 •

edited

Loading

codecov Bot commented Aug 5, 2025 •

edited

Loading