MONITOR TRACE: key-level access tracing with sampling#3655
Conversation
9ec9667 to
1cdc63d
Compare
aeb81ff to
033f0e3
Compare
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughImplements MONITOR TRACE: adds TRACE subcommand and JSON schema, public tracer API and header, core tracer implementation with deterministic per-key sampling and RESP/CSV formatting, instruments DB read/write/delete hot paths, wires client/server lifecycle and command execution, updates build files, and adds extensive unit tests. ChangesWorkload Tracing Feature
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/server.c (1)
3931-3934: ⚡ Quick winAvoid unconditional tracing context operations on hot path
At lines 3931-3934,
workloadTraceSaveContext()andworkloadTraceEndCommand()are called unconditionally even when no tracers are subscribed. WhileworkloadTraceBeginCommand()already guards its expensive operations via an earlyworkloadTraceActive()check, adding a guard at the call site would prevent unnecessary struct copy/assignment operations when tracing is inactive.Suggested guard
- workloadTraceContext prev_trace_ctx = workloadTraceSaveContext(); - workloadTraceBeginCommand(c); + workloadTraceContext prev_trace_ctx; + bool trace_active = workloadTraceActive(); + if (trace_active) { + prev_trace_ctx = workloadTraceSaveContext(); + workloadTraceBeginCommand(c); + } c->cmd->proc(c); - workloadTraceEndCommand(&prev_trace_ctx); + if (trace_active) { + workloadTraceEndCommand(&prev_trace_ctx); + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/server.c` around lines 3931 - 3934, The code currently calls workloadTraceSaveContext() and workloadTraceEndCommand() unconditionally around command execution; wrap these trace-context operations with a workloadTraceActive() check so they only run when tracing is enabled: call workloadTraceSaveContext(), workloadTraceBeginCommand(c), invoke c->cmd->proc(c), then workloadTraceEndCommand(&prev_trace_ctx) only if workloadTraceActive() returns true (keep workloadTraceBeginCommand's internal guard intact), avoiding unnecessary struct copies when tracing is inactive.src/server.h (1)
4153-4153: ⚡ Quick winAdd a short contract comment for
objectComputeSize.Please document why this API exists and its key assumptions (sampling semantics and DB context) near Line 4153.
Proposed change
+/* Computes estimated allocation size for MONITOR TRACE event reporting. */ size_t objectComputeSize(robj *key, robj *o, size_t sample_size, int dbid);As per coding guidelines, "Document why code exists, not just what it does; document all functions in C code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/server.h` at line 4153, Add a short contract comment above the declaration of objectComputeSize that explains why the API exists and its key assumptions: state that objectComputeSize computes an estimated memory footprint of a Redis object using sampling (describe what sample_size means and that it may inspect only part of the object for large containers), clarify that dbid indicates the DB context used for auxiliary lookups/encoding semantics, and note any preconditions (e.g., key and o must be non-NULL, caller responsibility for synchronization) and the meaning of the return value (estimated bytes). Reference the function signature objectComputeSize(robj *key, robj *o, size_t sample_size, int dbid) in the comment so callers understand sampling semantics and DB-context implications.src/workload_trace.c (1)
21-23: ⚡ Quick winClarify "thread-local" comment - these are file-local statics, not TLS.
The comment "Thread-local trace context" is misleading. These are file-scoped static variables, not thread-local storage (which would use
__threador_Thread_local). The safety comes from Valkey's single-threaded command execution model, not from TLS.📝 Proposed clarification
-/* Thread-local trace context (safe: valkey is single-threaded for commands) */ +/* Per-command trace context (file-scoped static, safe due to single-threaded command execution) */ static workloadTraceContext trace_ctx = {0}; static uint64_t trace_seq_counter = 0;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/workload_trace.c` around lines 21 - 23, The comment above the statics trace_ctx and trace_seq_counter incorrectly calls them "Thread-local"; update the comment to say these are file-scoped static variables (not TLS) and that their safety relies on Valkey's single-threaded command execution model; if actual thread-local storage is intended, switch to using __thread or _Thread_local for trace_ctx/trace_seq_counter—otherwise simply rephrase the comment to: "File-local trace context and sequence counter (safe because Valkey runs commands single-threaded), not C thread-local storage."
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/workload_trace.h`:
- Around line 38-58: Add doc comments for every public API declaration in this
header: workloadTraceInit, workloadTraceDetachClient, monitorTraceSetup,
workloadTraceSaveContext, workloadTraceBeginCommand, workloadTraceEndCommand,
workloadTraceEmitRead, workloadTraceEmitWrite, workloadTraceEmitDelete, and
workloadTraceActive. For each function include a short purpose summary,
parameter descriptions (ownership/nullable), return value semantics, important
behavioral notes (thread-safety, when to call, lifecycle/ordering constraints
such as save/begin/end pairing), and any side effects; ensure monitorTraceSetup
documents arg_start meaning and workloadTraceSaveContext/workloadTraceEndCommand
document how the context must be used. Keep comments concise and follow the
project's doc style.
In `@tests/unit/introspection.tcl`:
- Around line 1161-1170: The test "MONITOR TRACE CSV format escapes
binary-unsafe keys" currently uses an ASCII-safe key `normalkey`; update the
test to use a binary-unsafe key (e.g., include quotes, commas, newlines or
non-printable bytes) when calling `r set` (the client returned by
`valkey_deferring_client`) so the CSV-escaping/hex-encoding behavior is
exercised; also adjust the `assert_match` for the read `line` from the `$rd
read` (after `$rd MONITOR TRACE FORMAT csv`) to check for the expected
escaped/quoted or hex-encoded representation of that binary key rather than
matching `"normalkey"`, ensuring the test still closes `$rd` at the end.
---
Nitpick comments:
In `@src/server.c`:
- Around line 3931-3934: The code currently calls workloadTraceSaveContext() and
workloadTraceEndCommand() unconditionally around command execution; wrap these
trace-context operations with a workloadTraceActive() check so they only run
when tracing is enabled: call workloadTraceSaveContext(),
workloadTraceBeginCommand(c), invoke c->cmd->proc(c), then
workloadTraceEndCommand(&prev_trace_ctx) only if workloadTraceActive() returns
true (keep workloadTraceBeginCommand's internal guard intact), avoiding
unnecessary struct copies when tracing is inactive.
In `@src/server.h`:
- Line 4153: Add a short contract comment above the declaration of
objectComputeSize that explains why the API exists and its key assumptions:
state that objectComputeSize computes an estimated memory footprint of a Redis
object using sampling (describe what sample_size means and that it may inspect
only part of the object for large containers), clarify that dbid indicates the
DB context used for auxiliary lookups/encoding semantics, and note any
preconditions (e.g., key and o must be non-NULL, caller responsibility for
synchronization) and the meaning of the return value (estimated bytes).
Reference the function signature objectComputeSize(robj *key, robj *o, size_t
sample_size, int dbid) in the comment so callers understand sampling semantics
and DB-context implications.
In `@src/workload_trace.c`:
- Around line 21-23: The comment above the statics trace_ctx and
trace_seq_counter incorrectly calls them "Thread-local"; update the comment to
say these are file-scoped static variables (not TLS) and that their safety
relies on Valkey's single-threaded command execution model; if actual
thread-local storage is intended, switch to using __thread or _Thread_local for
trace_ctx/trace_seq_counter—otherwise simply rephrase the comment to:
"File-local trace context and sequence counter (safe because Valkey runs
commands single-threaded), not C thread-local storage."
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 4048686d-b2b4-4667-af4b-ae2ff4488d8a
📒 Files selected for processing (12)
cmake/Modules/SourceFiles.cmakesrc/Makefilesrc/commands.defsrc/commands/monitor-trace.jsonsrc/commands/monitor.jsonsrc/db.csrc/networking.csrc/server.csrc/server.hsrc/workload_trace.csrc/workload_trace.htests/unit/introspection.tcl
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/workload_trace.c (1)
106-106: 💤 Low valueConsider breaking long function signatures for readability.
This signature exceeds 90 characters. While functional, breaking it across multiple lines would improve readability.
♻️ Suggested formatting
-static void emitEventResp(client *tracer, long long ts_us, uint64_t seq, int db_id, const char *cmd, robj *key, int access_type, int key_exists, const char *obj_type, size_t key_bytes, size_t value_bytes) { +static void emitEventResp(client *tracer, long long ts_us, uint64_t seq, int db_id, + const char *cmd, robj *key, int access_type, int key_exists, + const char *obj_type, size_t key_bytes, size_t value_bytes) {As per coding guidelines: "Keep line length below 90 characters when reasonable in C code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/workload_trace.c` at line 106, The function signature for emitEventResp is too long; reformat its declaration and definition by breaking the parameter list across multiple lines for readability — put the return type and function name on one line and place each logical group of parameters (e.g., tracer/timestamp/seq, db_id/cmd/key/access_type, key_exists/obj_type/key_bytes/value_bytes) on their own indented lines so the line length stays under ~90 chars; update both the prototype and any corresponding definition/usage of emitEventResp to match the new multi-line signature.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/workload_trace.c`:
- Line 106: The function signature for emitEventResp is too long; reformat its
declaration and definition by breaking the parameter list across multiple lines
for readability — put the return type and function name on one line and place
each logical group of parameters (e.g., tracer/timestamp/seq,
db_id/cmd/key/access_type, key_exists/obj_type/key_bytes/value_bytes) on their
own indented lines so the line length stays under ~90 chars; update both the
prototype and any corresponding definition/usage of emitEventResp to match the
new multi-line signature.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: daaa1085-ea6c-4b95-aa35-d654ec157076
📒 Files selected for processing (12)
cmake/Modules/SourceFiles.cmakesrc/Makefilesrc/commands.defsrc/commands/monitor-trace.jsonsrc/commands/monitor.jsonsrc/db.csrc/networking.csrc/server.csrc/server.hsrc/workload_trace.csrc/workload_trace.htests/unit/introspection.tcl
🚧 Files skipped from review as they are similar to previous changes (11)
- src/commands/monitor.json
- src/Makefile
- cmake/Modules/SourceFiles.cmake
- src/workload_trace.h
- src/commands/monitor-trace.json
- src/server.h
- src/commands.def
- src/server.c
- src/db.c
- src/networking.c
- tests/unit/introspection.tcl
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/workload_trace.c`:
- Around line 84-87: The subtraction total - zmalloc_size(val) can underflow if
objectComputeSize(...) returns less than zmalloc_size(val); in the block where
samples >= 0 (using sample_size and total from objectComputeSize), clamp the
result to avoid wrapping by returning 0 when total <= zmalloc_size(val),
otherwise return total - zmalloc_size(val); update the code around
objectComputeSize, sample_size, and zmalloc_size(val) to perform this check
before returning.
- Around line 288-322: The seed bytes are derived via memcpy from the native
uint64_t which yields different SipHash keys on big- vs little-endian hosts;
normalize the seed to a uint64_t (e.g. uint64_t seed = (uint64_t)sample_seed)
and populate cfg->sample_seed_bytes in a fixed byte order (explicitly emit the 8
bytes with shifts/masks into sample_seed_bytes[0..7], then duplicate them into
[8..15]) instead of memcpy; this preserves the deterministic RATE/SEED sampling
and handles the -1 wrap consistently; locate this change around
workloadTracerConfig initialization where sample_seed and cfg->sample_seed_bytes
are set (workloadTraceDetachClient and workloadTracerConfig usage).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 9e07f303-bb78-4313-9893-f6132c859b12
📒 Files selected for processing (12)
cmake/Modules/SourceFiles.cmakesrc/Makefilesrc/commands.defsrc/commands/monitor-trace.jsonsrc/commands/monitor.jsonsrc/db.csrc/networking.csrc/server.csrc/server.hsrc/workload_trace.csrc/workload_trace.htests/unit/introspection.tcl
✅ Files skipped from review due to trivial changes (3)
- src/commands/monitor-trace.json
- src/workload_trace.h
- src/commands.def
🚧 Files skipped from review as they are similar to previous changes (7)
- src/Makefile
- cmake/Modules/SourceFiles.cmake
- src/networking.c
- src/db.c
- src/server.h
- tests/unit/introspection.tcl
- src/server.c
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/workload_trace.c (2)
84-87:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winClamp the sampled-size subtraction.
If
objectComputeSize(...)ever returns less than the top-levelrobjallocation, Line 87 wrapssize_tand TRACE emits a hugevalue_bytes.Suggested fix
if (samples >= 0) { size_t sample_size = (samples == 0) ? LLONG_MAX : (size_t)samples; size_t total = objectComputeSize(key, val, sample_size, dbid); - return total - zmalloc_size(val); + size_t robj_size = zmalloc_size(val); + return total > robj_size ? total - robj_size : 0; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/workload_trace.c` around lines 84 - 87, The subtraction of zmalloc_size(val) from total can underflow because both are size_t; modify the logic in the samples-handling branch (the block using objectComputeSize, sample_size, total and zmalloc_size) to clamp the computed value_bytes to zero when total is <= zmalloc_size(val). In other words, compute value_bytes as (total > zmalloc_size(val)) ? total - zmalloc_size(val) : 0 before returning/emitting, so you never return a wrapped huge size.
288-322:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNormalize
SEEDbefore deriving the SipHash key.
SEED -1silently becomesUINT64_MAX, and the native-endianmemcpymakes the sameRATE/SEEDpair sample different keys on little- vs big-endian hosts. That breaks the deterministic sampling contract.Suggested fix
} else if (!strcasecmp(objectGetVal(c->argv[i]), "seed") && i + 1 < c->argc) { long long s; if (getLongLongFromObjectOrReply(c, c->argv[i + 1], &s, NULL) != C_OK) return C_ERR; - sample_seed = (uint64_t)s; + if (s < 0) { + addReplyError(c, "SEED must be >= 0"); + return C_ERR; + } + sample_seed = (uint64_t)s; i++; } else { addReplyErrorObject(c, shared.syntaxerr); return C_ERR; } @@ - memset(cfg->sample_seed_bytes, 0, 16); - memcpy(cfg->sample_seed_bytes, &sample_seed, 8); - memcpy(cfg->sample_seed_bytes + 8, &sample_seed, 8); + for (int j = 0; j < 8; j++) { + uint8_t byte = (sample_seed >> (j * 8)) & 0xff; + cfg->sample_seed_bytes[j] = byte; + cfg->sample_seed_bytes[j + 8] = byte; + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/workload_trace.c` around lines 288 - 322, The seed is copied via native-endian memcpy causing different SipHash keys across endianness and signed -1 wrapping; instead normalize the 64-bit sample_seed to a uint64_t (e.g., uint64_t useed = (uint64_t)sample_seed) and write its bytes into cfg->sample_seed_bytes in a fixed big-endian order (populate bytes [0..7] using shifts (use (useed >> 56) & 0xFF, etc.), then copy the same 8 bytes into [8..15]) so workloadTracerConfig->sample_seed_bytes is deterministic regardless of host endianness or negative input handling; update the code around workloadTracerConfig initialization where sample_seed and sample_seed_bytes are set and remove the native memcpy calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/server.c`:
- Around line 3931-3939: The code samples workloadTraceActive() twice and fails
to restore the saved TLS trace context if the command disables tracing; fix by
capturing the entry state in a boolean (e.g. bool was_active =
workloadTraceActive()), call workloadTraceSaveContext() and
workloadTraceBeginCommand(c) only if was_active, and then call
workloadTraceEndCommand(&prev_trace_ctx) only if was_active after
c->cmd->proc(c); use the existing symbols workloadTraceActive,
workloadTraceSaveContext, workloadTraceBeginCommand, workloadTraceEndCommand,
prev_trace_ctx and c->cmd->proc to locate and update the logic so the saved
context is restored based on the entry state.
---
Duplicate comments:
In `@src/workload_trace.c`:
- Around line 84-87: The subtraction of zmalloc_size(val) from total can
underflow because both are size_t; modify the logic in the samples-handling
branch (the block using objectComputeSize, sample_size, total and zmalloc_size)
to clamp the computed value_bytes to zero when total is <= zmalloc_size(val). In
other words, compute value_bytes as (total > zmalloc_size(val)) ? total -
zmalloc_size(val) : 0 before returning/emitting, so you never return a wrapped
huge size.
- Around line 288-322: The seed is copied via native-endian memcpy causing
different SipHash keys across endianness and signed -1 wrapping; instead
normalize the 64-bit sample_seed to a uint64_t (e.g., uint64_t useed =
(uint64_t)sample_seed) and write its bytes into cfg->sample_seed_bytes in a
fixed big-endian order (populate bytes [0..7] using shifts (use (useed >> 56) &
0xFF, etc.), then copy the same 8 bytes into [8..15]) so
workloadTracerConfig->sample_seed_bytes is deterministic regardless of host
endianness or negative input handling; update the code around
workloadTracerConfig initialization where sample_seed and sample_seed_bytes are
set and remove the native memcpy calls.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 9d38fd12-024a-4279-ad92-7d037663dfec
📒 Files selected for processing (12)
cmake/Modules/SourceFiles.cmakesrc/Makefilesrc/commands.defsrc/commands/monitor-trace.jsonsrc/commands/monitor.jsonsrc/db.csrc/networking.csrc/server.csrc/server.hsrc/workload_trace.csrc/workload_trace.htests/unit/introspection.tcl
✅ Files skipped from review due to trivial changes (1)
- cmake/Modules/SourceFiles.cmake
🚧 Files skipped from review as they are similar to previous changes (7)
- src/Makefile
- src/commands/monitor.json
- src/commands/monitor-trace.json
- src/networking.c
- src/db.c
- src/server.h
- tests/unit/introspection.tcl
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/workload_trace.c`:
- Around line 45-55: The current workloadTraceEndCommand restores the entire
saved trace_ctx (trace_ctx = *prev) which reverts trace_ctx.seq to an older
value and breaks TRACE monotonicity; change workloadTraceEndCommand to restore
all fields from *prev except the seq (i.e., preserve the current trace_ctx.seq
or use the larger of prev->seq and current seq) so nested calls don't reset
sequence numbers—modify workloadTraceEndCommand to copy *prev but keep
trace_ctx.seq as the existing incremented value from
workloadTraceBeginCommand/trace_seq_counter.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 13e5e461-e911-41ac-9c6a-863c112d3016
📒 Files selected for processing (12)
cmake/Modules/SourceFiles.cmakesrc/Makefilesrc/commands.defsrc/commands/monitor-trace.jsonsrc/commands/monitor.jsonsrc/db.csrc/networking.csrc/server.csrc/server.hsrc/workload_trace.csrc/workload_trace.htests/unit/introspection.tcl
✅ Files skipped from review due to trivial changes (1)
- src/commands.def
🚧 Files skipped from review as they are similar to previous changes (7)
- src/server.h
- src/commands/monitor.json
- src/commands/monitor-trace.json
- src/workload_trace.h
- src/db.c
- src/server.c
- tests/unit/introspection.tcl
ce0395c to
fde5fb4
Compare
e307989 to
b8848e3
Compare
…alysis Add MONITOR TRACE subcommand that streams key access events (READ, WRITE, DELETE) to subscribed clients for offline workload analysis Features: - Hooks in lookupKey/signalModifiedKey/dbDelete emit per-key events - RESP and CSV output formats with object type and size metadata - Deterministic key-hash sampling (SipHash + threshold) for long traces - RATE 0.0-1.0 and SEED parameters for reproducible subset tracing - Zero overhead when no tracer subscribed (inline list length check) - ~8.1% overhead at 175K ops/sec with single tracer connected on r7g.4xlarge Usage: MONITOR TRACE [SAMPLES n] [FORMAT resp|csv] [RATE r] [SEED s] MONITOR TRACE RATE 0.01 SEED 42 -- trace 1% of keys, reproducible Signed-off-by: Dante Knowles <xdk@amazon.com>
MONITOR TRACE: Key-Level Workload Tracing for Valkey
Summary
Adds
MONITOR TRACE— a new subcommand that streams key-level access events (read, write, delete) to subscribed clients in real-time. UnlikeMONITORwhich captures full command text,MONITOR TRACEemits structured per-key records with metadata (timestamps, sizes, access type, object encoding), enabling offline simulation of cache eviction policies and workload characterization without replaying full command streams.Motivation
Cache tiering and eviction policy design requires understanding real workload access patterns — which keys are hot, how large they are, and what the read/write/delete ratio looks like. Existing tools (
MONITOR,SLOWLOG,INFO) don't provide this.MONITOR TRACEfills the gap by producing a compact, structured trace suitable for feeding into simulators (e.g., MRC curve generators).Feature Design
Command Syntax
Example
Trace Record Fields (CSV format)
ts_usseqdbcmdkey\xHHfor non-printable bytes)access_typekey_existsobj_typekey_bytesvalue_bytesImplementation Hooks
lookupKey()— emits on non-write lookupssignalModifiedKey()— universal mutation point, catches in-place mutations (HSET, LPUSH, etc.)dbGenericDeleteWithDictIndex()— all deletion pathsKey-Hash Sampling
When
RATE < 1.0, uses SipHash on the key to deterministically select which keys to trace. This ensures all accesses to the same key are either traced or skipped (no aliasing with request patterns).Architecture
The trace context is thread-local (safe: Valkey is single-threaded for commands). A fast-path check (
workloadTraceActive()) short-circuits when no tracers are subscribed — zero overhead in the common case.Benchmark Results
Methodology
memtier_benchmarkwith--rate-limitingfor controlled loadFinding Saturation Point
With P1 (no pipelining) on loopback, the server saturates at ~175K ops/sec regardless of client count (even 5 clients saturate a single core at sub-millisecond RTT). Used
memtier_benchmark --rate-limitingto cap at exactly 50% for the half-load phases.Results (averaged over 3 runs)
Saturated (server CPU-bound):
Half rate (~85K ops/sec, rate-limited):
Analysis
MONITOR TRACE is ~30% lighter than MONITOR at saturation: At full CPU load, MONITOR TRACE costs 8.1% throughput vs MONITOR's 11.8%. Both format output per-command, but MONITOR must serialize the entire command line (including large values) while MONITOR TRACE emits only compact key-level metadata.
At half load (50% CPU): Both achieve zero throughput impact (rate-limited). MONITOR TRACE adds +5.8% average latency vs MONITOR's +7.3%. The tail (p99.9) shows a clearer gap: +6.0% for TRACE vs +9.6% for MONITOR.
Zero overhead when inactive: The
workloadTraceActive()inline check short-circuits immediately when no trace clients are connected — no branch misprediction cost in the hot path.Pipelining amplification note: Earlier tests with P16 pipelining showed ~43% overhead, but I believe this is misleading for the average use case. Pipelining batches 16 commands per event loop iteration, amplifying the per-command trace cost 16x within a single batch. Pipelining workloads should be aware that this will incur significantly more overhead when tracing.
Files Changed
src/workload_trace.hsrc/workload_trace.csrc/server.hworkload_tracerslist, client flag, config pointersrc/server.ccall(), command gating for trace clientssrc/db.csrc/networking.cworkload_tracer_config, detach tracer on client free/resetsrc/commands/monitor.jsonsrc/commands/monitor-trace.jsonsrc/commands.defcmake/Modules/SourceFiles.cmakeworkload_trace.csrc/Makefileworkload_trace.otests/unit/introspection.tclTesting
16 integration tests in
tests/unit/introspection.tclcovering:key_bytes + value_bytes == MEMORY USAGEverified for embedded strings, RAW strings, listpack hashes, hashtable hashes, sets, zsets, streams, and with SAMPLES 0/1/5/50