fix: clone label values to prevent buffer reuse-after-free#5116
Merged
Conversation
Reader.LabelValues (no-matcher path) returns yoloString aliases into the pooled TSDB buffer. Once the dataset closes, that buffer is returned to bufferpool and can be immediately reused by a concurrent request — overwriting the strings while they are still held by the aggregator. Fix by cloning each value with strings.Clone before the dataset is closed. Adds Test_LabelValues_NoMatcher_BufferReuse: 20 goroutines × 20 iterations of a no-matcher LabelValues query. Without the fix, concurrent buffer recycling corrupts the returned strings (e.g. "test-app" becomes "codeInt64"). Run with -race to also trigger the detector on the underlying read/write race.
2 tasks
aleks-p
approved these changes
May 6, 2026
This was referenced May 7, 2026
aleks-p
added a commit
that referenced
this pull request
May 7, 2026
…5122) Reader.LabelValues (no-matcher path) returns yoloString aliases into the pooled TSDB buffer. Once the dataset closes, that buffer is returned to bufferpool and can be immediately reused by a concurrent request — overwriting the strings while they are still held by the aggregator. Fix by cloning each value with strings.Clone before the dataset is closed. Adds Test_LabelValues_NoMatcher_BufferReuse: 20 goroutines × 20 iterations of a no-matcher LabelValues query. Without the fix, concurrent buffer recycling corrupts the returned strings (e.g. "test-app" becomes "codeInt64"). Run with -race to also trigger the detector on the underlying read/write race. Co-authored-by: Christian Simon <simon@swine.de>
aleks-p
added a commit
that referenced
this pull request
May 7, 2026
…5121) Reader.LabelValues (no-matcher path) returns yoloString aliases into the pooled TSDB buffer. Once the dataset closes, that buffer is returned to bufferpool and can be immediately reused by a concurrent request — overwriting the strings while they are still held by the aggregator. Fix by cloning each value with strings.Clone before the dataset is closed. Adds Test_LabelValues_NoMatcher_BufferReuse: 20 goroutines × 20 iterations of a no-matcher LabelValues query. Without the fix, concurrent buffer recycling corrupts the returned strings (e.g. "test-app" becomes "codeInt64"). Run with -race to also trigger the detector on the underlying read/write race. Co-authored-by: Christian Simon <simon@swine.de>
simonswine
added a commit
to simonswine/go
that referenced
this pull request
May 8, 2026
Two correctness fixes:
1. GC safety (poolGuardCanMigrate):
Only migrate slice backing arrays whose element type has no Go pointer
fields (PtrBytes == 0). Previously any slice field in a pooled struct
was migrated to mmap pages. If the element type contained Go pointers
(e.g. []SomeStruct with pointer fields), those pointers became invisible
to the GC after migration, causing premature collection and the runtime
assertion "found pointer to free object".
Pure-data slices ([]byte, []uint8, []uint64, []int64 …) have PtrBytes=0
and are unaffected; they are still the common case for pooled byte buffers.
2. directOnly suppression frame index (poolGuardSuppressed):
The direct Pool.Put caller sits at different indices depending on the
code path through poolguard:
• *Struct pools (via poolGuardPutSliceFields): pcs[0]=sync.(*Pool).Put,
pcs[1]=actual caller.
• []byte pools (direct from sync_poolGuardPut): pcs[0]=actual caller.
The previous code always checked pcs[1], so "=" suppressions never fired
for slice-type pools. Fixed to check both pcs[0] and pcs[1], skipping
any frame that starts with "sync.(*Pool)." (the Pool.Put implementation).
With both fixes the Pyroscope V2 integration test (TestMicroServicesIntegrationV2)
runs to completion with a precise 17-entry suppression list:
POOLGUARD_SUPPRESS=\
=connectrpc.com/connect.(*bufferPool).Put,\
=fmt.(*pp).free,\
=github.com/go-kit/log.logfmtLogger.Log,\
=github.com/parquet-go/parquet-go.(*ColumnWriter).writeDataPage.deferwrap1,\
=github.com/parquet-go/parquet-go/internal/memory.(*ChunkBuffer[go.shape.uint8]).Reset,\
=github.com/parquet-go/parquet-go/internal/memory.putSliceToPool[go.shape.int64],\
=github.com/parquet-go/parquet-go/internal/memory.putSliceToPool[go.shape.uint64],\
=github.com/parquet-go/parquet-go/internal/memory.putSliceToPool[go.shape.uint8],\
=github.com/parquet-go/parquet-go.(*memoryBufferPool).PutBuffer,\
=github.com/parquet-go/parquet-go.putBufioReader,\
=google.golang.org/protobuf/internal/filedesc.(*File).unmarshalFull.deferwrap1,\
=google.golang.org/protobuf/internal/filedesc.(*File).unmarshalSeed.deferwrap1,\
=net/http.putBufioReader,\
=net/http.putBufioWriter,\
=regexp.freeBitState,\
=regexp.freeOnePassMachine
The Pyroscope bufferpool violation (grafana/pyroscope#5116) is detected
first and must be added to suppress the false positive before the test
passes; any regression in bufferpool would surface immediately.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
simonswine
added a commit
to simonswine/go
that referenced
this pull request
May 8, 2026
Some packages intentionally access pooled data after Pool.Put within their own carefully managed lifetimes (net/http's bufio.Writer pool, parquet-go's memory pool, protobuf descriptor bytes). Without a way to suppress these, GODEBUG=poolguard=1 fires too many false positives on startup before reaching user code. POOLGUARD_SUPPRESS is a comma-separated list of fully-qualified package path prefixes. A Pool.Put whose call stack contains any frame whose function name starts with a suppressed prefix is silently skipped — the backing array is never moved to guarded pages. POOLGUARD_SUPPRESS=net/http,google.golang.org/protobuf Matching is a simple string prefix against the full function name (e.g. "net/http.putBufioWriter"), so: • "net/http" silences the bufio.Writer pool • "google.golang.org/protobuf" silences descriptor pools • "cmd/compile" silences SSA sparse-set pools • "github.com/parquet-go" silences parquet memory pools With appropriate suppressions, GODEBUG=poolguard=1 catches the exact Pyroscope bufferpool bug (grafana/pyroscope#5116) even in a sequential single-goroutine test: WARNING: accessed pool buffer after Pool.Put Buffer was Put at: github.com/grafana/pyroscope/v2/pkg/util/bufferpool.Put .../bufferpool/pool.go:48 github.com/grafana/pyroscope/v2/pkg/block.(*Object).closeErr .../block/object.go:254 ... Implementation: poolGuardSuppressPatterns is parsed from POOLGUARD_SUPPRESS in poolguard.go's init(). poolGuardPutBacking captures the call stack with callers() before any mmap work, then calls poolGuardSuppressed() which walks the PCs and checks each frame's package path against the pattern list. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
simonswine
added a commit
to simonswine/go
that referenced
this pull request
May 8, 2026
Two correctness fixes:
1. GC safety (poolGuardCanMigrate):
Only migrate slice backing arrays whose element type has no Go pointer
fields (PtrBytes == 0). Previously any slice field in a pooled struct
was migrated to mmap pages. If the element type contained Go pointers
(e.g. []SomeStruct with pointer fields), those pointers became invisible
to the GC after migration, causing premature collection and the runtime
assertion "found pointer to free object".
Pure-data slices ([]byte, []uint8, []uint64, []int64 …) have PtrBytes=0
and are unaffected; they are still the common case for pooled byte buffers.
2. directOnly suppression frame index (poolGuardSuppressed):
The direct Pool.Put caller sits at different indices depending on the
code path through poolguard:
• *Struct pools (via poolGuardPutSliceFields): pcs[0]=sync.(*Pool).Put,
pcs[1]=actual caller.
• []byte pools (direct from sync_poolGuardPut): pcs[0]=actual caller.
The previous code always checked pcs[1], so "=" suppressions never fired
for slice-type pools. Fixed to check both pcs[0] and pcs[1], skipping
any frame that starts with "sync.(*Pool)." (the Pool.Put implementation).
With both fixes the Pyroscope V2 integration test (TestMicroServicesIntegrationV2)
runs to completion with a precise 17-entry suppression list:
POOLGUARD_SUPPRESS=\
=connectrpc.com/connect.(*bufferPool).Put,\
=fmt.(*pp).free,\
=github.com/go-kit/log.logfmtLogger.Log,\
=github.com/parquet-go/parquet-go.(*ColumnWriter).writeDataPage.deferwrap1,\
=github.com/parquet-go/parquet-go/internal/memory.(*ChunkBuffer[go.shape.uint8]).Reset,\
=github.com/parquet-go/parquet-go/internal/memory.putSliceToPool[go.shape.int64],\
=github.com/parquet-go/parquet-go/internal/memory.putSliceToPool[go.shape.uint64],\
=github.com/parquet-go/parquet-go/internal/memory.putSliceToPool[go.shape.uint8],\
=github.com/parquet-go/parquet-go.(*memoryBufferPool).PutBuffer,\
=github.com/parquet-go/parquet-go.putBufioReader,\
=google.golang.org/protobuf/internal/filedesc.(*File).unmarshalFull.deferwrap1,\
=google.golang.org/protobuf/internal/filedesc.(*File).unmarshalSeed.deferwrap1,\
=net/http.putBufioReader,\
=net/http.putBufioWriter,\
=regexp.freeBitState,\
=regexp.freeOnePassMachine
The Pyroscope bufferpool violation (grafana/pyroscope#5116) is detected
first and must be added to suppress the false positive before the test
passes; any regression in bufferpool would surface immediately.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
1 task
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reader.LabelValues (no-matcher path) returns yoloString aliases into the
pooled TSDB buffer. Once the dataset closes, that buffer is returned to
bufferpool and can be immediately reused by a concurrent request — overwriting
the strings while they are still held by the aggregator.
Fix by cloning each value with strings.Clone before the dataset is closed.
Adds Test_LabelValues_NoMatcher_BufferReuse: 20 goroutines × 20 iterations
of a no-matcher LabelValues query. Without the fix, concurrent buffer
recycling corrupts the returned strings (e.g. "test-app" becomes "codeInt64").
Run with -race to also trigger the detector on the underlying read/write race.
Note
Medium Risk
Touches a hot query path and changes memory/lifetime behavior to eliminate a potential data race from pooled buffer reuse; correctness improves but adds per-value allocations that could impact performance under high cardinality.
Overview
Fixes the no-matcher
LabelValuesfast path to clone returned strings (strings.Clone) so label values no longer alias a pooled TSDB/index buffer that may be recycled before the report is marshaled.Adds a concurrent regression test (
Test_LabelValues_NoMatcher_BufferReuse) that issues many parallel label-values queries and asserts stable results (and triggers-racewithout the fix) to guard against buffer reuse corruption/data races.Reviewed by Cursor Bugbot for commit 4fc79da. Bugbot is set up for automated code reviews on this repo. Configure here.