Skip to content

Commit 39380e8

Browse files
committed
runtime: fix block leak due to race in span set
The span set data structure may leak blocks due to a race in the logic to check whether it's safe to free a block. The simplest example of this race is between two poppers: 1. Popper A claims slot spanSetEntries-2. 2. Popper B claims slot spanSetEntries-1. 3. Popper A gets descheduled before it subtracts from block.used. 4. Popper B subtracts from block.used, sees that claimed spanSetEntries-1, but also that block.used != 0, so it returns. 5. Popper A comes back and subtracts from block.used, but it didn't claim spanSetEntries-1 so it also returns. The spine is left with a stale block pointer and the block later gets overwritten by pushes, never to be re-used again. The problem here is that we designate the claimer of slot spanSetEntries-1 to be the one who frees the block, but that may not be the thread that actually does the last subtraction from block.used. Fixing this problem is tricky, and the fundamental problem there is that block.used is not stable: it may be observed to be zero, but that doesn't necessarily mean you're the last popper! Do something simpler: keep a counter of how many pops have happened to a given block instead of block.used. This counter monotonically increases when a pop is _completely done_. Because this counter is monotonically increasing, and only increases when a popper is done, then we know for sure whichever popper is the last to increase it (i.e. its value is spanSetBlockEntries) is also the last popper in the block. Because the race described above still exists, the last popper may not be the one which claimed the last slot in the block, but we know for certain nobody else is popping from that block anymore so we can safely free it. Finally, because pops serialize with pushes to the same slot, we need not worry about concurrent pushers at all. Updates #37487. Change-Id: I6697219372774c8ca7d8ee6895eaa230a64ce9e1 Reviewed-on: https://go-review.googlesource.com/c/go/+/230497 Run-TryBot: Michael Knyszek <[email protected]> Reviewed-by: Michael Pratt <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
1 parent 0ddde4a commit 39380e8

File tree

1 file changed

+31
-12
lines changed

1 file changed

+31
-12
lines changed

src/runtime/mspanset.go

Lines changed: 31 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -60,12 +60,10 @@ type spanSetBlock struct {
6060
// Free spanSetBlocks are managed via a lock-free stack.
6161
lfnode
6262

63-
// used represents the number of slots in the spans array which are
64-
// currently in use. This number is used to help determine when a
65-
// block may be safely recycled.
66-
//
67-
// Accessed and updated atomically.
68-
used uint32
63+
// popped is the number of pop operations that have occurred on
64+
// this block. This number is used to help determine when a block
65+
// may be safely recycled.
66+
popped uint32
6967

7068
// spans is the set of spans in this block.
7169
spans [spanSetBlockEntries]*mspan
@@ -135,7 +133,6 @@ retry:
135133

136134
// We have a block. Insert the span atomically, since there may be
137135
// concurrent readers via the block API.
138-
atomic.Xadd(&block.used, 1)
139136
atomic.StorepNoWB(unsafe.Pointer(&block.spans[bottom]), unsafe.Pointer(s))
140137
}
141138

@@ -202,8 +199,19 @@ claimLoop:
202199
// corruption. This way, we'll get a nil pointer access instead.
203200
atomic.StorepNoWB(unsafe.Pointer(&block.spans[bottom]), nil)
204201

205-
// If we're the last possible popper in the block, free the block.
206-
if used := atomic.Xadd(&block.used, -1); used == 0 && bottom == spanSetBlockEntries-1 {
202+
// Increase the popped count. If we are the last possible popper
203+
// in the block (note that bottom need not equal spanSetBlockEntries-1
204+
// due to races) then it's our resposibility to free the block.
205+
//
206+
// If we increment popped to spanSetBlockEntries, we can be sure that
207+
// we're the last popper for this block, and it's thus safe to free it.
208+
// Every other popper must have crossed this barrier (and thus finished
209+
// popping its corresponding mspan) by the time we get here. Because
210+
// we're the last popper, we also don't have to worry about concurrent
211+
// pushers (there can't be any). Note that we may not be the popper
212+
// which claimed the last slot in the block, we're just the last one
213+
// to finish popping.
214+
if atomic.Xadd(&block.popped, 1) == spanSetBlockEntries {
207215
// Clear the block's pointer.
208216
atomic.StorepNoWB(blockp, nil)
209217

@@ -236,10 +244,20 @@ func (b *spanSet) reset() {
236244
blockp := (**spanSetBlock)(add(b.spine, sys.PtrSize*uintptr(top)))
237245
block := *blockp
238246
if block != nil {
239-
// Sanity check the used value.
240-
if block.used != 0 {
241-
throw("found used block in empty span set")
247+
// Sanity check the popped value.
248+
if block.popped == 0 {
249+
// popped should never be zero because that means we have
250+
// pushed at least one value but not yet popped if this
251+
// block pointer is not nil.
252+
throw("span set block with unpopped elements found in reset")
242253
}
254+
if block.popped == spanSetBlockEntries {
255+
// popped should also never be equal to spanSetBlockEntries
256+
// because the last popper should have made the block pointer
257+
// in this slot nil.
258+
throw("fully empty unfreed span set block found in reset")
259+
}
260+
243261
// Clear the pointer to the block.
244262
atomic.StorepNoWB(unsafe.Pointer(blockp), nil)
245263

@@ -270,6 +288,7 @@ func (p *spanSetBlockAlloc) alloc() *spanSetBlock {
270288

271289
// free returns a spanSetBlock back to the pool.
272290
func (p *spanSetBlockAlloc) free(block *spanSetBlock) {
291+
atomic.Store(&block.popped, 0)
273292
p.stack.push(&block.lfnode)
274293
}
275294

0 commit comments

Comments
 (0)