Skip to content

Commit 4179552

Browse files
committed
runtime: smooth cons/mark with a moving average and use actual trigger
This change modifies the pacer in two ways: * It replaces the PI controller used as a smoothing function with a simple two-cycle moving average. * It makes the pacer use the actual GC trigger point for cons/mark (and other) calculations instead of the precomputed one. The second part of this change was attempted in the Go 1.19 release cycle, but could not be done because although it resulted in a better-behaved pacer, it exploited the PI controller's sensitivity to history in a way that was ultimately unfavorable for most applications. This sensitivity is complex to reason about, and forces us into choices that don't really make sense (like using the precomputed trigger over the actual one -- that's really a bug fix). The net effect of this change is intended to be: * The pacer computes a more stable estimate of the actual cons/mark ratio, making it easier to understand. * The pacer is much more accurate at hitting the heap goal, so the GC respects GOGC much more often and reliably. * The pacer forces a more stable rate of GC assists in the steady-state overall. See https://perf.golang.org/search?q=upload:20221106.10 for benchmark results and #53892 for complete context. The benchmarks that regress in memory use appear to be not worth worrying about. In all cases, it appears that the GC was triggering super early, resulting in a lack of adherence to rule of GOGC. The fogleman benchmarks just have a single final GC that triggers early and also happens to be the peak heap size. The tile38 WithinCircle benchmark only has 4 GC cycles in the benchmarked region, so it's very sensitive to pacing. In this case, the old smoothing function is getting lucky by starting way too early, avoiding assists. Meanwhile the 2-cycle moving average is more accurate on the heap goal, but the 1st and 2nd cycle after the initialization phase are operating on a cons/mark from the initialization phase which is much less active, resulting in a cons/mark that's too low, causing the GC to start too late, increasing assists, and therefore latency (but only transiently, at this phase change). I really do think the PI controller is just getting lucky here with a particular history, because I've definitely observed it oscillating wildly in response to a phase change. This change also moves the PI controller out of mgcpacer.go, because it's no longer used there. It now lives in mgcscavenge.go, where it's actually used. Fixes #53892. Change-Id: I3f875a2e40f31f381920f91d8b090556b17a2b16 Reviewed-on: https://go-review.googlesource.com/c/go/+/417558 Run-TryBot: Michael Knyszek <[email protected]> Reviewed-by: Michael Pratt <[email protected]> TryBot-Result: Gopher Robot <[email protected]>
1 parent 26b2c9a commit 4179552

File tree

4 files changed

+125
-188
lines changed

4 files changed

+125
-188
lines changed

src/runtime/mgcpacer.go

+11-143
Original file line numberDiff line numberDiff line change
@@ -136,12 +136,10 @@ type gcControllerState struct {
136136
// Updated at the end of each GC cycle, in endCycle.
137137
consMark float64
138138

139-
// consMarkController holds the state for the mark-cons ratio
140-
// estimation over time.
141-
//
142-
// Its purpose is to smooth out noisiness in the computation of
143-
// consMark; see consMark for details.
144-
consMarkController piController
139+
// lastConsMark is the computed cons/mark value for the previous GC
140+
// cycle. Note that this is *not* the last value of cons/mark, but the
141+
// actual computed value. See endCycle for details.
142+
lastConsMark float64
145143

146144
// gcPercentHeapGoal is the goal heapLive for when next GC ends derived
147145
// from gcPercent.
@@ -372,28 +370,6 @@ type gcControllerState struct {
372370
func (c *gcControllerState) init(gcPercent int32, memoryLimit int64) {
373371
c.heapMinimum = defaultHeapMinimum
374372
c.triggered = ^uint64(0)
375-
376-
c.consMarkController = piController{
377-
// Tuned first via the Ziegler-Nichols process in simulation,
378-
// then the integral time was manually tuned against real-world
379-
// applications to deal with noisiness in the measured cons/mark
380-
// ratio.
381-
kp: 0.9,
382-
ti: 4.0,
383-
384-
// Set a high reset time in GC cycles.
385-
// This is inversely proportional to the rate at which we
386-
// accumulate error from clipping. By making this very high
387-
// we make the accumulation slow. In general, clipping is
388-
// OK in our situation, hence the choice.
389-
//
390-
// Tune this if we get unintended effects from clipping for
391-
// a long time.
392-
tt: 1000,
393-
min: -1000,
394-
max: 1000,
395-
}
396-
397373
c.setGCPercent(gcPercent)
398374
c.setMemoryLimit(memoryLimit)
399375
c.commit(true) // No sweep phase in the first GC cycle.
@@ -416,26 +392,7 @@ func (c *gcControllerState) startCycle(markStartTime int64, procs int, trigger g
416392
c.fractionalMarkTime.Store(0)
417393
c.idleMarkTime.Store(0)
418394
c.markStartTime = markStartTime
419-
420-
// TODO(mknyszek): This is supposed to be the actual trigger point for the heap, but
421-
// causes regressions in memory use. The cause is that the PI controller used to smooth
422-
// the cons/mark ratio measurements tends to flail when using the less accurate precomputed
423-
// trigger for the cons/mark calculation, and this results in the controller being more
424-
// conservative about steady-states it tries to find in the future.
425-
//
426-
// This conservatism is transient, but these transient states tend to matter for short-lived
427-
// programs, especially because the PI controller is overdamped, partially because it is
428-
// configured with a relatively large time constant.
429-
//
430-
// Ultimately, I think this is just two mistakes piled on one another: the choice of a swingy
431-
// smoothing function that recalls a fairly long history (due to its overdamped time constant)
432-
// coupled with an inaccurate cons/mark calculation. It just so happens this works better
433-
// today, and it makes it harder to change things in the future.
434-
//
435-
// This is described in #53738. Fix this for #53892 by changing back to the actual trigger
436-
// point and simplifying the smoothing function.
437-
heapTrigger, heapGoal := c.trigger()
438-
c.triggered = heapTrigger
395+
c.triggered = c.heapLive.Load()
439396

440397
// Compute the background mark utilization goal. In general,
441398
// this may not come out exactly. We round the number of
@@ -498,6 +455,7 @@ func (c *gcControllerState) startCycle(markStartTime int64, procs int, trigger g
498455
c.revise()
499456

500457
if debug.gcpacertrace > 0 {
458+
heapGoal := c.heapGoal()
501459
assistRatio := c.assistWorkPerByte.Load()
502460
print("pacer: assist ratio=", assistRatio,
503461
" (scan ", gcController.heapScan.Load()>>20, " MB in ",
@@ -700,31 +658,12 @@ func (c *gcControllerState) endCycle(now int64, procs int, userForced bool) {
700658
currentConsMark := (float64(c.heapLive.Load()-c.triggered) * (utilization + idleUtilization)) /
701659
(float64(scanWork) * (1 - utilization))
702660

703-
// Update cons/mark controller. The time period for this is 1 GC cycle.
704-
//
705-
// This use of a PI controller might seem strange. So, here's an explanation:
706-
//
707-
// currentConsMark represents the consMark we *should've* had to be perfectly
708-
// on-target for this cycle. Given that we assume the next GC will be like this
709-
// one in the steady-state, it stands to reason that we should just pick that
710-
// as our next consMark. In practice, however, currentConsMark is too noisy:
711-
// we're going to be wildly off-target in each GC cycle if we do that.
712-
//
713-
// What we do instead is make a long-term assumption: there is some steady-state
714-
// consMark value, but it's obscured by noise. By constantly shooting for this
715-
// noisy-but-perfect consMark value, the controller will bounce around a bit,
716-
// but its average behavior, in aggregate, should be less noisy and closer to
717-
// the true long-term consMark value, provided its tuned to be slightly overdamped.
718-
var ok bool
661+
// Update our cons/mark estimate. This is the raw value above, but averaged over 2 GC cycles
662+
// because it tends to be jittery, even in the steady-state. The smoothing helps the GC to
663+
// maintain much more stable cycle-by-cycle behavior.
719664
oldConsMark := c.consMark
720-
c.consMark, ok = c.consMarkController.next(c.consMark, currentConsMark, 1.0)
721-
if !ok {
722-
// The error spiraled out of control. This is incredibly unlikely seeing
723-
// as this controller is essentially just a smoothing function, but it might
724-
// mean that something went very wrong with how currentConsMark was calculated.
725-
// Just reset consMark and keep going.
726-
c.consMark = 0
727-
}
665+
c.consMark = (currentConsMark + c.lastConsMark) / 2
666+
c.lastConsMark = currentConsMark
728667

729668
if debug.gcpacertrace > 0 {
730669
printlock()
@@ -733,9 +672,6 @@ func (c *gcControllerState) endCycle(now int64, procs int, userForced bool) {
733672
print(c.heapScanWork.Load(), "+", c.stackScanWork.Load(), "+", c.globalsScanWork.Load(), " B work (", c.lastHeapScan+c.lastStackScan.Load()+c.globalsScan.Load(), " B exp.) ")
734673
live := c.heapLive.Load()
735674
print("in ", c.triggered, " B -> ", live, " B (∆goal ", int64(live)-int64(c.lastHeapGoal), ", cons/mark ", oldConsMark, ")")
736-
if !ok {
737-
print("[controller reset]")
738-
}
739675
println()
740676
printunlock()
741677
}
@@ -1379,74 +1315,6 @@ func readGOMEMLIMIT() int64 {
13791315
return n
13801316
}
13811317

1382-
type piController struct {
1383-
kp float64 // Proportional constant.
1384-
ti float64 // Integral time constant.
1385-
tt float64 // Reset time.
1386-
1387-
min, max float64 // Output boundaries.
1388-
1389-
// PI controller state.
1390-
1391-
errIntegral float64 // Integral of the error from t=0 to now.
1392-
1393-
// Error flags.
1394-
errOverflow bool // Set if errIntegral ever overflowed.
1395-
inputOverflow bool // Set if an operation with the input overflowed.
1396-
}
1397-
1398-
// next provides a new sample to the controller.
1399-
//
1400-
// input is the sample, setpoint is the desired point, and period is how much
1401-
// time (in whatever unit makes the most sense) has passed since the last sample.
1402-
//
1403-
// Returns a new value for the variable it's controlling, and whether the operation
1404-
// completed successfully. One reason this might fail is if error has been growing
1405-
// in an unbounded manner, to the point of overflow.
1406-
//
1407-
// In the specific case of an error overflow occurs, the errOverflow field will be
1408-
// set and the rest of the controller's internal state will be fully reset.
1409-
func (c *piController) next(input, setpoint, period float64) (float64, bool) {
1410-
// Compute the raw output value.
1411-
prop := c.kp * (setpoint - input)
1412-
rawOutput := prop + c.errIntegral
1413-
1414-
// Clamp rawOutput into output.
1415-
output := rawOutput
1416-
if isInf(output) || isNaN(output) {
1417-
// The input had a large enough magnitude that either it was already
1418-
// overflowed, or some operation with it overflowed.
1419-
// Set a flag and reset. That's the safest thing to do.
1420-
c.reset()
1421-
c.inputOverflow = true
1422-
return c.min, false
1423-
}
1424-
if output < c.min {
1425-
output = c.min
1426-
} else if output > c.max {
1427-
output = c.max
1428-
}
1429-
1430-
// Update the controller's state.
1431-
if c.ti != 0 && c.tt != 0 {
1432-
c.errIntegral += (c.kp*period/c.ti)*(setpoint-input) + (period/c.tt)*(output-rawOutput)
1433-
if isInf(c.errIntegral) || isNaN(c.errIntegral) {
1434-
// So much error has accumulated that we managed to overflow.
1435-
// The assumptions around the controller have likely broken down.
1436-
// Set a flag and reset. That's the safest thing to do.
1437-
c.reset()
1438-
c.errOverflow = true
1439-
return c.min, false
1440-
}
1441-
}
1442-
return output, true
1443-
}
1444-
1445-
// reset resets the controller state, except for controller error flags.
1446-
func (c *piController) reset() {
1447-
c.errIntegral = 0
1448-
}
1449-
14501318
// addIdleMarkWorker attempts to add a new idle mark worker.
14511319
//
14521320
// If this returns true, the caller must become an idle mark worker unless

src/runtime/mgcpacer_test.go

-45
Original file line numberDiff line numberDiff line change
@@ -1019,51 +1019,6 @@ func (f float64Stream) limit(min, max float64) float64Stream {
10191019
}
10201020
}
10211021

1022-
func FuzzPIController(f *testing.F) {
1023-
isNormal := func(x float64) bool {
1024-
return !math.IsInf(x, 0) && !math.IsNaN(x)
1025-
}
1026-
isPositive := func(x float64) bool {
1027-
return isNormal(x) && x > 0
1028-
}
1029-
// Seed with constants from controllers in the runtime.
1030-
// It's not critical that we keep these in sync, they're just
1031-
// reasonable seed inputs.
1032-
f.Add(0.3375, 3.2e6, 1e9, 0.001, 1000.0, 0.01)
1033-
f.Add(0.9, 4.0, 1000.0, -1000.0, 1000.0, 0.84)
1034-
f.Fuzz(func(t *testing.T, kp, ti, tt, min, max, setPoint float64) {
1035-
// Ignore uninteresting invalid parameters. These parameters
1036-
// are constant, so in practice surprising values will be documented
1037-
// or will be other otherwise immediately visible.
1038-
//
1039-
// We just want to make sure that given a non-Inf, non-NaN input,
1040-
// we always get a non-Inf, non-NaN output.
1041-
if !isPositive(kp) || !isPositive(ti) || !isPositive(tt) {
1042-
return
1043-
}
1044-
if !isNormal(min) || !isNormal(max) || min > max {
1045-
return
1046-
}
1047-
// Use a random source, but make it deterministic.
1048-
rs := rand.New(rand.NewSource(800))
1049-
randFloat64 := func() float64 {
1050-
return math.Float64frombits(rs.Uint64())
1051-
}
1052-
p := NewPIController(kp, ti, tt, min, max)
1053-
state := float64(0)
1054-
for i := 0; i < 100; i++ {
1055-
input := randFloat64()
1056-
// Ignore the "ok" parameter. We're just trying to break it.
1057-
// state is intentionally completely uncorrelated with the input.
1058-
var ok bool
1059-
state, ok = p.Next(input, setPoint, 1.0)
1060-
if !isNormal(state) {
1061-
t.Fatalf("got NaN or Inf result from controller: %f %v", state, ok)
1062-
}
1063-
}
1064-
})
1065-
}
1066-
10671022
func TestIdleMarkWorkerCount(t *testing.T) {
10681023
const workers = 10
10691024
c := NewGCController(100, math.MaxInt64)

src/runtime/mgcscavenge.go

+68
Original file line numberDiff line numberDiff line change
@@ -1114,3 +1114,71 @@ func (s *scavengeIndex) mark(base, limit uintptr) {
11141114
func (s *scavengeIndex) clear(ci chunkIdx) {
11151115
s.chunks[ci/8].And(^uint8(1 << (ci % 8)))
11161116
}
1117+
1118+
type piController struct {
1119+
kp float64 // Proportional constant.
1120+
ti float64 // Integral time constant.
1121+
tt float64 // Reset time.
1122+
1123+
min, max float64 // Output boundaries.
1124+
1125+
// PI controller state.
1126+
1127+
errIntegral float64 // Integral of the error from t=0 to now.
1128+
1129+
// Error flags.
1130+
errOverflow bool // Set if errIntegral ever overflowed.
1131+
inputOverflow bool // Set if an operation with the input overflowed.
1132+
}
1133+
1134+
// next provides a new sample to the controller.
1135+
//
1136+
// input is the sample, setpoint is the desired point, and period is how much
1137+
// time (in whatever unit makes the most sense) has passed since the last sample.
1138+
//
1139+
// Returns a new value for the variable it's controlling, and whether the operation
1140+
// completed successfully. One reason this might fail is if error has been growing
1141+
// in an unbounded manner, to the point of overflow.
1142+
//
1143+
// In the specific case of an error overflow occurs, the errOverflow field will be
1144+
// set and the rest of the controller's internal state will be fully reset.
1145+
func (c *piController) next(input, setpoint, period float64) (float64, bool) {
1146+
// Compute the raw output value.
1147+
prop := c.kp * (setpoint - input)
1148+
rawOutput := prop + c.errIntegral
1149+
1150+
// Clamp rawOutput into output.
1151+
output := rawOutput
1152+
if isInf(output) || isNaN(output) {
1153+
// The input had a large enough magnitude that either it was already
1154+
// overflowed, or some operation with it overflowed.
1155+
// Set a flag and reset. That's the safest thing to do.
1156+
c.reset()
1157+
c.inputOverflow = true
1158+
return c.min, false
1159+
}
1160+
if output < c.min {
1161+
output = c.min
1162+
} else if output > c.max {
1163+
output = c.max
1164+
}
1165+
1166+
// Update the controller's state.
1167+
if c.ti != 0 && c.tt != 0 {
1168+
c.errIntegral += (c.kp*period/c.ti)*(setpoint-input) + (period/c.tt)*(output-rawOutput)
1169+
if isInf(c.errIntegral) || isNaN(c.errIntegral) {
1170+
// So much error has accumulated that we managed to overflow.
1171+
// The assumptions around the controller have likely broken down.
1172+
// Set a flag and reset. That's the safest thing to do.
1173+
c.reset()
1174+
c.errOverflow = true
1175+
return c.min, false
1176+
}
1177+
}
1178+
return output, true
1179+
}
1180+
1181+
// reset resets the controller state, except for controller error flags.
1182+
func (c *piController) reset() {
1183+
c.errIntegral = 0
1184+
}

src/runtime/mgcscavenge_test.go

+46
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ package runtime_test
77
import (
88
"fmt"
99
"internal/goos"
10+
"math"
1011
"math/rand"
1112
. "runtime"
1213
"runtime/internal/atomic"
@@ -707,3 +708,48 @@ func TestScavengeIndex(t *testing.T) {
707708
find(0, 0)
708709
})
709710
}
711+
712+
func FuzzPIController(f *testing.F) {
713+
isNormal := func(x float64) bool {
714+
return !math.IsInf(x, 0) && !math.IsNaN(x)
715+
}
716+
isPositive := func(x float64) bool {
717+
return isNormal(x) && x > 0
718+
}
719+
// Seed with constants from controllers in the runtime.
720+
// It's not critical that we keep these in sync, they're just
721+
// reasonable seed inputs.
722+
f.Add(0.3375, 3.2e6, 1e9, 0.001, 1000.0, 0.01)
723+
f.Add(0.9, 4.0, 1000.0, -1000.0, 1000.0, 0.84)
724+
f.Fuzz(func(t *testing.T, kp, ti, tt, min, max, setPoint float64) {
725+
// Ignore uninteresting invalid parameters. These parameters
726+
// are constant, so in practice surprising values will be documented
727+
// or will be other otherwise immediately visible.
728+
//
729+
// We just want to make sure that given a non-Inf, non-NaN input,
730+
// we always get a non-Inf, non-NaN output.
731+
if !isPositive(kp) || !isPositive(ti) || !isPositive(tt) {
732+
return
733+
}
734+
if !isNormal(min) || !isNormal(max) || min > max {
735+
return
736+
}
737+
// Use a random source, but make it deterministic.
738+
rs := rand.New(rand.NewSource(800))
739+
randFloat64 := func() float64 {
740+
return math.Float64frombits(rs.Uint64())
741+
}
742+
p := NewPIController(kp, ti, tt, min, max)
743+
state := float64(0)
744+
for i := 0; i < 100; i++ {
745+
input := randFloat64()
746+
// Ignore the "ok" parameter. We're just trying to break it.
747+
// state is intentionally completely uncorrelated with the input.
748+
var ok bool
749+
state, ok = p.Next(input, setPoint, 1.0)
750+
if !isNormal(state) {
751+
t.Fatalf("got NaN or Inf result from controller: %f %v", state, ok)
752+
}
753+
}
754+
})
755+
}

0 commit comments

Comments
 (0)