-
Notifications
You must be signed in to change notification settings - Fork 18k
Benchmark results vastly different when Tests are skipped with -run=XXX #42696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you run both with -count=20 -cpu=1 and see if there is a high interrun variance? Also make sure to run on a quiet computer ideally with other applications and browsers closed. If this benchmark does any disk or network io this might also introduce latencies and if run only one might have received a cache result why the other run didnt. |
@martisch here's a run with those flags: $ go test ./ -bench=BenchmarkGenerate -count=20 -cpu=1
goos: linux
goarch: amd64
pkg: cegen
BenchmarkGenerate 166654 6381 ns/op
BenchmarkGenerate 180555 6240 ns/op
BenchmarkGenerate 181734 6258 ns/op
BenchmarkGenerate 183918 6260 ns/op
BenchmarkGenerate 181689 6275 ns/op
BenchmarkGenerate 181914 6310 ns/op
BenchmarkGenerate 181581 6284 ns/op
BenchmarkGenerate 182289 6284 ns/op
BenchmarkGenerate 181348 6267 ns/op
BenchmarkGenerate 180328 6275 ns/op
BenchmarkGenerate 179772 6259 ns/op
BenchmarkGenerate 183290 6221 ns/op
BenchmarkGenerate 179972 6282 ns/op
BenchmarkGenerate 182958 6257 ns/op
BenchmarkGenerate 182750 6302 ns/op
BenchmarkGenerate 181096 6261 ns/op
BenchmarkGenerate 180844 6294 ns/op
BenchmarkGenerate 181142 6287 ns/op
BenchmarkGenerate 180400 6275 ns/op
BenchmarkGenerate 180426 6266 ns/op
PASS
ok cegen 24.072s $ go test ./ -bench=BenchmarkGenerate -count=20 -cpu=1 -run=XXX
goos: linux
goarch: amd64
pkg: cegen
BenchmarkGenerate 391664 3062 ns/op
BenchmarkGenerate 380072 3045 ns/op
BenchmarkGenerate 388015 3081 ns/op
BenchmarkGenerate 392066 3041 ns/op
BenchmarkGenerate 381025 3093 ns/op
BenchmarkGenerate 397995 3041 ns/op
BenchmarkGenerate 385315 3089 ns/op
BenchmarkGenerate 398268 3041 ns/op
BenchmarkGenerate 389184 3092 ns/op
BenchmarkGenerate 395845 3056 ns/op
BenchmarkGenerate 382033 3090 ns/op
BenchmarkGenerate 394696 3028 ns/op
BenchmarkGenerate 391120 3089 ns/op
BenchmarkGenerate 379633 3033 ns/op
BenchmarkGenerate 337450 3103 ns/op
BenchmarkGenerate 392859 3038 ns/op
BenchmarkGenerate 383972 3086 ns/op
BenchmarkGenerate 392360 3041 ns/op
BenchmarkGenerate 382046 3098 ns/op
BenchmarkGenerate 381825 3042 ns/op
PASS
ok cegen 25.296s My laptop isn't busy right now, I closed other applications prior to running the benchmarks, and the application isn't interacting with the disk or network (code posted in the description). I'm curious what you get when you execute those commands? (edit: updating issue description with those results) |
Thanks. My only hunch now is the test might be setting or caching something that helps the benchmarks. That is quite a bit of code and dependencies which I had not had time to understand and I am currently reluctant to just run it. If somebody can reduce the complexity to have a more minimal reproducer that could help find why there is an effect. |
Hm actually no, I shouldn't do that. If I remove all $ go test ./ -bench=BenchmarkGenerate -cpu=1
goos: linux
goarch: amd64
pkg: cegen
BenchmarkGenerate 390487 3062 ns/op
PASS
ok cegen 1.232ss $ go test ./ -bench=BenchmarkGenerate -cpu=1 -run=XXX
goos: linux
goarch: amd64
pkg: cegen
BenchmarkGenerate 393402 3028 ns/op
PASS
ok cegen 2.215s |
Maybe we can remove test one by one to see if any test in particular turns out to improve the benchmark results? Maybe then we can see if that test has side effects like registering a callback or changing some global state without resetting it back to defaults. For example I can see there is a sync.Pool with global state involved. It might just be that the tests causes the sync.Pool to be filled and causing less allocations in the later benchmark. You can add b.ReportAllocs() before the for loop in the benchmark to see allocations. |
Oh, right, global vars are shared between all tests. I didn't even consider that angle. Thanks for bringing it up 🤦
https://gist.github.com/antoineco/19fda1e474bcc545cffe48ac07f9889e#file-main_test-go-L63-L65 $ go test ./ -bench=BenchmarkGenerate -count=5 -cpu=1
goos: linux
goarch: amd64
pkg: cegen
BenchmarkGenerate 164232 6758 ns/op 13945 B/op 22 allocs/op
BenchmarkGenerate 177374 6586 ns/op 13945 B/op 22 allocs/op
BenchmarkGenerate 180433 6594 ns/op 13945 B/op 22 allocs/op
BenchmarkGenerate 175017 6696 ns/op 13945 B/op 22 allocs/op
BenchmarkGenerate 174841 6692 ns/op 13945 B/op 22 allocs/op
PASS
ok cegen 7.175s $ go test ./ -bench=BenchmarkGenerate -count=5 -cpu=1 -run=XXX
goos: linux
goarch: amd64
pkg: cegen
BenchmarkGenerate 375070 3244 ns/op 192 B/op 8 allocs/op
BenchmarkGenerate 387146 3217 ns/op 192 B/op 8 allocs/op
BenchmarkGenerate 381927 3178 ns/op 192 B/op 8 allocs/op
BenchmarkGenerate 389636 3191 ns/op 192 B/op 8 allocs/op
BenchmarkGenerate 336218 3163 ns/op 192 B/op 8 allocs/op
PASS
ok cegen 6.157s Thanks a lot for investigating. 🙌 Mystery solved! (edit) In case someone read this and the cause of the issue isn't obvious 👇 diff --git a/perf/tools/cegen/main.go b/perf/tools/cegen/main.go
index cf585da..1662866 100644
--- a/perf/tools/cegen/main.go
+++ b/perf/tools/cegen/main.go
@@ -139,6 +139,9 @@ type CloudEventTargetsGenerator struct {
typeAttr string
sourceAttr string
data []byte
+
+ // Once used to initialize the buffer pool on the first call to Generate.
+ bufOnce sync.Once
}
// NewCloudEventTargetsGenerator returns a generator that yields vegeta JSON
@@ -157,9 +160,6 @@ func NewCloudEventTargetsGenerator(url, typeAttr, sourceAttr string, data []byte
// Buffer pool for jwriter.Writer's underlying Buffer and output.
var writerBufPool *sync.Pool
-// Once used to initialize the buffer pool on the first call to Generate.
-var bufOnce sync.Once
-
// Generate returns a target serialized as JSON.
func (g *CloudEventTargetsGenerator) Generate() ([]byte, error) {
var t jsonTarget
@@ -181,7 +181,7 @@ func (g *CloudEventTargetsGenerator) Generate() ([]byte, error) {
t.Body = g.data
// encode inside the Once fn to determine the size of buffers in sync pools
- bufOnce.Do(func() {
+ g.bufOnce.Do(func() {
var jw jwriter.Writer
t.encode(&jw)
dataBytes, _ := jw.BuildBytes() |
@antoineco, FWIW, putting differently-sized buffers in a |
@bcmills that's another interesting point 👍 Although in my case the issue was purely a misuse, as a side effect of me expecting the global var to be "fresh" with every test. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Ran a benchmark that is part of the following program's tests: https://gist.github.com/antoineco/19fda1e474bcc545cffe48ac07f9889e, once without
-run
flag, once with-run=XXX
to skip tests.What did you expect to see?
Similar results in both cases.
What did you see instead?
Vastly different results:
The text was updated successfully, but these errors were encountered: