-
Notifications
You must be signed in to change notification settings - Fork 148
Disable the locked tracer in the runner integration tests #6532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for clarity
- The locked tracer means you can't replace the global tracer instance
- There are multiple tests that require replacing the global tracer instance.
- Therefore we can't use the locked tracer at all for these tests (in general)
- We also must make sure they run sequentially and not in parallel
Datadog ReportBranch report: ✅ 0 Failed, 241901 Passed, 1964 Skipped, 18h 49m 31.14s Total Time |
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing the following branches/commits: Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6532) - mean (69ms) : 66, 71
. : milestone, 69,
master - mean (69ms) : 63, 75
. : milestone, 69,
section CallTarget+Inlining+NGEN
This PR (6532) - mean (982ms) : 956, 1009
. : milestone, 982,
master - mean (973ms) : 953, 993
. : milestone, 973,
gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6532) - mean (107ms) : 105, 110
. : milestone, 107,
master - mean (107ms) : 104, 110
. : milestone, 107,
section CallTarget+Inlining+NGEN
This PR (6532) - mean (673ms) : 656, 690
. : milestone, 673,
master - mean (675ms) : 658, 692
. : milestone, 675,
gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6532) - mean (91ms) : 89, 93
. : milestone, 91,
master - mean (91ms) : 88, 93
. : milestone, 91,
section CallTarget+Inlining+NGEN
This PR (6532) - mean (633ms) : 618, 648
. : milestone, 633,
master - mean (632ms) : 618, 646
. : milestone, 632,
gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6532) - mean (194ms) : 190, 198
. : milestone, 194,
master - mean (194ms) : 190, 198
. : milestone, 194,
section CallTarget+Inlining+NGEN
This PR (6532) - mean (1,100ms) : 1070, 1129
. : milestone, 1100,
master - mean (1,096ms) : 1070, 1123
. : milestone, 1096,
gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6532) - mean (279ms) : 274, 283
. : milestone, 279,
master - mean (278ms) : 272, 284
. : milestone, 278,
section CallTarget+Inlining+NGEN
This PR (6532) - mean (871ms) : 843, 900
. : milestone, 871,
master - mean (870ms) : 843, 897
. : milestone, 870,
gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6532) - mean (266ms) : 262, 270
. : milestone, 266,
master - mean (266ms) : 262, 270
. : milestone, 266,
section CallTarget+Inlining+NGEN
This PR (6532) - mean (842ms) : 809, 875
. : milestone, 842,
master - mean (854ms) : 816, 892
. : milestone, 854,
|
Benchmarks Report for tracer 🐌Benchmarks for #6532 compared to master:
The following thresholds were used for comparing the benchmark speeds:
Allocation changes below 0.5% are ignored. Benchmark detailsBenchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.SpanBenchmark - Slower
|
Benchmark | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑netcoreapp3.1 | 1.113 | 569.36 | 633.96 |
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | StartFinishSpan |
net6.0 | 403ns | 0.294ns | 1.14ns | 0.0081 | 0 | 0 | 576 B |
master | StartFinishSpan |
netcoreapp3.1 | 569ns | 0.401ns | 1.5ns | 0.00769 | 0 | 0 | 576 B |
master | StartFinishSpan |
net472 | 710ns | 0.553ns | 2.14ns | 0.0917 | 0 | 0 | 578 B |
master | StartFinishScope |
net6.0 | 541ns | 0.262ns | 1.02ns | 0.00972 | 0 | 0 | 696 B |
master | StartFinishScope |
netcoreapp3.1 | 686ns | 0.527ns | 2.04ns | 0.00938 | 0 | 0 | 696 B |
master | StartFinishScope |
net472 | 927ns | 0.39ns | 1.46ns | 0.105 | 0 | 0 | 658 B |
#6532 | StartFinishSpan |
net6.0 | 399ns | 0.249ns | 0.965ns | 0.00803 | 0 | 0 | 576 B |
#6532 | StartFinishSpan |
netcoreapp3.1 | 634ns | 0.439ns | 1.64ns | 0.00799 | 0 | 0 | 576 B |
#6532 | StartFinishSpan |
net472 | 668ns | 0.674ns | 2.61ns | 0.0917 | 0 | 0 | 578 B |
#6532 | StartFinishScope |
net6.0 | 561ns | 0.415ns | 1.61ns | 0.00972 | 0 | 0 | 696 B |
#6532 | StartFinishScope |
netcoreapp3.1 | 723ns | 0.364ns | 1.36ns | 0.00939 | 0 | 0 | 696 B |
#6532 | StartFinishScope |
net472 | 895ns | 0.472ns | 1.83ns | 0.104 | 0 | 0 | 658 B |
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | RunOnMethodBegin |
net6.0 | 654ns | 0.525ns | 2.03ns | 0.00982 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
netcoreapp3.1 | 873ns | 0.651ns | 2.52ns | 0.00942 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
net472 | 1.12μs | 0.706ns | 2.73ns | 0.104 | 0 | 0 | 658 B |
#6532 | RunOnMethodBegin |
net6.0 | 663ns | 0.52ns | 2.01ns | 0.00962 | 0 | 0 | 696 B |
#6532 | RunOnMethodBegin |
netcoreapp3.1 | 942ns | 0.593ns | 2.3ns | 0.00911 | 0 | 0 | 696 B |
#6532 | RunOnMethodBegin |
net472 | 1.1μs | 0.659ns | 2.55ns | 0.104 | 0 | 0 | 658 B |
## Summary of changes Randomize the order of the tests. ## Reason for change Flaky tests are much harder to fix when we discover them long after they have been written. By randomizing the order of the tests, I'm hoping to make them fail earlier. In practice, this could temporarily increase the overall flakiness, but I expect this will reduce the overall effort spent on fixing tests. ## Implementation details In `CustomTestFramework`, randomize the list of all tests in each collections, and the collections themselves. The seed is displayed in the output. When a test order causes tests to fail, this allows to deterministically reproduce that test order. ## Other details Four other issues were found thanks to this: #6535, #6532, #6511, #6509
## Summary of changes Disable the locked tracer globally in the runner integration tests. ## Reason for change The locked tracer was already disabled in CiRunCommandTests, but it turns out there are other tests that require to disable it (we just got lucky with the ordering). Rather than discovering them one by one, let's just disable it globally. ## Implementation details Also added the proper xunit.runner.json file to have more info on the running tests in the CI. ## Other details Discovered while trying to randomize the order of the tests.
## Summary of changes Randomize the order of the tests. ## Reason for change Flaky tests are much harder to fix when we discover them long after they have been written. By randomizing the order of the tests, I'm hoping to make them fail earlier. In practice, this could temporarily increase the overall flakiness, but I expect this will reduce the overall effort spent on fixing tests. ## Implementation details In `CustomTestFramework`, randomize the list of all tests in each collections, and the collections themselves. The seed is displayed in the output. When a test order causes tests to fail, this allows to deterministically reproduce that test order. ## Other details Four other issues were found thanks to this: #6535, #6532, #6511, #6509
Summary of changes
Disable the locked tracer globally in the runner integration tests.
Reason for change
The locked tracer was already disabled in CiRunCommandTests, but it turns out there are other tests that require to disable it (we just got lucky with the ordering). Rather than discovering them one by one, let's just disable it globally.
Implementation details
Also added the proper xunit.runner.json file to have more info on the running tests in the CI.
Other details
Discovered while trying to randomize the order of the tests.