Closed
Description
One of the goals of MiniProfiler has always been to add as little overhead as possible to get our timings. Luckily the ASP.NET team has a Benchmarks repo (aspnet/benchmarks). I've created a minimal fork to add MiniProfiler (I need to discuss with that team if this is something that's even welcome, and if so how we'd want to set it up) - that fork is here: NickCraver/benchmarks.
Here's a current benchmark comparison of aspnet/benchmarks with and without MiniProfiler Middleware activated to get a general idea of overhead as a current baseline.
Without (151,417.5 RPS over 4 runs)
-bash-4.2$ ./wrk -c 256 -t 32 -d 10 -s ./scripts/pipeline.lua http://10.8.2.111:5000/plaintext -- 16
Running 10s test @ http://10.8.2.111:5000/plaintext
32 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 20.57ms 51.24ms 524.91ms 90.94%
Req/Sec 4.83k 689.94 10.74k 82.19%
1546313 requests in 10.10s, 1.70GB read
Requests/sec: 153109.54
Transfer/sec: 172.16MB
-bash-4.2$
-bash-4.2$ ./wrk -c 256 -t 32 -d 10 -s ./scripts/pipeline.lua http://10.8.2.111:5000/plaintext -- 16
Running 10s test @ http://10.8.2.111:5000/plaintext
32 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 47.06ms 123.84ms 1.79s 92.94%
Req/Sec 4.83k 1.05k 12.77k 79.97%
1523088 requests in 10.10s, 1.67GB read
Socket errors: connect 0, read 0, write 0, timeout 5
Requests/sec: 150808.75
Transfer/sec: 169.57MB
-bash-4.2$ ./wrk -c 256 -t 32 -d 10 -s ./scripts/pipeline.lua http://10.8.2.111:5000/plaintext -- 16
Running 10s test @ http://10.8.2.111:5000/plaintext
32 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 23.20ms 49.25ms 409.99ms 89.02%
Req/Sec 4.68k 0.86k 20.28k 86.68%
1498335 requests in 10.10s, 1.65GB read
Requests/sec: 148345.87
Transfer/sec: 166.81MB
-bash-4.2$ ./wrk -c 256 -t 32 -d 10 -s ./scripts/pipeline.lua http://10.8.2.111:5000/plaintext -- 16
Running 10s test @ http://10.8.2.111:5000/plaintext
32 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 22.80ms 49.43ms 414.15ms 89.17%
Req/Sec 4.85k 0.87k 13.07k 84.68%
1549396 requests in 10.10s, 1.70GB read
Requests/sec: 153405.85
Transfer/sec: 172.50MB
With MiniProfiler Enabled (115,415.35 RPS over 4 runs)
-bash-4.2$ ./wrk -c 256 -t 32 -d 10 -s ./scripts/pipeline.lua http://10.8.2.111:5000/plaintext -- 16
Running 10s test @ http://10.8.2.111:5000/plaintext
32 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 49.97ms 87.79ms 435.86ms 83.68%
Req/Sec 3.55k 1.22k 9.67k 77.44%
1064787 requests in 10.10s, 1.32GB read
Requests/sec: 105422.74
Transfer/sec: 133.33MB
-bash-4.2$ ./wrk -c 256 -t 32 -d 10 -s ./scripts/pipeline.lua http://10.8.2.111:5000/plaintext -- 16
Running 10s test @ http://10.8.2.111:5000/plaintext
32 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 54.88ms 92.87ms 493.74ms 82.86%
Req/Sec 3.83k 1.17k 12.74k 81.83%
1199836 requests in 10.10s, 1.48GB read
Requests/sec: 118794.48
Transfer/sec: 150.25MB
-bash-4.2$ ./wrk -c 256 -t 32 -d 10 -s ./scripts/pipeline.lua http://10.8.2.111:5000/plaintext -- 16
Running 10s test @ http://10.8.2.111:5000/plaintext
32 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 69.93ms 108.49ms 926.36ms 81.69%
Req/Sec 3.83k 1.32k 13.59k 77.07%
1192225 requests in 10.10s, 1.47GB read
Requests/sec: 118075.61
Transfer/sec: 149.33MB
-bash-4.2$ ./wrk -c 256 -t 32 -d 10 -s ./scripts/pipeline.lua http://10.8.2.111:5000/plaintext -- 16
Running 10s test @ http://10.8.2.111:5000/plaintext
32 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 52.13ms 94.91ms 908.83ms 83.97%
Req/Sec 3.89k 1.29k 17.44k 83.52%
1205621 requests in 10.10s, 1.49GB read
Requests/sec: 119368.59
Transfer/sec: 150.97MB
So it takes us from about 151,417.5 RPS down to 115,415.35 RPS, or a 23.8% overhead/cost on the /plaintext
test. Let's see if we can get that down a bit. This issue is for tracking and suggestions.