diff --git a/cmd/benchstat/README.md b/cmd/benchstat/README.md
deleted file mode 100644
index 0a299af..0000000
--- a/cmd/benchstat/README.md
+++ /dev/null
@@ -1,85 +0,0 @@
-# Benchstat
-
-Benchstat computes and compares statistics about benchmarks.
-
-Usage:
-
-    benchstat [options] old.txt [new.txt] [more.txt ...]
-
-Run `benchstat -h` for the list of supported options.
-
-Each input file should contain the concatenated output of a number of runs
-of `go test -bench`. For each different benchmark listed in an input file,
-benchstat computes the mean, minimum, and maximum run time, after removing
-outliers using the interquartile range rule.
-
-If invoked on a single input file, benchstat prints the per-benchmark
-statistics for that file.
-
-If invoked on a pair of input files, benchstat adds to the output a column
-showing the statistics from the second file and a column showing the percent
-change in mean from the first to the second file. Next to the percent
-change, benchstat shows the p-value and sample sizes from a test of the two
-distributions of benchmark times. Small p-values indicate that the two
-distributions are significantly different. If the test indicates that there
-was no significant change between the two benchmarks (defined as p > 0.05),
-benchstat displays a single ~ instead of the percent change.
-
-The -delta-test option controls which significance test is applied: utest
-(Mann-Whitney U-test), ttest (two-sample Welch t-test), or none. The default
-is the U-test, sometimes also referred to as the Wilcoxon rank sum test.
-
-If invoked on more than two input files, benchstat prints the per-benchmark
-statistics for all the files, showing one column of statistics for each
-file, with no column for percent change or statistical significance.
-
-The -html option causes benchstat to print the results as an HTML table.
-
-## Example
-
-Suppose we collect benchmark results from running `go test -bench=Encode`
-five times before and after a particular change.
-
-The file old.txt contains:
-
-    BenchmarkGobEncode   	100	  13552735 ns/op	  56.63 MB/s
-    BenchmarkJSONEncode  	 50	  32395067 ns/op	  59.90 MB/s
-    BenchmarkGobEncode   	100	  13553943 ns/op	  56.63 MB/s
-    BenchmarkJSONEncode  	 50	  32334214 ns/op	  60.01 MB/s
-    BenchmarkGobEncode   	100	  13606356 ns/op	  56.41 MB/s
-    BenchmarkJSONEncode  	 50	  31992891 ns/op	  60.65 MB/s
-    BenchmarkGobEncode   	100	  13683198 ns/op	  56.09 MB/s
-    BenchmarkJSONEncode  	 50	  31735022 ns/op	  61.15 MB/s
-
-The file new.txt contains:
-
-    BenchmarkGobEncode   	 100	  11773189 ns/op	  65.19 MB/s
-    BenchmarkJSONEncode  	  50	  32036529 ns/op	  60.57 MB/s
-    BenchmarkGobEncode   	 100	  11942588 ns/op	  64.27 MB/s
-    BenchmarkJSONEncode  	  50	  32156552 ns/op	  60.34 MB/s
-    BenchmarkGobEncode   	 100	  11786159 ns/op	  65.12 MB/s
-    BenchmarkJSONEncode  	  50	  31288355 ns/op	  62.02 MB/s
-    BenchmarkGobEncode   	 100	  11628583 ns/op	  66.00 MB/s
-    BenchmarkJSONEncode  	  50	  31559706 ns/op	  61.49 MB/s
-    BenchmarkGobEncode   	 100	  11815924 ns/op	  64.96 MB/s
-    BenchmarkJSONEncode  	  50	  31765634 ns/op	  61.09 MB/s
-
-The order of the lines in the file does not matter, except that the output
-lists benchmarks in order of appearance.
-
-If run with just one input file, benchstat summarizes that file:
-
-    $ benchstat old.txt
-    name        time/op
-    GobEncode   13.6ms ± 1%
-    JSONEncode  32.1ms ± 1%
-
-If run with two input files, benchstat summarizes and compares:
-
-    $ benchstat old.txt new.txt
-    name        old time/op  new time/op  delta
-    GobEncode   13.6ms ± 1%  11.8ms ± 1%  -13.31% (p=0.016 n=4+5)
-    JSONEncode  32.1ms ± 1%  31.8ms ± 1%     ~    (p=0.286 n=4+5)
-
-Note that the JSONEncode result is reported as statistically insignificant
-instead of a -0.93% delta.
diff --git a/cmd/benchstat/main.go b/cmd/benchstat/main.go
index b882c06..7c6e76d 100644
--- a/cmd/benchstat/main.go
+++ b/cmd/benchstat/main.go
@@ -79,7 +79,6 @@
 //	name        time/op
 //	GobEncode   13.6ms ± 1%
 //	JSONEncode  32.1ms ± 1%
-//	$
 //
 // If run with two input files, benchstat summarizes and compares:
 //
@@ -87,11 +86,77 @@
 //	name        old time/op  new time/op  delta
 //	GobEncode   13.6ms ± 1%  11.8ms ± 1%  -13.31% (p=0.016 n=4+5)
 //	JSONEncode  32.1ms ± 1%  31.8ms ± 1%     ~    (p=0.286 n=4+5)
-//	$
 //
 // Note that the JSONEncode result is reported as
 // statistically insignificant instead of a -0.93% delta.
 //
+// An example benchmarking workflow in Unix shell language:
+//
+//	oldBin=/tmp/benchmarkBinaryOld
+//	newBin=/tmp/benchmarkBinaryNew
+//	old=/tmp/benchmarkReportOld
+//	new=/tmp/benchmarkReportNew
+//	result=/tmp/benchstatReport
+//	
+//	# Create first test executable.
+//	go test -c -o "$oldBin" -bench .
+//	
+//	# Apply code patch now
+//	git checkout fixes
+//	
+//	# Create the other test executable.
+//	go test -c -o "$newBin" -bench .
+//	
+//	# Test and benchmark.
+//	for i in 0 1 2 3 4 5 6 7 8 9 10 11 12 13; do
+//		printf 'Tests %s starting.\n' "$i"
+//		"$oldBin" -test.bench . >> "$old"
+//		"$newBin" -test.bench . >> "$new"
+//	done
+//	
+//	# Create final report with benchstat.
+//	benchstat "$old" "$new" > "$result"
+//
+// Possible variations include disabling tests (done with the command
+// line arguments "-run -"), running three instead of two benchmark
+// executables in the loop or increasing niceness or, even better,
+// running the binaries under a real time scheduling policy (see
+// sched_setscheduler and SCHED_FIFO). If you are on Linux and have
+// the chrt program, to run the test binary under a real time
+// scheduling policy run it like so:
+//
+//	chrt -f 50 testBinary -test.bench regexp >> out
+//
+// Be aware, though, that since a real time scheduling policy gives a
+// process or thread as much time as it "wants" to take, a thread of
+// the running testBinary process or one of its children can take up
+// all the time of a CPU core, and thus testBinary and its children
+// could, if malicious or simply buggy, effectively make a denial of
+// service attack on your computer.
+//
+// Other general benchmarking tips for reducing noise, Linux specific,
+// include disabling address space randomization and disabling Intel
+// turbo mode:
+//
+//	printf 0 > /proc/sys/kernel/randomize_va_space
+//	printf 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
+//
+// If your computer has sufficient cooling, set the Linux "performance"
+// frequency scaling governor for all cores:
+//
+//	for f in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
+//		printf performance > "$f"
+//	done
+//
+// If your computer has insufficient cooling, lower the maximum
+// frequency of all CPU cores:
+//
+//	# Get minimum frequency.
+//	cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq
+//	
+//	for f in /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq; do
+//		printf minimumFrequency > "$f"
+//	done
 package main
 
 import (