wwood
diff --git a/‎README.md‎
Lines changed: 55 additions & 0 deletions b/‎README.md‎
Lines changed: 55 additions & 0 deletions
diff --git a/‎demo/genome_1.fna‎
Lines changed: 72967 additions & 0 deletions b/‎demo/genome_1.fna‎
Lines changed: 72967 additions & 0 deletions
diff --git a/‎demo/genome_2.fna‎
Lines changed: 60313 additions & 0 deletions b/‎demo/genome_2.fna‎
Lines changed: 60313 additions & 0 deletions
diff --git a/‎demo/genome_3.fna‎
Lines changed: 76544 additions & 0 deletions b/‎demo/genome_3.fna‎
Lines changed: 76544 additions & 0 deletions
diff --git a/‎demo/genome_4.fna‎
Lines changed: 51851 additions & 0 deletions b/‎demo/genome_4.fna‎
Lines changed: 51851 additions & 0 deletions
diff --git a/‎demo/genome_5.fna‎
Lines changed: 53947 additions & 0 deletions b/‎demo/genome_5.fna‎
Lines changed: 53947 additions & 0 deletions
diff --git a/‎demo/genome_6.fna‎
Lines changed: 72372 additions & 0 deletions b/‎demo/genome_6.fna‎
Lines changed: 72372 additions & 0 deletions
diff --git a/‎demo/genome_7.fna‎
Lines changed: 30474 additions & 0 deletions b/‎demo/genome_7.fna‎
Lines changed: 30474 additions & 0 deletions
diff --git a/‎demo/genome_8.fna‎
Lines changed: 89586 additions & 0 deletions b/‎demo/genome_8.fna‎
Lines changed: 89586 additions & 0 deletions
diff --git a/‎demo/output_coverm.tsv‎
Lines changed: 10 additions & 0 deletions b/‎demo/output_coverm.tsv‎
Lines changed: 10 additions & 0 deletions
@@ -10,6 +10,7 @@
 		- [Shell completion](#shell-completion)
 	- [Usage](#usage)
 	- [Calculation methods](#calculation-methods)
+	- [Demo](#demo)
 	- [Citation](#citation)
 	- [License](#license)
 
@@ -163,6 +164,60 @@ is in a genome with 2,000,000bp contig with no reads mapped, then the
 trimmed_mean will be 0 as all positions in the 2000bp are in the top 5% of
 positions sorted by coverage.
 
+## Demo
+
+Download a test dataset of 8 genomes and 1 sample of paired-end reads
+
+```bash
+wget https://raw.githubusercontent.com/wwood/CoverM/refs/heads/main/demo/sample_1.1.fq.gz
+wget https://raw.githubusercontent.com/wwood/CoverM/refs/heads/main/demo/sample_1.2.fq.gz
+wget https://raw.githubusercontent.com/wwood/CoverM/refs/heads/main/demo/genome_1.fna
+wget https://raw.githubusercontent.com/wwood/CoverM/refs/heads/main/demo/genome_2.fna
+wget https://raw.githubusercontent.com/wwood/CoverM/refs/heads/main/demo/genome_3.fna
+wget https://raw.githubusercontent.com/wwood/CoverM/refs/heads/main/demo/genome_4.fna
+wget https://raw.githubusercontent.com/wwood/CoverM/refs/heads/main/demo/genome_5.fna
+wget https://raw.githubusercontent.com/wwood/CoverM/refs/heads/main/demo/genome_6.fna
+wget https://raw.githubusercontent.com/wwood/CoverM/refs/heads/main/demo/genome_7.fna
+wget https://raw.githubusercontent.com/wwood/CoverM/refs/heads/main/demo/genome_8.fna
+```
+
+Run CoverM
+
+```bash
+coverm genome \
+  --coupled sample_1.1.fq.gz sample_1.2.fq.gz \
+  --genome-fasta-files \
+    genome_1.fna genome_2.fna genome_3.fna genome_4.fna \
+    genome_5.fna genome_6.fna genome_7.fna genome_8.fna \
+  -t 8 \
+  -m mean relative_abundance covered_fraction \
+  -o output_coverm.tsv
+```
+
+This should have created the file `output_coverm.tsv` and logged the following message:
+`coverm::genome] In sample 'sample_1.1.fq.gz', found 48254 reads mapped out of 100000 total (48.25%)`.
+This indicates that 48.25% of the reads from our sample mapped to the genomes. So our genomes represent about half of the diversity in the sample.
+
+Looking in `output_coverm.tsv`, we find columns with the following headings:
+
+- `Genome`: The name of the genome
+- `sample_1.1.fq.gz Mean`: The mean read coverage from sample_1 across the given genome, i.e. the average height across the genome if reads aligned were stacked on top of each other.
+- `sample_1.1.fq.gz Relative Abundance (%)`: The relative abundance of the genome within sample_1. This metric accounts for differing genome sizes by using the proportion of mean coverage rather than the proportion of reads.
+- `sample_1.1.fq.gz Covered Fraction`: The proportion of the genome that is covered by at least one read.
+
+Each row represents a genome, and the columns represent the coverage metrics calculated for that genome for each provided sample.
+For instance, the row for `genome_1` shows that the mean coverage of this genome is `0.941`, the relative abundance is `25.9`%, and the covered fraction is `0.528`.
+Again, the row for `genome_5` shows that the mean coverage of this genome is `0.0`, the relative abundance is `0.0`%, and the covered fraction is `0.0`.
+This indicates that `genome_1` is well represented in the sample, while `genome_5` is not present at all.
+There are 3 other genomes with varying coverage, and 3 other genomes with 0 coverage.
+
+You may have noticed that the coverage fraction for most genomes is rather low. This is because the reads have been sub-sampled to 100,000 reads.
+The full sample has 76,618,686 reads and produces covered fractions of 1 for all present genomes. Notably, the relative abundances are very similar.
+The output from the full sample can be downloaded as follows: `wget https://raw.githubusercontent.com/wwood/CoverM/refs/heads/main/demo/output_coverm_full.tsv`
+
+There is an additional row named `unmapped` which represents the coverage metrics for the reads that did not map to any of the provided genomes.
+This is only applicable to the relative abundance metric (among those we selected), and we can see that 51% of the reads were unmapped.
+
 ## Citation
 
 If you use CoverM in your research, please cite the following publication:
 
@@ -0,0 +1,10 @@
+Genome	sample_1.1.fq.gz Mean	sample_1.1.fq.gz Relative Abundance (%)	sample_1.1.fq.gz Covered Fraction
+unmapped	NA	51.746	NA
+genome_1	0.9410575	25.87694	0.52770287
+genome_2	0.40274984	11.074703	0.27789244
+genome_3	0.20988818	5.7714467	0.15818907
+genome_4	0.20114066	5.5309105	0.1509256
+genome_5	0	0	0
+genome_6	0	0	0
+genome_7	0	0	0
+genome_8	0	0	0