A collection of snakemake wrappers that are not suitable for the wider snakemake-wrappers config.
The wrappers are designed to be used in the snakemake workflow. They can be used from the Github repo directly.
rule dorado_duplex:
input:
"data/reads.fastq.gz"
output:
"data/reads.duplex.fastq.gz"
wrapper:
"https://github.com/damlab/damlab-wrappers/blob/main/dorado/duplex"For certain tools, you may need to install the tool environment separately.
git clone https://github.com/damlab/damlab-wrappers.git
cd damlab-wrappers
make installrule strainline_haplotypes:
input:
"data/reads.fastq.gz"
output:
"data/haplotypes.fasta"
params:
prefix = f"{wrappers_path}/strainline/venv"
wrapper:
f"file://{wrappers_path}/strainline/strainline"A complete end-to-end pipeline for processing Nanopore sequencing data from POD5 files to aligned BAMs and QC reports. Ideal for viral near-full-length (NFL) sequencing projects.
Features:
- Duplex and simplex basecalling with Dorado
- Scatter mode for distributed cluster execution
- Automated demultiplexing
- Reference-based alignment
- Haplotype reconstruction with Strainline
- Deletion block detection for identifying defective proviruses
- Comprehensive QC reporting with MultiQC
Documentation: workflows/proviral_nfl.md
Quick Start:
# Create config and samples.csv, then run:
snakemake --snakefile workflows/proviral_nfl.smk --cores 8 --use-condaReconstructs proviral haplotypes from long-read Nanopore data using Strainline, with optional downsampling, VADR viral annotation, GenBank file generation, and per-cohort phylogenetic trees.
Features:
- FASTQ or BAM input (BAM auto-converted via
cigarmath/bam2fastx) - Optional read downsampling with filtlong
- Haplotype reconstruction and reference-based clipping with Strainline
- Three VADR annotation modes: pre-built model, NCBI fetch, or fully offline local files
- GenBank flat file and ASN.1 submission file generation per sample
- Optional per-cohort MSA (MUSCLE) → FastTree → phytreeviz phylogenetic trees
Documentation: workflows/proviral_reconstruction.md
Quick Start:
# Create samples.csv and run.meta.yaml, then run:
snakemake -s workflows/proviral_reconstruction.smk --use-conda --cores 8An automation pipeline for CRISPResso2 CRISPR editing analysis. Accepts reads from paired or single-end FASTQ files, or from a BAM file (with optional region-level slicing for long-read data). Supports automatic pairwise CRISPRessoCompare runs across labelled experiment and control groups.
Features:
- Flexible read input: paired FASTQ, single-end FASTQ, or BAM file
- BAM region slicing via
cigarmath/slicefor long-read amplicon extraction - Amplicon supplied as an inline sequence string or a FASTA file path
- Automatic CRISPRessoAggregate report combining all samples into one summary
- Automatic pairwise CRISPRessoCompare for experiment vs. control groups
- Wrappers fetched from GitHub by default; override with a local path
Documentation: workflows/proviral_crispr.md
Quick Start:
# Create samples.csv (and optionally run.meta.yaml), then run:
snakemake -s workflows/proviral_crispr.smk -d /path/to/run --use-conda --cores 8
# Or via the data_scripts makefile:
make proviral-crispr ROOT=/path/to/run MACHINE=PicotteThis package contains wrappers for the Nanopore dorado tool.
dorado/duplex: Nanopore dorado duplex basecalling tool. Supports GPU acceleration and optional reference-based alignment.dorado/simplex: Nanopore dorado simplex basecalling tool. Supports GPU acceleration and optional reference-based alignment.dorado/demux: Nanopore dorado demultiplexing tool. Supports various barcoding kits and custom barcode arrangements.dorado/aligner: Nanopore dorado aligner tool. Supports GPU acceleration and optional reference-based alignment.
This package contains wrappers for the POD5 file format.
pod5/convert_fast5: Convert Oxford Nanopore FAST5 files to the newer POD5 format.pod5/split_by_channel: Split POD5 files by channel.
This package contains wrappers for CRISPR editing analysis tools.
CRISPR/crispresso-core: CRISPResso2 editing quantification for a single amplicon. Accepts amplicon sequence as an inline string parameter or as a FASTA input file.CRISPR/crispresso-compare: CRISPRessoCompare pairwise comparison of two CRISPResso output directories (e.g. treated vs. control).CRISPR/crispresso-aggregate: CRISPRessoAggregate multi-run aggregation into a single combined HTML report and summary plots.CRISPR/crispresso-aggregate: CRISPRessoAggregate multi-run aggregation into a single combined report. Accepts any number of CRISPResso output directories.
This package contains wrappers for VADR viral annotation and related tools.
hiv/vadr-genbank: Normalise NCBI GenBank flat files and FASTA files for use withv-build.pl. Rewrites the LOCUS name and strips VERSION suffixes so VADR's accession-matching checks pass.hiv/vadr-build: Build a VADR homology model from a GenBank accession. Supports NCBI fetch (online) or local file inputs (offline, viahiv/vadr-genbank).hiv/vadr-annotate: Annotate viral sequences using a VADR model directory. Includesmode='hiv'to pre-configure--alt_passfor all recoverable HIV alert codes.hiv/vadr-tbl2gbk: Convert VADR's passing FASTA and NCBI 5-column feature table to an annotated GenBank flat file (.gbf) and ASN.1 submission file (.sqn) viatable2asn.
This package contains wrappers for the multiple sequence alignment tools.
MSA/muscle: MUSCLE multiple sequence alignment tool.
This package contains wrappers for the Strainline reference free haplotype assembly tool.
This package requires the strainline tool to be installed via the provided makefile.
strainline/strainline: Strainline haplotype assembly tool.strainline/clipqs: ClipQS tool to orient and clip sequences generated by the Strainline tool.
This package contains wrappers that extend the Seqkit tool.
seqkit/primercheck: Seqkit primer checking tool. Given a primer file and a set of reads, it will check if the reads contain the primers and return the amplicon length.
This package contains wrappers for the Cigarmath library.
cigarmath/deletion_frequency: Calculate the deletion frequency of a given region.cigarmath/deletion_block_detection: Detect large deletion blocks in aligned BAM files, useful for identifying defective proviruses.cigarmath/pileup: Calculate per-position coverage depth.cigarmath/slice: Extract reads overlapping a genomic region from a BAM file, slicing each read so only bases covering the target window are returned.cigarmath/bam2fastx: Convert a BAM/SAM file to FASTA or FASTQ format.
This package contains wrappers for the Phylogenetic tree construction tools.
phylo/FastTree: Fasttree phylogenetic tree construction tool.phylo/phytreeviz: Phytreeviz tool for visualizing phylogenetic trees.phylo/reroot: Rerooting tool to reroot a tree using dendropy.
This package contains wrappers for the barcode and UMI tools.
barcode/extract: Extract barcodes and UMIs from a BAM file.barcode/correct: Correct barcodes in a BAM file.
This package contains wrappers for various AI tools.
huggingface/hiv-bert: Run HIV-BERT models on provided sequences.
A collection of snakemake wrappers of Python APIs for visualization.
visualization/jaspar2logo: Create sequence logos from frequency counts.
picard/addorreplacereadgroups: Picard tool to add or replace read groups in a BAM file. Slightly modified to have an easier way to set parameters.
Use make test to check all tests.