This command tags (assigns) each read (in BAM) to one haplotype in the phased SNP VCF. i.e., reads will be tagged as HP:i:1 or HP:i:2. In addition, the haplotype block of each read is stored in the PS tag. The phased VCF can be also generated by other programs as long as the PS or HP tags are encoded. The author can specify --log for additionally output a plain-text file containing haplotype tags of each read without parsing BAM.
longphase-to haplotag \
-r reference.fasta \
-s phased_snp.vcf \
-b alignment.bam \
-t 8 \
-o tagged_bam_prefixUsage: haplotag [OPTION] ... READSFILE
--help display this help and exit.
require arguments:
-s, --snp-file=NAME input SNP vcf file.
-b, --bam-file=NAME input bam file.
-r, --reference=NAME reference fasta.
optional arguments:
--tagSupplementary tag supplementary alignment. default:false
-q, --qualityThreshold=Num not tag alignment if the mapping quality less than threshold. default:1
-p, --percentageThreshold=Num the alignment will be tagged according to the haplotype corresponding to most alleles.
if the alignment has no obvious corresponding haplotype, it will not be tagged. default:0.6
-t, --threads=Num number of thread. default:1
-o, --out-prefix=NAME prefix of phasing result. default:result
--cram the output file will be in the cram format. default:bam
--region=REGION tagging include only reads/variants overlapping those regions. default:""(all regions)
input format:chrom (consider entire chromosome)
chrom:start (consider region from this start to end of chromosome)
chrom:start-end
--log an additional log file records the result of each read. default:false