Skip to content
This repository was archived by the owner on Mar 16, 2022. It is now read-only.

Sequel II System Data Release: HG002 SV and SNVs (HiFi Reads powered by CCS)

tkerelska edited this page Mar 13, 2019 · 11 revisions

SAMPLE

GIAB HG002 extracted DNA

METHODS

  • Shearing 15 kb with Megaruptor
  • Library prep TPK 1.0
  • Size selection Fraction 4 (11kb) with Sage ELF
  • Sequencing Sequel System II with "Early Access" binding kit (101-490-800) and chemistry (101-490-900)
  • Run time 12 hour pre-extension; 30 hour movie
  • CCS SMRTLink 6.1 "Early Access" Circular Consensus Sequence Analysis (ccs v3.2.1)
  • Reference hs37d5 (GRCh37 with decoy)
  • Alignment pbmm2 --preset CCS
  • Variant Calling GATK v4.0.10.1 HaplotypeCaller
  • Variant Phasing WhatsHap v0.17

FOLDERS

  • subreads Basecalled reads and metadata for three Sequel II SMRTCells 8M loaded with 11kb HG002 libraries
  • consensusreads Circular Consensus reads and metadata for runs above
  • consensusalignments CCS reads above, aligned to hs37d5 with pbmm2.
  • gatk4hc Small variants called with GATK4 HaplotypeCaller and phased with WhatsHap
    •     /GIAB_small_variant_v3.3.2_benchmark  Benchmarked against GIAB small variant v3.3.2 with hap.py
      
  •     * /GIAB_phasing_benchmark               Benchmarked against 10X/Trio phased variant set.
    
  • pbsv Structural variants called with SMRT Link Structural Variant Calling (powered by pbsv)
  •     /truvari-giab-v0.6                    Benchmarked against GIAB structural variant v0.6 with Truvar
    

DOWNLOAD

URLs and md5 checksums are listed in URLs.txt

Note: subreads directory is 1.2 TB and contains the basecalled Sequel II data (unaligned BAM file format)

Download Data: https://downloads.pacbcloud.com/public/dataset/HG002_SV_and_SNV_CCS/

Clone this wiki locally