Last updated: 2025-02-12 Version: 0.9.0
Coverage and total number of variants
  • Sample name: COLO829_60-30
  • Mean tumor coverage (fold): 63.04
  • Mean normal coverage (fold): 32.81
  • Total number of small variants: 45162
    • Total number of SNV: 43114
    • Total number of small deletion: 1632
    • Total number of small insertion: 416
  • Total number of structural variants (SV): 196
BND DEL DUP INS
68 37 8 20
Tumor purity and ploidy
  • Note that purity and ploidy estimation can be unreliable at low coverage (<30X) and low tumor purity (<50%)
  • Estimated tumor purity and ploidy solutions from Purple:
purity ploidy gender wholeGenomeDuplication
0.99 3 MALE TRUE
Homologous recombination deficiency prediction (CHORD HRD)
  • CHORD has not been tested extensively on long-reads dataset, so the prediction may not be accurate.
  • In particular, we observed that CHORD can predict wrong results for samples with < 15X effective tumor coverage (effective tumor coverage = tumor coverage * tumor purity).
Sample Probability of BRCA1-type HRD Probability of BRCA2-type HRD Probability of HRD HRD status HRD type Remarks on HRD status Remarks on HRD type
COLO829_60-30 0 0 0 HR_proficient none NA NA
  • For visualization purpose, if the major copy number is more than 5, the plot is capped at 5.
  • CNV tool used: Purple.
Small variants (SNV/INDEL) coverage and variant allele frequency (VAF) distribution
Mutational signatures
  • Mutational signature is estimated using R package MutationalPattern based on SNVs only (INDELs are ignored).
Notes on small variants (SNV/INDEL) filtering
  • Variants are filtered with any of the following criteria:
    • IMPACT is HIGH or MODERATE
    • CLIN_SIG contains pathogenic (Pathogenic intron variants will be retained)
    • CANCER_TYPE is not NA (variants that are in IntOGen Cancer Gene Census)
    • MAX_AF (maximum population allele frequency) is less than 3%
  • CANCER_TYPE_ROLE and CANCER_TYPE_CGC_GENE are merged columns from CANCER_TYPE, ROLE and CGC_CANCER_GENE. These columns are collapsed into single entries separated by semicolon. E.g. CANCER_TYPE = “Breast;Prostate” and ROLE - “LoF;Act” means that the gene is a LoF in breast cancer and an Act in prostate cancer. This is done so that the table is more readable.
Small variants (SNV/INDEL) table
Notes on structural variants (SVs)
  • SVs are filtered to only those that are part of the IntOGen Cancer Gene Census (CGC)
  • Annotation based on AnnotSV. However to make the output readable some columns with very long information (e.g. “_coord” and “_source”) are removed. Please refer to original AnnotSV output for more information.
    • AnnotSV converts square bracketed notation using the harmonization rule from variant-extractor, which may result in wrong conversion, especially in BND to DEL conversion.
  • Capital letter columns are from IntOGen CGC. Please see README from the IntOGen release for more information.
    • CANCER_TYPE_ROLE and CANCER_TYPE_CGC_GENE are merged columns from CANCER_TYPE, ROLE and CGC_CANCER_GENE. These columns are collapsed into single entries separated by semicolon. E.g. CANCER_TYPE = “Breast;Prostate” and ROLE - “LoF;Act” means that the gene is a LoF in breast cancer and an Act in prostate cancer. This is done so that the table is more readable.
  • Each SV can affect multiple genes. AnnotSV “splits” the different genes into different entries. This is why there are multiple rows with the same AnnotSV_ID.
  • ALT allele for insertion is hidden as “Too long” in the table. Please refer to the original AnnotSV output for more information.
  • Note that Severus can call duplication as BND event, and AnnotSV has a tendency to annotate these as DEL event since it doesn’t make use of the “STRAND” information. Therefore, the “SV_type” column is not very accurate for BND events (You will recognize these with SEVERUS_BND in the ID column)
  • The “SAMPLE” column represents the FORMAT column in the VCF. For Severus this is “GT:GQ:VAF:hVAF:DR:DV”
  • Translocations are defined as all BND pairs that are at least 100kbp apart:
    • Limitation: It is possible for large duplication or deletion events to be called as BND events.
  • BND pairs are connected by a line:
    • Red line: BND pairs that are in the Mitelman fusion database.
    • Black line: BND pairs involving known genes supplied to svpack GFF.
    • Grey line: All other BND pairs.
circos plot
Notes on DMR filtering
  • The table shows DMRs overlapping with promoters of genes in the IntOGen Cancer Gene Census (CGC) in the pipeline output generated using DSS.
  • Only DMRs with nCG >= 50 and are overlapping with known promoter regions (annotated using annotatr) are shown. There are other annotated regions in the pipeline output such as exonic and intronic CpG islands, but these are not shown.
  • meanMethyl1 refers to the mean methylation level in tumor.
  • meanMethyl2 refers to the mean methylation level in normal.
  • length refers to the length of the DMR.
  • nCG refers to then number of CpG sites in the DMR. By default the workflow requires at least 50 CpG sites in any DMR region.
  • areaStat refers to the area statistic of the DMR. The larger the area statistic, the more significant the DMR is. annot.X columns are produced by annotatr and all upper-case columns are extracted from IntOGen Compendium of Cancer Genes TSV file.
Table of DMRs overlapping with promoters of IntOGen CGC genes