Skip to content

Scripts for handling exome reports output from CCM's various pipelines. These are supplementary to those scripts found in `report-scripts` in the CCM repo.

Notifications You must be signed in to change notification settings

delvinso/exome-report-scripts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About

Series of scripts for aggregating exome and genome reports from CCM's CRE pipeline.

The below scripts should be run in the directory you'd like the output saved in, eg. python3 get_all_report_paths.py and python3 copy_reports.py --report_paths=./all_reports-2022-04-21/all-report-paths-2022-04-21.csv

  1. get_all_report_paths.py - traverses multiple known directories with exome and exome-like reports and dumps the information into two flat files
  • known directories:
    • current_exome: /hpf/largeprojects/ccm_dccforge/dccforge/results
    • old_exome : /hpf/largeprojects/ccmbio/naumenko/project_cheo/DCC_Samples_part1
    • current_genome: /hpf/largeprojects/ccmbio/ccmmarvin_shared/genomes
    • old_genome: /hpf/largeprojects/ccm_dccforge/dccdipg/c4r_wgs/results
    • in_progress_exome: /hpf/largeprojects/ccmbio/ccmmarvin_shared/exomes/in_progress
  • outputs:
    • ./all_reports-yyyy-mm-dd/all-fam-ptp-reports-yyyy-mm-dd.csv - parsed family and participant codenames and the report they belong to
    • ./all_reports-yyyy-mm-dd/all-report-paths-yyyy-mm-dd.csv - report paths
      • sanity check report counts by type: df[['report', 'report_type']].dropna().value_counts('report_type')
  1. copy_reports.py - takes output of above script, and cps all reports into a single, nested directory

About

Scripts for handling exome reports output from CCM's various pipelines. These are supplementary to those scripts found in `report-scripts` in the CCM repo.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published