Skip to content

KarriKarthik/Esophageal_Motility_Study

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Esophageal Motility Study Pipeline

A streamlined pipeline for processing esophageal manometry reports in .docx format. It automates the extraction of patient data, diagnostic tables, embedded images, and textual findings β€” all neatly structured for analysis and integration.


Features

  • Extracts patient info and clinical tables to .csv
  • Extracts and renames embedded images from reports
  • Removes grid lines from diagrams using OpenCV
  • Parses diagnostic text sections into structured .json
  • Outputs organized, analysis-ready folders

βš™οΈ How It Works

1. Text & Table Extraction

  • Reads .docx tables to extract:
    • Patient details
    • Summary metrics
    • Esophageal & UES motility data
  • Saves each as a separate .csv

2. Image Extraction & Naming

  • Extracts diagrams embedded in the document
  • Names them using:
    • Custom defaults (e.g., Swallow Composite)
    • Patient metadata from CSV

3. Image Preprocessing

  • Detects and removes grid lines via:
    • Canny edge detection
    • Hough Transform
    • Inpainting (OpenCV)
  • Saves clean diagrams to processed_images/

4. Diagnostic Text Extraction

  • Captures key sections like:
    • Chicago Classification Findings
    • Procedure, Indications, Impressions
  • Exports to a structured .json

Output Structure

.
β”œβ”€β”€ extracted_data/
β”‚   β”œβ”€β”€ Patient_details.csv
β”‚   β”œβ”€β”€ Esophageal_Manometry_Summary.csv
β”‚   β”œβ”€β”€ Lower_Esophageal_Sphincter.csv
β”‚   β”œβ”€β”€ Esophageal_Motility.csv
β”‚   β”œβ”€β”€ Upper_Esophageal_Sphincter.csv
β”‚   β”œβ”€β”€ Pharyngeal_UES_Motility.csv
β”‚   β”œβ”€β”€ Image_filenames.csv
β”‚   └── chicago_classification_findings.json
β”œβ”€β”€ images/
β”‚   └── *.png         # Original extracted images
β”œβ”€β”€ processed_images/
β”‚   └── *.png         # Grid-line removed versions
└── subj.docx         # Input report

πŸ”§ Requirements

Install dependencies with:

pip install -r requirements.txt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%