Skip to content

youssef-Araby/gastro-ensempling

Repository files navigation

GastroVision Ensemble Learning Framework

A research framework for evaluating ensemble deep learning methods on the GastroVision medical image dataset.

📁 Project Structure

gastro-ensampling/
├── ensemble_framework/          # Main framework code
│   ├── train_phase1.py         # Phase 1: Baseline model training
│   ├── train_phase2.py         # Phase 2: Ensemble methods (TODO)
│   ├── config.py               # Central configuration
│   ├── models/                 # Model architectures
│   │   ├── model_factory.py    # Create backbone models
│   │   └── ensemble_methods.py # Voting, Stacking, etc.
│   ├── data/                   # Data utilities
│   │   └── data_utils.py       # DataLoaders, transforms
│   ├── training/               # Training utilities
│   │   └── trainer.py          # Trainer class
│   ├── evaluation/             # Evaluation metrics
│   │   └── metrics.py          # MCC, F1, confusion matrices
│   ├── checkpoints/            # Saved model weights
│   │   └── phase1/             # Baseline model checkpoints
│   └── results/                # Experiment results
│       └── phase1/             # Baseline results & predictions
│
├── docs/                       # Documentation
│   └── PHASE1_RESULTS.md       # Phase 1 analysis & takeaways
│
├── scripts/                    # Utility scripts
│   ├── evaluate_all_models.py  # Evaluate saved predictions
│   ├── run_training.py         # Quick training entry point
│   └── setup_and_train.py      # One-command setup for collaborators
│
├── GastroVision/               # Original paper code (reference)
│   ├── Source/                 # Original training scripts
│   └── Split/                  # Official train/val/test CSV splits
│
├── .github/
│   └── copilot-instructions.md # AI coding agent instructions
│
├── RESEARCH_PHASES.md          # Research methodology overview
├── CONTRIBUTING.md             # Collaboration guide
├── requirements.txt            # Python dependencies
└── .gitignore                  # Git ignore patterns

🚀 Quick Start

1. Setup Environment

python -m venv venv
venv\Scripts\activate  # Windows
pip install -r requirements.txt

2. Download Dataset

Download from https://osf.io/84e7f/ and extract to ensemble_framework/data/Gastrovision-data/

3. Organize Data

python ensemble_framework/organize_data.py

4. Train Baseline Models (Phase 1)

python ensemble_framework/train_phase1.py

5. Run Ensemble Methods (Phase 2)

python ensemble_framework/train_phase2.py  # Coming soon

📊 Current Results (Phase 1)

Model Test Accuracy F1 Macro MCC
DenseNet-169 🥇 82.72% 0.6378 0.8062
DenseNet-121 82.16% 0.6283 0.7999
ResNet-50 81.97% 0.6088 0.7979
ResNet-152 81.65% 0.6216 0.7945
EfficientNet-B0 80.71% 0.6046 0.7837

See docs/PHASE1_RESULTS.md for detailed analysis.

🔬 Research Phases

  1. Phase 1 ✅ - Train 5 baseline models individually
  2. Phase 2 🔄 - Apply ensemble methods (Voting, Stacking)
  3. Phase 3 - Advanced models (ConvNeXt, Swin, ViT)
  4. Phase 4 - Ensemble advanced models
  5. Phase 5 - Hybrid ensemble (best of baseline + advanced)

📝 Citation

If you use this work, please cite the original GastroVision paper:

@inproceedings{jha2023gastrovision,
  title={GastroVision: A Multi-class Endoscopy Image Dataset...},
  author={Jha, Debesh and Sharma, Vanshali and ...},
  booktitle={ICML Workshop ML4MHD},
  year={2023}
}

📄 License

Research use only. See GastroVision dataset terms.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages