A research framework for evaluating ensemble deep learning methods on the GastroVision medical image dataset.
gastro-ensampling/
├── ensemble_framework/ # Main framework code
│ ├── train_phase1.py # Phase 1: Baseline model training
│ ├── train_phase2.py # Phase 2: Ensemble methods (TODO)
│ ├── config.py # Central configuration
│ ├── models/ # Model architectures
│ │ ├── model_factory.py # Create backbone models
│ │ └── ensemble_methods.py # Voting, Stacking, etc.
│ ├── data/ # Data utilities
│ │ └── data_utils.py # DataLoaders, transforms
│ ├── training/ # Training utilities
│ │ └── trainer.py # Trainer class
│ ├── evaluation/ # Evaluation metrics
│ │ └── metrics.py # MCC, F1, confusion matrices
│ ├── checkpoints/ # Saved model weights
│ │ └── phase1/ # Baseline model checkpoints
│ └── results/ # Experiment results
│ └── phase1/ # Baseline results & predictions
│
├── docs/ # Documentation
│ └── PHASE1_RESULTS.md # Phase 1 analysis & takeaways
│
├── scripts/ # Utility scripts
│ ├── evaluate_all_models.py # Evaluate saved predictions
│ ├── run_training.py # Quick training entry point
│ └── setup_and_train.py # One-command setup for collaborators
│
├── GastroVision/ # Original paper code (reference)
│ ├── Source/ # Original training scripts
│ └── Split/ # Official train/val/test CSV splits
│
├── .github/
│ └── copilot-instructions.md # AI coding agent instructions
│
├── RESEARCH_PHASES.md # Research methodology overview
├── CONTRIBUTING.md # Collaboration guide
├── requirements.txt # Python dependencies
└── .gitignore # Git ignore patterns
python -m venv venv
venv\Scripts\activate # Windows
pip install -r requirements.txtDownload from https://osf.io/84e7f/ and extract to ensemble_framework/data/Gastrovision-data/
python ensemble_framework/organize_data.pypython ensemble_framework/train_phase1.pypython ensemble_framework/train_phase2.py # Coming soon| Model | Test Accuracy | F1 Macro | MCC |
|---|---|---|---|
| DenseNet-169 🥇 | 82.72% | 0.6378 | 0.8062 |
| DenseNet-121 | 82.16% | 0.6283 | 0.7999 |
| ResNet-50 | 81.97% | 0.6088 | 0.7979 |
| ResNet-152 | 81.65% | 0.6216 | 0.7945 |
| EfficientNet-B0 | 80.71% | 0.6046 | 0.7837 |
See docs/PHASE1_RESULTS.md for detailed analysis.
- Phase 1 ✅ - Train 5 baseline models individually
- Phase 2 🔄 - Apply ensemble methods (Voting, Stacking)
- Phase 3 - Advanced models (ConvNeXt, Swin, ViT)
- Phase 4 - Ensemble advanced models
- Phase 5 - Hybrid ensemble (best of baseline + advanced)
If you use this work, please cite the original GastroVision paper:
@inproceedings{jha2023gastrovision,
title={GastroVision: A Multi-class Endoscopy Image Dataset...},
author={Jha, Debesh and Sharma, Vanshali and ...},
booktitle={ICML Workshop ML4MHD},
year={2023}
}Research use only. See GastroVision dataset terms.