DECAF is a bioinformatics framework for detecting and removing contamination in environmental DNA (eDNA) sequences. It leverages deep learning to classify amplicon fragments, enhancing the reliability of metabarcoding analyses.
Currently supports ITS barcodes for plant contamination detection.
Future versions aim to support multiple barcodes and taxa.
DECAF is still under active development. For now, it provides a single deep learning model focused on ITS barcodes (ITS1, ITS2). The long-term goal is to offer a suite of models for diverse barcodes (e.g. COI, rbcL, 16S) across various taxonomic levels.
It can be:
- Integrated into existing pipelines like OBITools, AmpliSeq, or QIIME
- Used standalone for rapid filtering/classification of FASTA/FASTQ data
- Helpful in building clean reference databases for metabarcoding
- Barcode support: ITS, ITS1, ITS2
- Input types: Amplicons, ASVs
- Task: Binary classification — plant vs. contaminant
- Output: Filtered FASTA, prediction scores
- Python 3.8 or higher
- Git
- NVIDIA graphics card (recommended for fast processing)
pip install decaf- Clone the repository:
git clone [email protected]:UMMISCO/decaf.git
cd DECAF- Create a virtual environment:
python -m venv decaf-env
source decaf-env/bin/activate # On Linux/Mac
# decaf-env\Scripts\activate # On Windows- Install dependencies:
pip install -r requirements.txtdecaf --input_fastq data/test.fasta --output_folder output/ --taxa plants --barcode ITS --cpus 4 --threshold 0.99For more options and examples, consult the complete documentation.
The complete documentation is available at: https://decaf.readthedocs.io
To generate the documentation locally:
- Install development dependencies:
pip install -r requirements.txt- Start the documentation server:
mkdocs serveThen open your browser at: http://127.0.0.1:8000
We welcome contributions to DECAF!
- Open an issue to report bugs or suggest features
- Create a pull request to contribute code
- Follow the code style guidelines
To check your code style:
pip install black
black --check .To format your code:
black .DECAF/
├── decaf/ # Main code
│ ├── models/ # Model implementation
│ ├── data/ # Data management
│ └── utils/ # Utility functions
├── tests/ # Unit and integration tests
├── docs/ # Documentation
├── config/ # Configuration files
└── data/ # Example data
DECAF is under the MIT license. See the LICENSE file for more details.
- Auguste_GARDETTE - Lead Developer
- Contributors - All contributors
For any questions or issues, please open an issue on GitHub or contact the development team.