Skip to content

Add new features and Improvement to nf-core/mag: Trimmomatic, contigs param for binning, new concoct default params #744

@Pranjal-Bioinfo

Description

@Pranjal-Bioinfo

Description of feature

1. Include Trimmomatic as an Additional Preprocessing Tool
Currently, quality trimming is supported only for fastp in the pipeline. While fastp can be very efficient, Trimmomatic offers particular advantages for specific datasets, such as handling paired-end reads more robustly or allowing for finer control over trimming parameters.
Adding Trimmomatic will allow users the ability to make choices, giving flexibility based on their dataset's requirements.
Including the step for optional Trimmomatic may gather more users who are used to or who prefer this tool for preprocessing.

2. Contigs Param for binning
Scaffolds may introduce errors due to misassemblies during linking.
Since contigs are the raw output of assemblers, they may provide a more realistic representation in binning based on both sequence composition and coverage.
Scaffolding is based on assumptions that may not hold for complex metagenomes, potentially biasing binning results.

3. New concoct default params
Currently, the cut_up_fasta.py script within CONCOCT is set to chunk contigs with the parameters -c 1999 -o 1900.
These settings yield small chunks of 1,999 bases with significant overlap of 1,900 bases, which in turn increases the number of fragments and subsequently the runtime.
A potential improvement is to use -c 10000 -o 0, which creates larger chunks (10,000 bases) with no overlap.
This approach is faster in generating fewer fragments and is also the default in the official GitHub example for CONCOCT.

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions