fastq::Reader: from_maybe_gzip_path: New instantiation. #3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi,
One thing I've not seen in biological rust-land is a kseq.h type code, where you can provide some method a file path (which might be FIFO like stdin) and it works out for you if it is fastq or fasta, and whether it is gzip or not.
This PR is an attempt at the "gzip or not" part for the fastq reader. It is unfinished (see TODOs in code) because insufficient thought has been given to the architecture here e.g. should a enum be returned, or a dyn
fastq::Reader
? I figure you might have some ideas already so I didn't want to go too far down this path without hearing them first.Tests pass, but the code is not benchmarked. It isn't clear to me whether it is OK that the gzip reader is using the right size of buffer. Multithreading the gzip decompression might also speed things up.
Let me know what you think.
Thanks, ben