Releases: iqbal-lab-org/make_prg
Version 0.5.0
Version 0.4.0
0.4.0 - 02/12/2022
Added
make_prg updatecommand, that updates PRGs without requiring to rebuild MSAs and the PRG itself from scratch;- Trace (
-vv) logging level, to track make_prg behaviour (intended for developers only); - Multithreading support (
-tparameter); - A sample example;
- 365 new tests (from 116 to 481 total tests), with test coverage >99% in non-argument parsing code;
- Precompiled binary;
Changed
-
make_prg from_msa: input can now be a single file or a directory. If it is a single file,
a<prefix>.prg.bin, a<prefix>.prg.fa, a<prefix>.prg.gfaand a<prefix>.update_DS.zip
files are created. If it is a directory, all files in the directory are scanned and the same
execution for a single file is done for each input file found. The output files are a collection of the single-input
execution: a<prefix>.prg.bin.zipfile will contain a collection of.prg.binfiles, similar to
<prefix>.prg.gfa.zipand<prefix>.update_DS.zip;<prefix>.prg.fawill be a multi fasta; -
Other
make_prg from_msaCLI changes (please runmake_prg from_msa -hfor a full description of the new parameters):- Parameters removed:
--prg_name,--seqid,--no_overwrite; - Parameters added:
-s, --suffix,-F, --force,-t, --threads,-g, --output-graphs; - Parameters changed: Replaced
--outdirby--output_prefix;
- Parameters removed:
-
The recursive clustering and collapse algorithm is now explicitly represented as a tree with internal data
structures that remember the multiple sequence subalignment at any point of the recursion, as well as several other
internal data, allowing the serialization and deserialization of the recursion tree at any point. Thus updates can
be done avoiding any recomputation by firstly saving the state of the recursion tree to disk, and then loading this
recursion tree, adding denovo sequences to some specific nodes, and triggering recomputation of the modified nodes.
Any preorder traversal of the recursion tree yields the same order of recursive calls of the previous algorithm,
thus allowing us to translate the algorithms in the previous version as preorder traversals with custom visit
operations. -
Moved from
setup.pytopyproject.toml
Fixed
- Several minor bugs;
- Heavy refactoring of almost the whole codebase;
Removed
- Dropped support for
python 3.7, supportedpythonversions are:3.8,3.9,3.10,3.11. - Dropped support for
Mac OS X;
Version 0.2.0
New command-line
-
Added:
-S,--seqidoption to name the PRG sequence, which by default uses the file name.-Nshortcut for max nesting-Lshortcut for min match length--logto enable specifying log file should go to path. Default behaviour is now that
log goes to stderr by default-O,--output-typeoption to specify what output files are required. Defaults to
all
-
Removed:
- summary file
-
Changed:
--prefixCLI parameter offrom_msasubcommand removed in favor of CLI parameters--outdir
and--prg_name, with sensible defaults (current working directory and MSA file name stem respectively).
This allows finer control over where to place output files.
Output files
- Output files:
- No longer contain 'max_nesting' and 'min_match_length' in their names; these appear in the log files,
and in the.prgfasta header. .binfile now stores even integer markers at site ends; this is the format used by gramtools.- Summary file not written by default
- No longer contain 'max_nesting' and 'min_match_length' in their names; these appear in the log files,
Bug fixes
Version 0.1.1
0.1.1 - 2021-01-27
Added
- Dockerfile
-Voption to get version
Changed
- A test that was clustering all unique 5-mers was reduced to all 4-mers as the memory
usage of all 5-mers was causing a segfault when trying to run the tests during the
docker image build.
Removed
- Singularity file as it is redundant with the new Dockerfile (that will be hosted on
quay.io) scipydependency. We never actually explicitly usescipy.