Release v3.0.0 · bacterial-genomics/wf-paired-end-illumina-assembly

What's Changed

Round memory down for SPAdes by @gregorysprenger in #88
Add SKESA parameters to main params.config by @gregorysprenger in #90
Refactor profiles for use in Seqera Tower and add tower yaml for reports by @slsevilla in #92
Dev store downsample data by @chrisgulvik in #94
Dev to main by @chrisgulvik in #96

Consistent metrics reported for each read cleaning step (@chrisgulvik)
Added SeqFu for FastQ format validation (@chrisgulvik)
Checksum (SHA-512) reporting of intermediate and output files (@chrisgulvik)
Report full input paths for each sample (@chrisgulvik)
For assembly depth reporting, added stdev depth metrics; added total paired+single mapped stats (@chrisgulvik)

Default uses SeqKit rather than SeqTk for downsampling (@chrisgulvik)
Output structure and filenames revised (@chrisgulvik)
For MLST, exclude all MLST databases with a *_ by default (> 1) to ensure the original MLST database version is used for each taxon (e.g., excludes leptospira_2 and leptospira_3) and avoids inconsistent versions used within a run which would occasionally give one sample a leptospira and a different sample leptospira_3 making it impossible to immediately compare between samples. (@chrisgulvik)
For MLST, store novel FastA when that situation occurs (@chrisgulvik)
Sample name in outputs and file content no longer contains assembler name (@chrisgulvik)
Changed RDP output to exclude unneccesary data columns such as "Phylum\nphylum", "Genus\ngenus" (@chrisgulvik)
Use both R1 and R2 and only Phred30 for estimate bp input for more accurate estimation of genome size (@chrisgulvik)
Changed default to always on to store stats and FastA of discarded contigs during biopython filtering (@chrisgulvik)
Output filenames within pipeline_info/ changed to show month by name and include day of the week (@chrisgulvik)

Order of operations in Trimmomatic process now ensures final output reads have minimum sequence length (default: 50 bp) (@chrisgulvik)
Fixed issue with missing column header names in the .kraken_summary.tsv output files (@chrisgulvik)
Fixed trailing tab character in Kraken1 and Kraken2 output TSV summaries, which made pandas XLSX conversion fail due to different column numbers in header and data (@chrisgulvik)
Fixed VERSION reporting RDP bug by removing spaces (@chrisgulvik)

Coloring of workflow process now corresponds to tab color in XLSX output summary sheet (@chrisgulvik)
Docker container version updates (@chrisgulvik)
Updated description on output files based on new files created as well as some renamed output files (@chrisgulvik)

Full Changelog: v2.3.0...v3.0.0