Skip to content

v3.0.0

Latest
Compare
Choose a tag to compare
@chrisgulvik chrisgulvik released this 20 Nov 17:03
12f5e59

What's Changed

New Contributors

Added

  • Consistent metrics reported for each read cleaning step (@chrisgulvik)
  • Added SeqFu for FastQ format validation (@chrisgulvik)
  • Checksum (SHA-512) reporting of intermediate and output files (@chrisgulvik)
  • Report full input paths for each sample (@chrisgulvik)
  • For assembly depth reporting, added stdev depth metrics; added total paired+single mapped stats (@chrisgulvik)

Changed

  • Default uses SeqKit rather than SeqTk for downsampling (@chrisgulvik)
  • Output structure and filenames revised (@chrisgulvik)
  • For MLST, exclude all MLST databases with a *_ by default (> 1) to ensure the original MLST database version is used for each taxon (e.g., excludes leptospira_2 and leptospira_3) and avoids inconsistent versions used within a run which would occasionally give one sample a leptospira and a different sample leptospira_3 making it impossible to immediately compare between samples. (@chrisgulvik)
  • For MLST, store novel FastA when that situation occurs (@chrisgulvik)
  • Sample name in outputs and file content no longer contains assembler name (@chrisgulvik)
  • Changed RDP output to exclude unneccesary data columns such as "Phylum\nphylum", "Genus\ngenus" (@chrisgulvik)
  • Use both R1 and R2 and only Phred30 for estimate bp input for more accurate estimation of genome size (@chrisgulvik)
  • Changed default to always on to store stats and FastA of discarded contigs during biopython filtering (@chrisgulvik)
  • Output filenames within pipeline_info/ changed to show month by name and include day of the week (@chrisgulvik)

Fixed

  • Order of operations in Trimmomatic process now ensures final output reads have minimum sequence length (default: 50 bp) (@chrisgulvik)
  • Fixed issue with missing column header names in the .kraken_summary.tsv output files (@chrisgulvik)
  • Fixed trailing tab character in Kraken1 and Kraken2 output TSV summaries, which made pandas XLSX conversion fail due to different column numbers in header and data (@chrisgulvik)
  • Fixed VERSION reporting RDP bug by removing spaces (@chrisgulvik)

Updated

  • Coloring of workflow process now corresponds to tab color in XLSX output summary sheet (@chrisgulvik)
  • Docker container version updates (@chrisgulvik)
  • Updated description on output files based on new files created as well as some renamed output files (@chrisgulvik)

Deprecated

  • Removed gene calling from QUAST output summary (@chrisgulvik)

Full Changelog: v2.3.0...v3.0.0