Re-evaluate impact of error correction #273

standage · 2018-06-18T20:23:08Z

Performing error correction drastically reduces the sequence content (specifically the number of distinct k-mers) in each data set, and accordingly the amount of memory required to track k-mer counts accurately. At one point we were pretty enthusiastic about this improvement, but abandoned it at one point since it led to some false negatives.

I think this decision was based on a small number of manually inspected variants (perhaps even 1), and not on overall statistics. And in any case all of the variants involved were SNVs, where our superiority is already marginal. We should re-investigate kevlar's performance on the latest simulations using error corrected data.

standage added the optimization label Jun 18, 2018

standage mentioned this issue Jun 18, 2018

[Meta] Memory and runtime performance improvements #272

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-evaluate impact of error correction #273

Re-evaluate impact of error correction #273

standage commented Jun 18, 2018

Re-evaluate impact of error correction #273

Re-evaluate impact of error correction #273

Comments

standage commented Jun 18, 2018