Discussion - ANIm files using pyani v0.2 vs 0.3 differ #109

peterjc · 2024-10-08T15:09:48Z

For issue #102 (groundtruth ANIb using pyani v0.2, see PR #108) we need to use pyANI v0.2 rather than v0.3 when we generate the expected matrices. I thought it would be nicer therefore to also use v0.2 for generating the ANIm groundtruth values - assuming they would be the same.

The script to do this for ANIm was essentially a copy of the proposed ANIb script with a search-and-replace.

This PR is to demonstrate changes in the ANIm output from v0.2 (specifically the version_0_2 branch as of 4c19123c67b78c01d58869527402aec518ead473 to v0.3 (specifically the main branch as of 6b8b262b028f7d27c2dfdddbeda67e7cb571fee7).

First there is a trivial difference in the row/col labels switching order from v0.2 to v0.3. Therefore this PR explicitly sorts them matrix by the MD5 (as being done anyway in this repo internally as part of testing against the expected matrices).

The final commit dc87d7e on this PR shows meaningful differences including:

Alignment lengths were floats (ending .0) under v0.2, and included some longer than the shorter sequence (!)
Some coverage values of approx 1.5 (see above)
Knock on changes to the Hadamard
Sim-errors diagonals were floats (ending .0) under v0.2 with a zero diagonal, but have got diagonal values of 1 under v0.3 (!)

i.e.

This looks like a bug in v0.2 (overly long alignments) which was fixed in v0.3.
There looks to be a quirk in v0.3 with non-zero self-vs-self values of one for sim-errors

This should probably be done as a separate Python script but this is a proof of principle for now.

This is the same script as for ANIb which has to use v0.2 since currently the method has not been finished in v0.3

Note we have explicitly sorted the rows/cols by MD5 as otherwise the order flipped from pyani v0.2 to v0.3 There are still differences...

codacy-production · 2024-10-08T15:09:53Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ +0.56% (target: -1.00%)	✅ ∅

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`7e41764`)	448	433	96.65%
Head commit (`dc87d7e`)	896 (+448)	871 (+438)	97.21% (+0.56%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#109)	0	0	∅ (not applicable)

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

_{Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more}

codecov · 2024-10-08T15:10:07Z

Codecov Report

All modified and coverable lines are covered by tests ✅

see 1 file with indirect coverage changes

peterjc · 2024-10-08T15:15:19Z

(This test will pass right now as we've not yet merged #107 which will check the pyANI plus ANIm calculations against these expected values)

peterjc · 2024-10-08T15:22:39Z

Confirmed with @kiepczi this is due to the bug in v0.2 fixed in v0.3 by widdowquinn/pyani#425

This means we will need to split the makefile fixtures entry as we need both pyani 0.2 (for ANIb) and 0.3 (for ANIm).

See also 7e41764 which I made to stop accidentally using the wrong pyani - similar guard in #108.

peterjc · 2024-10-08T15:35:47Z

Closing in favour of issue widdowquinn/pyani#438

See discussion on #109, we need different versions of pyani for ANIm (v0.3) and ANIb (v0.2).

peterjc added 4 commits October 8, 2024 15:50

Sort the expected matrix rows/cols by MD5

4b5d89e

This should probably be done as a separate Python script but this is a proof of principle for now.

Sort ANIm matrices by row/col MD5

ebcd967

Switch to generating ANIm with pyANI v0.2

b5491bc

This is the same script as for ANIb which has to use v0.2 since currently the method has not been finished in v0.3

Regenerate with pyani 0.2.13.1 and mumer 3.1 (on macOS)

dc87d7e

Note we have explicitly sorted the rows/cols by MD5 as otherwise the order flipped from pyani v0.2 to v0.3 There are still differences...

peterjc changed the title ~~Dsicussion - ANIm files using pyani v0.2 vs 0.3 differ~~ Discussion - ANIm files using pyani v0.2 vs 0.3 differ Oct 8, 2024

peterjc mentioned this pull request Oct 8, 2024

Self-vs-self sim-error values of 1 rather than 0 widdowquinn/pyani#438

Open

peterjc closed this Oct 8, 2024

peterjc added a commit that referenced this pull request Oct 8, 2024

Update makefile fixtures for pyani v0.2 vs 0.3

4246b69

See discussion on #109, we need different versions of pyani for ANIm (v0.3) and ANIb (v0.2).

peterjc deleted the regenerate_anim branch October 8, 2024 16:13

peterjc added a commit that referenced this pull request Oct 10, 2024

Update makefile fixtures for pyani v0.2 vs 0.3

470f1e3

See discussion on #109, we need different versions of pyani for ANIm (v0.3) and ANIb (v0.2).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion - ANIm files using pyani v0.2 vs 0.3 differ #109

Discussion - ANIm files using pyani v0.2 vs 0.3 differ #109

peterjc commented Oct 8, 2024

codacy-production bot commented Oct 8, 2024 •

edited

Loading

codecov bot commented Oct 8, 2024

peterjc commented Oct 8, 2024

peterjc commented Oct 8, 2024

peterjc commented Oct 8, 2024

Discussion - ANIm files using pyani v0.2 vs 0.3 differ #109

Discussion - ANIm files using pyani v0.2 vs 0.3 differ #109

Conversation

peterjc commented Oct 8, 2024

codacy-production bot commented Oct 8, 2024 • edited Loading

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

codecov bot commented Oct 8, 2024

Codecov Report

peterjc commented Oct 8, 2024

peterjc commented Oct 8, 2024

peterjc commented Oct 8, 2024

codacy-production bot commented Oct 8, 2024 •

edited

Loading