Skip to content

Handle graphs with overlap? #2

@dirkjanvw

Description

@dirkjanvw

Hi! Seeing the paper of your tool, it sparked my interest to compare some GFA files of alignment-based tools (like minigraph-cactus) to de bruijn graph-based tools (like cuttlefish). I tried to run it, but I am facing two issues:

  1. [minor] It seems rs-pancat-compare assumes a certain order in the GFA file: first H-line, then S-line, then L-line and lastly P-line. This is not always the case for GFA files, though. I can fix this myself by reordering the lines, but ideally I shouldn't have to do that (or have some easy to understand error message).
  2. [major] It seems rs-pancat-compare doesn't handle overlapping nodes where the overlap is specified in a CIGAR string in the L-line. In de Bruijn graphs, these are typically always k - 1 long (e.g. 26M when k = 27). It gives errors such as: # Error: the two paths representing NC_001136.10 have different lengths: 3399825 and 1531933. whereas the underlying sequence is identical in length for sure.

Perhaps if you continue to improve rs-pancat-compare, these two would be nice features to add :)

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingenhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions