Conversation
These seqwish-* directories are temporary working files created by seqwish during graph construction runs. They were accidentally committed and are bloating the repository. Removed: - 36 files across 12 seqwish-* directories (~10MB total) - Files: .sqa, .sqi, .sqq (seqwish's intermediate sequence alignment data) Added to .gitignore: - seqwish-*/ pattern to prevent future commits - wfmash-*/ pattern (similar temp directories) These temporary directories are automatically created by seqwish and should never be tracked in version control.
SweepGA (FastGA + plane sweep filtering) is significantly faster for all-vs-all alignment of sequences ≥100bp, making it better suited as the default aligner for typical pangenome graph construction workflows. Changes: - Set default aligner to 'sweepga' (was 'allwave') - Updated README to reflect SweepGA as default - Updated build instructions to use --features use-sweepga - Fixed README citation (removed incorrect co-author) - Added SweepGA to acknowledgments AllWave remains available via --aligner allwave for shorter sequences or when wavefront alignment is specifically needed. Note: SweepGA uses PAF output from FastGA, so orientation detection happens natively in the FastGA alignment phase (strand column in PAF).
b8b0eed to
114e732
Compare
ekg
added a commit
that referenced
this pull request
Oct 20, 2025
SweepGA alignments were not producing good results for graph construction. Reverting back to AllWave as the default aligner. This reverts: - Switch default aligner to SweepGA (#8) - Fix SweepGA filtering: remove min_block_length for graph construction (#9) SweepGA work is preserved on the 'sweepga-default-experiment' branch for future investigation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Switch the default aligner from AllWave to SweepGA for better performance on typical pangenome graph construction workflows.
Rationale
SweepGA is faster for all-vs-all alignment of sequences ≥100bp:
AllWave remains available via
--aligner allwavefor:Changes
Code
Args:default_value = "sweepga"(was"allwave")Documentation (README.md)
--features use-sweepgaTechnical Note
SweepGA uses PAF output from FastGA, so orientation detection happens natively in the alignment phase (strand column in PAF). We parse the PAF into
AlignmentRecordobjects rather than using raw alignment objects.Testing
✅ Build successful with
--features use-sweepga✅ All tests pass
✅ Verified SweepGA is used by default (FastGA output in logs)
Impact
Users will get faster alignments by default. Those needing AllWave can explicitly specify
--aligner allwave.