Expose AllWave's full sparsification interface#10
Merged
Conversation
Exposes all AllWave sparsification strategies through the CLI, following the AllWave naming model: - 'none' or '1.0' - align all pairs (default) - 'auto' - automatic sparsification based on sequence count - 'random:F' - random sampling with probability F (e.g., 'random:0.5') - 'connectivity:F' - Erdős-Rényi with probability F for connected component - 'tree:K,K2,F,SIZE' - tree sampling with: * K = k-nearest neighbors * K2 = k-farthest neighbors * F = additional random fraction * SIZE = kmer size for Mash distance (e.g., 'tree:3,3,0.1,16') Changes: - Updated --sparsify flag help text with all options - Changed default from '1.0' to 'none' (more intuitive) - Added parse_sparsification() method to handle all formats - Iterative mode now respects user's sparsification setting - Falls back to tree:3,3,0.1,16 for iterative mode if not specified - Backward compatible: plain floats treated as random factors (with warning) Examples: --sparsify none # All pairs --sparsify random:0.5 # 50% random sampling --sparsify tree:3,3,0.1,16 # Tree sampling (default for iterative) --sparsify connectivity:0.95 # Erdős-Rényi for connectivity
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Exposes all of AllWave's sparsification strategies through the CLI, following the AllWave naming model instead of our custom simplified interface.
New Sparsification Options
Tree Sampling Format
tree:K,K2,F,SIZEwhere:Example:
tree:3,3,0.1,16creates:Iterative Mode Integration
Iterative mode now respects the user's
--sparsifysetting:# Use custom tree sampling for iterative alignment ./seqrush -s input.fa -o output.gfa --iterative --sparsify tree:5,5,0.2,16If no tree sampling specified, iterative mode defaults to
tree:3,3,0.1,16and shows a note.Changes
'1.0'to'none'(more intuitive)Testing
✅ Build successful
✅ All tests pass
✅ Help text updated
✅ Iterative mode uses parsed settings