Sequitur

Sequitur is an experimental genome assembler that blends De Bruijn-graph efficiency with overlap-layout-consensus accuracy. It now lives in this repository as two parallel tracks:

a feature-complete Python reference implementation (used for validation and rapid iteration);
a memory-conscious Rust port aimed at large-scale datasets.

Repository layout

python/ – reference Python implementation, notebooks, and experiment drivers.
rust/ – in-progress Rust port with Cargo project scaffolding.
images/ – figures and poster assets referenced in documentation.
julian jacobs 1605267 masters dissertation.pdf – the original thesis describing the Sequitur methodology in detail.

Archived Python scripts are preserved under python/.archive/ for historical context.

Quick start

Python reference pipeline

cd python
python -m venv .venv
source .venv/bin/activate
pip install numpy scipy biopython fastDamerauLevenshtein networkx pylcs
PYTHONPATH=. python sequitur.py \
	data/input/example.1.fastq \
	data/input/example.2.fastq \
	--reference data/input/example.fasta \
	--output-fasta data/output/example.sequitur.fasta \
	--metrics-csv data/output/example.metrics.csv

See python/README.md for environment notes, experiment runners, and notebook usage.

Rust prototype

cd rust
cargo build --release
cargo run -- --help

Implementation of the Rust modules is ongoing; the CLI currently acts as a placeholder.

Roadmap

Port suffix-array construction, overlap scoring, and matching to Rust (tracked in rust/ and the issue/todo list).
Reconcile results between Python and Rust on toy datasets and realistic read sets.
Package reproducible benchmarks and automate CI checks once the Rust path is feature complete.

Background

If you are new to the project, start with the dissertation PDF and the notebooks in python/ for a detailed walkthrough of the algorithm, experiments, and lessons learned. The natural-language toy datasets provide an intuitive introduction before tackling real FASTQ data.

Contributions are welcome—open an issue or start a discussion if you plan to add new features or fix bugs.

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
.devcontainer		.devcontainer
.github		.github
.vscode		.vscode
images		images
python		python
rust		rust
scripts/integration		scripts/integration
tests		tests
.gitignore		.gitignore
.venv		.venv
README.md		README.md
analysis.json		analysis.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sequitur

Repository layout

Quick start

Python reference pipeline

Rust prototype

Roadmap

Background

About

Uh oh!

Releases

Packages

Uh oh!

Languages

JulianVJacobs/Sequitur

Folders and files

Latest commit

History

Repository files navigation

Sequitur

Repository layout

Quick start

Python reference pipeline

Rust prototype

Roadmap

Background

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages