Skip to content
Young edited this page Jun 9, 2025 · 12 revisions

Welcome to the wiki for Donut Falls!

All "good" bioinformatic tools and workflows are attempting to solve a problem. The problem we ran into we could not find a completed workflow on nf-core, and we needed a simple workflow to assembly nanopore sequencing reads with and without corresponding Illumina reads for downstream analyses.

Notable goals:

  • works for R10.4 and above flow cells
  • portability for easy incorporation into other workflows
  • assembly of nanopore reads with and without polishing
  • assembly with unicycler
  • assembly with more than one assembler
  • easy-to-access QC metrics
    • coverage and circular status are in the resultant fasta files
    • all results go into a multiqc report
  • no gene annotation (our genomes are submitted to PGAP)
  • subsampling to 100X depth (subsampling depth default is actually to 150X) to reduce assembly artifacts
  • removal of short nanopore reads

Missed goals:

  • Time filtering with ontime. We have noticed if we filter reads "a little bit" after the run has started, but before the reagents get depleted, we have better reads. Time information, however, is removed for reads in the SRA, so it is difficult to test. We may find ways to incorporate this feature at a later date.

Nanopore sequence processing is an actively developing field, so tools were chosen due to their acceptance in the field and extracted from the tutorials generated by Dr. Ryan Wick in the Trycycler wiki and Perfect bacterial genome tutorial wiki.

The generated consensus files can then be used in multiple applications, including phylogenetic analysis with Grandeur or submission to NCBI via the genome submissions portal.

This wiki will cover the rationale and steps of this workflow.

Basic diagram of the workflow

---
Donut Falls
---
flowchart TD

subgraph S0[input files]
A[/nanopore fastq/]
U[/"illumina reads (optional)"/]
end

subgraph S1[flye or raven]
B[filtering with fastplong to remove low-quality and short reads]
B --> C[rasusa for random downsampling to 150X coverage for 5M sized genome]
C --> D[assemble with flye and/or raven]
D --> E[rotation with dnaapler]
end


subgraph S2[nanopore-only polishing]
E --> F[mapping to assembly with circulocov]
F --> G[clair3 for polishing]
G --> H[incorporation of changes with bcftools]
end

A --> S1

subgraph S3[unicycler]
I[unicycler hybrid assembly]
end

A --> S3
U --> S3

subgraph S4[polishing with short reads]
J[alignment of short reads to assembly with bwa]
J --> K[polypolish polishing]
K --> L[pypolca polishing]
end

S2 --> S4
U --> S4

style S1 stroke-width:4px
%% mermaid help : https://mermaid.js.org/syntax/flowchart.html
Loading

Workflow steps

Donut Falls is meant to be a simple workflow, it really is.

The basic steps are

Clone this wiki locally