Modena is a nanopore-based computational method for detecting a wide spectrum of epigenetic and epitranscriptomic modifications.
It uses an unsupervised learning approach, namely resampling of nanopore signals followed by the Kuiper test. Unlike other unsupervised tools, classification is performed by 1D clustering of scores into two groups.
Important
This version of Modena is v2 beta. To find the stable v1 version of Modena, visit the v1.0.0 git tag.
To install and use Modena, you need at least Python 3.10 and the Poetry package manager. Then run the following commands:
$ git clone https://github.com/sbidin/modena.git
$ cd modena
$ poetry install
$ poetry run python -m modena --help # See options.
$ poetry --directory path/to/modena/dir/ run python -m modena # Run outside modena dir.Both datasets need to be supplied in blow5 or slow5 format, alongside their
f5c resquiggle output tsv files. If your dataset is in single/multi fast5
format, or pod5 format, you can apply conversions using one of the following
tools:
- single-
fast5to multi-fast5: ont_fast5_api - multi-
fast5toblow5/slow5: slow5tools pod5toblow5/slow5: blue-crab
To resquiggle your data with f5c, install f5c and run the resquiggle command:
$ f5c resquiggle data.fastq data.blow5 > resquiggled.tsvBoth datasets (in this case a and b) need a blow5 or slow5 file and a
corresponding f5c-resquiggled tsv file.
$ poetry run python -m modena -a a.blow5 -ax a.tsv -b b.blow5 -bx b.tsv -o out.tsv
$ poetry run python -m modena --help # See here for more options.Modena outputs a simple tsv file with four columns:
- position,
int, 1-based - coverage,
int, a count of all reads that contributed to the signal - distance,
float, a two-sample Kuiper-test-based measure (a distance sum) - label,
str,"pos"or"neg", separating positions into two clusters