Skip to content

Commit 6efb19f

Browse files
Add 'Get Started' guide for finding ORFIs and update documentation navigation
1 parent af3d1ee commit 6efb19f

File tree

2 files changed

+72
-0
lines changed

2 files changed

+72
-0
lines changed

docs/getstarterd.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
## Finding complete and overlapped ORFIs
2+
3+
The main package function is `findorfs`. Under the hood, the `findorfs` function is an interface for different gene finding algorithms that can be plugged using the `finder` keyword argument. By default it uses the `NaiveFinder` algorithm, which is a simple algorithm that finds all (non-outbounded) ORFIs in a DNA sequence (see the [NaiveFinder](https://camilogarciabotero.github.io/GeneFinder.jl/dev/api/#GeneFinder.NaiveFinder-Union{Tuple{Union{BioSequences.LongDNA{N},%20BioSequences.LongSubSeq{BioSequences.DNAAlphabet{N}}}},%20Tuple{N}}%20where%20N) documentation for more details).
4+
5+
> [!NOTE]
6+
The `minlen` kwarg in the `NaiveFinder` mehtod has been set to 6nt, so it will catch random ORFIs not necesarily genes thus it might consider `dna"ATGTGA"` -> `aa"M*"` as a plausible ORFI.
7+
8+
Here is an example of how to use the `findorfs` function with the `NaiveFinder` algorithm:
9+
10+
```julia
11+
using BioSequences, GeneFinder
12+
13+
# > 180195.SAMN03785337.LFLS01000089 -> finds only 1 gene in Prodigal (from Pyrodigal tests)
14+
seq = dna"AACCAGGGCAATATCAGTACCGCGGGCAATGCAACCCTGACTGCCGGCGGTAACCTGAACAGCACTGGCAATCTGACTGTGGGCGGTGTTACCAACGGCACTGCTACTACTGGCAACATCGCACTGACCGGTAACAATGCGCTGAGCGGTCCGGTCAATCTGAATGCGTCGAATGGCACGGTGACCTTGAACACGACCGGCAATACCACGCTCGGTAACGTGACGGCACAAGGCAATGTGACGACCAATGTGTCCAACGGCAGTCTGACGGTTACCGGCAATACGACAGGTGCCAACACCAACCTCAGTGCCAGCGGCAACCTGACCGTGGGTAACCAGGGCAATATCAGTACCGCAGGCAATGCAACCCTGACGGCCGGCGACAACCTGACGAGCACTGGCAATCTGACTGTGGGCGGCGTCACCAACGGCACGGCCACCACCGGCAACATCGCGCTGACCGGTAACAATGCACTGGCTGGTCCTGTCAATCTGAACGCGCCGAACGGCACCGTGACCCTGAACACAACCGGCAATACCACGCTGGGTAATGTCACCGCACAAGGCAATGTGACGACTAATGTGTCCAACGGCAGCCTGACAGTCGCTGGCAATACCACAGGTGCCAACACCAACCTGAGTGCCAGCGGCAATCTGACCGTGGGCAACCAGGGCAATATCAGTACCGCGGGCAATGCAACCCTGACTGCCGGCGGTAACCTGAGC"
15+
16+
orfs = findorfs(seq, finder=NaiveFinder) # use finder=NaiveCollector as an alternative
17+
18+
12-element Vector{ORFI{4, NaiveFinder}}:
19+
ORFI{NaiveFinder}(29:40, '+', 2)
20+
ORFI{NaiveFinder}(137:145, '+', 2)
21+
ORFI{NaiveFinder}(164:184, '+', 2)
22+
ORFI{NaiveFinder}(173:184, '+', 2)
23+
ORFI{NaiveFinder}(236:241, '+', 2)
24+
ORFI{NaiveFinder}(248:268, '+', 2)
25+
ORFI{NaiveFinder}(362:373, '+', 2)
26+
ORFI{NaiveFinder}(470:496, '+', 2)
27+
ORFI{NaiveFinder}(551:574, '+', 2)
28+
ORFI{NaiveFinder}(569:574, '+', 2)
29+
ORFI{NaiveFinder}(581:601, '+', 2)
30+
ORFI{NaiveFinder}(695:706, '+', 2)
31+
```
32+
33+
The `ORFI` structure displays the location, frame, and strand, but currently does not include the sequence *per se*. To extract the sequence of an `ORFI` instance, you can use the `sequence` method directly on it, or you can also broadcast it over the `orfs` collection using the dot syntax `.`:
34+
35+
```julia
36+
sequence.(orfs)
37+
38+
12-element Vector{LongSubSeq{DNAAlphabet{4}}}:
39+
ATGCAACCCTGA
40+
ATGCGCTGA
41+
ATGCGTCGAATGGCACGGTGA
42+
ATGGCACGGTGA
43+
ATGTGA
44+
ATGTGTCCAACGGCAGTCTGA
45+
ATGCAACCCTGA
46+
ATGCACTGGCTGGTCCTGTCAATCTGA
47+
ATGTCACCGCACAAGGCAATGTGA
48+
ATGTGA
49+
ATGTGTCCAACGGCAGCCTGA
50+
ATGCAACCCTGA
51+
```
52+
53+
Similarly, you can extract the amino acid sequences of the ORFIs using the `translate` function.
54+
55+
```julia
56+
translate.(orfs)
57+
58+
12-element Vector{LongAA}:
59+
MQP*
60+
MR*
61+
MRRMAR*
62+
MAR*
63+
M*
64+
MCPTAV*
65+
MQP*
66+
MHWLVLSI*
67+
MSPHKAM*
68+
M*
69+
MCPTAA*
70+
MQP*
71+
```

docs/make.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ fmt = DocumenterVitepress.MarkdownVitepress(
3838

3939
pgs = [
4040
"Home" => "index.md",
41+
"Get Started" => "getstarted.md",
4142
"Finding ORFs" => "naivefinder.md",
4243
"Scoring ORFs" => "features.md",
4344
"A Simple Coding Rule" => "simplecodingrule.md",

0 commit comments

Comments
 (0)