You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improve documentation for findorfs function and NaiveFinder algorithm; clarify usage examples and enhance explanations of keyword arguments and output.
Copy file name to clipboardExpand all lines: docs/src/getstarted.md
+16-7Lines changed: 16 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,14 +1,17 @@
1
1
## Finding complete and overlapped ORFIs
2
2
3
-
The main package function is `findorfs`. Under the hood, the `findorfs` function is an interface for different genefinding algorithms that can be plugged using the `finder` keyword argument. By default it uses the`NaiveFinder` algorithm, which is a simple algorithm that finds all (non-outbounded) ORFIs in a DNA sequence (see the [NaiveFinder](https://camilogarciabotero.github.io/GeneFinder.jl/dev/api/#GeneFinder.NaiveFinder-Union{Tuple{Union{BioSequences.LongDNA{N},%20BioSequences.LongSubSeq{BioSequences.DNAAlphabet{N}}}},%20Tuple{N}}%20where%20N) documentation for more details).
3
+
The main function in the GeneFinder package is `findorfs`, which serves as an interface to various gene-finding algorithms. By default, `findorfs` uses a`NaiveFinder` algorithm, a simple approach that detects all non-outbounded Open Reading Frames (ORFs) in a DNA sequence. You can also specify a different algorithm by setting the `finder` keyword argument. For more details on the NaiveFinder algorithm, see [NaiveFinder](https://camilogarciabotero.github.io/GeneFinder.jl/dev/api/#GeneFinder.NaiveFinder-Union{Tuple{Union{BioSequences.LongDNA{N},%20BioSequences.LongSubSeq{BioSequences.DNAAlphabet{N}}}},%20Tuple{N}}%20where%20N) documentation for more details.
4
4
5
5
!!! note
6
-
The `minlen` kwarg in the `NaiveFinder`mehtod has been set to 6nt, so it will catch random ORFIs not necesarily genes thus it might consider `dna"ATGTGA"` -> `aa"M*"` as a plausible ORFI.
6
+
The minlen keyword argument in `NaiveFinder`is set to a minimum length of 6 nucleotides (nt). As a result, it may identify short ORFs that aren't necessarily genes, such as dna"ATGTGA" producing the amino acid sequence aa"M*".
7
7
8
-
Here is an example of how to use the `findorfs` function with the `NaiveFinder` algorithm:
8
+
9
+
## Usage example
10
+
11
+
Here's an example of using `findorfs` with the `NaiveFinder` algorithm to identify ORFs in a DNA sequence:
9
12
10
13
```julia
11
-
using BioSequences, GeneFinder
14
+
julia>using BioSequences, GeneFinder
12
15
13
16
# > 180195.SAMN03785337.LFLS01000089 -> finds only 1 gene in Prodigal (from Pyrodigal tests)
@@ -30,10 +33,12 @@ orfs = findorfs(seq, finder=NaiveFinder) # use finder=NaiveCollector as an alter
30
33
ORFI{NaiveFinder}(695:706, '+', 2)
31
34
```
32
35
36
+
## Extracting Sequences from ORFIs
37
+
33
38
The `ORFI` structure displays the location, frame, and strand, but currently does not include the sequence *per se*. To extract the sequence of an `ORFI` instance, you can use the `sequence` method directly on it, or you can also broadcast it over the `orfs` collection using the dot syntax `.`:
34
39
35
40
```julia
36
-
sequence.(orfs)
41
+
julia>sequence.(orfs)
37
42
38
43
12-element Vector{LongSubSeq{DNAAlphabet{4}}}:
39
44
ATGCAACCCTGA
@@ -50,10 +55,12 @@ sequence.(orfs)
50
55
ATGCAACCCTGA
51
56
```
52
57
58
+
## Translating ORFIs to Amino Acid Sequences
59
+
53
60
Similarly, you can extract the amino acid sequences of the ORFIs using the `translate` function.
54
61
55
62
```julia
56
-
translate.(orfs)
63
+
julia>translate.(orfs)
57
64
58
65
12-element Vector{LongAA}:
59
66
MQP*
@@ -68,4 +75,6 @@ translate.(orfs)
68
75
M*
69
76
MCPTAA*
70
77
MQP*
71
-
```
78
+
```
79
+
80
+
This returns a vector of translated amino acid sequences, allowing for easy interpretation of each ORF's potential protein product.
0 commit comments