-
Notifications
You must be signed in to change notification settings - Fork 94
/
Copy pathfasta_formatter.xml
90 lines (74 loc) · 3.24 KB
/
fasta_formatter.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
<tool id="cshl_fasta_formatter" version="@VERSION@" name="FASTA Width">
<description>formatter</description>
<expand macro="requirements" />
<macros>
<import>fastx_macros.xml</import>
</macros>
<!--
Note:
fasta_formatter also has a tabular output mode (-t),
but Galaxy already contains such a tool, so no need
to offer the user a duplicated tool.
So this XML tool only changes the width (line-wrapping) of a
FASTA file.
-->
<command>
<![CDATA[
zcat -f < '$input' | fasta_formatter -w $width -o '$output'
]]>
</command>
<inputs>
<param format="fasta" name="input" type="data" label="Library to re-format" />
<param name="width" type="integer" value="0" label="New width for nucleotides strings" help="Use 0 for single line out." />
</inputs>
<outputs>
<data format="fasta" name="output" metadata_source="input" />
</outputs>
<tests>
<test>
<!-- Re-format a FASTA file into a single line -->
<param name="input" value="fasta_formatter1.fasta" />
<param name="width" value="0" />
<output name="output" file="fasta_formatter1.out" />
</test>
<test>
<!-- Re-format a FASTA file into multiple lines wrapping at 60 charactes -->
<param name="input" value="fasta_formatter1.fasta" />
<param name="width" value="60" />
<output name="output" file="fasta_formatter2.out" />
</test>
</tests>
<help>
**What it does**
This tool re-formats a FASTA file, changing the width of the nucleotides lines.
**TIP:** Outputting a single line (with **width = 0**) can be useful for scripting (with **grep**, **awk**, and **perl**). Every odd line is a sequence identifier, and every even line is a nucleotides line.
--------
**Example**
Input FASTA file (each nucleotides line is 50 characters long)::
>Scaffold3648
AGGAATGATGACTACAATGATCAACTTAACCTATCTATTTAATTTAGTTC
CCTAATGTCAGGGACCTACCTGTTTTTGTTATGTTTGGGTTTTGTTGTTG
TTGTTTTTTTAATCTGAAGGTATTGTGCATTATATGACCTGTAATACACA
ATTAAAGTCAATTTTAATGAACATGTAGTAAAAACT
>Scaffold9299
CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG
TCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAG
aactggtctttacctTTAAGTTG
Output FASTA file (with width=80)::
>Scaffold3648
AGGAATGATGACTACAATGATCAACTTAACCTATCTATTTAATTTAGTTCCCTAATGTCAGGGACCTACCTGTTTTTGTT
ATGTTTGGGTTTTGTTGTTGTTGTTTTTTTAATCTGAAGGTATTGTGCATTATATGACCTGTAATACACAATTAAAGTCA
ATTTTAATGAACATGTAGTAAAAACT
>Scaffold9299
CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTAC
GTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG
Output FASTA file (with width=0 => single line)::
>Scaffold3648
AGGAATGATGACTACAATGATCAACTTAACCTATCTATTTAATTTAGTTCCCTAATGTCAGGGACCTACCTGTTTTTGTTATGTTTGGGTTTTGTTGTTGTTGTTTTTTTAATCTGAAGGTATTGTGCATTATATGACCTGTAATACACAATTAAAGTCAATTTTAATGAACATGTAGTAAAAACT
>Scaffold9299
CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG
------
This tool is based on `FASTX-toolkit`__ by Assaf Gordon.
.. __: http://hannonlab.cshl.edu/fastx_toolkit/
</help>
</tool>