merge_pair_seq

Biopiece: merge_pair_seq

Description

merge_pair_seq merges paired sequences in the stream, if these are interleaved. Sequence names must be in either Illumina1.3/1.5 format trailing a /1 or /2 or Illumina1.8 containing 1: or 2:. Sequence names must match accordingly in order to merge sequences.

An example record:

SEQ_LEN_RIGHT: 15
SEQ_LEN_LEFT: 15
SCORES: <???9?BBBDBDDBDDFFFFFFHHHIFHFH
SEQ: TAGGGAATCTTGCACAATGGAGGAAACTCT
SEQ_LEN: 30
SEQ_NAME: M01168:16:000000000-A1R9L:1:1101:13906:2139 1:N:0:14
---

Usage

... | merge_pair_seq [options]

Options

[-?          | --help]               #  Print full usage description.
[-I <file!>  | --stream_in=<file!>]  #  Read input from stream file  -  Default=STDIN
[-O <file>   | --stream_out=<file>]  #  Write output to stream file  -  Default=STDOUT
[-v          | --verbose]            #  Verbose output.

Examples

Consider the following FASTQ entries in the file test.fq:

@M01168:16:000000000-A1R9L:1:1101:14862:1868 1:N:0:14
TGGGGAATATTGGACAATGG
+
<??????BDDDDDDDDGGGG
@M01168:16:000000000-A1R9L:1:1101:14862:1868 2:N:0:14
CCTGTTTGCTACCCACGCTT
+
?????BB<-<BDDDDDFEEF
@M01168:16:000000000-A1R9L:1:1101:13906:2139 1:N:0:14
TAGGGAATCTTGCACAATGG
+
<???9?BBBDBDDBDDFFFF
@M01168:16:000000000-A1R9L:1:1101:13906:2139 2:N:0:14
ACTCTTCGCTACCCATGCTT
+
,5<??BB?DDABDBDDFFFF
@M01168:16:000000000-A1R9L:1:1101:14865:2158 1:N:0:14
TAGGGAATCTTGCACAATGG
+
?????BBBBBDDBDDBFFFF
@M01168:16:000000000-A1R9L:1:1101:14865:2158 2:N:0:14
CCTCTTCGCTACCCATGCTT
+
??,<??B?BB?BBBBBFF?F

To merge these interleaved pair-end sequences use merge_pair_seq:

read_fastq -e base_33 -i test.fq | merge_pair_seq

SEQ_NAME: M01168:16:000000000-A1R9L:1:1101:14862:1868 1:N:0:14
SEQ: TGGGGAATATTGGACAATGGCCTGTTTGCTACCCACGCTT
SEQ_LEN: 40
SCORES: <??????BDDDDDDDDGGGG?????BB<-<BDDDDDFEEF
SEQ_LEN_LEFT: 20
SEQ_LEN_RIGHT: 20
---
SEQ_NAME: M01168:16:000000000-A1R9L:1:1101:13906:2139 1:N:0:14
SEQ: TAGGGAATCTTGCACAATGGACTCTTCGCTACCCATGCTT
SEQ_LEN: 40
SCORES: <???9?BBBDBDDBDDFFFF,5<??BB?DDABDBDDFFFF
SEQ_LEN_LEFT: 20
SEQ_LEN_RIGHT: 20
---
SEQ_NAME: M01168:16:000000000-A1R9L:1:1101:14865:2158 1:N:0:14
SEQ: TAGGGAATCTTGCACAATGGCCTCTTCGCTACCCATGCTT
SEQ_LEN: 40
SCORES: ?????BBBBBDDBDDBFFFF??,<??B?BB?BBBBBFF?F
SEQ_LEN_LEFT: 20
SEQ_LEN_RIGHT: 20
---

Author

[email protected]

March 2013

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

merge_pair_seq is part of the Biopieces framework.

http://www.biopieces.org

merge_pair_seq

Biopiece: merge_pair_seq

Description

Usage

Options

Examples

See also

Author

License

Help

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!