Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 5 revisions

Biopiece: read_2bit

Description

[read_2bit] read in nucleotide sequence entries from 2bit files. The sequence is compressed using two bits per base and a table of content containing information about soft and hard masking is located at the beginning of the file.

The resulting Biopiece record consists of the following record type:

SEQ_NAME: test
SEQ_LEN: 10
SEQ: ATCGATCGAC
---

For more about the 2bit format:

http://genome.ucsc.edu/FAQ/FAQformat#format7

Usage

read_2bit [options] -i <2bit file(s)>

Options

[-?          | --help]               #  Print full usage description.
[-i <files!> | --data_in=<files!>]   #  Comma separated list of files or glob expression to read.
[-n <uint>   | --num=<Uint>]         #  Limit number of records to read.
[-N          | --no_mask]            #  Ignore soft masking.
[-I <file!>  | --stream_in=<file!>]  #  Read input stream from file  -  Default=STDIN
[-O <file>   | --stream_out=<file>]  #  Write output stream to file  -  Default=STDOUT
[-v          | --verbose]            #  Verbose output.

Examples

To read all 2bit entries from a file:

read_2bit -i test.2bit

To read in only 10 records from a 2bit file:

read_2bit -n 10 -i test.2bit

To read all 2bit entries from multiple files:

read_2bit -i test1.2bit,test2.2bit

To read 2bit entries from multiple files using a glob expression:

read_2bit -i '*.2bit'

See also

[write_2bit]

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

[email protected]

August 2007

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

[read_2bit] is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally