A domain-specific language (transcription between UTF-8 text and DNA bases) based on YAML.
Please refer to coding.py:
DNA_TO_BIN = {
"A": "00",
"C": "01",
"G": "10",
"T": "11"
}
BIN_TO_DNA = {
v: k
for k, v in DNA_TO_BIN.items()
}
DNA_COMPLEMENT = {
"A": "T",
"T": "A",
"C": "G",
"G": "C"
}DNA can be installed from PyPI:
pip install dnadslor download the repository and run:
pip install .as of the repository root folder.
Run dna --help for help:
$ dna --help
usage: dna [-h] [-m {encode,decode}] [-i INPUT_FILE] [-o OUTPUT_FILE] [-v]
+--------------------------------------------------+
| DNA |
| A domain-specific language |
| (transcription between UTF-8 text and DNA bases) |
| based on YAML. |
+--------------------------------------------------+
options:
-h, --help show this help message and exit
-m {encode,decode}, --mode {encode,decode}
Choose the mode of transcoding.
encode: UTF-8 to bases; decode: bases to UTF-8.
The default is: encode
-i INPUT_FILE, --input-file INPUT_FILE
The path of the input YAML file.
-o OUTPUT_FILE, --output-file OUTPUT_FILE
The path of the output YAML file.
-v, --version Print the version number of dna and exit.-
Convert UTF-8 text to DNA bases, e.g., run
dna -m encode -i input_text.yml -o output_bases.yml:-
text_utf8: 😄😊
-
text_utf8: 😄😊 dna: positive_strand: sequence: TTAAGCTTGCGAGACATTAAGCTTGCGAGAGG binary: 11110000 10011111 10011000 10000100 11110000 10011111 10011000 10001010 text: 😄😊 negative_strand: sequence: AATTCGAACGCTCTGTAATTCGAACGCTCTCC binary: 00001111 01100000 01100111 01111011 00001111 01100000 01100111 01110101 text: "\x0F`g{\x0F`gu"
-
-
Convert DNA bases to UTF-8 text, e.g., run
dna -m decode -i input_bases.yml -o output_text.yml:-
positive_strand: TGAGGCTCGGCATGTTGTGAGATTTTAAGCTTGCAAGTCG
-
text_utf8: ❤️🐶 dna: positive_strand: sequence: TGAGGCTCGGCATGTTGTGAGATTTTAAGCTTGCAAGTCG binary: 11100010 10011101 10100100 11101111 10111000 10001111 11110000 10011111 10010000 10110110 text: ❤️🐶 negative_strand: sequence: ACTCCGAGCCGTACAACACTCTAAAATTCGAACGTTCAGC binary: 00011101 01100010 01011011 00010000 01000111 01110000 00001111 01100000 01101111 01001001 text: "\x1Db[\x10Gp\x0F`oI"
-
The binaries are created with PyInstaller:
# Package it on Linux
pyinstaller --name DNA --onefile -p dna dna/__main__.py
# Package it on Windows
pyinstaller --name DNA --onefile --icon python.ico -p dna dna/__main__.pyDNA is a free, open-source software package (distributed under the GPLv3 license). The logo used in README.md is downloaded from Wikimedia Commons. The Python icon is downloaded from python.ico.