DNAproDB Processing Pipeline

The DNAproDB processing pipeline is built as a collection of python scripts that take as input a PDB entry (in mmCIF or PDB file format) and produce a single JSON file as output containing all information about the DNA and protein structural characteristics and the interactions between protein residues and nucleotides in the complex. Additional annotations from several other databases are used as well. A schematic of the pipeline is shown below.

It is preferred to use the mmCIF file format, as the header info is parsed and additonal annotations can be extracted from it, which would be missing for the PDB file format.

Install Instructions

Clone this repository and install python dependencies via pip

git clone https://github.com/timkartar/dnaprodb.git
cd dnaprodb
conda env create -f environment.yml
conda activate dnaprodb

Dependencies

In addition to the python packages listed in python-requirements.txt, dnaprodb relies on a variety of command line tools that are invoked by the processing scripts. The following must be compiled or made executable and accessible from the system path:

reduce https://github.com/rlabduke/reduce
hbplus (source available in /share/hbplus.tar.gz)
Curves 5.3 (source available in /share/curves5.3_edited.tar.gz)
3DNA (source available in x3dna-v2.3.tar.gz)
pdb2pqr (source available in pdb2pqr-3.5.2.zip)
dssr (available as linux binary in /share/x3dna-dssr)
dssp (available as linux binary in /share/dssp)
SNAP (available as linux binary in /share/x3dna-snap)
msms (available as linux binary in /share/msms)

Run Instructions

To run the pipeline, simply invoke the main script processStructure.py from the command line as such:

./processStructure.py <FILE_NAME> --clean

the output will be a single file containing all the DNAproDB output data in a file called <FILE_NAME_PREFIX>.json.

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
JASPAR		JASPAR
data		data
dnaprodb_rnascape		dnaprodb_rnascape
new		new
pdb		pdb
rnascape		rnascape
share		share
.gitignore		.gitignore
.lis		.lis
LICENSE		LICENSE
OLD_update_annotations.py		OLD_update_annotations.py
README.md		README.md
add_and_update.py		add_and_update.py
add_structure_db.py		add_structure_db.py
annotations_config.json		annotations_config.json
auto_processStructure.py		auto_processStructure.py
auto_update_from_pdb.sh		auto_update_from_pdb.sh
bkp_processDNA.py		bkp_processDNA.py
celeryconfig.py		celeryconfig.py
compileJSON.py		compileJSON.py
config.json		config.json
dnaprodb_utils.py		dnaprodb_utils.py
don.txt		don.txt
download_rcsb_entries.py		download_rcsb_entries.py
environment.yml		environment.yml
failedStructures.txt		failedStructures.txt
gatherStats.py		gatherStats.py
getBASA.py		getBASA.py
getUniprot.py		getUniprot.py
get_citation_data.py		get_citation_data.py
get_current_pdb_ids.py		get_current_pdb_ids.py
get_sequence_clusters.py		get_sequence_clusters.py
logfile.txt		logfile.txt
new_update_annotations.py		new_update_annotations.py
plot_jaspar.py		plot_jaspar.py
proc_source.sh		proc_source.sh
processComplex.py		processComplex.py
processDNA.py		processDNA.py
processProtein.py		processProtein.py
processStructure.py		processStructure.py
production.sh		production.sh
production_log.txt		production_log.txt
pyro_server.py		pyro_server.py
python-requirements.txt		python-requirements.txt
query_cath.py		query_cath.py
query_jaspar.py		query_jaspar.py
reclassifyInteractions.py		reclassifyInteractions.py
retrieveExternalData.py		retrieveExternalData.py
search_query.tmpl		search_query.tmpl
temp_add_and_update.py		temp_add_and_update.py
thumbnails.py		thumbnails.py
update_annotations.py		update_annotations.py
update_new_entries.py		update_new_entries.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DNAproDB Processing Pipeline

Install Instructions

Dependencies

Run Instructions

About

Releases

Packages

Contributors 2

Languages

License

timkartar/DNAproDB

Folders and files

Latest commit

History

Repository files navigation

DNAproDB Processing Pipeline

Install Instructions

Dependencies

Run Instructions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages