GenZProt : Chemically Transferable Generative Backmapping of Coarse-Grained Proteins

This is a repo accompanying our paper "Chemically Transferable Generative Backmapping of Coarse-Grained Proteins" (arxiv link).

We propose a geometric generative model for backmapping all-atom coordinates from (coarse-grained) C_alpha traces of proteins.

Download PED data

cd data
wget https://zenodo.org/record/7683192/files/genzprot_pedfiles.tar.gz
tar -xvf genzprot_pedfiles.tar.gz

Install packages

You can install the requirements via conda. pytorch=1.11.0=py3.8_cuda11.3_cudnn8.2.0_0

conda env create -f environment.yml

Download and install this package

git clone https://github.com/learningmatter-mit/GenZProt
cd GenZProt
conda activate genzprot
pip install -e .

Backmapping C_alpha traces into all-atom structures

Saved checkpoint of GenZProt is located in './ckpt/'. Save your C_alpha traces in .pdb format and pass its path to 'ca_trace_path' argument. We need an all-atom pdb file (at least one model/frame) to get the topology and C_alpha mapping. Pass the path to 'topology_path' argument. Note that GenZProt does not support the backmapping of PTMs, water, or HETATMs. Please remove all PTM atoms (e.g., OXT) and HETATMs from the all-atom pdb file before running the code.

Example:

cd scripts
MPATH=../ckpt/model_seed_12345
ca_trace_name=PED00055_CA_trace
ca_trace_path=../data/${ca_trace_name}.pdb
top_path=../data/PED00055.pdb
python inference.py -load_model_path $MPATH -ca_trace_path $ca_trace_path -topology_path $top_path

The results are saved in both .npy ( shape = ( 10,n_cg_samples,n_atoms_truncated,3 ) ) and .pdb format, in a directory named result_{MPATH}_{ca_trace_name}. Because our algorithm requires i-1th and i+1th C_alpha positions to locate the atoms of the ith residue, it does not backmap the first and the last residue. Hence, n_atom_truncated = n_atom - (n_atom_first_res + n_atom_last_res).

Training your own GenZProt

cd script
python train_model.py -load_json modelparams/multi.json

Test script

cd script
MPATH=../ckpt/model_seed_12345
test_data_path=../data/PED00055e000.pdb
python test_model.py -load_model_path $MPATH -test_data_path $test_data_path

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
GenZProt.egg-info		GenZProt.egg-info
GenZProt		GenZProt
ckpt		ckpt
data		data
scripts		scripts
README.md		README.md
environment.yml		environment.yml
overview.png		overview.png
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GenZProt : Chemically Transferable Generative Backmapping of Coarse-Grained Proteins

Download PED data

Install packages

Backmapping C_alpha traces into all-atom structures

Training your own GenZProt

Test script

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

learningmatter-mit/GenZProt

Folders and files

Latest commit

History

Repository files navigation

GenZProt : Chemically Transferable Generative Backmapping of Coarse-Grained Proteins

Download PED data

Install packages

Backmapping C_alpha traces into all-atom structures

Training your own GenZProt

Test script

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages