We build a graph neural network ("Graph"), PointNet++ ("Point"), and a 3D convolution network ("Voxel") as baseline methods and perform evaluation on the ProteinShake tasks. See the paper for more information on the architecture of the models.
Table 1: Results of baseline models/representations (columns) on the ProteinShake tasks (rows). Best marked in bold, values are given as mean and standard deviation over 4 random seeds. The optimal choice of representation depends on the task. Results were obtained on the random split, see the paper supplemental for the other splits.
Task | Graph | Point | Voxel |
---|---|---|---|
Binding Site | 0.721 |
0.609 |
- |
Enzyme Class | 0.790 |
0.712 |
0.656 |
Gene Ontology | 0.704 |
0.580 |
0.609 |
Ligand Affinity | 0.670 |
0.683 |
0.690 |
Protein Family | 0.728 |
0.609 |
0.543 |
Protein-Protein Interface | 0.883 |
0.974 |
- |
Structural Class | 0.495 |
0.293 |
0.221 |
Structure Similarity | 0.598 |
0.627 |
0.620 |
Figure 2: Comparison of random, sequence, and structure splits across tasks and representations. Models generalize less well to sequence and structure splits, respectively.
Figure 3: Relative improvement due to pre-training across tasks and representations. Performance is substantially improved by pre-training with AlphaFoldDB. Tasks are abbreviated with their initials. Values are relative to the metric values obtained from the supervised model without pre-training.
One can use conda
, mamba
or pip
to download required packages. The main dependecies are:
proteinshake
pytorch
pyg
pytorch-lightning
hydra
An example for installing ProteinShake_eval
with mamba
(similar but faster than conda
):
mamba create -n proteinshake
mamba activate proteinshake
mamba install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia
mamba install pyg -c pyg
mamba install lightning
pip install hydra-core --upgrade
pip install proteinshake
pip install -e .
The weights for pre-trained models are available in the repository.
Train a graph neural network from scratch for the Enzyme Class prediction task:
python experiments/train.py task=enzyme_class representation=graph
Finetune a PointNet++ for the Ligand Affinity prediction task:
python experiments/train.py task=ligand_affinity representation=point_cloud pretrained=true
Use python experiments/train.py
to see more details.
python experiments/pretrain_mask_residues.py representation=graph
Code in this repository is licensed under BSD-3, the model weights are licensed under CC-BY-4.0.