GitHub - MSRG/DeepBDE: A deep learning model developed using large and accurate dataset generated via atom-centered potentials approach

DeepBDE: a graph neural network for fast and accurate bond dissociation enthalpies

This repository contains the official implementation of DeepBDE, available on arXiv.

Getting Started

Setup environment

Create an environment with all necessary dependencies. This can be done using Conda:
```
conda create -n "deepbde" python=3.12
conda activate deepbde
pip install -r requirements.txt
```
DGL needs to be istalled separately as our repo expects CUDA to be available with it.
```
pip install  dgl -f https://data.dgl.ai/wheels/torch-2.3/repo.html
```
Download model and transforms (or dataset CSV if training)

To run predictions

Extract model and transform

Place these files in the parent directory of this repo.
Inferencing

Single reaction inferencing - requires reactant SMILES and bond index. Bond index is defined by how RDkit arranges bond order in the molecule.

Example: split reactant given by SMILES: CCOc1cccc(O)c1 at bond index 1. Products are: [O]c1cccc(O)c1 [CH2]C
```
python infer.py 'CCOc1cccc(O)c1' 1
```
Same as above, but use products to cross-check that the reaction products are the same as you expect (if not, an error is generated).
```
python infer.py 'CCOc1cccc(O)c1' 1 --product_1_smiles '[O]c1cccc(O)c1' --product_2_smiles '[CH2]C'
```
Multiple reaction inferencing - requires reactant SMILES and bond indices. Bond index is defined by how RDkit arranges bond order in the molecule. If the list has a single index, it is identical to single reaction inference. The BDEs will be given in the same order as the bond indices inputted.
```
python multi_infer.py 'C[C@H](O)C(=O)O' '[4,5,9]'
```
All valid bond inferencing - requires reactant SMILES only. All valid bond indices will be found and printed before BDE values are outputted.
```
python infer_all.py 'C[C@H](O)C(=O)O'
```

Training

Encoding dataset and subset split

Create the encoded dataset that can be used with our training code from a CSV file (find download above). Also creates a train, validation, test index split that will be required. We use a typical 8:1:1 split as an example.
```
python encode_dataset.py --save_dir [save directory] --csv_path [path to dataset csv] --split '[0.8,0.1,0.1]'
```

Train model given hyperparameters

Train model given a set of hyperparameters. This code supports training restart - if the path has a pre-existing train save and remaining arguments to the function are the same, training will resume from the last epoch recorded.

We show the hyperparameters used in our final model here. The dset_path should point to the dset/ dir when the encoded dataset is generated. training_indices_path and valid_indices_path should point to subset index files generated by above.

python train.py \
    --path [save path for training] \
    --dset_path [path to the dset directory] \
    --train_indices_path [path training indices list file generated] \
    --valid_indices_path [path validation indices list file generated] \
    --device [cpu or cuda] \
    --num_workers 1 \
    --activation_fn 'silu' \
    \
    --graph_hidden_size 256 \
    --graph_inner_layer_sizes '[[256, 256, 256, 256, 256, 256], [256, 256, 256, 256, 256, 256], [256, 256, 256, 256, 256, 256], [256, 256, 256, 256, 256, 256], [256, 256, 256, 256, 256, 256]]' \
    --fc_readout_sizes '[128, 32, 32, 32, 32, 32, 32, 32]' \
    \
    --learn_rate 0.00011796198660219 \
    --epochs 1000 \
    --batch_size 512 \
    \
    --reducelr_factor 0.8 \
    --reducelr_patience 20 \
    --reducelr_threshold 0.01 \
    \
    --min_epochs 1000 \
    --epochs_of_no_mae_drop_before_stop 1000

Citation

arxiv or nature bib whatever

Name		Name	Last commit message	Last commit date
Latest commit History 335 Commits
architecture		architecture
data_processing		data_processing
model_data		model_data
train_util		train_util
.gitignore		.gitignore
README.md		README.md
encode_dataset.py		encode_dataset.py
infer.py		infer.py
infer_all.py		infer_all.py
multi_infer.py		multi_infer.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepBDE: a graph neural network for fast and accurate bond dissociation enthalpies

Getting Started

To run predictions

Training

Citation

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

MSRG/DeepBDE

Folders and files

Latest commit

History

Repository files navigation

DeepBDE: a graph neural network for fast and accurate bond dissociation enthalpies

Getting Started

To run predictions

Training

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages