DGKL (Deep Graph Kernel Learning) is a framework that combines graph neural networks with Gaussian processes to enable accurate uncertainty quantification in catalysis. It provides reliable predictions of adsorption energies with calibrated uncertainty estimates, crucial for computational catalyst screening and discovery.
- Dual-level uncertainty quantification: Captures both material-level and atomic-level uncertainties
- State-of-the-art performance: Outperforms existing uncertainty quantification methods in catalysis
- Multiple GNN backbones: Supports SchNet and PaiNN architectures
- Flexible kernel learning: Learnable graph kernels for improved uncertainty estimates
- Comprehensive benchmarking: Tested on CatHub and OC20 datasets
- OS: Linux (Ubuntu 20.04+ recommended)
- GPU: NVIDIA GPU with CUDA support (tested on A100, V100, RTX 4060)
- Memory: 16GB+ RAM recommended
- Storage: ~50GB for datasets and models
- Python 3.10+
- PyTorch 2.0+ with CUDA support
- PyTorch Geometric
- GPyTorch 1.11+
- ASE (Atomic Simulation Environment)
- PyTorch Lightning 2.0+
- Weights & Biases (optional, for experiment tracking)
# Clone the repository
git clone https://github.com/yourusername/DGKL.git
cd DGKL
# Install Poetry if not already installed
curl -sSL https://install.python-poetry.org | python3 -
# Install dependencies
poetry install
# Activate the virtual environment
poetry shell# Create a virtual environment
python -m venv dgkl_env
source dgkl_env/bin/activate # On Linux/Mac
# Install PyTorch (adjust for your CUDA version)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# Install other dependencies
pip install pytorch-geometric gpytorch ase pytorch-lightning wandbDGKL/
├── cat_uncertainty/ # Core implementation package
│ ├── data/ # Data processing and loading utilities
│ │ ├── datamodule.py # PyTorch Lightning data module
│ │ ├── lmdb_dataset.py # LMDB dataset handler
│ │ └── soap_transform.py # SOAP descriptor transformations
│ ├── dgkl/ # DGKL model implementation
│ │ ├── dgkl.py # Main DGKL model
│ │ ├── kmeans_init.py # K-means initialization for inducing points
│ │ └── train_*.py # Training scripts for different models
│ ├── graph_models/ # Graph neural network implementations
│ │ ├── models/ # GNN architectures (SchNet, PaiNN, etc.)
│ │ ├── graph_trainer.py # PyTorch Lightning trainer
│ │ └── optimizers.py # Custom optimizers
│ ├── ensemble_model/ # Ensemble baseline implementation
│ ├── evidential_model/ # Evidential regression baseline
│ └── monte_carlo_model/ # MC Dropout baseline
├── experiments/ # Experiment scripts and results
│ ├── DGKL/ # DGKL experiments
│ ├── DGKL_Atomic/ # Atomic-level DGKL experiments
│ ├── Ensemble/ # Ensemble baseline experiments
│ ├── Evidential/ # Evidential baseline experiments
│ └── MCD/ # MC Dropout experiments
└── Paper/ # Publication materials
To reproduce the results from our paper, navigate to the experiments/ directory and run the appropriate scripts:
# For DGKL experiments on CatHub with SchNet
cd experiments/DGKL/CatHub_SchNet/comp_1/
python script.py
# For ensemble baseline
cd experiments/Ensemble/CatHub_SchNet/comp_1/
python script.pyEach experiment directory contains:
script.py: Main training script*.log: Training logs
This work builds upon several excellent open-source projects:
- PyTorch - Deep learning framework
- PyTorch Geometric - Graph neural network library
- PyTorch Lightning - High-level PyTorch wrapper
- GPyTorch - Gaussian processes in PyTorch
- ASE - Atomic Simulation Environment
- Open Catalyst Project - Datasets and benchmarks
If you find our work useful in your research, please consider citing:
@article{mamun2025deep,
title={Deep Graph Kernel Learning for Material and Atomic Level Uncertainty
Quantification in Adsorption Energy Prediction},
author={Mamun, Osman and Yang, Chen and Yue, Shuyi},
journal={ChemRxiv},
year={2025},
doi={10.26434/chemrxiv-2025-pfng2-v2},
note={Preprint}
}For questions and feedback:
- Lead Author: Osman Mamun ([email protected])
This project is licensed under the MIT License - see the LICENSE file for details.