DGKL: Deep Graph Kernel Learning for Uncertainty Quantification in Catalysis

DGKL (Deep Graph Kernel Learning) is a framework that combines graph neural networks with Gaussian processes to enable accurate uncertainty quantification in catalysis. It provides reliable predictions of adsorption energies with calibrated uncertainty estimates, crucial for computational catalyst screening and discovery.

🎯 Key Features

Dual-level uncertainty quantification: Captures both material-level and atomic-level uncertainties
State-of-the-art performance: Outperforms existing uncertainty quantification methods in catalysis
Multiple GNN backbones: Supports SchNet and PaiNN architectures
Flexible kernel learning: Learnable graph kernels for improved uncertainty estimates
Comprehensive benchmarking: Tested on CatHub and OC20 datasets

📋 Requirements

System Requirements

OS: Linux (Ubuntu 20.04+ recommended)
GPU: NVIDIA GPU with CUDA support (tested on A100, V100, RTX 4060)
Memory: 16GB+ RAM recommended
Storage: ~50GB for datasets and models

Software Dependencies

Python 3.10+
PyTorch 2.0+ with CUDA support
PyTorch Geometric
GPyTorch 1.11+
ASE (Atomic Simulation Environment)
PyTorch Lightning 2.0+
Weights & Biases (optional, for experiment tracking)

🛠️ Installation

Quick Start with Poetry

# Clone the repository
git clone https://github.com/yourusername/DGKL.git
cd DGKL

# Install Poetry if not already installed
curl -sSL https://install.python-poetry.org | python3 -

# Install dependencies
poetry install

# Activate the virtual environment
poetry shell

Alternative: Manual Installation

# Create a virtual environment
python -m venv dgkl_env
source dgkl_env/bin/activate  # On Linux/Mac

# Install PyTorch (adjust for your CUDA version)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# Install other dependencies
pip install pytorch-geometric gpytorch ase pytorch-lightning wandb

📂 Project Structure

DGKL/
├── cat_uncertainty/           # Core implementation package
│   ├── data/                 # Data processing and loading utilities
│   │   ├── datamodule.py    # PyTorch Lightning data module
│   │   ├── lmdb_dataset.py  # LMDB dataset handler
│   │   └── soap_transform.py # SOAP descriptor transformations
│   ├── dgkl/                 # DGKL model implementation
│   │   ├── dgkl.py          # Main DGKL model
│   │   ├── kmeans_init.py   # K-means initialization for inducing points
│   │   └── train_*.py       # Training scripts for different models
│   ├── graph_models/         # Graph neural network implementations
│   │   ├── models/          # GNN architectures (SchNet, PaiNN, etc.)
│   │   ├── graph_trainer.py # PyTorch Lightning trainer
│   │   └── optimizers.py    # Custom optimizers
│   ├── ensemble_model/       # Ensemble baseline implementation
│   ├── evidential_model/     # Evidential regression baseline
│   └── monte_carlo_model/    # MC Dropout baseline
├── experiments/              # Experiment scripts and results
│   ├── DGKL/                # DGKL experiments
│   ├── DGKL_Atomic/         # Atomic-level DGKL experiments
│   ├── Ensemble/            # Ensemble baseline experiments
│   ├── Evidential/          # Evidential baseline experiments
│   └── MCD/                 # MC Dropout experiments
└── Paper/                   # Publication materials

📊 Reproducing Paper Results

To reproduce the results from our paper, navigate to the experiments/ directory and run the appropriate scripts:

# For DGKL experiments on CatHub with SchNet
cd experiments/DGKL/CatHub_SchNet/comp_1/
python script.py

# For ensemble baseline
cd experiments/Ensemble/CatHub_SchNet/comp_1/
python script.py

Each experiment directory contains:

script.py: Main training script
*.log: Training logs

⭐ Acknowledgements

This work builds upon several excellent open-source projects:

PyTorch - Deep learning framework
PyTorch Geometric - Graph neural network library
PyTorch Lightning - High-level PyTorch wrapper
GPyTorch - Gaussian processes in PyTorch
ASE - Atomic Simulation Environment
Open Catalyst Project - Datasets and benchmarks

📝 Citation

If you find our work useful in your research, please consider citing:

@article{mamun2025deep,
  title={Deep Graph Kernel Learning for Material and Atomic Level Uncertainty 
         Quantification in Adsorption Energy Prediction},
  author={Mamun, Osman and Yang, Chen and Yue, Shuyi},
  journal={ChemRxiv},
  year={2025},
  doi={10.26434/chemrxiv-2025-pfng2-v2},
  note={Preprint}
}

📫 Contact

For questions and feedback:

Lead Author: Osman Mamun ([email protected])

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Paper		Paper
cat_uncertainty		cat_uncertainty
experiments		experiments
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DGKL: Deep Graph Kernel Learning for Uncertainty Quantification in Catalysis

🎯 Key Features

📋 Requirements

System Requirements

Software Dependencies

🛠️ Installation

Quick Start with Poetry

Alternative: Manual Installation

📂 Project Structure

📊 Reproducing Paper Results

⭐ Acknowledgements

📝 Citation

📫 Contact

📄 License

About

Uh oh!

Releases

Packages

Languages

mamunm/DGKL

Folders and files

Latest commit

History

Repository files navigation

DGKL: Deep Graph Kernel Learning for Uncertainty Quantification in Catalysis

🎯 Key Features

📋 Requirements

System Requirements

Software Dependencies

🛠️ Installation

Quick Start with Poetry

Alternative: Manual Installation

📂 Project Structure

📊 Reproducing Paper Results

⭐ Acknowledgements

📝 Citation

📫 Contact

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages