PMGen: From Structure Prediction to Peptide Generation

PMGen (Peptide MHC Generator) is a comprehensive pipeline for predicting peptide-MHC (pMHC) complex structures and designing optimized peptide sequences.

Key Features

Fast & accurate structure prediction using AlphaFold with template engineering or initial guess mode
Peptide sequence design with structure-aware optimization
MHC pseudo-sequence design for customized allele engineering
Iterative peptide optimization with binding prediction
Mutation screening for systematic variant analysis
Batch processing for multiple pMHC complexes

Installation

Requirements

Python 3.8+
Conda or Mamba
Git Optional
Modeller (requires a license key - get it here)
CUDA-enabled GPU (Required for faster Alphafold predictions)

Setup

git clone https://github.com/soedinglab/PMGen.git
cd PMGen
bash -l install.sh
#or, for CPU only support run: bash -l install.sh --cpu
conda activate PMGen

You will be prompted to enter your Modeller license key. The script automatically:

Creates the PMGen Conda environment
Installs all dependencies
Downloads AlphaFold parameters
Clones PANDORA and ProteinMPNN

Optional: Configure NetMHCpan (Recommended)

Install NetMHCpan and NetMHCIIpan, then edit user_setting.py:

netmhcipan_path = '/path/to/netMHCpan'
netmhciipan_path = '/path/to/netMHCIIpan'

Quick Start

Prepare Input File

Create a tab-separated file (input.tsv) with your pMHC data:

peptide	mhc_seq	mhc_type	anchors	id
GILGFVFTL	GSHSMRYFFTSVSRPGRGEPRFIAVGYVDDTQFVRFDSDAASQRMEPRAPWIEQEGPEYWDGETRKVKAHSQTHRVDLGTLRGYYNQSEAGSHTVQRMYGCDVGSDWRFLRGYHQYAYDGKDYIALKEDLRSWTAADMAAQTTKHKWEAAHVAEQLRAYLEGTCVEWLRRYLENGKETLQRTDAPKTHMTHHAVSDHEATLRCWALSFYPAEITLTWQRDGEDQTQDTELVETRPAGDGTFQKWAAVVVPSGQEQRYTCHVQHEGLPKPLTLRWE	1		sample1
KLGGALQAK	GSHSLKYFHTSVSRPGRGEPRFISVGYVDDTQFVRFDSDAASPRGEPRAPWVEQEGPEYWDRNTQIFKTNTQTYRENLRIALRYYNQSEAGSHIIQRMYGCDLGPDGRLLRGHDQYAYDGKDYIALNEDLRSWTAADTAAQITQRKWEAAREAEQRRAYLEGTCVEWLRRYLKNGNATLLRTDSPKTHMTHHPISDHEATLRCWALGFYPAEITLTWQRDGEDQTQDTELVETRPAGDRTFQKWAAVVVPSGEEQRYTCHVQHEGLPKPLTLRWE	1		sample2

Columns:

peptide: Peptide sequence
mhc_seq: MHC sequence (for MHC-II: Alpha/Beta separated by /)
mhc_type: 1 for MHC-I, 2 for MHC-II
anchors: Anchor positions (leave empty for prediction)
id: Unique identifier

Basic Structure Prediction (Recommended)

Single-threaded mode with initial guess (fastest & most accurate):

python run_PMGen.py \
  --mode wrapper \
  --run single \
  --df input.tsv \
  --output_dir output/ \
  --initial_guess

This is the preferred mode for most users. It uses:

--mode wrapper: Works for one or more than one prediction per run.
--run single: Sequential processing (unparallel)
--initial_guess: Fast and more accurate AlphaFold mode without homology modeling (recommended)

Common Use Cases

1. Structure Prediction with Multiple Models

python run_PMGen.py \
  --mode wrapper \
  --run single \
  --df input.tsv \
  --output_dir output/ \
  --initial_guess \
  --models model_1_ptm model_2_ptm model_3_ptm

2. Peptide Sequence Design

Generate optimized peptide variants:

python run_PMGen.py \
  --mode wrapper \
  --run single \
  --df input.tsv \
  --output_dir output/ \
  --initial_guess \
  --peptide_design \
  --num_sequences_peptide 50 \
  --binder_pred

3. MHC Pseudo-Sequence Design

Customize MHC binding groove residues:

python run_PMGen.py \
  --mode wrapper \
  --run single \
  --df input.tsv \
  --output_dir output/ \
  --initial_guess \
  --only_pseudo_sequence_design \
  --num_sequences_mhc 20

4. Iterative Peptide Optimization

Optimize peptides over multiple rounds:

python run_PMGen.py \
  --mode wrapper \
  --run single \
  --df input.tsv \
  --output_dir output/ \
  --initial_guess \
  --peptide_design \
  --binder_pred \
  --iterative_peptide_gen 3 \
  --fix_anchors

5. Mutation Screening

Systematically test point mutations:

python run_PMGen.py \
  --mode wrapper \
  --run single \
  --df input.tsv \
  --output_dir output/ \
  --initial_guess \
  --mutation_screen \
  --n_mutations 1

Key Options

Flag	Description
`--mode wrapper`	Batch processing mode (recommended)
`--run single`	Sequential processing (recommended)
`--initial_guess`	Fast AF mode without templates (recommended)
`--peptide_design`	Enable peptide sequence generation
`--only_pseudo_sequence_design`	Design MHC binding groove only
`--binder_pred`	Predict binding affinity (requires NetMHCpan)
`--fix_anchors`	Keep anchor positions fixed during design
`--iterative_peptide_gen N`	Run N rounds of optimization
`--mutation_screen`	Systematic mutation analysis
`--num_templates`	Number of structural templates (default: 4)
`--num_recycles`	AlphaFold recycles (default: 3)

Output Structure

output/
├── pandora/              # Template structures
├── alphafold/            # Predicted pMHC structures
├── proteinmpnn/          # Designed sequences
│   └── {id}/
│       ├── peptide_design/
│       └── only_pseudo_sequence_design/
└── best_structures/      # Top-ranked models (if --best_structures used)

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
.github/workflows		.github/workflows
AFfine		AFfine
PANDORA		PANDORA
PMBind		PMBind
data		data
examples		examples
utils		utils
.gitignore		.gitignore
AFfine_PANDORA_modifications.txt		AFfine_PANDORA_modifications.txt
LICENSE		LICENSE
PMGen-cpuonly.yml		PMGen-cpuonly.yml
PMGen.yml		PMGen.yml
README.md		README.md
index.html		index.html
install.sh		install.sh
pip_requirements.txt		pip_requirements.txt
proteinmpnn.yml		proteinmpnn.yml
run_PMGen.py		run_PMGen.py
run_bioemu.py		run_bioemu.py
run_bioemu.sh		run_bioemu.sh
run_utils.py		run_utils.py
tmp.json		tmp.json
user_setting.py		user_setting.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PMGen: From Structure Prediction to Peptide Generation

Key Features

Installation

Requirements

Setup

Optional: Configure NetMHCpan (Recommended)

Quick Start

Prepare Input File

Basic Structure Prediction (Recommended)

Common Use Cases

1. Structure Prediction with Multiple Models

2. Peptide Sequence Design

3. MHC Pseudo-Sequence Design

4. Iterative Peptide Optimization

5. Mutation Screening

Key Options

Output Structure

Citation

Support

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

soedinglab/PMGen

Folders and files

Latest commit

History

Repository files navigation

PMGen: From Structure Prediction to Peptide Generation

Key Features

Installation

Requirements

Setup

Optional: Configure NetMHCpan (Recommended)

Quick Start

Prepare Input File

Basic Structure Prediction (Recommended)

Common Use Cases

1. Structure Prediction with Multiple Models

2. Peptide Sequence Design

3. MHC Pseudo-Sequence Design

4. Iterative Peptide Optimization

5. Mutation Screening

Key Options

Output Structure

Citation

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages