Skip to content

BrainStOrmics/Perturbation_benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Benchmarking Genetic Perturbation Prediction Models
for Transcriptional Response

Description

Exploring the molecular responses of single cells to various influences—such as external stimuli or gene knockouts—is a crucial step toward demystifying the intricacies of cellular mechanisms. Although single-cell perturbation remains experimentally challenging at the current stage, burgeoning bioinformatic tools are accelerating the expansion of this field through in silico modeling. More strikingly, the recent surge in foundation models underscores the applicability of perturbation tasks as an indispensable downstream application. Therefore, a critical assessment of reliable perturbation tools is required. Here, we benchmark the performance of leading algorithms, including four conventional approaches and eight foundation models, using 21 paired perturbed or non-perturbed cell datasets of varying perturbation types and qualities. The results show that the conventional deep learning method GEARS and the foundation model scGPT outperform others in a comprehensive benchmark. Additionally, in certain tasks, the foundation models demonstrate promising potential through pre-training strategies or by ensembling with GEARS's perturbation embeddings. Our findings also highlight dataset quality, measured by E-distance, as a critical determinant of model performance. This study offers actionable insights for choosing the most suitable toolkit based on dataset characteristics and informs future development of robust, generalizable models for genetic perturbation predictions.

Implementation

We implemented all 12 proposed methods using their default parameters as described in their respective publications. In the /methods folder, we provide the implementations of all methods on the /demo_data dataset as an example. For detailed information regarding each method, please refer to the following repositories: GRN, CPA, GEARS, AttentionPert, scLong, scGPT, scFoundation, scELMo, scBERT-G, Geneformer-G, GenePT-G and ESM2-G.

The /analysis_visualization folder contains scripts for generating the visualizations and quantitative analyses presented in our manuscript and supplementary materials. To test any of the models on the /demo_data dataset, you can download the provided data, update the path to pert_data in the corresponding scripts within the /methods folder, and execute the code.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors