whisper

whisper-ppi is a Python package for scoring protein–protein interactions from proximity labeling and affinity purification mass spectrometry datasets.
It uses interpretable features, programmatic weak supervision, and decoy-based false discovery rate (FDR) estimation to identify high-confidence interactors.

Installation

Install from PyPI

pip install whisper-ppi

Install from GitHub

git clone https://github.com/camlab-bioml/whisper
cd whisper
pip install .

Input Format

A CSV file with:
- One column named Protein
- Other columns representing bait replicate intensities, named as BAIT_1, BAIT_2, etc.
Control samples must be identifiable via substrings in their column names (e.g., "EGFP" or "Empty").

Usage

#protein-level
from whisper.protein_features import feature_engineering_protein
from whisper.protein_train import train_and_score_protein
import pandas as pd


# Load intensity table
intensity_df = pd.read_csv("input_intensity_dataset.tsv", sep="\t")

controls = ['EGFP', 'Empty', 'NminiTurbo']

# Run feature engineering
features_df = feature_engineering_protein(intensity_df, controls)

# You can save the features to use in the next step with different settings without generating them again.
features_df = pd.read_csv("features.csv")


# Run scoring and FDR estimation
scored_df = train_and_score_protein(features_df, initial_positives=15, initial_negatives=200)


#peptide-level
from whisper.peptide_features import feature_engineering_peptide
from whisper.peptide_train import train_and_score_peptide
import pandas as pd


# Load intensity table
intensity_df = pd.read_csv("input_intensity_dataset.tsv", sep="\t")

controls = ['EGFP', 'Empty', 'NminiTurbo']

# Run feature engineering
features_df = feature_engineering_peptide(intensity_df, controls)

# features_df = pd.read_csv("features.csv")


# Run scoring and FDR estimation
scored_df = train_and_score_peptide(features_df, initial_positives=15, initial_negatives=200)


#fragment-level
from whisper.fragment_features import feature_engineering_fragment
from whisper.fragment_train import train_and_score_fragment
import pandas as pd


# Load intensity table
intensity_df = pd.read_csv("input_intensity_dataset.tsv", sep="\t")

controls = ['EGFP', 'Empty', 'NminiTurbo']

# Run feature engineering
features_df = feature_engineering_fragment(intensity_df, controls)

# features_df = pd.read_csv("features.csv")


# Run scoring and FDR estimation
scored_df = train_and_score_fragment(features_df, initial_positives=15, initial_negatives=200)

Output

The final output includes:

predicted_probability: Probability of each bait–prey interaction being real
FDR: Estimated false discovery rate
global_cv_flag: Flag for likely background preys based on variability across all samples

Tutorial

Read the full documentation

Citation

This software is authored by: Vesal Kasmaeifar, Kieran R Campbell

Lunenfeld-Tanenbaum Research Institute & University of Toronto

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
dist		dist
docs		docs
tutorial		tutorial
whisper		whisper
whisper_ppi.egg-info		whisper_ppi.egg-info
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

whisper

Installation

Install from PyPI

Install from GitHub

Input Format

Usage

Output

Tutorial

Citation

About

Uh oh!

Releases

Packages

Languages

camlab-bioml/whisper

Folders and files

Latest commit

History

Repository files navigation

whisper

Installation

Install from PyPI

Install from GitHub

Input Format

Usage

Output

Tutorial

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages