TF-MInDi: Transcription Factor Motifs and Instances Discovery

TF-MInDi is a Python package for analyzing transcription factor binding patterns from deep learning model attribution scores. It identifies and clusters sequence motifs from contribution scores, maps them to DNA-binding domains, and provides comprehensive visualization tools for regulatory genomics analysis.

Key Features

Seqlet Extraction: Identifies important sequence regions from contribution scores using recursive seqlet calling
Motif Similarity Analysis: Compares extracted seqlets to known motif databases using TomTom
Clustering & Dimensionality Reduction: Groups similar seqlets using Leiden clustering and t-SNE visualization
DNA-Binding Domain Annotation: Maps seqlet clusters to transcription factor families
Pattern Generation: Creates consensus motifs from clustered seqlets with alignment
Comprehensive Visualization: Region-level contribution plots, t-SNE embeddings, motif logos, and heatmaps
Scalable Processing: Memory-efficient chunked processing for large datasets

Quick Start

import tfmindi as tm

# Extract seqlets from contribution scores
seqlets_df, seqlet_matrices = tm.pp.extract_seqlets(
    contrib=contrib_scores,  # (n_examples, 4, length)
    oh=one_hot_sequences,    # (n_examples, 4, length)
    threshold=0.05
)

# Calculate motif similarity
motif_collection = tm.load_motif_collection(
    tm.fetch_motif_collection()
)
similarity_matrix = tm.pp.calculate_motif_similarity(
    seqlet_matrices,
    motif_collection,
    chunk_size=10000
)

# Create AnnData object for analysis
adata = tm.pp.create_seqlet_adata(
    similarity_matrix,
    seqlets_df,
    seqlet_matrices=seqlet_matrices,
    oh_sequences=one_hot_sequences,
    contrib_scores=contrib_scores,
    motif_collection=motif_collection
)

# Cluster seqlets and annotate with DNA-binding domains
tm.tl.cluster_seqlets(adata, resolution=3.0)

# Generate consensus logos for each cluster
patterns = tm.tl.create_patterns(adata)

# Visualize results
tm.pl.tsne(adata, color_by="cluster_dbd")
tm.pl.region_contributions(adata, example_idx=0)
tm.pl.dbd_heatmap(adata)

Installation

You need to have Python 3.10 or newer installed on your system.

pip install tfmindi

Core Workflow

TF-MInDi follows a scanpy-inspired workflow:

Preprocessing (tm.pp): Extract seqlets, calculate motif similarities, and create an Anndata object
Tools (tm.tl): Cluster seqlets and create consensus patterns
Plotting (tm.pl): Visualize results

Data Requirements

Contribution scores: Attribution values from deep learning models (e.g., DeepSHAP, Integrated Gradients)
One-hot sequences: Corresponding genomic sequences in one-hot encoding
Motif database: Known transcription factor motifs

Getting Started

Please refer to the documentation for detailed tutorials and examples, in particular, the API documentation.

Release Notes

See the changelog.

Contact

If you found a bug, please use the issue tracker.

Citation

t.b.a

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github		.github
.vscode		.vscode
docs		docs
paper		paper
src/tfmindi		src/tfmindi
tests		tests
.codecov.yaml		.codecov.yaml
.cruft.json		.cruft.json
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
biome.jsonc		biome.jsonc
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TF-MInDi: Transcription Factor Motifs and Instances Discovery

Key Features

Quick Start

Installation

Core Workflow

Data Requirements

Getting Started

Release Notes

Contact

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

aertslab/TF-MInDi

Folders and files

Latest commit

History

Repository files navigation

TF-MInDi: Transcription Factor Motifs and Instances Discovery

Key Features

Quick Start

Installation

Core Workflow

Data Requirements

Getting Started

Release Notes

Contact

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages