📖 How to Use TSUMUGI

日本語版 README はこちら

TSUMUGI (Trait-driven Surveillance for Mutation-based Gene module Identification) is a web tool that leverages knockout (KO) mouse phenotype data from the International Mouse Phenotyping Consortium (IMPC) to extract and visualize gene modules based on phenotypic similarity.

The tool is publicly available online for anyone to use 👇️

🔗 https://larc-tsukuba.github.io/tsumugi/

TSUMUGI derives from the Japanese concept of "weaving together gene groups that form phenotypes."

📖 How to Use TSUMUGI

💬 Top Page

TSUMUGI supports three types of input:

1. Phenotype

When you input a phenotype of interest, TSUMUGI searches for gene groups with similar overall phenotype profiles among genes whose KO mice exhibit that phenotype.
Phenotype names are based on Mammalian Phenotype Ontology (MPO).

List of currently searchable phenotypes in TSUMUGI:
👉 Phenotype List

2. Gene

When you specify a single gene, TSUMUGI searches for other gene groups whose KO mice have similar phenotype profiles to that gene's KO mice.
Gene names follow gene symbols registered in MGI.

List of currently searchable gene names in TSUMUGI:
👉 Gene List

3. Gene List

Accepts input of multiple genes.
Gene lists should be entered separated by line breaks.

Note

Gene List differs from single Gene input in that it extracts phenotypically similar genes among the genes within the list.

Caution

If no phenotypically similar genes are found, No similar phenotypes were found among the entered genes. alert will be displayed and processing will stop.

If phenotypically similar genes exceed 200, Too many genes submitted. Please limit the number to 200 or fewer. alert will be displayed and processing will stop to prevent browser overload.

📥 Raw Data Download (`TSUMUGI_{version}_raw_data`)

You can download raw data of phenotypic similarity between gene pairs (in Gzip-compressed CSV format or Parquet format).

Contents include:

Paired gene names (Gene1, Gene2)
Phenotypic similarity between pairs (Jaccard Similarity)
Number of shared phenotypes between pairs (Number of shared phenotype)
List of shared phenotypes between pairs (List of shared phenotype)

Caution

File size is approximately 50-100MB. Download may take some time.

We recommend using Parquet format when working with Polars or Pandas.
You can load the data as follows:

Polars

# Install Polars and PyArrow using conda
conda create -y -n env-tsumugi polars pyarrow
conda activate env-tsumugi

# Load Parquet file using Polars
import polars as pl
df_tsumugi = pl.read_parquet("TSUMUGI_{version}_raw_data.parquet")

Pandas

# Install Pandas and PyArrow using conda
conda create -y -n env-tsumugi pandas pyarrow
conda activate env-tsumugi

# Load Parquet file using Pandas
import pandas as pd
df_tsumugi = pd.read_parquet("TSUMUGI_{version}_raw_data.parquet")

🌐 Network Visualization

Based on the input, the page transitions and the network is automatically drawn.

Important

Gene pairs with 2 or more shared abnormal phenotypes AND phenotypic similarity of 0.2 or higher are subject to visualization.

Network Panel

Nodes (Points)

Each node represents one gene.
Clicking displays a list of abnormal phenotypes observed in that gene's KO mice.
You can freely adjust positions by dragging.

Edges (Lines)

Clicking an edge shows details of shared phenotypes.

Control Panel

The left control panel allows you to adjust network display.

Filter by Phenotypic Similarity

The Phenotypes similarity slider allows you to set thresholds for gene pairs displayed in the network based on edge phenotypic similarity (Jaccard coefficient).
Similarity minimum and maximum values are converted to a 1-10 scale, allowing 10-level filtering.

Note

For details on phenotypic similarity, please see:
👉 🔍 Calculation Method for Phenotypically Similar Gene Groups

Filter by Phenotype Severity

The Phenotype severity slider allows you to adjust node display based on phenotype severity (effect size) in KO mice.
Higher effect sizes indicate stronger phenotypic impact.
This also scales the effect size range to 1-10, allowing 10-level filtering.

Note

When IMPC phenotype evaluation is binary (present/absent) (e.g., abnormal embryo development: list of binary phenotypes available here) or when gene name is input, the Phenotypes severity slider is not available.

Specify Genotype

You can specify the genotype of KO mice exhibiting phenotypes:

Homo: Phenotypes seen in homozygous mice
Hetero: Phenotypes seen in heterozygous mice
Hemi: Phenotypes seen in hemizygous mice

Specify Sex

You can extract sex-specific phenotypes:

Female: Female-specific phenotypes
Male: Male-specific phenotypes

Specify Life Stage

You can specify life stages when phenotypes appear:

Embryo: Phenotypes appearing during embryonic stage
Early: Phenotypes appearing at 0-16 weeks of age
Interval: Phenotypes appearing at 17-48 weeks of age
Late: Phenotypes appearing at 49+ weeks of age

Markup Panel

Highlight Human Disease-Related Genes (Highlight: Human Disease)

You can highlight genes related to human diseases.
The relationship between KO mice and human diseases uses public data from IMPC Disease Models Portal.

Search Gene Names (Search: Specific Gene)

You can search for gene names included in the network.

Adjust Network Display Style (Layout & Display)

You can adjust the following elements:

Network layout (layout)
Font size (Font size)
Edge thickness (Edge width)
Distance between nodes (*Cose layout only) (Node repulsion)

Export

You can export current network images and data in PNG, CSV and GraphML formats.
CSV includes connected component (module) IDs and lists of phenotypes shown by each gene's KO mice.
GraphML is a format compatible with the desktop version of Cytoscape, allowing you to import the network into Cytoscape for further analysis.

🔍 Calculation Method for Phenotypically Similar Gene Groups

Data Source

IMPC dataset uses statistical-results-ALL.csv.gz from Release-23.0.
Information about columns included in the dataset: Data fields

Preprocessing

Extract gene-phenotype pairs where KO mouse phenotype P-values (p_value, female_ko_effect_p_value, or male_ko_effect_p_value) are 0.0001 or below.

Genotype-specific phenotypes are annotated with homo, hetero, or hemi
Sex-specific phenotypes are annotated with female or male

Phenotypic Similarity Calculation

Jaccard coefficient is used as the phenotypic similarity metric.
This is a similarity measure that expresses the proportion of shared phenotypes as a 0-1 numerical value.

Jaccard(A, B) = |A ∩ B| / |A ∪ B|

For example, suppose gene A and gene B KO mice have the following abnormal phenotypes:

A: {abnormal embryo development, abnormal heart morphology, abnormal kidney morphology}
B: {abnormal embryo development, abnormal heart morphology, abnormal lung morphology}

In this case, there are 2 shared phenotypes and 4 total unique phenotypes, so the Jaccard coefficient is calculated as follows:

Jaccard(A, B) = 2 / 4 = 0.5

✉️ Contact

For questions or requests, please feel free to contact us:

Google Form
👉 Contact Form
For GitHub account holders
👉 GitHub Issue

Name		Name	Last commit message	Last commit date
Latest commit History 566 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
TSUMUGI		TSUMUGI
doc		doc
image		image
notebooks		notebooks
test-tsumugi		test-tsumugi
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.nbstripout		.nbstripout
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📖 How to Use TSUMUGI

💬 Top Page

1. Phenotype

2. Gene

3. Gene List

📥 Raw Data Download (`TSUMUGI_{version}_raw_data`)

Polars

Pandas

🌐 Network Visualization

Network Panel

Nodes (Points)

Edges (Lines)

Control Panel

Filter by Phenotypic Similarity

Filter by Phenotype Severity

Specify Genotype

Specify Sex

Specify Life Stage

Markup Panel

Highlight Human Disease-Related Genes (Highlight: Human Disease)

Search Gene Names (Search: Specific Gene)

Adjust Network Display Style (Layout & Display)

Export

🔍 Calculation Method for Phenotypically Similar Gene Groups

Data Source

Preprocessing

Phenotypic Similarity Calculation

✉️ Contact

About

Uh oh!

Releases 13

Contributors 2

Uh oh!

Languages

License

akikuno/TSUMUGI-dev

Folders and files

Latest commit

History

Repository files navigation

📖 How to Use TSUMUGI

💬 Top Page

1. Phenotype

2. Gene

3. Gene List

📥 Raw Data Download (TSUMUGI_{version}_raw_data)

Polars

Pandas

🌐 Network Visualization

Network Panel

Nodes (Points)

Edges (Lines)

Control Panel

Filter by Phenotypic Similarity

Filter by Phenotype Severity

Specify Genotype

Specify Sex

Specify Life Stage

Markup Panel

Highlight Human Disease-Related Genes (Highlight: Human Disease)

Search Gene Names (Search: Specific Gene)

Adjust Network Display Style (Layout & Display)

Export

🔍 Calculation Method for Phenotypically Similar Gene Groups

Data Source

Preprocessing

Phenotypic Similarity Calculation

✉️ Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 13

Contributors 2

Uh oh!

Languages

📥 Raw Data Download (`TSUMUGI_{version}_raw_data`)