Skip to content

Input Format Requirements for Using Pre-trained Model #1

@yuequnwang

Description

@yuequnwang

Hello EpiFoundation team,

Thank you for developing and sharing EpiFoundation. However, I have some questions about the required input format to ensure compatibility with your pre-trained parameters.

I have an AnnData object where rows represent cells and columns represent ATAC peaks (binary values indicating open/closed chromatin regions). I'm trying to understand the best way to transform this data to leverage your pre-trained model effectively.

My Questions:
Peak Alignment Requirements: Do I need to align my peak set with the peaks used in your pre-training data? For instance, do the columns in my peak-cell matrix need to correspond exactly to the peaks used during pre-training?
Peak Identification Format: What format should peak IDs follow to be compatible with your model's embedding layer (Epeak)? Is there a specific format like "chr1:1000-2000"?

Thank you for your guidance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions