-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Hello EpiFoundation team,
Thank you for developing and sharing EpiFoundation. However, I have some questions about the required input format to ensure compatibility with your pre-trained parameters.
I have an AnnData object where rows represent cells and columns represent ATAC peaks (binary values indicating open/closed chromatin regions). I'm trying to understand the best way to transform this data to leverage your pre-trained model effectively.
My Questions:
Peak Alignment Requirements: Do I need to align my peak set with the peaks used in your pre-training data? For instance, do the columns in my peak-cell matrix need to correspond exactly to the peaks used during pre-training?
Peak Identification Format: What format should peak IDs follow to be compatible with your model's embedding layer (Epeak)? Is there a specific format like "chr1:1000-2000"?
Thank you for your guidance!