-
Notifications
You must be signed in to change notification settings - Fork 7
Try to create the chicken cisdatabase #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Never seen that error myself, but I assume the error comes from a mismatch between gene/region names in the database and the ones you are requesting in pySCENIC (which seems to be |
|
ENSEMBL switched from GRCg6a (galGal6) to GRCg7b (bGalGal1) in recent releases. ENSGALG00010029927, for example, is an ENSEMBL ID for the GRCg7b assembly. So, the coordinates would not match if you are using GRCg6a. They still provide gene annotations for GRCg6a on their ftp server (at least for ENSEMBL 108). But many gene names are missing. We updated the protein-coding gene names for the GRCg6a version (ENSEMBL 108) for one of our recent data sets, in case that is helpful: https://ftp.ncbi.nlm.nih.gov/geo/series/GSE262nnn/GSE262321/suppl/GSE262321%5Fgex%5Fgenes%2Etsv%2Egz |
I have made two feather file which one is made from 500bp around TSS the other is made from 10KB around TSS , and find that both of them are not ideal ,only 6 to 7 TFs are detected in the results. What's more, the detected TSs are totally different! feel sad about results, EMMMMMMMMMM |
@LJZYaaa Are you sure that you are using the correct gene annotation GRCg7b (bGalGal1) with the correct FASTA (GRCg7b (bGalGal1)) file? |
@ghuls I made the fa file throught Ensembl biomart as below: |
At first glance it looks OK. Now that your gene names are from GRCg7b in the Feather database, make sure to convert your expression matrix gene names to GRCg7b too. For pySCENIC it might be better to use the human motif to TF than the chicken (GRCg6a) one (: https://resources.aertslab.org/cistarget/motif2tf/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl, same with the TF list: https://resources.aertslab.org/cistarget/tf_lists/allTFs_hg38.txt (prune it, so you only have TFs that are actually in your database). As we mainly work with scATAC data instead of scRNA, I think we might not have a gene-based chicken cisTarget database internally, but only region-based ones. So there is a chance that the gene-based version does not work very well. |
@ghuls ok, i will try it. It will be with great regret that your tool can't be used in the non-model specise. Anyway, thanks for your reply. |
Hello,I'm trying to create the chicken's cistarget database for my single-cell research analysis, and already creating the GRCg6a.regions_vs_motifs.rankings.feather throught EPD's bed and v10_clust motifs. But when i try to run the pyscenic ctx using the feather file and motifs-v10-nr.chicken-m0.00001-o0.0.tbl , it report that



Can you make some advice for that wrong ?
The text was updated successfully, but these errors were encountered: