-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
psiPerIsoform resulting in empty .psi file #161
Comments
Hi Tasy,
Thanks for your email.
Perhaps the transcript IDs in your GTF and in your expression file are
different? They look different in your screen captures. SUPPA would not be
able to match them.
The expression file should have the transcript ID without the " ".
Also, the GTF format uses " " for transcript and gene IDs:
see e.g. https://asia.ensembl.org/info/website/upload/gff.html
Please let me know if that would fix it
best
Eduardo
…On Wed, 12 Apr 2023 at 20:04, TinyTasy ***@***.***> wrote:
Dear SUPPA team,
Thank you so much for your amazing tool. It is really helpful for
differential isoform analysis.
I am trying to use SUPPA on pacbio single-cell isoseq data. I aligned my
data with pbbm2 and used pigeon (SQANTI-based) to obtain a .gff file. Using
gffread, I converted my .gff file into a .gtf file. My gtf file looks like
this:
[image: Screenshot from 2023-04-12 11-54-45]
<https://user-images.githubusercontent.com/118251413/231423185-3e586c04-d231-45c4-9a25-3c1c59ba445d.png>
Thus, having the pb gene and transcript ID as the 9th column in the gtf
file.
My expression file is a tab-seperated (.tsv) file consists of 268 samples
(pseudobulks) and looks like this:
[image: Screenshot from 2023-04-12 11-56-32]
<https://user-images.githubusercontent.com/118251413/231423691-97b5e60f-9440-410f-9747-1df66fa303b6.png>
If I now execute this command:
python3.4 /vol/projects/agrinko/TREM2_7_03_2022/SUPPA-2.3/suppa.py
psiPerIsoform
-g
/vol/projects/agrinko/TREM2_7_03_2022/data/Trem2_Longread/pacbio_TREM2.gtf
-e
/vol/projects/agrinko/TREM2_7_03_2022/data/Trem2_Longread/pseudobulk_without_rownames.tsv
-o
/vol/projects/agrinko/TREM2_7_03_2022/data/Trem2_Longread/psiPerIsoform_output
I get this warning for each transcript:
INFO:psiPerGene:Reading GTF data.
INFO:psiPerGene:Reading Expression data.
INFO:psiPerGene:Calculating inclusion and generating output.
INFO:lib.tools:Expression for transcript "PB.104659.2" not found. Ignoring
it in calculation.
INFO:lib.tools:Expression for transcript "PB.104659.16" not found.
Ignoring it in calculation.
INFO:lib.tools:Expression for transcript "PB.98879.2" not found. Ignoring
it in calculation.
INFO:lib.tools:Expression for transcript "PB.98879.3" not found. Ignoring
it in calculation.
.
.
.
And my .psi output file is empty, only the sample names are persisting.
I already tried multiple things, such as testing tab seperated .txt files
and .tsv files. I also already used the transcripts as rownames.
Do you have any idea what could be the issue? Any help is greatly
appreciated.
Sincerely,
Tasy
—
Reply to this email directly, view it on GitHub
<#161>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADCZKB2KYGOEICYQCXUUX73XAZ43RANCNFSM6AAAAAAW3OQLLU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Hello Eduardo! Thank you for your quick reply. I am grateful for your help, thank you! Sincerely, |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Dear SUPPA team,
Thank you so much for your amazing tool. It is really helpful for differential isoform analysis.
I am trying to use SUPPA on pacbio single-cell isoseq data. I aligned my data with pbbm2 and used pigeon (SQANTI-based) to obtain a .gff file. Using gffread, I converted my .gff file into a .gtf file. My gtf file looks like this:
Thus, having the pb gene and transcript ID as the 9th column in the gtf file.
My expression file is a tab-seperated (.tsv) file consists of 268 samples (pseudobulks) and looks like this:
If I now execute this command:
python3.4 /vol/projects/agrinko/TREM2_7_03_2022/SUPPA-2.3/suppa.py psiPerIsoform
-g /vol/projects/agrinko/TREM2_7_03_2022/data/Trem2_Longread/pacbio_TREM2.gtf
-e /vol/projects/agrinko/TREM2_7_03_2022/data/Trem2_Longread/pseudobulk_without_rownames.tsv
-o /vol/projects/agrinko/TREM2_7_03_2022/data/Trem2_Longread/psiPerIsoform_output
I get this warning for each transcript:
INFO:psiPerGene:Reading GTF data.
INFO:psiPerGene:Reading Expression data.
INFO:psiPerGene:Calculating inclusion and generating output.
INFO:lib.tools:Expression for transcript "PB.104659.2" not found. Ignoring it in calculation.
INFO:lib.tools:Expression for transcript "PB.104659.16" not found. Ignoring it in calculation.
INFO:lib.tools:Expression for transcript "PB.98879.2" not found. Ignoring it in calculation.
INFO:lib.tools:Expression for transcript "PB.98879.3" not found. Ignoring it in calculation.
.
.
.
And my .psi output file is empty, only the sample names are persisting.
I already tried multiple things, such as testing tab seperated .txt files and .tsv files. I also already used the transcripts as rownames.
Do you have any idea what could be the issue? Any help is greatly appreciated.
Sincerely,
Tasy
The text was updated successfully, but these errors were encountered: