-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SUPPA not working on quantification #160
Comments
Hi Olivier,
I managed to reproduce your problem.
I looked at the .ioe and expression files, and saw that your expression
file had a total 56980 different transcript IDs
and your ioe file has 17822 transcript ids.
Many of the IDs in the ioe file are not in the expression file.
What happens is that, since each event is defined by multiple transcripts,
if some do not appear in the expression, the event may be left as
undefined.
Not being in the expression file is not considered to be the same as having
zero expression, so this may be the cause of having so many NA's.
One thing that you could try is including the zeroes in the expression file
and see whether the event PSIs can be calculated.
I hope this helps
best
Eduardo
…On Mon, 27 Mar 2023 at 21:12, Olivier Feudjio ***@***.***> wrote:
Hello
I am trying to run SUPPA to assess alternative splicing event, I have done
everything as described in the documentation regarding the expression file
but it still does not work. Here is the command line I am using:
suppa.py psiPerEvent --ioe-file mm10_SE_strict.ioe --expression-file
spc.tsv --output-file ./spc_SE
Thank you for helping me out
Best
—
Reply to this email directly, view it on GitHub
<#160>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADCZKB3D42T6OX2M52ZBATLW6FRZBANCNFSM6AAAAAAWI7DLDY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Hello @EduEyras, I think that at least for the transcripts that appear to be in the expression file, there should be a PSI calculated but this is not the case, it gives me only NAs, regardless of whether the transcript ID is present in the expression file or not. Best. |
Could you check if there are any events that have all the transcript IDs
involved in the event in the expression file? My suspicion is that there
isn’t, and that’s why is NA for all events
E
On Tue, 28 Mar 2023 at 02:07, Olivier Feudjio ***@***.***> wrote:
Hello @EduEyras <https://github.com/EduEyras>,
Thank you for your answer.
All of the transcripts in my expression file have a value, of at least
zero.
So if I understand well, you suggest extracting all the transcripts IDs
from the ioe file and for the ones that do not have a value in the
expression file, I should assign zero and see if it is able to work, right?
I think that at least for the transcripts that appear to be in the
expression file, there should be a PSI calculated but this is not the case,
it gives me only NAs, regardless of whether the transcript ID is present in
the expression file or not.
Best.
—
Reply to this email directly, view it on GitHub
<#160 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADCZKB6SOHGALEVDHN4QLPLW6GULNANCNFSM6AAAAAAWI7DLDY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
|
Thank you for your reply,
I am afraid that I might not understand what you mean or how to check that, can you please elaborate more on your question?
Best
--
Sent from Mail.ru app for Android Monday, 27 March 2023, 11:01PM +02:00 from Eduardo Eyras ***@***.*** :
…Could you check if there are any events that have all the transcript IDs
involved in the event in the expression file? My suspicion is that there
isn’t, and that’s why is NA for all events
E
On Tue, 28 Mar 2023 at 02:07, Olivier Feudjio ***@***.***>
wrote:
Hello @EduEyras < https://github.com/EduEyras> ,
Thank you for your answer.
All of the transcripts in my expression file have a value, of at least
zero.
So if I understand well, you suggest extracting all the transcripts IDs
from the ioe file and for the ones that do not have a value in the
expression file, I should assign zero and see if it is able to work, right?
I think that at least for the transcripts that appear to be in the
expression file, there should be a PSI calculated but this is not the case,
it gives me only NAs, regardless of whether the transcript ID is present in
the expression file or not.
Best.
—
Reply to this email directly, view it on GitHub
< #160 (comment)> , or
unsubscribe
< https://github.com/notifications/unsubscribe-auth/ADCZKB6SOHGALEVDHN4QLPLW6GULNANCNFSM6AAAAAAWI7DLDY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
—
Reply to this email directly, view it on GitHub , or unsubscribe .
You are receiving this because you authored the thread. Message ID: @ github . com>
|
Hi,
the ioe file defines the event by indicating the transcript that includes
the alternative exon
and the transcripts that include or exclude the exon, e.g.:
1 ENSMUSG00000026312
ENSMUSG00000026312;SE:1:110036685-110039327:110039381-110065592:+
ENSMUST00000134301
ENSMUST00000027542,ENSMUST00000134301,ENSMUST00000112701,ENSMUST00000172005,ENSMUST00000131464
So I'm wondering whether the events are NA because there are missing values
in the last column (the transcript IDs excluding the event)
E.
On Tue, 28 Mar 2023 at 17:52, Olivier Feudjio ***@***.***>
wrote:
…
Thank you for your reply,
I am afraid that I might not understand what you mean or how to check
that, can you please elaborate more on your question?
Best
--
Sent from Mail.ru app for Android Monday, 27 March 2023, 11:01PM +02:00
from Eduardo Eyras ***@***.*** :
>Could you check if there are any events that have all the transcript IDs
>involved in the event in the expression file? My suspicion is that there
>isn’t, and that’s why is NA for all events
>E
>
>On Tue, 28 Mar 2023 at 02:07, Olivier Feudjio ***@***.***>
>wrote:
>
> Hello @EduEyras < https://github.com/EduEyras> ,
> Thank you for your answer.
> All of the transcripts in my expression file have a value, of at least
> zero.
> So if I understand well, you suggest extracting all the transcripts IDs
> from the ioe file and for the ones that do not have a value in the
> expression file, I should assign zero and see if it is able to work,
right?
>
> I think that at least for the transcripts that appear to be in the
> expression file, there should be a PSI calculated but this is not the
case,
> it gives me only NAs, regardless of whether the transcript ID is present
in
> the expression file or not.
>
> Best.
>
> —
> Reply to this email directly, view it on GitHub
>< #160 (comment)> ,
or
> unsubscribe
><
https://github.com/notifications/unsubscribe-auth/ADCZKB6SOHGALEVDHN4QLPLW6GULNANCNFSM6AAAAAAWI7DLDY
>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
>--
>—
>Reply to this email directly, view it on GitHub , or unsubscribe .
>You are receiving this because you authored the thread. Message ID: @
github . com>
—
Reply to this email directly, view it on GitHub
<#160 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADCZKB3YI5ELYOHC6HQR433W6KDDRANCNFSM6AAAAAAWI7DLDY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I have tried to punctually grep some IDs from the expression file and they exist in the ioe file too can I please send you both files for you t check? I would be very grateful if you could have a look at them |
Hi,
I made a quick perl script to parse the expression and ioe file and check
whether any of the transcript IDs in each event are missing, and all of
them have one or more
transcript IDs missing in the expression file. I paste the script below. It
works as follows:
perl check_exp_ioe_files.pl spd1.tsv mm10_SE_strict.ioe
#!/usr/bin/perl -w
use strict;
my ($exp_file, $ioe_file) = @argv;
unless($exp_file && $ioe_file){
print STDERR "Usage: $0 exp_file ioe_file\n";
print STDERR "Script to check that the ioe file has IDs with expression
values\n";
exit(0);
}
open (INPUT, $exp_file) or die("cannot open $exp_file, $!");
my %ids_exp;
while(my $line=<INPUT>){
chomp $line;
next unless($line=~/ENST*/);
my @line_array = split(/\t/,$line);
$ids_exp{$line_array[0]}++;
}
close(INPUT) or die("cannot close $exp_file");
open (INPUT, $ioe_file) or die("cannot open $ioe_file, $!");
while(my $line=<INPUT>){
chomp $line;
next unless($line=~/ENST*/);
# 1 ENSMUSG00000039748
ENSMUSG00000039748;SE:1:175733555-175734172:175734226-175735996:+
ENSMUST00000194636
ENSMUST00000039725,ENSMUST00000193610,ENSMUST00000194636
my ($chr, $gene, $event, $t1, $t_line) = split(/\t/,$line);
my %missing;
$missing{$t1}++ unless $ids_exp{$t1};
my @t_list = split(",", $t_line);
foreach my $t ***@***.***_list){
$missing{$t}++ unless $ids_exp{$t};
}
my @missing_ids = keys %missing;
if ***@***.***_ids){
my $s = join "\t", ($chr, $gene, $event, $t1, $t_line, "missing", @
missing_ids);
print $s."\n";
}
else{
my $s = join "\t", ($chr, $gene, $event, $t1, $t_line, "correct");
print $s."\n";
}
}
…On Wed, 29 Mar 2023 at 02:59, Olivier Feudjio ***@***.***> wrote:
I have tried to punctually grep some IDs from the expression file and they
exist in the ioe file too
can I please send you both files for you t check?
Here there are:
mm10_SE_strict.zip
<https://github.com/comprna/SUPPA/files/11091460/mm10_SE_strict.zip>
expression files.zip
<https://github.com/comprna/SUPPA/files/11091461/expression.files.zip>
I would be very grateful if you could have a look at them
—
Reply to this email directly, view it on GitHub
<#160 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADCZKB2MOLX626FX2DFKEFLW6MDFLANCNFSM6AAAAAAWI7DLDY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hello @EduEyras Best |
Hi
Yes, the ioe is fine
I am wondering whether adding the ids with zeroes in the expression file
might help
E
On Wed, 29 Mar 2023 at 21:46, Olivier Feudjio ***@***.***> wrote:
Hello @EduEyras <https://github.com/EduEyras>
Thank you very much for this script, what can be the solution to this? how
can I solve this problem, please?
At the same time, I am very confused because the ioe files were generated
using the same gtf file used to generate the expression files so the
transcripts IDs are supposed to be the same.
Best
—
Reply to this email directly, view it on GitHub
<#160 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADCZKBZTCA2MCDHLHH36U6LW6QHJXANCNFSM6AAAAAAWI7DLDY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
|
I will try to extract those transcripts IDs from the ioe file and assign zeroes to them in the expression file to see if it changes the outcome. Thank you! |
Hello
I am trying to run SUPPA to assess alternative splicing event, I have done everything as described in the documentation regarding the expression file but it still does not work. Here is the command line I am using:
suppa.py psiPerEvent --ioe-file mm10_SE_strict.ioe --expression-file spc.tsv --output-file ./spc_SE
Thank you for helping me out
Best
The text was updated successfully, but these errors were encountered: