Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diffsplice index error #196

Open
Amhaslam opened this issue Sep 16, 2024 · 5 comments
Open

diffsplice index error #196

Amhaslam opened this issue Sep 16, 2024 · 5 comments

Comments

@Amhaslam
Copy link

hi,
i am getting following error while running diffsplice, please help me to resolve this error.

Calculating differential analysis between conditions: Suppa_salmon_RB_psi_values and Suppa_salmon_Control_psi_values
ERROR:main:Unknown error: (<class 'IndexError'>, IndexError('list index out of range'), <traceback object at 0x790a335260c0>)

@EduEyras
Copy link
Member

Could you please send the command line?

Also, please use a copy of the code in github (either a clone or a download). This will contain some bug fixes that might not be available in the conda version

E.

@MarekGierlinski
Copy link

Hi,

I have encountered the same error, using the most recent version from GitHub. Here is my command line:

> python3 SUPPA-2.4/suppa.py diffSplice -m empirical -i suppa/events/event_SE_strict.ioe -p suppa/psi/DMSO_4h_SE.psi suppa/psi/ActD_4h_SE.psi -e suppa/tpm/DMSO_4h.txt suppa/tpm/ActD_4h.txt --lower-bound 0.05 -gc -o suppa/diff/ActD_4h_SE.txt

Calculating differential analysis between conditions: DMSO_4h_SE and ActD_4h_SE
ERROR:__main__:Unknown error: (<class 'IndexError'>, IndexError('list index out of range'), <traceback object at 0x2b02ecc8f040>)

The input files are like these:

> head suppa/events/event_SE_strict.ioe
seqname	gene_id	event_id	alternative_transcripts	total_transcripts
1	ENSG00000142611	ENSG00000142611;SE:1:3431108-3431966:3432140-3433677:+	ENST00000270722,ENST00000512462,ENST00000509860	ENST00000378389,ENST00000511072,ENST00000509860,ENST00000512462,ENST00000514189,ENST00000270722
1	ENSG00000142655	ENSG00000142655;SE:1:10495321-10517028:10517131-10536213:+	ENST00000472851	ENST00000356607,ENST00000472851,ENST00000491661
1	ENSG00000232596	ENSG00000232596;SE:1:4571629-4572470:4572592-4593052:+	ENST00000667354,ENST00000659059	ENST00000666685,ENST00000659059,ENST00000667354,ENST00000634256
1	ENSG00000232596	ENSG00000232596;SE:1:4571629-4575322:4575450-4593052:+	ENST00000659083	ENST00000666685,ENST00000659083,ENST00000634256
1	ENSG00000232596	ENSG00000232596;SE:1:4572592-4582277:4582574-4583083:+	ENST00000420522	ENST00000420522,ENST00000661628
1	ENSG00000235054	ENSG00000235054;SE:1:4416548-4423054:4423187-4423348:+	ENST00000635312	ENST00000635002,ENST00000668086,ENST00000635312
1	ENSG00000235054	ENSG00000235054;SE:1:4423187-4423348:4423448-4423994:+	ENST00000635312,ENST00000669931,ENST00000423197,ENST00000659315,ENST00000635642	ENST00000659315,ENST00000423197,ENST00000669931,ENST00000635642,ENST00000635312,ENST00000667352
1	ENSG00000235054	ENSG00000235054;SE:1:4416548-4422760:4423187-4423348:+	ENST00000669931,ENST00000423197	ENST00000423197,ENST00000669931,ENST00000635002,ENST00000668086
1	ENSG00000235054	ENSG00000235054;SE:1:4416548-4422876:4423187-4423348:+	ENST00000659315,ENST00000635642	ENST00000659315,ENST00000635002,ENST00000668086,ENST00000635642

head suppa/psi/DMSO_4h_SE.psi
DMSO_4h_1	DMSO_4h_2	DMSO_4h_3	DMSO_4h_4
ENSG00000000003;SE:X:100630866-100632485:100632568-100633405:-	0.9984909907641735	0.9960692215706672	1.0	0.9993863994390673
ENSG00000000419;SE:20:50936262-50940865:50940933-50942031:-	0.9629173478563579	0.9427937649710495	0.956188066744992	0.9425552007213499
ENSG00000000419;SE:20:50936262-50940865:50940955-50942031:-	0.47221963951960194	0.2933847562014635	0.3657149073452484	0.3038975229653856
ENSG00000000419;SE:20:50940933-50941105:50941209-50942031:-	0.028363373150652894	0.02799640449354926	0.030312514565798154	0.014823359763832214
ENSG00000000419;SE:20:50940933-50941129:50941209-50942031:-	0.07818130233291508	0.08780259000559165	0.054971267248545404	0.07336325358106109
ENSG00000000419;SE:20:50940955-50941105:50941209-50942031:-	0.0	0.035003223060908244	0.01448838433493186	0.10207471483060135
ENSG00000000419;SE:20:50941209-50942031:50942126-50945737:-	0.9832119546137829	1.0000000000000002	0.9618810813223434	0.9769950594008092
ENSG00000000419;SE:20:50942126-50945737:50945762-50945847:-	0.9876598940318572	0.9743969827341252	0.9833589331044317	0.9658016711881765
ENSG00000000457;SE:1:169854964-169855796:169855957-169859041:-	0.4856115420019294	0.5194001770392952	0.533045114862394	0.533167156249122

head suppa/psi/ActD_4h_SE.psi
ActD_4h_1	ActD_4h_2	ActD_4h_3	ActD_4h_4
ENSG00000000003;SE:X:100630866-100632485:100632568-100633405:-	0.9993465716877533	0.9946488099599791	0.9983193877500409	1.0
ENSG00000000419;SE:20:50936262-50940865:50940933-50942031:-	0.936674329975077	0.928984849513219	0.9239805417948307	0.9351968473956108
ENSG00000000419;SE:20:50936262-50940865:50940955-50942031:-	0.3400886395468895	0.22525994489482515	0.34906576990789634	0.2483124442329311
ENSG00000000419;SE:20:50940933-50941105:50941209-50942031:-	0.01568429976434919	0.0169688806521051	0.0232128041190583	0.01772789595641008
ENSG00000000419;SE:20:50940933-50941129:50941209-50942031:-	0.05337329595858719	0.07219034981681514	0.055939850533341266	0.050026735080484164
ENSG00000000419;SE:20:50940955-50941105:50941209-50942031:-	0.012418429736731382	0.01504706810234816	0.0	0.017565223622536257
ENSG00000000419;SE:20:50941209-50942031:50942126-50945737:-	0.9668314225176855	0.9623899468702947	0.9897780452396181	1.0
ENSG00000000419;SE:20:50942126-50945737:50945762-50945847:-	0.9639761546102018	0.962519374887582	0.9574930457839392	0.9560360966187568
ENSG00000000457;SE:1:169854964-169855796:169855957-169859041:-	0.5338348074956553	0.5334716768349987	0.48508218276071796	0.5062566816919817

head suppa/tpm/DMSO_4h.txt
DMSO_4h_1	DMSO_4h_2	DMSO_4h_3	DMSO_4h_4
ENST00000415118	0	0	0	0
ENST00000448914	0	0	0	0
ENST00000434970	0	0	0	0
ENST00000631435	0	0	0	0
ENST00000710614	0	0	0	0
ENST00000605284	0	0	0	0
ENST00000604642	0	0	0	0
ENST00000603077	0	0	0	0
ENST00000603693	0	0	0	0

head suppa/tpm/ActD_4h.txt
ActD_4h_1	ActD_4h_2	ActD_4h_3	ActD_4h_4
ENST00000415118	0	0	0	0
ENST00000448914	0	0	0	0
ENST00000434970	0	0	0	0
ENST00000631435	0	0	0	0
ENST00000710614	0	0	0	0
ENST00000605284	0	0	0	0
ENST00000604642	0	0	0	0
ENST00000603077	0	0	0	0
ENST00000603693	0	0	0	0

Any ideas?

@MarekGierlinski
Copy link

Just to add to my previous post, when I change -m empirical to -m classical the code runs with no issues, however, the created files suggest that something is amiss:

l suppa/diff
total 12545
drwxr-s--- 2 mgierlinski barton    4096 Oct 16 12:38 .
drwxr-s--- 6 mgierlinski barton    4096 Oct 16 11:47 ..
-rw-r----- 1 mgierlinski barton 4738471 Oct 16 12:38 ActD_4h_SE.dpsi.temp.0
-rw-r----- 1 mgierlinski barton 8098918 Oct 16 12:38 ActD_4h_SE.psivec

The file ActD_4h_SE.dpsi.temp.0 looks like this:

head suppa/diff/ActD_4h_SE.dpsi.temp.0
Event_id	DMSO_4h_SE-ActD_4h_SE_dPSI	DMSO_4h_SE-ActD_4h_SE_p-val
ENSG00000000003;SE:X:100630866-100632485:100632568-100633405:-	-0.0004079606	0.7715034091
ENSG00000000419;SE:20:50936262-50940865:50940933-50942031:-	-0.0199044529	0.1000000000
ENSG00000000419;SE:20:50936262-50940865:50940955-50942031:-	-0.0681225069	0.4800000000
ENSG00000000419;SE:20:50940933-50941105:50941209-50942031:-	-0.0069754429	0.4800000000
ENSG00000000419;SE:20:50940933-50941129:50941209-50942031:-	-0.0156970454	0.2666666667
ENSG00000000419;SE:20:50940955-50941105:50941209-50942031:-	-0.0266339002	0.6549237453
ENSG00000000419;SE:20:50941209-50942031:50942126-50945737:-	-0.0007721702	1.0000000000
ENSG00000000419;SE:20:50942126-50945737:50945762-50945847:-	-0.0177982023	0.1000000000
ENSG00000000457;SE:1:169854964-169855796:169855957-169859041:-	-0.0031446603	0.8857142857

which is probably fine, but I'm suspicious about the name.

@EduEyras
Copy link
Member

EduEyras commented Nov 2, 2024

Thanks
the temp.0 suffix makes me think that the process is not running in full.
Can you see whether there is any problem with one of the entries?

@MarekGierlinski
Copy link

I cannot see any obvious problems with neither the input or output files. The output file I showed the head of (ActD_4h_SE.txt.dpsi.temp.0), contains 57342 lines and there is no corruption visible. Input files also look clean.

The only unusual thing I noticed is that PSI files contain rows with NAs and the corresponding rows in "psivec" and "dpsi" files contain NaNs. Here is an example:

> grep "ENSG00000293597;SE:1:169018002-169018130:169018303-169020943:-" psi/ActD_4h_SE.psi
ENSG00000293597;SE:1:169018002-169018130:169018303-169020943:-  NA      NA      NA      NA

 > grep "ENSG00000293597;SE:1:169018002-169018130:169018303-169020943:-" diff/ActD_4h_SE.txt.dpsi.temp.0 
ENSG00000293597;SE:1:169018002-169018130:169018303-169020943:-  nan     1.0000000000

 > grep "ENSG00000293597;SE:1:169018002-169018130:169018303-169020943:-" diff/ActD_4h_SE.txt.psivec 
ENSG00000293597;SE:1:169018002-169018130:169018303-169020943:-  nan     nan     nan     nan     nan     nan     nan     nan

In the GTF file I found that the exon starting at 1:169018130 and ending at 1:169018303 belongs to two transcripts, ENST00000715538 and ENST00000715540. None of the is present in the salmon output files. I'm guessing lack of Salmon input causes SUPPA to create NAs in the PSI file and, consequently, NaNs in the differential output. I don't know if this can cause any problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants