'IndexError: list index out of range' when running epi.tl.geneactivity() #104

pabloswfly · 2021-08-12T15:39:51Z

Hi,

I am getting the following error when I run epi.tl.geneactivity():

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/tmp/ipykernel_105708/3365853572.py in <module>
----> 1 epi.tl.geneactivity(episcanpy_atac, gtf_file, key_added="gene_scores")

~/miniconda3/envs/csg.p/lib/python3.9/site-packages/episcanpy/tools/_geneactivity.py in geneactivity(adata, gtf_file, key_added, upstream, feature_type, annotation, layer_name, raw, copy)
     73         line = line.split('_')
     74         if line[0] not in raw_adata_features.keys():
---> 75             raw_adata_features[line[0]] = [[int(line[1]),int(line[2]), feature_index]]
     76         else:
     77             raw_adata_features[line[0]].append([int(line[1]),int(line[2]), feature_index])

IndexError: list index out of range

This has to do with the way _geneactivity.py iterates over the feature names from the anndata object. By looking at the code, I saw that line = line.split('_') tries to separate diverse string fields, but each line variable in my case is:

chr1:629499-630394
chr1:633580-634634
chr1:778282-779198
chr1:816872-817778
chr1:827063-827952
chr1:844145-844994
chr1:869467-870372
chr1:904350-905199
chr1:920760-921655

So there is nothing to split by '_' character.

I tried to tweak the code and separate myself into starting position and ending position for each feature to create the raw_adata_features dictionary, but If I do this I receive an empty gene_activity_X matrix later on. I am using the same GTF file as in the example gencode.v36.annotation.gtf.

Can you help me with this? Thanks!

The text was updated successfully, but these errors were encountered:

HelloWorldLTY · 2022-09-20T03:02:56Z

Hi, I think this code will help you:
var_list = []

for i in adata_atac.var_names: new_i=i.replace('-', '_').replace(':','_') var_list.append(new_i)

But it seems that the gene number they found is quite small...

DaneseAnna · 2022-09-20T09:29:11Z

Hi,

If the number you obtain is quite small, you should check if the adata.var are sorted per coordinates.

Best,
Anna

HelloWorldLTY · 2022-09-20T12:59:58Z

Hi, I do not quite understand your meaning. Do I need to sort adata.var_names in ahead? Do you have any tutorials about this part? Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'IndexError: list index out of range' when running epi.tl.geneactivity() #104

'IndexError: list index out of range' when running epi.tl.geneactivity() #104

pabloswfly commented Aug 12, 2021

HelloWorldLTY commented Sep 20, 2022 •

edited

Loading

DaneseAnna commented Sep 20, 2022

HelloWorldLTY commented Sep 20, 2022

'IndexError: list index out of range' when running epi.tl.geneactivity() #104

'IndexError: list index out of range' when running epi.tl.geneactivity() #104

Comments

pabloswfly commented Aug 12, 2021

HelloWorldLTY commented Sep 20, 2022 • edited Loading

DaneseAnna commented Sep 20, 2022

HelloWorldLTY commented Sep 20, 2022

HelloWorldLTY commented Sep 20, 2022 •

edited

Loading