Skip to content

No TAXID for Ecoli and incompatibility of the Ecoli specific annotation file #306

@FH96

Description

@FH96

Hi,

This is my first time to do GO term analysis and I'm new to it. I want to use goatools for Ecoli. The taxID for ecoli is not in the gene2go file that is downloaded by default, I searched for 562 which is Ecoli in general, 83333 that is for Ecoli k12 and also 469008 which is the taxID for Ecoli BL21 and it says "AssertionError: **FATAL: NO TAXIDS: gene2go" for all of them.

Thus I tried to download the Ecoli annotation from Gene Ontology database. However, the formatting of this file is different from the gene2go file and again I face error reading this GAF file:

**NOTE: DEFAULT TAXID STORED FROM gene2go IS 9606 (human)
1
0) tax_id UniProtKB

  1. DB_ID A0A385XJ53
  2. GO_ID insA9
  3. Evidence_Code involved_in
  4. Qualifier GO:0006310
  5. GO_term GO_REF:0000043
  6. DB_Reference IEA
  7. NS UniProtKB-KW:KW-0233
    Traceback (most recent call last):
    File "/Users/fh/Documents/project/python code/venv/lib/python3.12/site-packages/goatools/anno/init/reader_genetogo.py", line 74, in init_associations
    taxid = int(vals[0])
    ^^^^^^^^^^^^
    ValueError: invalid literal for int() with base 10: 'UniProtKB'
    **FATAL: invalid literal for int() with base 10: 'UniProtKB'
    **FATAL: ecocyc2.gaf[1]:
    UniProtKB A0A385XJ53 insA9 involved_in GO:0006310 GO_REF:0000043 IEA UniProtKB-KW:KW-0233 P Insertion element IS1 9 protein InsA insA9|b4709 protein taxon:83333 20240729 UniProt

Probably I can rearrange the E.coli file to match Gene2GOreader function, except "DB reference" that I'm not sure about.
Also I'm wondering if there is another way to get a compatible file for E.coli.

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions