Skip to content

UTF-8 encoded canopus.tsv #3

@askerdb

Description

@askerdb

Hi,

Thanks for making this software!

I ran the following:

from canopus import Canopus
C = Canopus(sirius="/emc/cbmr/users/chs962/mani_diet_sirius_aligned_canopus")
C.npcSummary().to_csv("npc_summary.csv")

I ran it on a workspace generated with SIRIUS 4.8.2.

It failed with an error about parsing ascii that i unfortunately didn't save
presumably because the canopus.tsv is utf-8 encoded:
-bash-4.2$ file /emc/cbmr/users/chs962/mani_diet_sirius_aligned_canopus/canopus.tsv
/emc/cbmr/users/chs962/mani_diet_sirius_aligned_canopus/canopus.tsv: UTF-8 Unicode text, with very long lines

I applied the following fix:
@@ -771,7 +772,8 @@ class SiriusWorkspace(object):

 def load_ontology_index(self):
     mapping = dict()
  •    with Path(self.rootdir, "canopus.tsv").open() as fhandle:
    
  •    print(Path(self.rootdir, "canopus.tsv"))
    
  •    with Path(self.rootdir, "canopus.tsv").open(encoding="utf-8") as fhandle:
           header=None
           ri = None
           coid = None
    

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions