Skip to content

How to reduce computational resource consumption #611

@tudouzi1111

Description

@tudouzi1111

I hope this message finds you well. I am currently working on a genome analysis project inspired by your paper "Taurine pangenome uncovers a segmental duplication upstream of KIT associated with depigmentation in white-headed cattle." Specifically, I am following a similar approach by partitioning the entire pangenome into windows and calculating the Jaccard similarity within each window.

However, I have encountered a significant challenge during this process. The command odgi extract -i pig.renamed.sorted.og -r A#chr1:1-1000 -o chr1_1_1000.window_subgraph.og consumes a substantial amount of computational resources. Extracting just one subgraph from the full pangenome graph requires around 10 CPU cores. My pangenome consists of 30 genomes, each approximately 2G in size. As a result, extracting subgraphs in 1000 bp windows across the entire graph becomes computationally intensive and difficult to scale.

I was wondering if you might have any suggestions or alternative strategies to reduce the computational burden of this step, or to make the extraction process more efficient.

Thank you very much for your time and any insights you can share.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions