Skip to content

Querying job_stats on large cluster(s) takes multiple seconds #30

Open
@mjurksa

Description

@mjurksa

Hello,

we have identified a performance issue with the lustre_exporter when querying metrics on large Lustre file systems with a significant number of jobstats. The problem seems to be related to the procfs.go script repeatedly accessing the same job_stats file in the procfs, resulting in a delay of 4-5 seconds per query.
I think it should be possible to open the file once and scan each line for needed information aldough im not very well versed in GO and dont know if this would require significant refactor of the code.
Is this issue known? Are there any workarounds around this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions