Skip to content

What is the expected repeatability of cugraph.leiden? #4072

Closed
@jpintar

Description

@jpintar

What is your question?

I know that one cannot expect exact repeatability over multiple runs of cugraph.leiden (even setting random_state to a constant). But how similar can we rely on the results being?

I've repeated 10 runs on a 15,000-vertex graph on 13 different GPUs, and got an average adjusted Rand index of 0.82 across the runs on a given GPU (min: 0.69, max: 0.95). That corresponds to a difference of ±1 cluster detected on average (and up to a 6-cluster difference in the worst case) for this graph. Is that as consistent as we can hope for?

Code of Conduct

  • I agree to follow cuGraph's Code of Conduct
  • I have searched the open issues and have found no duplicates for this question

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions