Closed
Description
What is your question?
I know that one cannot expect exact repeatability over multiple runs of cugraph.leiden
(even setting random_state
to a constant). But how similar can we rely on the results being?
I've repeated 10 runs on a 15,000-vertex graph on 13 different GPUs, and got an average adjusted Rand index of 0.82 across the runs on a given GPU (min: 0.69, max: 0.95). That corresponds to a difference of ±1 cluster detected on average (and up to a 6-cluster difference in the worst case) for this graph. Is that as consistent as we can hope for?
Code of Conduct
- I agree to follow cuGraph's Code of Conduct
- I have searched the open issues and have found no duplicates for this question