[GraphBolt] Improve sample_neighbors() on CPU with prob/mask by 6.47%, fixing #7462#7794
[GraphBolt] Improve sample_neighbors() on CPU with prob/mask by 6.47%, fixing #7462#7794
Conversation
|
To trigger regression tests:
|
|
Update: simply removing the .squeeze(1) call will also work, improving performance by 6.47%. |
|
@az15240 6.5% speedup with such a simple change is quite good but I wouldn't say this change fixes the issue altogether. Do you think there are other opportunities to further speed it up? |
@mfbalin I think the main reason for this 14x slower is because for certain combination of parameters, the sampling time is too fast. It is not necessarily because our sampling is bad. In regression, when |
Description
A one line change in
NonUniformPickOp, to make prob/mask samplings go faster.Runtime data is supported by this sheet, by running
benchmark_graphbolt_sampling.pylocally. Note that this function only works for prob/mask, so I left out data for cases thatprobs=None.Checklist
Please feel free to remove inapplicable items for your PR.
Changes