[FEA] Heuristics for running kudo serialization optimally.

We added kudo GPU serialization as an optional (off by default) feature in https://github.com/NVIDIA/spark-rapids/pull/13078. We are worried that in some scenarios where the GPU is fully utilized, moving the serialization to GPU would actually hurt more than help. Therefore, we want to experiment with adding heuristics that would, in such scenarios, run cpu kudo, but otherwise run gpu kudo. For now we're just looking at the write side which is in the above PR, but later we'll want to check read side as well.

Additionally, we want a more general sense of how the overall performance of gpu vs cpu kudo is for different workloads, and we want to see if there are any other optimizations we can do before turning it on more broadly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEA] Heuristics for running kudo serialization optimally. #13171

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEA] Heuristics for running kudo serialization optimally. #13171

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions