[FEA] Expose runtime configurability of default stream behavior #17626
Labels
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
pylibcudf
Issues specific to the pylibcudf package
Is your feature request related to a problem? Please describe.
Currently all of libcudf operates on the default stream (stream 0) by default, and on cudaStreamPerThread if compiled with
CUDF_USE_PER_THREAD_DEFAULT_STREAM
. Some consumers of libcudf who wish to use the per-thread default stream instead for various reasons such as improved performance. Historically, we have supported this by compiling withCUDA_API_PER_THREAD_DEFAULT_STREAM
andCUDF_USE_PER_THREAD_DEFAULT_STREAM
because compile-time control was the only reasonable way to achieve this, and consumers like spark-rapids leverage this. However, as #13744 comes to a close we will have a fully stream-ordered API that is also completely tested to ensure that streams are being passed through everywhere to ensure that nothing is unintentionally running on the default stream if the user provides one. This fact affords us some additional options when it comes to enabling PTDS behavior.Describe the solution you'd like
We should modify
get_default_stream
to support runtime configurability of its behavior to mean PTDS instead of every thread running on the default stream. This could easily be done in a thread-safe manner using a function-local staticThe above uses an environment variable, but we could just as easily expose a public API that would set some configuration that must be called before the first call to
get_default_stream
. The end result would be that we would entirely control the default stream behavior at runtime without needing to build separate binaries to support PTDS. This would allow us to support various newer higher-level APIs, such as pylibcudf, while still supporting Spark's needs.Describe alternatives you've considered
We could ship separate binaries compiled for PTDS, or change our default to always build with PTDS. The former has generally been rejected on the grounds of requiring double the resources, though, while the latter was previously attempted but rejected due to PTDS builds of cudf not being safe drop-in replacements for non-PTDS builds because PTDS allows for race conditions that would not be possible with non-PTDS builds.
The text was updated successfully, but these errors were encountered: