You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
{
// mr here is rmm::mr::pool_memory_resource<rmm::mr::pinned_memory_resource>
rmm::device_uvector<int> tmps(1024, stream, mr);
thrust::for_each(rmm::exec_policy_nosync(stream), tmps.beign(), tmps.end(),
[]__device__(auto val) { ... });
}
Here, users may expect tmps to follow stream semantics but it gets immediately deallocated once the variable becomes out-of-scope (before the thrust::for_each call actually finishes).
This behavior can be surprising to some users and the fix requires an explicit stream synchronization (and this may have unnecessary performance overhead).
Describe the solution you'd like rmm::mr::pool_memory_resource to follow stream semantics even when the upstream memory allocator is (host) pinned_memory_resource.
Describe alternatives you've considered
Explicitly calling cudaStreamSynchronize();