-
Notifications
You must be signed in to change notification settings - Fork 497
UCT/CUDA_IPC: Add process namespace to cuda_ipc rkey #10968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| if (!ucs_sys_ns_is_default(UCS_SYS_NS_TYPE_PID)) { | ||
| ext_rkey = (uct_cuda_ipc_extended_rkey_t*)packed; | ||
| ext_rkey->pid_ns = memh->pid_ns; | ||
|
|
||
| ucs_assert(!(getpid() & UCT_CUDA_IPC_RKEY_FLAG_PID_NS)); | ||
| packed->pid |= UCT_CUDA_IPC_RKEY_FLAG_PID_NS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shasson5 IIUC if a new version sends to an older version, the receiver is going to use an incorrect pid, right?
Is it intended, or we assume there is no such scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that's intended but it shouldn't cause a problem IMO
src/uct/cuda/cuda_ipc/cuda_ipc_md.c
Outdated
| /* Indicates whether PID NS is contained in rkey */ | ||
| #define UCT_CUDA_IPC_RKEY_FLAG_PID_NS UCS_BIT(31) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd elaborate more about the way it affects the wire compatibility (or doesn't affect, and why)
WalkthroughThe pull request extends CUDA IPC cache and memory management systems with process namespace (pid_ns) awareness. Changes include adding pid_ns fields to cache keys and memory handles, introducing an extended remote key type for backward compatibility, and updating all related function signatures and access paths to propagate and utilize namespace information. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (8)
🧰 Additional context used🧬 Code graph analysis (3)src/uct/cuda/cuda_ipc/cuda_ipc_cache.h (1)
src/uct/cuda/cuda_ipc/cuda_ipc_md.c (1)
src/uct/cuda/cuda_ipc/cuda_ipc_cache.c (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
🔇 Additional comments (8)
Comment |
|
/azp run |
|
Azure Pipelines successfully started running 5 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 5 pipeline(s). |
What?
Add process namespace to cuda_ipc rkey
Why?
Support running multiple containers where same process id can be used in different containers
Summary by CodeRabbit
New Features
Refactor