-
Notifications
You must be signed in to change notification settings - Fork 455
Open
Labels
lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.Denotes an issue or PR has remained open with no activity and has become stale.questionCategorizes issue or PR as a support question.Categorizes issue or PR as a support question.
Description
Once a node is restarted, the absence of a guaranteed sequence of pod restarts may result in pods that were started before the nvidia-container-toolkit cannot access the GPU devices, including device plugin.
Could you please provide some guidance on how to handle this situation? Is there any config from gpu-operator side to ensure nvidia-container-toolkit gets started at first?
Thanks.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.Denotes an issue or PR has remained open with no activity and has become stale.questionCategorizes issue or PR as a support question.Categorizes issue or PR as a support question.