-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tolerations only to specific worker pods #539
Comments
No, the mpi-operator doesn't support configuring parameters for a specific worker. |
/kind question |
Thank you very much for your information. Closing the issue. |
Why is worker 0 special? In addition to the launcher already being "special". |
In my case, worker0 is special since rank0 need to access some resources that only exists on specific nodes. It is true that launcher is already special to some extents, but in my understanding, the launcher won't do any computation, right? BTW, is it possible to make rank0 running on the launcher? |
That is correct, the launcher just coordinates. Having the workers in its own pods has the advantage that the resources can be exclusive to the worker computations, as opposed to be shared with launcher tasks. Nothing prohibits the launcher pod to be in the same node as other workers, but you have the isolation of the pod namespaces to have better control. |
I wonder how common is the specialization you mention is. Would we need to add support for an arbitrary number of pod templates? |
/reopen |
@alculquicondor: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This was also discussed in #384 |
I am currently facing with a scenario, where I need to schedule worker0 pod onto specific nodes. Is there any possibility to configure custom tolerations or labels only to specific worker pods (like worker0). Thank you in advance for any information.
The text was updated successfully, but these errors were encountered: