-
Notifications
You must be signed in to change notification settings - Fork 2k
scheduler: perform feasibility checks for system canaries before computing placements #26953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…uting placements Canaries for system jobs are placed on a tg.update.canary percent of eligible nodes. Some of these nodes may not be feasible, and until now we removed infeasible nodes during placement computation. However, if it happens to be that the first eligible node we picked to place a canary on is infeasible, this will lead to the scheduler halting deployment. The solution presented here simplifies canary deployments: initially, system jobs that use canary updates get allocations placed on all eligible nodes, but before we start computing actual placements, a method called `evictCanaries` is called (much like `evictAndPlace` is for honoring MaxParallel), and performs a feasibility check on each node up to the amount of required canaries per task group. Feasibility checks are expensive, but this way we only check all the nodes in the worst case scenario (with canary=100), otherwise we stop checks once we know we are ready to place enough canaries.
// we only now the total amountn of placements once we filter out | ||
// infeasible nodes, so for system jobs we do it backwards a bit: the | ||
// "desired" total is the total we were able to place. | ||
if s.deployment != nil { | ||
s.deployment.TaskGroups[tgName].DesiredTotal += 1 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For system jobs I think we need to make sure we're working from a blank slate dstate
for each evaluation. Incrementing this here is adding on top of the desired total from previous evals.
// ensure everything is healthy | ||
if dstate, ok := s.deployment.TaskGroups[groupName]; ok { | ||
if dstate.HealthyAllocs < dstate.DesiredTotal { // Make sure we have enough healthy allocs | ||
complete = false | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're resetting desired total in computePlacements
, it won't be correctly set when we reach isDeploymentComplete
, which is called before that.
Canaries for system jobs are placed on a tg.update.canary percent of eligible nodes. Some of these nodes may not be feasible, and until now we removed infeasible nodes during placement computation. However, if it happens to be that the first eligible node we picked to place a canary on is infeasible, this will lead to the scheduler halting deployment.
The solution presented here simplifies canary deployments: initially, system jobs that use canary updates get allocations placed on all eligible nodes, but before we start computing actual placements, a method called
evictCanaries
is called (much likeevictAndPlace
is for honoring MaxParallel), and performs a feasibility check on each node up to the amount of required canaries per task group. Feasibility checks are expensive, but this way we only check all the nodes in the worst case scenario (with canary=100), otherwise we stop checks once we know we are ready to place enough canaries.