-
Notifications
You must be signed in to change notification settings - Fork 4.2k
WIP Add good candidate node interface #8531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
WIP Add good candidate node interface #8531
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: elmiko The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
i'm still working on some clusterapi specific unit tests, but they are quite challenging given the mocks that are needed. |
cc @sbueringer @fabriziopandini this isn't quite done yet, but the business logic seems to be working as expected. |
} else if err != nil && err != cloudprovider.ErrNotImplemented { | ||
klog.Warningf("Error while checking if node is a candidate for deletion %s: %v", node.Name, err) | ||
continue | ||
} | ||
nodeGroup, err := ctx.CloudProvider.NodeGroupForNode(node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to NodeGroupForNode()'s interface comment: nil if the node should not be processed by cluster autoscaler, it looks we can also put capi rollout logics into this interface's implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, good suggestion. that might be a much simpler way to solve this. i'll research that approach, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this also break scale up's and potentially a lot of other places where this func is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this also break scale up's and potentially a lot of other places where this func is used?
yes, unfortunately after doing more research i don't think this would work for us for a couple reasons:
- there are other use of NodeGroupForNode that should always return accurate information
- clusterapi, and potentially other providers who perform node updates, need to know that the autoscaler will be deleting a node to make the decision about the deletion process. NodeGroupForNode does not pass this context.
this would only work in we were to assume that any node undergoing an update should be ignored completely by the autoscaler. i'm not sure we can make that assertion.
my hope is that in the future when something like the Declarative Node Maintenance api has been accepted, that we will be able to coordinate using that api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. clusterapi, and potentially other providers who perform node updates, need to know that the autoscaler will be deleting a node to make the decision about the deletion process. NodeGroupForNode does not pass this context.
Honestly, I don't know. But NodeGroupForNode
is called in 21 places. So I do not know what other impact it has and if it produces new issues elsewhere. I think this would require extensive research and testing.
Our idea so far was to make a surgical change to ensure GetScaleDownCandidates
does not return Nodes/Machines of MD in rollout as scale down candidates. Which means that Nodes/Machines of a MD are simply not considered for scale down, but everything else still works as of today.
Modifying NodeGroupForNode
would significantly increase the blast radius of this fix.
}) | ||
objs, err := c.machineSetInformer.Lister().ByNamespace(r.GetNamespace()).List(selector) | ||
if err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe wrap the error here to provide a bit of context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elmiko I think you missed this one (but up to you of course :))
If I understand #8494 correctly, this change is meant to prevent CA from attempting to scale down a node that has already been cordoned and drained by Cluster API (and it's up to Cluster API to remove it). Can Cluster API apply the scale-down disabled annotation when draining? https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/core/scaledown/eligibility/eligibility.go#L39 |
No, this is about preventing cluster autoscaler to delete / scale down a Node altogether if Cluster API is doing a rollout. |
+1 to what @sbueringer is saying, and also this problem is currently confined to clusterapi provider but it could affect any provider who does node updating in a similar fashion as clusterapi. i think we need to make the autoscaler smarter in these scenarios where a cloud provider needs to have more control over which nodes are being marked for removal during a maintenance window. |
This function allows cloud providers to specify when a node is not a good candidate for scaling down. This will occur before the autoscaler has begun to cordon, drain, and taint any node for scale down. Also adds a unit test for the prefiltering node processor.
The initial implementation of this function for clusterapi will return that a node is not a good candidate for scale down when it belongs to a MachineDeployment that is currently rolling out an upgrade.
cae87c7
to
150cf78
Compare
updated with @sbueringer 's suggestions. |
Answered above |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #8494
Special notes for your reviewer:
this is a challenging scenario to debug, please see the related issue.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: