-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Add capacity buffers scalable objects, limits and fake pods injection #8540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add capacity buffers scalable objects, limits and fake pods injection #8540
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: abdelrahman882 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
3202792
to
31a1faf
Compare
e547307
to
f9f833b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you checked that the eventing processor does not try to send events for fake pods created from buffer?
cluster-autoscaler/capacitybuffer/translators/pod_template_translator.go
Show resolved
Hide resolved
cluster-autoscaler/capacitybuffer/translators/resource_limits_translator.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/capacitybuffer/translators/resource_limits_translator_test.go
Show resolved
Hide resolved
cluster-autoscaler/capacitybuffer/translators/scalable_objects_translator.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/capacitybuffer/translators/scalable_objects_translator.go
Show resolved
Hide resolved
cluster-autoscaler/capacitybuffer/translators/scalable_objects_translator.go
Outdated
Show resolved
Hide resolved
client: client, | ||
toProvisionFilter: buffersfilter.NewStatusFilter(map[string]string{ | ||
common.ReadyForProvisioningCondition: common.ConditionTrue, | ||
common.ProvisioningCondition: common.ConditionTrue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should filter out buffers that have not matching generation id to the pod template generation id.
To avoid scaling down cluster in case when they happen to be not updated yet we should have some kind of pod template cache for a short period of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we excluded buffers with stale generation id we will have scale downs as you mentioned until the controller reacts and if we cached for some period it would be until the generation updated by the controller.
So my suggestion is to
- have CA not reacting on generation change (for buffers and for pods templates)
- The controller will pick up those and filter them to be processed as soon as a loop kicks in
- controller will fix and update the buffer status and CA will correctly react
I think this way it will be smoother as CA will have -most probably- no loop without injection as the fake pods number will change as soon as the controller updates the buffer status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "CA not reacting on generation change"? What if the autoscaler starts and these do not match from the start of cluster autoscaler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "CA not reacting on generation change"?
By CA I meant the fake pods injector, and by not reacting I mean just inject the stale number of replicas
What if the autoscaler starts and these do not match from the start of cluster autoscaler?
If the autoscaler starts and the generation do not matching, we would have stale injected fake pods until the controller fixes that in ~5s
cluster-autoscaler/processors/capacitybuffer/pod_list_processor.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/processors/capacitybuffer/pod_list_processor.go
Outdated
Show resolved
Hide resolved
samplePod := getPodFromTemplate(samplePodTemplate) | ||
|
||
for i := 1; i <= podCount; i++ { | ||
newPod := fake.WithFakePodAnnotation(samplePod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to somehow mark a fake pod as originating from buffer vs for example provisioning request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have separate annotation but we also mark those injected for proactive scale up same way so I believe it's better to do it the same way so the fake pods are handled the same way like the others
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in the end it would be nice to emit an event for a buffer if it triggered scale up in the eventing procesor. There we need to differentiate not only omit these: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/processors/status/eventing_scale_up_processor.go#L39 (and also check which buffer they were generated from)
c742f9c
to
6e7caa0
Compare
6e7caa0
to
a5e31e3
Compare
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR is completing the basic logic for Capacity Buffers API to cover scalable objects, resource limits and fake pods injection.
Which issue(s) this PR fixes:
none
Special notes for your reviewer:
The new parts in this PR:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
proposal doc: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/buffers.md