Add capacity buffers scalable objects, limits and fake pods injection #8540

abdelrahman882 · 2025-09-16T00:28:08Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR is completing the basic logic for Capacity Buffers API to cover scalable objects, resource limits and fake pods injection.

Which issue(s) this PR fixes:

none

Special notes for your reviewer:

The new parts in this PR:

Handle scalable object reference in buffer's controller
Handle resource limits in buffer's controller
Refactor Client calls to be more efficient
Add pod list processor to inject fake pods to trigger scale up
Add flags for controller and fake pods injector to main.go

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

proposal doc: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/buffers.md

- [AEP]: https://docs.google.com/document/d/1bcct-luMPP51YAeUhVuFV7MIXud5wqHsVBDah9WuKeo/edit?tab=t.0

k8s-ci-robot · 2025-09-16T00:28:19Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: abdelrahman882
Once this PR has been reviewed and has the lgtm label, please assign feiskyer for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

cluster-autoscaler/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jbtk

Have you checked that the eventing processor does not try to send events for fake pods created from buffer?

cluster-autoscaler/capacitybuffer/translators/pod_template_translator.go

cluster-autoscaler/capacitybuffer/translators/resource_limits_translator.go

cluster-autoscaler/capacitybuffer/translators/resource_limits_translator_test.go

cluster-autoscaler/capacitybuffer/translators/scalable_objects_translator.go

jbtk · 2025-09-16T11:27:48Z

cluster-autoscaler/processors/capacitybuffer/pod_list_processor.go

+		client: client,
+		toProvisionFilter: buffersfilter.NewStatusFilter(map[string]string{
+			common.ReadyForProvisioningCondition: common.ConditionTrue,
+			common.ProvisioningCondition:         common.ConditionTrue,


We should filter out buffers that have not matching generation id to the pod template generation id.

To avoid scaling down cluster in case when they happen to be not updated yet we should have some kind of pod template cache for a short period of time.

If we excluded buffers with stale generation id we will have scale downs as you mentioned until the controller reacts and if we cached for some period it would be until the generation updated by the controller.

So my suggestion is to

have CA not reacting on generation change (for buffers and for pods templates)

The controller will pick up those and filter them to be processed as soon as a loop kicks in

controller will fix and update the buffer status and CA will correctly react

I think this way it will be smoother as CA will have -most probably- no loop without injection as the fake pods number will change as soon as the controller updates the buffer status

What do you mean by "CA not reacting on generation change"? What if the autoscaler starts and these do not match from the start of cluster autoscaler?

What do you mean by "CA not reacting on generation change"?

By CA I meant the fake pods injector, and by not reacting I mean just inject the stale number of replicas

What if the autoscaler starts and these do not match from the start of cluster autoscaler?

If the autoscaler starts and the generation do not matching, we would have stale injected fake pods until the controller fixes that in ~5s

cluster-autoscaler/processors/capacitybuffer/pod_list_processor.go

jbtk · 2025-09-16T11:30:34Z

cluster-autoscaler/processors/capacitybuffer/pod_list_processor.go

+	samplePod := getPodFromTemplate(samplePodTemplate)
+
+	for i := 1; i <= podCount; i++ {
+		newPod := fake.WithFakePodAnnotation(samplePod)


Is it possible to somehow mark a fake pod as originating from buffer vs for example provisioning request?

We can have separate annotation but we also mark those injected for proactive scale up same way so I believe it's better to do it the same way so the fake pods are handled the same way like the others

But in the end it would be nice to emit an event for a buffer if it triggered scale up in the eventing procesor. There we need to differentiate not only omit these: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/processors/status/eventing_scale_up_processor.go#L39 (and also check which buffer they were generated from)

k8s-ci-robot requested review from BigDarkClown and elmiko September 16, 2025 00:28

k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Sep 16, 2025

abdelrahman882 force-pushed the capacity-buffer-ca branch from 3202792 to 31a1faf Compare September 16, 2025 01:14

abdelrahman882 changed the title ~~Add capacity buffers scalable objects, limits and integration logic with cluster autoscaler loop~~ Add capacity buffers scalable objects, limits and fake pods injection Sep 16, 2025

abdelrahman882 force-pushed the capacity-buffer-ca branch 2 times, most recently from e547307 to f9f833b Compare September 16, 2025 03:03

jbtk reviewed Sep 16, 2025

View reviewed changes

abdelrahman882 force-pushed the capacity-buffer-ca branch 7 times, most recently from c742f9c to 6e7caa0 Compare September 17, 2025 05:55

Add buffer resource limits and implement status update

a5e31e3

abdelrahman882 force-pushed the capacity-buffer-ca branch from 6e7caa0 to a5e31e3 Compare September 17, 2025 06:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add capacity buffers scalable objects, limits and fake pods injection #8540

Add capacity buffers scalable objects, limits and fake pods injection #8540

abdelrahman882 commented Sep 16, 2025 •

edited

Loading

Uh oh!

k8s-ci-robot commented Sep 16, 2025

Uh oh!

jbtk left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbtk Sep 16, 2025

Uh oh!

abdelrahman882 Sep 17, 2025

Uh oh!

jbtk Sep 17, 2025

Uh oh!

abdelrahman882 Sep 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

jbtk Sep 16, 2025

Uh oh!

abdelrahman882 Sep 17, 2025

Uh oh!

jbtk Sep 17, 2025

Uh oh!

Uh oh!

Add capacity buffers scalable objects, limits and fake pods injection #8540

Are you sure you want to change the base?

Add capacity buffers scalable objects, limits and fake pods injection #8540

Conversation

abdelrahman882 commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

k8s-ci-robot commented Sep 16, 2025

Uh oh!

jbtk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbtk Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

abdelrahman882 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

jbtk Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

abdelrahman882 Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jbtk Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

abdelrahman882 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

jbtk Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

abdelrahman882 commented Sep 16, 2025 •

edited

Loading

abdelrahman882 Sep 17, 2025 •

edited

Loading