-
Notifications
You must be signed in to change notification settings - Fork 388
docs: RFC for Capacity Buffer API Support #2611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
docs: RFC for Capacity Buffer API Support #2611
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: moko-poi The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Hi @moko-poi. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Pull Request Test Coverage Report for Build 19022966184Details
💛 - Coveralls |
| @@ -0,0 +1,594 @@ | |||
| # Capacity Buffer API Support | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the design! Couple high level notes:
- I don't think the capacity buffer api exists yet. The sig-autoscaling RFC was merged but I don't think the api itself has been released. Doesn't mean we can't think ahead, but implementation for this will have to wait until the api exists.
- Overall I think the doc is a bit messy. I think it would be a stronger proposal if you started from the CX and derived implementation from that. Similarly, I think the implementation section could be much stronger if it started from the requirements of the existing controllers and worked backwards to the API's that the capacity buffer controller should provide.
| - Using pause containers with resource requests to reserve capacity | ||
| - Over-provisioning through static NodePools | ||
|
|
||
| The Kubernetes SIG Autoscaling has standardized a CapacityBuffer API to declare spare capacity/headroom in clusters. Cluster Autoscaler supports this API (autoscaling.x-k8s.io/v1alpha1), providing a vendor-agnostic way to express capacity requirements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure CAS has support for that API yet
| 1. **Performance-critical applications** where just-in-time provisioning latency is unacceptable | ||
| 2. **Burst workloads** that need immediate scheduling for CI/CD, batch jobs, or event-driven applications | ||
| 3. **High-availability services** that require buffer capacity to handle traffic spikes or node failures | ||
| 4. **Consistent user experience** across different autoscaling solutions in the Kubernetes ecosystem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we care all that much about consistent UX. In fact, the two autoscaling solutions work very differently. I do think we could say we care about intent driven configuration though
|
|
||
| ## Proposal | ||
|
|
||
| Extend Karpenter to support the standard CapacityBuffer API (autoscaling.x-k8s.io/v1alpha1) by integrating buffer capacity into scheduling and consolidation algorithms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the API is alpha, whatever design we create should include the standard set of alpha protections we use. I don't see where you've discussed feature gating this and the opt-in opt-out behavior, but the RFC should include details on that
| ``` | ||
|
|
||
| Key aspects: | ||
| 1. **Virtual Pod Approach**: Follow Cluster Autoscaler's pattern using in-memory virtual pods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can reduce this to a single goal, something along the lines of 'Karpenter respects configured CapacityBuffers, maintaining additional capacity as if they were pods'. 1-3 in this list are implementation details that get us towards that goal
|
|
||
| 5. **Graceful Degradation**: If buffer capacity cannot be maintained, prioritize user workloads and log buffer capacity warnings | ||
|
|
||
| ### API Integration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this section is repeated
|
|
||
| **Revised Protection Strategy**: | ||
|
|
||
| 1. **NodeClaim-Level Tracking**: Buffer capacity is tracked at the NodeClaim level, not just pod level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct. What are you trying to say with this?
| 5. **Update buffer status** with translation results | ||
| 6. **Inject virtual pods** into Karpenter's scheduling pipeline | ||
|
|
||
| ### Implementation Phases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this section enumerate what requirements must be met before implementation can begin, and then what all functionality is required for the alpha release of buffer support?
| - Memory overhead of virtual pods | ||
| - Watch performance with many buffers | ||
|
|
||
| ## Migration & Compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this section is missing a feature flag discussion
| **A**: Follow Karpenter's provisioning behavior - create suitable NodeClaims through scheduler constraint solving | ||
|
|
||
| 3. **Q**: How does buffer capacity interact with NodePool limits? | ||
| **A**: Buffer NodeClaims must respect NodePool resource limits and budget constraints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, what is a buffer nodeclaim?
Description
This is a proposal to add support for the standard Kubernetes CapacityBuffer API (autoscaling.x-k8s.io/v1alpha1) to enable pre-provisioned spare capacity in
Karpenter clusters.
The RFC introduces a virtual pod approach that integrates buffer capacity into Karpenter's scheduling and consolidation algorithms while maintaining compatibility
with the existing Cluster Autoscaler Buffer API.
Related issue: #2571
How was this change tested?
RFC only - implementation will follow in subsequent PRs
Key Features
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.