-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Proposal
Current KEDA behaviour during scale to zero is such that if the scale trigger remains inactive after the cooldownPeriod, KEDA will force the deployment to scale down to 0, regardless of the current number of replicas.
The proposal is to honour the scale-down policy defined in the Horizontal Pod Autoscaler (HPA) before enforcing the scale to zero. If the user configures the following in the ScaledObject:
behavior:
scaleDown:
policies:
- type: Pods
value: 2
periodSeconds: 60
then, KEDA should first gradually scale down the replicas from N to 1 in steps of 2, and only then scale from 1 to 0.
Use-Case
After the trigger becomes inactive for the cooldownPeriod, KEDA's current behaviour forces the deployment to scale down to zero immediately. If there is a short burst of traffic right after the cooldownPeriod, the deployment will need to scale up from 0 to 1, increasing request latency.
If KEDA honours the HPA scale-down policy, replicas will scale down gradually in defined steps, ensuring that even during short traffic bursts, there are still pods available to serve requests, reducing latency and improving response times.
Is this a feature you are interested in implementing yourself?
No
Anything else?
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status