Honour HPA Scale-Down Policy When Scaling Down to Zero Replicas

### Proposal

Current KEDA behaviour during scale to zero is such that if the scale trigger remains inactive after the cooldownPeriod, KEDA will force the deployment to scale down to 0, regardless of the current number of replicas.

The proposal is to honour the scale-down policy defined in the Horizontal Pod Autoscaler (HPA) before enforcing the scale to zero. If the user configures the following in the ScaledObject:
```
behavior:
  scaleDown:
    policies:
    - type: Pods
      value: 2
      periodSeconds: 60
```
then, KEDA should first gradually scale down the replicas from N to 1 in steps of 2, and only then scale from 1 to 0.

### Use-Case

After the trigger becomes inactive for the cooldownPeriod, KEDA's current behaviour forces the deployment to scale down to zero immediately. If there is a short burst of traffic right after the cooldownPeriod, the deployment will need to scale up from 0 to 1, increasing request latency.

If KEDA honours the HPA scale-down policy, replicas will scale down gradually in defined steps, ensuring that even during short traffic bursts, there are still pods available to serve requests, reducing latency and improving response times.

### Is this a feature you are interested in implementing yourself?

No

### Anything else?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Honour HPA Scale-Down Policy When Scaling Down to Zero Replicas #7204

Proposal

Use-Case

Is this a feature you are interested in implementing yourself?

Anything else?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Honour HPA Scale-Down Policy When Scaling Down to Zero Replicas #7204

Description

Proposal

Use-Case

Is this a feature you are interested in implementing yourself?

Anything else?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions