Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ Figure 1. VerticalPodAutoscaler controls the resource requests and limits of Pod

Kubernetes implements vertical pod autoscaling through multiple cooperating components that run intermittently (it is not a continuous process). The VPA consists of three main components:

* The _recommender*, which analyzes resource usage and provides recommendations.
* The _recommender_, which analyzes resource usage and provides recommendations.
* The _updater_, that Pod resource requests either by evicting Pods or modifying them in place.
* And the VPA _admission controller_ webhook, which applies resource recommendations to new or recreated Pods.

Expand All @@ -100,7 +100,6 @@ Once during each period, the Recommender queries the resource utilization for Po
The Recommender analyzes both current and historical resource usage data (CPU and memory) for each Pod targeted by the VerticalPodAutoscaler. It examines:
- Historical consumption patterns over time to identify trends
- Peak usage and variance to ensure sufficient headroom
- Current resource requests compared to actual usage
- Out-of-memory (OOM) events and other resource-related incidents

Based on this analysis, the Recommender calculates three types of recommendations:
Expand All @@ -119,9 +118,7 @@ The chosen method depends on the configured update mode, cluster capabilities, a

The _admission controller_ operates as a mutating webhook that intercepts Pod creation requests. It
checks if the Pod is targeted by a VerticalPodAutoscaler and, if so, applies the recommended
resource requests and limits before the Pod is created. This ensures new Pods start with
appropriately sized resource allocations, whether they're created during initial deployment,
after an eviction by the updater, or due to scaling operations.
resource requests and limits before the Pod is created. More specifically, the admission controller uses the Target recommendation in the VerticalPodAutoscaler resource's `.status.recommendation` stanza as the new resource requests. The admission controller ensures new Pods start with appropriately sized resource allocations, whether they're created during initial deployment, after an eviction by the updater, or due to scaling operations.

The VerticalPodAutoscaler requires a metrics source, such as Kubernetes' Metrics Server {{< glossary_tooltip text="add-on" term_id="addons" >}},
to be installed in the cluster.
Expand Down Expand Up @@ -152,25 +149,27 @@ spec:
In the _Off_ update mode, the VPA recommender still analyzes resource usage and generates
recommendations, but these recommendations are not automatically applied to Pods.
The recommendations are only stored in the VPA object's `.status` field.
The recommendations are only stored in the VPA object's `.status` field. In this mode, only the recommender is active.

You can use a tool such as `kubectl` to view the `.status` and the recommendations in it.

### Initial {#updateMode-Initial}

In _Initial_ mode, VPA only sets resource requests when Pods are first created. It does not update resources for already running Pods, even if recommendations change over time.
In _Initial_ mode, VPA only sets resource requests when Pods are first created. In this mode, only the recommender and the admission controller components are active. This mode does not update resources for already running Pods, even if recommendations change over time. The recommendations apply when you modify mutable fields in the workload API, such as a Deployment, which triggers Pod recreation, or when you delete the Pods manually.

### Recreate {#updateMode-Recreate}

In _Recreate_ mode, VPA actively manages Pod resources by evicting Pods when their current
resource requests differ significantly from recommendations. When a Pod is evicted, the workload
controller (managing a Deployment, StatefulSet, etc) creates a replacement Pod, and the VPA admission
controller applies the updated resource requests to the new Pod.
controller applies the updated resource requests to the new Pod. In this mode all three VPA components are active.

### InPlaceOrRecreate {#updateMode-InPlaceOrRecreate}

In `InPlaceOrRecreate` mode, VPA attempts to update Pod resource requests and limits without restarting the Pod when possible. However, if in-place updates cannot be performed for a particular resource change, VPA falls back to evicting the Pod
(similar to `Recreate` mode) and allowing the workload controller to create a replacement Pod with updated resources.
(similar to `Recreate` mode) and allowing the workload controller to create a replacement Pod with updated resources. In this mode all three VPA components are active.

In this mode, the updater applies recommendations in-place using the [Resize Container Resources In-Place](/docs/tasks/configure-pod-container/resize-container-resources/) feature.

### Auto (deprecated) {#updateMode-Auto}

Expand Down Expand Up @@ -231,13 +230,19 @@ Valid resource names include `cpu` and `memory`.
The `controlledValues` field determines whether VPA controls resource requests, limits, or both:

RequestsAndLimits
: VPA sets both requests and limits. The limit is scaled proportionally to the request. This is the default mode.
: VPA sets both requests and limits. The limit scales proportionally to the request based on the request-to-limit ratio defined in the Pod spec. This is the default mode.

RequestsOnly
: VPA only sets requests, leaving limits unchanged. Limits are respected and can still trigger throttling or out-of-memory kills if usage exceeds them.

See [requests and limits](/docs/concepts/configuration/manage-resources-containers/#requests-and-limits) to learn more about those two concepts.

## LimitRange resouces
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

Suggested change
## LimitRange resouces
## LimitRange resources


The admission controller and updater VPA components post-process recommendations to comply with the constraints defined in [LimitRanges](/docs/concepts/policy/limit-range/). The LimitRange resources with `type` Pod and Container are checked in the Kubernetes cluster.

For example, if the `max` field in a Container LimitRange resource is exceeded, both VPA components lower the limit to the value defined in the `max` field, and the request is proportionally decreased to maintain the request-to-limit ratio in the Pod spec.

## {{% heading "whatsnext" %}}

If you configure autoscaling in your cluster, you may also want to consider using
Expand Down