Vertical Pod autoscaling automates setting CPU and memory resource requests and limits for containers within Kubernetes Pods. Vertical Pod autoscaling analyzes historical and current resource usage to provide recommendations, which it can either display or automatically apply by updating Pods. This feature improves stability and cost efficiency by right-sizing resource allocations.
Before you begin
Before you configure Vertical Pod Autoscaling, ensure you meet the following prerequisites:
- You have a running bare metal cluster.
- You have
kubectl
access to the cluster. - Metrics Server is available in the cluster. Bare metal clusters include Metrics Server by default.
Enable vertical Pod autoscaling
Enable vertical Pod autoscaling on your bare metal cluster by setting a preview annotation and configuring the cluster specification:
Add or update the preview annotation on the Cluster custom resource.
Edit the Cluster custom resource directly or modify the cluster configuration file and use
bmctl update
.metadata: annotations: preview.baremetal.cluster.gke.io/vertical-pod-autoscaler: enable
Modify the
spec
of the Cluster custom resource to include theverticalPodAutoscaling
field and specify theenableUpdater
andenableMemorySaver
modes:apiVersion: baremetal.cluster.gke.io/v1 kind: Cluster metadata: name: cluster1 namespace: cluster-cluster1 annotations: preview.baremetal.cluster.gke.io/vertical-pod-autoscaler: enable spec: # ... other cluster spec fields verticalPodAutoscaling: enableUpdater: true # Set to true for automated updates enableMemorySaver: true # Set to true to reduce recommender memory usage
If you modified the cluster configuration file, apply the changes using the following command:
bmctl update cluster -c CLUSTER_NAME --kubeconfig KUBECONFIG
Replace the following:
CLUSTER_NAME
: the name of your cluster.KUBECONFIG
: the path of your cluster kubeconfig file.
Create a VerticalPodAutoscaler
custom resource
After enabling vertical Pod autoscaling on your cluster, define a
VerticalPodAutoscaler
custom resource to target specific workloads:
Define a
VerticalPodAutoscaler
resource in the same namespace as the target workload.This custom resource specifies which Pods it targets using
targetRef
and any resource policies.apiVersion: "autoscaling.k8s.io/v1" kind: VerticalPodAutoscaler metadata: name: hamster-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: hamster resourcePolicy: containerPolicies: - containerName: '*' minAllowed: cpu: 100m memory: 50Mi maxAllowed: cpu: 1 memory: 500Mi controlledResources: ["cpu", "memory"]
Apply the
VerticalPodAutoscaler
manifest using the following command:kubectl apply -f VPA_MANIFEST \ --kubeconfig KUBECONFIG
Replace the following:
VPA_MANIFEST
: the path of theVerticalPodAutoscaler
manifest file.KUBECONFIG
: the path of the cluster kubeconfig file.
Understand vertical Pod autoscaling modes
Vertical Pod autoscaling operates in different modes that control how it applies resource recommendations.
Recommendation mode
In recommendation mode, vertical Pod autoscaling installs the recommender
component. This component analyzes resource usage and publishes recommended
values for CPU and memory requests and limits in the status section of the
VerticalPodAutoscaler
custom resources you create.
To view resource requests and limits recommendations, use the following command:
kubectl describe vpa VPA_NAME \
--kubeconfig KUBECONFIG \
-n CLUSTER_NAMESPACE
Replace the following:
* `VPA_NAME`: the name of the `VerticalPodAutoscaler`
that's targeting the workloads for which you are considering resource
adjustments.
* `KUBECONFIG`: the path of the cluster kubeconfig
file.
* `CLUSTER_NAMESPACE`: the name of the cluster that's
running vertical Pod autoscaling.
The response should contain a Status
section that's similar to the following
sample:
Status:
Conditions:
Last Transition Time: 2025-08-04T23:53:32Z
Status: True
Type: RecommendationProvided
Recommendation:
Container Recommendations:
Container Name: hamster
Lower Bound:
Cpu: 100m
Memory: 262144k
Target:
Cpu: 587m
Memory: 262144k
Uncapped Target:
Cpu: 587m
Memory: 262144k
Upper Bound:
Cpu: 1
Memory: 500Mi
Pods aren't automatically updated in this mode. Use these recommendations to
manually update your Pod configurations. This is the default behavior if
enableUpdater
isn't set or is false
.
Automated update mode
When you set
enableUpdater
enableUpdater
to true
, bare metal lifecycle controllers deploy the vertical Pod autoscaling
updater and admission controller components in addition to the recommender. The
updater monitors for Pods whose current resource requests deviate significantly
from the recommendations.
The update policy in the VerticalPodAutoscaler
resource specifies how the
updater applies the recommendations. By default, the update mode is Auto
,
which dictates that the updater assigns updated resource settings on Pod
creation. The following VerticalPodAutoscaler
sample how you set the update
mode to Initial
:
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
name: hamster-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: hamster
resourcePolicy:
updatePolicy:
updateMode: "Initial"
...
The updater supports the following five modes:
Auto
: The updater evicts the Pod. The admission controller intercepts the creation request for the new Pod and modifies it to use the recommended CPU and memory values provided by the recommender. Updating resources requires recreating the Pod, which can cause disruptions. Use Pod Disruption Budgets, which the updater honors, to manage the eviction process. This mode equivalent toRecreate
.Recreate
: The updater evicts Pods and assigns recommended resource requests and limits when the Pod is recreated.InPlaceOrRecreate
(alpha): The updater attempts best-effort in-place updates, but may fall back to recreating the Pod if in-place updates aren't possible. For more information, see the in-place pod resize documentation.Initial
: The updater only assigns resource requests on Pod creation and never changes them later.Off
: The updater doesn't automatically change the resource requirements of the Pods. The recommendations are calculated and can be inspected in theVerticalPodAutoscaler
object.
For more information about the VerticalPodAutoscaler
custom resource, use
kubectl
to retrieve the verticalpodautoscalercheckpoints.autoscaling.k8s.io
custom resource definition that is installed on the version 1.33.0 or later
cluster.
The following sample shows how resource recommendations might appear in the
Status
section for the hamster
container. The sample also shows an example
of a Pod eviction event, which occurs when the updater evicts a Pod prior to
automatically assigning the recommended resource configuration to the recreated
Pod:
Spec:
Resource Policy:
Container Policies:
Container Name: *
Controlled Resources:
cpu
memory
Max Allowed:
Cpu: 1
Memory: 500Mi
Min Allowed:
Cpu: 100m
Memory: 50Mi
Target Ref:
API Version: apps/v1
Kind: Deployment
Name: hamster
Update Policy:
Update Mode: Auto
Status:
Conditions:
Last Transition Time: 2025-08-04T23:53:32Z
Status: True
Type: RecommendationProvided
Recommendation:
Container Recommendations:
Container Name: hamster
Lower Bound:
Cpu: 100m
Memory: 262144k
Target:
Cpu: 587m
Memory: 262144k
Uncapped Target:
Cpu: 587m
Memory: 262144k
Upper Bound:
Cpu: 1
Memory: 500Mi
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EvictedPod 49s vpa-updater VPA Updater evicted Pod hamster-7cb59fb657-lkrk4 to apply resource recommendation.
Memory saver mode
Memory saver mode reduces the memory footprint of the vertical Pod autoscaling
recommender component. When you set
enableMemorySaver
to true
, the recommender only tracks and computes aggregations for Pods that
have a matching VerticalPodAutoscaler
custom resource.
The trade-off is that when you create a new VerticalPodAutoscaler
custom
resource for an existing workload, the recommender takes some time (up to 24
hours) to gather sufficient history to provide accurate recommendations. This
mode is false
by default for most cluster types, but defaults to true
for
edge clusters.
Disable Vertical Pod Autoscaling
Disable Vertical Pod Autoscaling by removing its custom resources and configuration from your cluster:
Delete any
VerticalPodAutoscaler
custom resources you have created.Modify the Cluster custom resource and remove the entire
verticalPodAutoscaling
section from thespec
.You can edit the Cluster custom resource directly or modify the cluster configuration file and use
bmctl update
.Remove the
preview.baremetal.cluster.gke.io/vertical-pod-autoscaler
annotation from the Cluster custom resource.
Limitations
Consider the following limitations when using Vertical Pod Autoscaling:
- Vertical Pod autoscaling isn't ready for use with JVM-based workloads due to limited visibility into actual memory usage of the workload.
- The updater requires a minimum of two Pod replicas for Deployments to replace Pods with revised resource values.
- The updater doesn't quickly update Pods that are crash-looping due to Out-Of-Memory (OOM) errors.
- The
InPlaceOrRecreate
update policy for Pods is an alpha feature within vertical Pod autoscaling. It attempts best-effort in-place updates, but may fall back to recreating the Pod if in-place updates aren't possible.
What's next
- Explore Pod Disruption Budgets.