Configure vertical Pod autoscaling

Vertical Pod autoscaling automates setting CPU and memory resource requests and limits for containers within Kubernetes Pods. Vertical Pod autoscaling analyzes historical and current resource usage to provide recommendations, which it can either display or automatically apply by updating Pods. This feature improves stability and cost efficiency by right-sizing resource allocations.

Before you begin

Before you configure Vertical Pod Autoscaling, ensure you meet the following prerequisites:

  • You have a running bare metal cluster.
  • You have kubectl access to the cluster.
  • Metrics Server is available in the cluster. Bare metal clusters include Metrics Server by default.

Enable vertical Pod autoscaling

Enable vertical Pod autoscaling on your bare metal cluster by setting a preview annotation and configuring the cluster specification:

  1. Add or update the preview annotation on the Cluster custom resource.

    Edit the Cluster custom resource directly or modify the cluster configuration file and use bmctl update.

    metadata:
      annotations:
        preview.baremetal.cluster.gke.io/vertical-pod-autoscaler: enable
    
  2. Modify the spec of the Cluster custom resource to include the verticalPodAutoscaling field and specify the enableUpdater and enableMemorySaver modes:

    apiVersion: baremetal.cluster.gke.io/v1
    kind: Cluster
    metadata:
      name: cluster1
      namespace: cluster-cluster1
      annotations:
        preview.baremetal.cluster.gke.io/vertical-pod-autoscaler: enable
    spec:
      # ... other cluster spec fields
      verticalPodAutoscaling:
        enableUpdater: true       # Set to true for automated updates
        enableMemorySaver: true   # Set to true to reduce recommender memory usage
    
  3. If you modified the cluster configuration file, apply the changes using the following command:

    bmctl update cluster -c CLUSTER_NAME --kubeconfig KUBECONFIG
    

    Replace the following:

    • CLUSTER_NAME: the name of your cluster.

    • KUBECONFIG: the path of your cluster kubeconfig file.

Create a VerticalPodAutoscaler custom resource

After enabling vertical Pod autoscaling on your cluster, define a VerticalPodAutoscaler custom resource to target specific workloads:

  1. Define a VerticalPodAutoscaler resource in the same namespace as the target workload.

    This custom resource specifies which Pods it targets using targetRef and any resource policies.

    apiVersion: "autoscaling.k8s.io/v1"
    kind: VerticalPodAutoscaler
    metadata:
      name: hamster-vpa
    spec:
      targetRef:
        apiVersion: "apps/v1"
        kind: Deployment
        name: hamster
      resourcePolicy:
        containerPolicies:
          -   containerName: '*'
            minAllowed:
              cpu: 100m
              memory: 50Mi
            maxAllowed:
              cpu: 1
              memory: 500Mi
            controlledResources: ["cpu", "memory"]
    
  2. Apply the VerticalPodAutoscaler manifest using the following command:

    kubectl apply -f VPA_MANIFEST \
        --kubeconfig KUBECONFIG
    

    Replace the following:

    • VPA_MANIFEST: the path of the VerticalPodAutoscaler manifest file.

    • KUBECONFIG: the path of the cluster kubeconfig file.

Understand vertical Pod autoscaling modes

Vertical Pod autoscaling operates in different modes that control how it applies resource recommendations.

Recommendation mode

In recommendation mode, vertical Pod autoscaling installs the recommender component. This component analyzes resource usage and publishes recommended values for CPU and memory requests and limits in the status section of the VerticalPodAutoscaler custom resources you create.

To view resource requests and limits recommendations, use the following command:

kubectl describe vpa VPA_NAME \
    --kubeconfig KUBECONFIG \
    -n CLUSTER_NAMESPACE
Replace the following:

*   `VPA_NAME`: the name of the `VerticalPodAutoscaler`
    that's targeting the workloads for which you are considering resource
    adjustments.

*   `KUBECONFIG`: the path of the cluster kubeconfig
    file.

*   `CLUSTER_NAMESPACE`: the name of the cluster that's
    running vertical Pod autoscaling.

The response should contain a Status section that's similar to the following sample:

Status:
  Conditions:
    Last Transition Time:  2025-08-04T23:53:32Z
    Status:                True
    Type:                  RecommendationProvided
  Recommendation:
    Container Recommendations:
      Container Name:  hamster
      Lower Bound:
        Cpu:     100m
        Memory:  262144k
      Target:
        Cpu:     587m
        Memory:  262144k
      Uncapped Target:
        Cpu:     587m
        Memory:  262144k
      Upper Bound:
        Cpu:     1
        Memory:  500Mi

Pods aren't automatically updated in this mode. Use these recommendations to manually update your Pod configurations. This is the default behavior if enableUpdater isn't set or is false.

Automated update mode

When you set enableUpdater enableUpdater to true, bare metal lifecycle controllers deploy the vertical Pod autoscaling updater and admission controller components in addition to the recommender. The updater monitors for Pods whose current resource requests deviate significantly from the recommendations.

The update policy in the VerticalPodAutoscaler resource specifies how the updater applies the recommendations. By default, the update mode is Auto, which dictates that the updater assigns updated resource settings on Pod creation. The following VerticalPodAutoscaler sample how you set the update mode to Initial:

apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
  name: hamster-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: hamster
  resourcePolicy:
  updatePolicy:
    updateMode: "Initial"
    ...

The updater supports the following five modes:

  • Auto: The updater evicts the Pod. The admission controller intercepts the creation request for the new Pod and modifies it to use the recommended CPU and memory values provided by the recommender. Updating resources requires recreating the Pod, which can cause disruptions. Use Pod Disruption Budgets, which the updater honors, to manage the eviction process. This mode equivalent to Recreate.

  • Recreate: The updater evicts Pods and assigns recommended resource requests and limits when the Pod is recreated.

  • InPlaceOrRecreate(alpha): The updater attempts best-effort in-place updates, but may fall back to recreating the Pod if in-place updates aren't possible. For more information, see the in-place pod resize documentation.

  • Initial: The updater only assigns resource requests on Pod creation and never changes them later.

  • Off: The updater doesn't automatically change the resource requirements of the Pods. The recommendations are calculated and can be inspected in the VerticalPodAutoscaler object.

For more information about the VerticalPodAutoscaler custom resource, use kubectl to retrieve the verticalpodautoscalercheckpoints.autoscaling.k8s.io custom resource definition that is installed on the version 1.33.0 or later cluster.

The following sample shows how resource recommendations might appear in the Status section for the hamster container. The sample also shows an example of a Pod eviction event, which occurs when the updater evicts a Pod prior to automatically assigning the recommended resource configuration to the recreated Pod:

Spec:
  Resource Policy:
    Container Policies:
      Container Name:  *
      Controlled Resources:
        cpu
        memory
      Max Allowed:
        Cpu:     1
        Memory:  500Mi
      Min Allowed:
        Cpu:     100m
        Memory:  50Mi
  Target Ref:
    API Version:  apps/v1
    Kind:         Deployment
    Name:         hamster
  Update Policy:
    Update Mode:  Auto
Status:
  Conditions:
    Last Transition Time:  2025-08-04T23:53:32Z
    Status:                True
    Type:                  RecommendationProvided
  Recommendation:
    Container Recommendations:
      Container Name:  hamster
      Lower Bound:
        Cpu:     100m
        Memory:  262144k
      Target:
        Cpu:     587m
        Memory:  262144k
      Uncapped Target:
        Cpu:     587m
        Memory:  262144k
      Upper Bound:
        Cpu:     1
        Memory:  500Mi
Events:
  Type    Reason      Age   From         Message
  ----    ------      ----  ----         -------
  Normal  EvictedPod  49s   vpa-updater  VPA Updater evicted Pod hamster-7cb59fb657-lkrk4 to apply resource recommendation.

Memory saver mode

Memory saver mode reduces the memory footprint of the vertical Pod autoscaling recommender component. When you set enableMemorySaver to true, the recommender only tracks and computes aggregations for Pods that have a matching VerticalPodAutoscaler custom resource.

The trade-off is that when you create a new VerticalPodAutoscaler custom resource for an existing workload, the recommender takes some time (up to 24 hours) to gather sufficient history to provide accurate recommendations. This mode is false by default for most cluster types, but defaults to true for edge clusters.

Disable Vertical Pod Autoscaling

Disable Vertical Pod Autoscaling by removing its custom resources and configuration from your cluster:

  1. Delete any VerticalPodAutoscaler custom resources you have created.

  2. Modify the Cluster custom resource and remove the entire verticalPodAutoscaling section from the spec.

    You can edit the Cluster custom resource directly or modify the cluster configuration file and use bmctl update.

  3. Remove the preview.baremetal.cluster.gke.io/vertical-pod-autoscaler annotation from the Cluster custom resource.

Limitations

Consider the following limitations when using Vertical Pod Autoscaling:

  • Vertical Pod autoscaling isn't ready for use with JVM-based workloads due to limited visibility into actual memory usage of the workload.
  • The updater requires a minimum of two Pod replicas for Deployments to replace Pods with revised resource values.
  • The updater doesn't quickly update Pods that are crash-looping due to Out-Of-Memory (OOM) errors.
  • The InPlaceOrRecreate update policy for Pods is an alpha feature within vertical Pod autoscaling. It attempts best-effort in-place updates, but may fall back to recreating the Pod if in-place updates aren't possible.

What's next