This page shows you how to autoscale your Standard Google Kubernetes Engine (GKE) clusters. To learn about how the cluster autoscaler works, refer to Cluster autoscaler.
With Autopilot clusters, you don't need to worry about provisioning nodes or managing node pools because node pools are automatically provisioned through node auto-provisioning, and are automatically scaled to meet the requirements of your workloads.
Using the cluster autoscaler
The following sections explain how to use cluster autoscaler.
Creating a cluster with autoscaling
You can create a cluster with autoscaling enabled using the Google Cloud CLI or the Google Cloud console.
gcloud
To create a cluster with autoscaling enabled, use the --enable-autoscaling
flag and specify --min-nodes
and --max-nodes
:
gcloud container clusters create CLUSTER_NAME \
--enable-autoscaling \
--num-nodes NUM_NODES \
--min-nodes MIN_NODES \
--max-nodes MAX_NODES \
--region=COMPUTE_REGION
Replace the following:
CLUSTER_NAME
: the name of the cluster to create.NUM_NODES
: the number of nodes to create in each location.MIN_NODES
: the minimum number of nodes to automatically scale for the specified node pool per zone. To specify the minimum number of nodes for the entire node pool in GKE versions 1.24 and later, use--total-min-nodes
. The flags--total-min-nodes
and--total-max-nodes
are mutually exclusive with the flags--min-nodes
and--max-nodes
.MAX_NODES
: the maximum number of nodes to automatically scale for the specified node pool per zone. To specify the maximum number of nodes for the entire node pool in GKE versions 1.24 and later, use--total-max-nodes
. The flags--total-min-nodes
and--total-max-nodes
are mutually exclusive with the flags--min-nodes
and--max-nodes
.COMPUTE_REGION
: the Compute Engine region for the new cluster. For zonal clusters, use--zone=COMPUTE_ZONE
.
Example: Creating a cluster with node autoscaling enabled and min and max nodes
The following command creates a cluster with 90 nodes, or 30 nodes in each of the 3 zones present in the region. Node autoscaling is enabled and resizes the number of nodes based on cluster load. The cluster autoscaler can reduce the size of the default node pool to 15 nodes or increase the node pool to a maximum of 50 nodes per zone.
gcloud container clusters create my-cluster --enable-autoscaling \
--num-nodes=30 \
--min-nodes=15 --max-nodes=50 \
--region=us-central
Example: Creating a cluster with node autoscaling enabled and total nodes
The following command creates a cluster with 30 nodes, or 10 nodes in each of the 3 zones present in the region. Node autoscaling is enabled and resizes the number of nodes based on cluster load. In this example, the total size of the cluster can be between 10 and 60 nodes, regardless of spreading between zones.
gcloud container clusters create my-cluster --enable-autoscaling \
--num-nodes 10 \
--region us-central1 \
--total-min-nodes 10 --total-max-nodes 60
Console
To create a new cluster in which the default node pool has autoscaling enabled:
Go to the Google Kubernetes Engine page in the Google Cloud console.
Click add_box Create.
Configure your cluster as desired.
From the navigation pane, under Node Pools, click default-pool.
Select the Enable autoscaling checkbox.
Change the values of the Minimum number of nodes and Maximum number of nodes fields as desired.
Click Create.
Adding a node pool with autoscaling
You can create a node pool with autoscaling enabled using the gcloud CLI or the Google Cloud console.
gcloud
To add a node pool with autoscaling to an existing cluster, use the following command:
gcloud container node-pools create POOL_NAME \
--cluster=CLUSTER_NAME \
--enable-autoscaling \
--min-nodes=MIN_NODES \
--max-nodes=MAX_NODES \
--region=COMPUTE_REGION
Replace the following:
POOL_NAME
: the name of the desired node pool.CLUSTER_NAME
: the name of the cluster in which the node pool is created.MIN_NODES
: the minimum number of nodes to automatically scale for the specified node pool per zone. To specify the minimum number of nodes for the entire node pool in GKE versions 1.24 and later, use--total-min-nodes
. The flags--total-min-nodes
and--total-max-nodes
are mutually exclusive with the flags--min-nodes
and--max-nodes
.MAX_NODES
: the maximum number of nodes to automatically scale for the specified node pool per zone. To specify the maximum number of nodes for the entire node pool in GKE versions 1.24 and later, use--total-max-nodes
. The flags--total-min-nodes
and--total-max-nodes
are mutually exclusive with the flags--min-nodes
and--max-nodes
.COMPUTE_REGION
: the Compute Engine region for the new cluster. For zonal clusters, use--zone=COMPUTE_ZONE
.
Example: Adding a node pool with node autoscaling enabled
The following command creates a node pool with node autoscaling that scales the node pool to a maximum of 5 nodes and a minimum of 1 node:
gcloud container node-pools create my-node-pool \
--cluster my-cluster \
--enable-autoscaling \
--min-nodes 1 --max-nodes 5 \
--zone us-central1-c
Console
To add a node pool with autoscaling to an existing cluster:
Go to the Google Kubernetes Engine page in the Google Cloud console.
In the cluster list, click the name of the cluster you want to modify.
Click add_box Add Node Pool.
Configure the node pool as desired.
Under Size, select the Enable autoscaling checkbox.
Change the values of the Minimum number of nodes and Maximum number of nodes fields as desired.
Click Create.
Enabling autoscaling for an existing node pool
You can enable autoscaling for an existing node pool using the gcloud CLI or the Google Cloud console.
gcloud
To enable autoscaling for an existing node pool, use the following command:
gcloud container clusters update CLUSTER_NAME \
--enable-autoscaling \
--node-pool=POOL_NAME \
--min-nodes=MIN_NODES \
--max-nodes=MAX_NODES \
--region=COMPUTE_REGION
Replace the following:
CLUSTER_NAME
: the name of the cluster to update.POOL_NAME
: the name of the desired node pool. If you have only one node pool, supplydefault-pool
as the value.MIN_NODES
: the minimum number of nodes to automatically scale for the specified node pool per zone. To specify the minimum number of nodes for the entire node pool in GKE versions 1.24 and later, use--total-min-nodes
. The flags--total-min-nodes
and--total-max-nodes
are mutually exclusive with the flags--min-nodes
and--max-nodes
.MAX_NODES
: the maximum number of nodes to automatically scale for the specified node pool per zone. To specify the maximum number of nodes for the entire node pool in GKE versions 1.24 and later, use--total-max-nodes
. The flags--total-min-nodes
and--total-max-nodes
are mutually exclusive with the flags--min-nodes
and--max-nodes
.COMPUTE_REGION
: the Compute Engine region for the new cluster. For zonal clusters, use--zone=COMPUTE_ZONE
.
Console
To enable autoscaling for an existing node pool:
Go to the Google Kubernetes Engine page in the Google Cloud console.
In the cluster list, click the name of the cluster you want to modify.
Click the Nodes tab.
Under Node Pools, click the name of the node pool you want to modify, then click edit Edit.
Under Size, select the Enable autoscaling checkbox.
Change the values of the Minimum number of nodes and Maximum number of nodes fields as desired.
Click Save.
Verifying that autoscaling for the existing node pool is enabled
You verify that your cluster is using autoscaling with the Google Cloud CLI or the Google Cloud console.
gcloud
Describe the node pools in the cluster:
gcloud container node-pools describe NODE_POOL_NAME --cluster=CLUSTER_NAME |grep autoscaling -A 1
Replace the following:
POOL_NAME
: the name of the new node pool that you choose.CLUSTER_NAME
: the name of the cluster.
If autoscaling is enabled, the output is similar to the following:
autoscaling:
enabled: true
Console
Go to the Google Kubernetes Engine page in the Google Cloud console.
In the cluster list, click the name of the cluster you want to verify.
Click the Nodes tab.
Under Node Pools, verify that node pool
Autoscalling
state.
Creating a node pool that prioritizes optimization of unused reservations
You can use the --location_policy=ANY
flag when you create a node pool to
instruct the cluster autoscaler to
prioritize utilization of unused reservations:
gcloud container node-pools create POOL_NAME \
--cluster=CLUSTER_NAME \
--location_policy=ANY
Replace the following:
POOL_NAME
: the name of the new node pool that you choose.CLUSTER_NAME
: the name of the cluster.
Disabling autoscaling for an existing node pool
You can disable autoscaling for an existing node pool using the gcloud CLI or the Google Cloud console.
gcloud
To disable autoscaling for a specific node pool, use the
--no-enable-autoscaling
flag:
gcloud container clusters update CLUSTER_NAME \
--no-enable-autoscaling \
--node-pool=POOL_NAME \
--region=COMPUTE_REGION
Replace the following:
CLUSTER_NAME
: the name of the cluster to update.POOL_NAME
: the name of the desired node pool.COMPUTE_REGION
: the Compute Engine region for the new cluster. For zonal clusters, use--zone=COMPUTE_ZONE
.
The cluster size is fixed at the cluster's current default node pool size, which can be manually updated.
Console
To disable autoscaling for a specific node pool:
Go to the Google Kubernetes Engine page in the Google Cloud console.
In the cluster list, click the name of the cluster you want to modify.
Click the Nodes tab.
Under Node Pools, click the name of the node pool you want to modify, then click edit Edit.
Under Size, clear the Enable autoscaling checkbox.
Click Save.
Resizing a node pool
For clusters with autoscaling enabled, the cluster autoscaler automatically
resizes node pools within the boundaries specified by either the minimum size
(--min-nodes
) and maximum size (--max-nodes
) values or the minimum total
size (--total-min-nodes
) and maximum total size (--total-max-nodes
). These
flags are mutually exclusive. You cannot manually resize a node pool by changing
these values.
If you want to manually resize a node pool in your cluster that has autoscaling enabled, perform the following:
- Disable autoscaling on the node pool.
- Manually resize the cluster.
- Re-enable autoscaling and specify the minimum and maximum node pool size.
Preventing Pods scheduling on selected nodes
You can use startup
or status
taints to prevent Pods scheduling on selected nodes, depending on the use case.
This feature is available in GKE in version 1.28 and later.
Startup taints
Use startup
taints when there is an operation that has to complete before any Pods can run on the node. For example, Pods shouldn't run until the drivers installation on node finishes.
Cluster autoscaler treats nodes tainted with startup
taints as unready, but taken into account during scale up logic, assuming they will become ready shortly.
Startup taints are defined as all taints with the prefix startup-taint.cluster-autoscaler.kubernetes.io/
Status taints
Use status
taints when GKE shouldn't use a given node to run Pods.
Cluster autoscaler treats nodes tainted with status
taints as ready, but ignores them during scale up logic. Even though the tainted node is ready, no Pods should run. If more resources are needed by the Pods, GKE scales up the cluster and ignores the tainted nodes.
Status taints are defined as all taints with the prefix status-taint.cluster-autoscaler.kubernetes.io/
Ignore taints
Ignore taints are defined as all taints with the prefix ignore-taint.cluster-autoscaler.kubernetes.io/
Troubleshooting
Check if the issue you are running into is caused by one of the limitations for the cluster autoscaler. Otherwise, see the following troubleshooting information for the cluster autoscaler:
Cluster is not downscaling
After the cluster properly scales up and then attempts to scale down, underutilized nodes remain enabled and prevent the cluster from scaling down. This error occurs for one of the following reasons:
Restrictions can prevent a node from being deleted by the autoscaler. GKE might prevent a node's deletion if the node contains a Pod with any of these conditions:
- The Pod's affinity or anti-affinity rules prevent rescheduling.
- In GKE version 1.21 and earlier, the Pod has local storage.
- The Pod is not managed by a Controller such as a Deployment, StatefulSet, Job or ReplicaSet.
To resolve this issue, set up the cluster autoscaler scheduling and eviction rules on your Pods. For more information, see Pod scheduling and disruption.
System Pods are running on a node. To verify that your nodes are running
kube-system
pods, perform the following steps:Go to the Logs Explorer page in the Google Cloud console.
Click Query builder.
Use the following query to find all network policy log records:
- resource.labels.location="CLUSTER_LOCATION" resource.labels.cluster_name="CLUSTER_NAME" logName="projects/PROJECT_ID/logs/container.googleapis.com%2Fcluster-autoscaler-visibility" jsonPayload.noDecisionStatus.noScaleDown.nodes.node.mig.nodepool="NODE_POOL_NAME"
Replace the following:
CLUSTER_LOCATION
: The region your cluster is in.CLUSTER_NAME
: The name of your cluster.PROJECT_ID
: the ID of the project in which the cluster is created.NODE_POOL_NAME
: The name of your node pool.If there are
kube-system
pods running on your node pool, the output includes the following:"no.scale.down.node.pod.kube.system.unmovable"
To resolve this issue, you have to either:
- Add a
PodDisruptionBudget
for thekube-system
Pods. For more information about manually adding aPodDisruptionBudget
for thekube-system
Pods, see the Kubernetes cluster autoscaler FAQ. - Use a combination of node pools taints and tolerations to separate
kube-system
pods from your application pods. For more information, see node auto-provisioning in GKE.
Node pool size mismatching
The following issue results when you configure node pool size:
- An existing node pool size is smaller than the minimum number of nodes you specified for the cluster.
The following list describes the possible common causes of this behavior:
- You specified a new minimum number of nodes when the existing number of nodes is higher.
- You manually scaled down the node pool or the underlying Managed Instance Group. This manual operation specified the number of nodes lesser than the minimum number of nodes.
- You deployed preempted Spot VMs within the node pool.
- The Pod has local storage and the GKE control plane version is lower than 1.22. In GKE clusters with control plane version 1.22 or later, Pods with local storage no longer block scaling down.
The Pod has the annotation
"cluster-autoscaler.kubernetes.io/safe-to-evict": "false"
.For more troubleshooting steps during scale down events, refer to Cluster not scaling down.
When scaling down, cluster autoscaler respects the Pod termination grace period, up to a maximum of 10 minutes. After 10 minutes, Pods are forcefully terminated.
You may observe a node pool size being smaller than the minimum number of nodes you specified for the cluster. This behavior happens because the autoscaler uses the minimum number of nodes parameter only when it need to determine a scaling down. These are the list of the possible common causes of this behavior.
To resolve this issue, manually increase the node pool size to at least the minimum number of nodes. For more information, see how to manually resize a cluster.
For more information about the cluster autoscaler and preventing disruptions, see the following questions in the Kubernetes cluster autoscaler FAQ:
- How does scale-down work?
- Does the cluster autoscaler work with PodDisruptionBudget in scale-down?
- What types of Pods can prevent the cluster autoscaler from removing a node?