Persistent volume attach limits for GKE nodes


This document helps you understand how persistent volume attach limits for Compute Engine Persistent Disks and Hyperdisks work on Google Kubernetes Engine (GKE) nodes. For proper workload scheduling and node pool sizing it's crucial for you to understand the maximum number of persistent volumes that can be attached to a GKE node. If you want greater control over workload scheduling, especially when you use multiple disk types with varying attach limits on a single instance, you can use a node label to override the default attach limits.

This document is for Storage specialists who create and allocate storage, and GKE administrators who manage workload scheduling and node pool sizing. To learn more about common roles and example tasks referenced in Google Cloud content, see Common GKE Enterprise user roles and tasks.

Overview

In GKE, when you request a PersistentVolume (PV) using the Compute Engine Persistent Disk CSI driver (pd.csi.storage.gke.io), a block storage volume is provisioned from the Google Cloud Persistent Disk service.

Google Cloud offers several types of block storage options including Persistent Disks and Hyperdisks, each with different performance characteristics and pricing.

When you enable the Compute Engine Persistent Disk CSI driver (PDCSI) on GKE clusters, the PDCSI driver calculates and reports the per-node persistent volume attach limit to the kubelet. Based on this information, the Kubernetes scheduler makes scheduling decisions to help ensure that it doesn't schedule too many Pods that require persistent volumes on a node that has reached its attachment capacity. If the PDCSI driver reported inaccurate attach limits, more specifically a number higher than the actual limit, the Pods will fail to get scheduled, getting stuck in a Pending state. This can happen on third generation machine types such as, C3, which have different attach limits for Hyperdisks and Persistent Disks.

Understand persistent volume attach limits

For machine generations older than the fourth, Compute Engine PDCSI driver sets an aggregate persistent volume attach limit of 128 disks (127 data disks plus one boot disk) across all machine types. The attach limit applies to both Persistent Disk and Hyperdisk volumes combined. For Hyperdisk, the attach limit is determined by the underlying Compute Engine machine type, the number of vCPUs the machine has, and the specific Hyperdisk type.

For example:

  • For the fourth generation machine types like C4, the PDCSI driver accurately reports a default attach limit to Kubernetes that's calculated based on the node's vCPU count. The reported attach limit typically falls within a range of 8-128 persistent volumes.
  • In contrast, for the third generation machine types like C3, the PDCSI driver reports a default attach limit to Kubernetes as the fixed limit of 128 disks, which can lead to Pod scheduling failure because the actual limit can be lower than 128 based on the vCPU count.

You can override the default attach limit by using a node label.

Refer to the following helpful resources in the Compute Engine documentation:

Override the default persistent volume attach limit

If you have specific requirements or node configurations where you want to attach a specific number of persistent volumes to your nodes, you can override the default persistent volume attach limit for a node pool by using the following node label: node-restriction.kubernetes.io/gke-volume-attach-limit-override: VALUE.

You can use this node label on the following GKE versions:

  • 1.32.4-gke.1698000 and later.
  • 1.33.1-gke.1386000 and later.

New node pool

To create a new node pool with a specific persistent volume attach limit, run the following command:

gcloud container node-pools create NODE_POOL_NAME \
    --cluster=CLUSTER_NAME \
    --node-labels=node-restriction.kubernetes.io/gke-volume-attach-limit-override=VALUE

Existing node pool

To modify the current persistent volume attach limit of an existing node pool, run the following command:

gcloud container node-pools update NODE_POOL_NAME \
    --cluster=CLUSTER_NAME \
    --node-labels=node-restriction.kubernetes.io/gke-volume-attach-limit-override=VALUE

Replace the following:

  • NODE_POOL_NAME: the name of the node pool you want to create or update.
  • CLUSTER_NAME: the name of the cluster for the node pool you want to create or update.
  • VALUE: an integer between 0 and 127 to specify the new number of persistent volumes that can be attached. If you specify a value higher than 127, the node label is ignored and the PDCSI driver uses the default persistent volume attach limit instead. The default limit is 128 for third generation machines and a vCPU count-based value for fourth generation machines.

Verify the override

To verify if the override was applied correctly, check the node labels and the node capacity.

In the following commands, replace NODE_NAME with the name of a node that is part of the specific node pool where you applied the override node label.

  1. Check the node labels:

    kubectl get node NODE_NAME --show-labels
    

    The output should include the label node-restriction.kubernetes.io/gke-volume-attach-limit-override.

  2. Check the node capacity:

    kubectl describe node NODE_NAME
    

    The output should include the attachable-volumes-gce-pd capacity, which should match the override value you set for the node pool. For more information, see Check allocatable resources on a node.

What's next