Quotas and limits

This document lists the quotas and system limits that apply to Google Kubernetes Engine.

Quotas specify the amount of a countable, shared resource that you can use. Quotas are defined by Google Cloud services such as Google Kubernetes Engine.
System limits are fixed values that cannot be changed.

Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.

The Cloud Quotas system does the following:

Monitors your consumption of Google Cloud products and services
Restricts your consumption of those resources
Provides a way to request changes to the quota value and automate quota adjustments

In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.

Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.

To adjust most quotas, use the Google Cloud console. For more information, see Request a quota adjustment.

There are also system limits on GKE resources. System limits can't be changed.

Quotas per project

GKE has the following quotas:

Zonal clusters per zone
Regional clusters per region
API reads
API writes

Note: Clusters created in the Autopilot mode are pre-configured as regional clusters.

Check your quota

Quotas can be view in Google Cloud console, go to the Quotas page.

Go to Quotas

To managing and request for additional quotas, see view and manage quotas

Limits per cluster

The following tables describe the limits per GKE cluster.

Any GKE versions specified in the following table apply to both cluster nodes and the control plane.

Limits	GKE Standard cluster	GKE Autopilot cluster
Nodes per cluster	65,000 nodes If you plan to use this limit, consider the following recommendations when designing your GKE architecture: If you plan to run more than 2,000 nodes, use a regional cluster. Running more than 7,500 nodes is only available for clusters that are regional with Private Service Connect, and with GKE Dataplane V2 disabled. Contact support to increase this quota limit.	5,000 nodes If you plan to use this limit, consider the following recommendations when designing your GKE architecture: If you plan to run more than 1,000 nodes, use GKE Autopilot version 1.23 or newer. Running more than 400 nodes may require lifting a cluster size quota for clusters that were created on earlier versions. Contact support for assistance.
Nodes per node pool	1,000 nodes per zone 2,000 TPU nodes per zone - requires the following or newer versions: 1.28.5-gke.135500, 1.29.1-gke.1206000, 1.30	Not applicable
Nodes in a zone	No node limitations for container-native load balancing with NEG-based Ingress, which is recommended whenever possible. In GKE versions 1.17 and later, NEG-based Ingress is the default mode. 1,000 nodes if you are using Instance Group-based Ingress.	Not applicable
Pods per node¹	256 Pods Note: For GKE versions earlier than 1.23.5-gke.1300, the limit is 110 Pods.	Set dynamically to any value between 8 and 256. GKE considers the cluster size and the number of workloads to provision the maximum Pods per node. For GKE versions earlier than 1.28, the limit is 32 Pods. For Accelerator class Pods and Performance class Pods, the limit is one Pod per node.
Pods per cluster²	200,000 Pods¹	200,000 Pods
Containers per cluster	400,000 containers	400,000 containers
Etcd database size	6 GB	6 GB

As a platform administrator, we recommend you to get familiar with how quotas affect large workloads that run on GKE. For additional recommendations, best practices, limits, and quotas for large workloads, see Guidelines for creating scalable clusters.

Resource quotas

For clusters with under 100 nodes, GKE applies Kubernetes resource quota to every namespace. These quotas protect the cluster's control plane from instability caused by potential bugs in applications deployed to the cluster. You cannot remove these quotas because they are enforced by GKE.

GKE automatically updates resource quota values in proportion to the number of nodes. For clusters with over 100 nodes, GKE removes the resource quota.

To examine resource quotas, use the following command:

kubectl get resourcequota gke-resource-quotas -o yaml

To view the values for a given namespace, specify the namespace by adding the --namespace option.

Notes

The maximum number of Pods per GKE Standard cluster includes system Pods. The number of system Pods varies depending on cluster configuration and enabled features. ↩
The maximum number of Pods that can fit in a node depends on the size of your Pod resource requests and the capacity of the node. You might not reach every limit at the same time. As a best practice, we recommend that you load test large deployments. ↩