This document lists the quotas and system limits that apply to Google Kubernetes Engine.
- Quotas specify the amount of a countable, shared resource that you can use. Quotas are defined by Google Cloud services such as Google Kubernetes Engine.
- System limits are fixed values that cannot be changed.
Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.
The Cloud Quotas system does the following:
- Monitors your consumption of Google Cloud products and services
- Restricts your consumption of those resources
- Provides a way to request changes to the quota value
In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.
Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.
To adjust most quotas, use the Google Cloud console. For more information, see Request a quota adjustment.
There are also system limits on GKE resources. System limits can't be changed.
Limits per project
In a single project, you can create a maximum of 100 zonal clusters per zone, plus 100 regional clusters per region.
Note: Clusters created in the Autopilot mode are pre-configured as regional clusters.
Limits per cluster
The following tables describe the limits per GKE cluster.
Any GKE versions specified in the following table apply to both cluster nodes and the control plane.
Limits | GKE Standard cluster | GKE Autopilot cluster |
---|---|---|
Nodes per cluster |
65,000 nodes
If you plan to use this limit, consider the following recommendations when designing your GKE architecture:
|
5,000 nodes
If you plan to use this limit, consider the following recommendations when designing your GKE architecture:
|
Nodes per node pool | 1,000 nodes per zone 2,000 TPU nodes per zone - requires the following or newer versions: 1.28.5-gke.135500, 1.29.1-gke.1206000, 1.30 |
Not applicable |
Nodes in a zone |
|
Not applicable |
Pods per node1 |
256 Pods
Note: For GKE versions earlier than 1.23.5-gke.1300, the limit is 110 Pods. |
Set dynamically to any value between 8 and 256. GKE considers the cluster size and the number of workloads to provision the maximum Pods per node.
|
Pods per cluster2 | 200,000 Pods1 | 200,000 Pods |
Containers per cluster | 400,000 containers | 400,000 containers |
Etcd database size | 6 GB | 6 GB |
As a platform administrator, we recommend you to get familiar with how quotas affect large workloads that run on GKE. For additional recommendations, best practices, limits, and quotas for large workloads, see Guidelines for creating scalable clusters.
Limit for API requests
The default rate limit for the Kubernetes Engine API is 3000 requests per min, enforced at intervals of every 100 seconds.
Resource quotas
For clusters with under 100 nodes, GKE applies Kubernetes resource quota to every namespace. These quotas protect the cluster's control plane from instability caused by potential bugs in applications deployed to the cluster. You cannot remove these quotas because they are enforced by GKE.
GKE automatically updates resource quota values in proportion to the number of nodes. For clusters with over 100 nodes, GKE removes the resource quota.
To examine resource quotas, use the following command:
kubectl get resourcequota gke-resource-quotas -o yaml
To view the values for a given namespace, specify the namespace by adding the
--namespace
option.
Check your quota
Console
- In the Google Cloud console, go to the Quotas page. The Quotas page displays the list of quotas that are prefiltered to GKE quotas.
- To search for the exact quota, use the Filter table. If you don't know the name of the quota, you can use the links on the Quotas page.
gcloud
- To check your quotas, run the following command:
gcloud compute project-info describe --project PROJECT_ID
Replace
PROJECT_ID
with your own project ID. - To check your used quota in a region, run the following command:
gcloud compute regions describe example-region
Notes
-
The maximum number of Pods per GKE Standard cluster includes system Pods. The number of system Pods varies depending on cluster configuration and enabled features. ↩
-
The maximum number of Pods that can fit in a node depends on the size of your Pod resource requests and the capacity of the node. You might not reach every limit at the same time. As a best practice, we recommend that you load test large deployments. ↩