Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.
The Cloud Quotas system does the following:
- Monitors your consumption of Google Cloud products and services
- Restricts your consumption of those resources
- Provides a way to request changes to the quota value and automate quota adjustments
In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.
Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.
There are also limits on Vertex AI resources. These limits are unrelated to the quota system. Limits can't be changed.
Request quotas
The following quotas apply to Vertex AI requests for a given project and supported region. For example, in a single project, you can have up to 30,000 online prediction requests per minute in one region and another 30,000 online prediction requests per minute in another supported region.
Request type | Requests per minute |
---|---|
Resource management (CRUD) requests1 | 600 |
Job or long-running operation (LRO) submission requests | 60 |
Online prediction requests2 | 30,000 |
Online prediction request throughput | 1.5 GB |
Online explain requests | 600 |
Vertex AI TensorBoard Time Series read requests | 60,000 |
ML Metadata (CRUD) requests | 12,000 |
generative AI Caching (CRUD) requests | 200 |
Vertex AI Vizier (CRUD) requests | 6,000 |
Vertex AI Feature Store online serving requests | 300,000 |
Vertex ML Metadata requests | 12,000 |
Number of count tokens or compute tokens requests | 3,000 |
1Resource management requests include any request that isn't a job, an LRO, an online prediction request, a Vertex AI Vizier request, an ML metadata request, a Vertex AI TensorBoard Timeseries Insights API read request, a Vertex AI Feature Store request, a Vertex AI Feature Store streaming request, or a Vector Search request.
2 This quota applies for public endpoints only.
Jobs or LROs include the following requests:
- Create or delete a dataset.
- Import or export data to or from a dataset.
- Create an endpoint.
- Create or delete a custom job.
- Create or delete a data-labeling job.
- Create or delete a hyper-parameter tuning job.
- Create or delete a batch-prediction job.
- Create or delete a model.
- Upload, delete, or export a model.
- Create or delete a notebook-runtime template.
- Assign, delete, start, or upgrade a notebook runtime.
- Create, delete, or update a model monitor.
- Create or delete a model monitoring job.
For quota information for Generative AI models, see
Generative AI on Vertex AI quotas and limits.
AutoML model quotas
The following quotas apply to each data type and objective for a given project and region. For example, in a particular project and region, you can deploy 10 AutoML image classification models and 10 AutoML image object detection models for a total of 20 deployed models.
Image
Classification
Quota | Value |
---|---|
Concurrent training jobs | 5 |
Concurrent training jobs with explainable AI | 2 |
Concurrent batch prediction jobs | 5 |
Concurrent model deployment jobs | 5 |
Concurrent model undeployment jobs | 5 |
Number of deployed models | 10 |
Object detection
Quota | Value |
---|---|
Concurrent training jobs | 5 |
Concurrent batch prediction jobs | 5 |
Number of deployed models | 10 |
Tabular
Quota | Value |
---|---|
Concurrent training jobs | 5 |
Concurrent batch prediction jobs | 5 |
Number of deployed models | 30 |
Text
Classification
Quota | Value |
---|---|
Concurrent training jobs | 5 |
Concurrent batch prediction jobs | 5 |
Number of deployed models | 10 |
Entity extraction
Quota | Value |
---|---|
Concurrent training jobs | 5 |
Concurrent batch prediction jobs | 5 |
Number of deployed models | 10 |
Sentiment analysis
Quota | Value |
---|---|
Concurrent training jobs | 5 |
Concurrent batch prediction jobs | 5 |
Number of deployed models | 10 |
Video
Action Recognition
Quota | Value |
---|---|
Concurrent training jobs | 5 |
Concurrent batch prediction jobs | 5 |
Classification
Quota | Value |
---|---|
Concurrent training jobs | 5 |
Concurrent batch prediction jobs | 5 |
Object tracking
Quota | Value |
---|---|
Concurrent training jobs | 5 |
Concurrent batch prediction jobs | 5 |
Vertex AI Model Registry
The maximum model size supported by Vertex AI Model Registry is 1 TB.
AutoML model limits
The following limits apply to each data type for a given project and region. For example, in a particular project and region, you can include a maximum of 1,000,000 images in a batch request input.
Image
Type of limit | Value |
---|---|
Image file size | Maximum: 30MB |
Images per dataset | Maximum: 1,000,000 |
Labels per dataset | Minimum: 2 Maximum: 5000 |
Images per label | Minimum: 10 Recommended: 1000 |
Batch input CSV file size | Maximum: 100MB |
Number of images in batch input | Maximum: 1,000,000 |
Tabular
Type of limit | Value |
---|---|
Maximum size | 100GB |
Number of rows | Between 1,000 and 200,000,000 rows |
Number of columns | Between 2 and 1,000 columns |
Number of concurrently running dataset imports | 5 imports |
CSV file size | Maximum: 10GB per file, up to maximum total amount of 100GB |
Text
Character counts assume UTF-8 characters.
Type of limit | Classification | Entity extraction | Sentiment analysis |
---|---|---|---|
Training items | 10 to 1,000,000 | 50 to 100,000 | 10 to 100,000 |
Labels per dataset | 2 to 5,000 | 1 to 100 | 2 to 11 |
Length of label name | 1 to 32 | 1 to 32 | Integer from 0 to 10 |
Length of annotated span | N/A | 1 to 100 characters | N/A |
Training items per label | 10 to 1,000,000 | 100 to 100,000 | 10 to 100,000 |
Training item size | 10MB
5,000,000 characters |
128 KB (text); 20MB (PDF)
10 to 300,000 characters (text) |
128 KB (text); 2MB (PDF)
60,000 characters |
Item sent for prediction | 128 KB (text); 2MB (PDF)
60,000 characters |
20MB | 128 KB (text); 2MB (PDF)
60,000 characters |
Items per batch request | 10,000 | 10,000 | 10,000 |
Video
Type of limit | Value |
---|---|
Maximum video length | 3 hours |
Maximum video file size | 50GB |
Minimum labels per dataset | 2 |
Minimum videos per label | 10 (1000 is recommended) |
Batch input CSV file size | Maximum: 100MB |
Number of video segments in batch input | Maximum: 1,000 |
Custom-trained model quotas
The following quotas apply to Vertex AI custom-trained models for a given project and region.
Training
Quota | Value |
---|---|
Concurrent custom training pipelines | 2,000 |
Number of N1 and E2 CPUs for training, per region | |
---|---|
Region | Value |
us-west1 | 2,200 |
us-west2 | 20 |
us-west3 | 2,200 |
us-west4 | 20 |
us-central1 | 2,200 |
us-east1 | 2,200 |
us-east4 | 20 |
us-east5 | 450 |
us-south1 | 450 |
northamerica-northeast1 | 2,200 |
northamerica-northeast2 | 20 |
southamerica-east1 | 20 |
southamerica-west1 | 20 |
europe-west2 | 2,200 |
europe-west1 | 2,200 |
europe-west4 | 2,200 |
europe-west6 | 20 |
europe-west3 | 2,200 |
europe-north1europe-central2 | 20 |
europe-west8 | 20 |
europe-west9 | 450 |
europe-southwest1asia-south1 | 20 |
asia-southeast1 | 2,200 |
asia-southeast2 | 2,200 |
asia-east2 | 2,200 |
asia-east1 | 2,200 |
asia-northeast1 | 2,200 |
asia-northeast2 | 20 |
australia-southeast1 | 2,200 |
australia-southeast2 | 20 |
asia-northeast3 | 2,200 |
me-west1 | 450 |
me-central1 | 450 |
me-central2 | 450 |
europe-west12 | 450 |
africa-south1 | 450 |
Number of N2 CPUs for training, per region | |
---|---|
Region | Value |
us-west1 | 20 |
us-west2 | 20 |
us-west3 | 20 |
us-west4 | 20 |
us-central1 | 450 |
us-east1 | 20 |
us-east4 | 20 |
us-east5 | 450 |
us-south1 | 20 |
northamerica-northeast1 | 20 |
northamerica-northeast2 | 20 |
southamerica-east1 | 20 |
southamerica-west1 | 20 |
europe-west2 | 20 |
europe-west1 | 20 |
europe-west4 | 450 |
europe-west6 | 20 |
europe-west3 | 20 |
europe-north1europe-central2 | 20 |
europe-west8 | 20 |
europe-west9 | 450 |
europe-southwest1asia-south1 | 20 |
asia-southeast1 | 20 |
asia-southeast2 | 20 |
asia-east2 | 20 |
asia-east1 | 450 |
asia-northeast1 | 20 |
asia-northeast2 | 20 |
australia-southeast1 | 20 |
australia-southeast2 | 20 |
asia-northeast3 | 20 |
me-west1 | 20 |
me-central1 | 450 |
me-central2 | 450 |
europe-west12 | 450 |
africa-south1 | 450 |
Number of M1 CPUs for training, per region | |
---|---|
Region | Value |
us-west1 | 0 |
us-west2 | 0 |
us-west3 | 0 |
us-west4 | 0 |
us-central1 | 0 |
us-east1 | 0 |
us-east4 | 0 |
us-east5 | 0 |
us-south1 | 0 |
northamerica-northeast1 | 0 |
northamerica-northeast2 | 0 |
southamerica-east1 | 0 |
southamerica-west1 | 0 |
europe-west2 | 0 |
europe-west1 | 0 |
europe-west4 | 0 |
europe-west6 | 0 |
europe-west3 | 0 |
europe-north1europe-central2 | 0 |
europe-west8 | 0 |
europe-west9 | 0 |
europe-southwest1asia-south1 | 0 |
asia-southeast1 | 0 |
asia-southeast2 | 0 |
asia-east2 | 0 |
asia-east1 | 0 |
asia-northeast1 | 0 |
asia-northeast2 | 0 |
australia-southeast1 | 0 |
australia-southeast2 | 0 |
asia-northeast3 | 0 |
me-west1 | 0 |
me-central1 | 0 |
me-central2 | 0 |
europe-west12 | 0 |
africa-south1 | 0 |
Number of C2 CPUs for training, per region | |
---|---|
Region | Value |
us-west1 | 20 |
us-west2 | 20 |
us-west3 | 20 |
us-west4 | 20 |
us-central1 | 450 |
us-east1 | 20 |
us-east4 | 20 |
us-east5 | 450 |
us-south1 | 20 |
northamerica-northeast1 | 20 |
northamerica-northeast2 | 20 |
southamerica-east1 | 20 |
southamerica-west1 | 20 |
europe-west2 | 20 |
europe-west1 | 20 |
europe-west4 | 450 |
europe-west6 | 20 |
europe-west3 | 20 |
europe-north1europe-central2 | 20 |
europe-west8 | 20 |
europe-west9 | 450 |
europe-southwest1asia-south1 | 20 |
asia-southeast1 | 20 |
asia-southeast2 | 20 |
asia-east2 | 20 |
asia-east1 | 450 |
asia-northeast1 | 20 |
asia-northeast2 | 20 |
australia-southeast1 | 20 |
australia-southeast2 | 20 |
asia-northeast3 | 20 |
me-west1 | 20 |
me-central1 | 20 |
me-central2 | 20 |
europe-west12 | 20 |
africa-south1 | 20 |
Number of A2 CPUs for training, per region | |
---|---|
Region | Value |
us-west1 | Not available |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | Unlimited |
us-east1 | Unlimited |
us-east4 | Unlimited |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | Unlimited |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | Unlimited |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Unlimited |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of concurrent A3 CPUs for training, per region | |
---|---|
Region | Value |
us-west1 | Unlimited |
us-west2 | Unlimited |
us-west3 | Unlimited |
us-west4 | Unlimited |
us-central1 | Unlimited |
us-east1 | Unlimited |
us-east4 | Unlimited |
us-east5 | Unlimited |
us-south1 | Unlimited |
northamerica-northeast1 | Unlimited |
northamerica-northeast2 | Unlimited |
southamerica-east1 | Unlimited |
southamerica-west1 | Unlimited |
europe-west2 | Unlimited |
europe-west1 | Unlimited |
europe-west4 | Unlimited |
europe-west6 | Unlimited |
europe-west3 | Unlimited |
europe-north1europe-central2 | Unlimited |
europe-west8 | Unlimited |
europe-west9 | Unlimited |
europe-southwest1asia-south1 | Unlimited |
asia-southeast1 | Unlimited |
asia-southeast2 | Unlimited |
asia-east2 | Unlimited |
asia-east1 | Unlimited |
asia-northeast1 | Unlimited |
asia-northeast2 | Unlimited |
australia-southeast1 | Unlimited |
australia-southeast2 | Unlimited |
asia-northeast3 | Unlimited |
me-west1 | Unlimited |
me-central1 | Unlimited |
me-central2 | Unlimited |
europe-west12 | Unlimited |
africa-south1 | Unlimited |
Number of P4 GPUs for training, per region | |
---|---|
Region | Value |
us-west1 | Not available |
us-west2 | 6 |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 6 |
us-east1 | Not available |
us-east4 | 1 |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | 6 |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 6 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 6 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | 6 |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of T4 GPUs for training, per region | |
---|---|
Region | Value |
us-west1 | 2 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 12 |
us-east1 | 2 |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | 6 |
europe-west1 | Not available |
europe-west4 | 2 |
europe-west6 | Not available |
europe-west3 | 0 |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 1 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | 6 |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | 1 |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of L4 GPUs for training, per region | |
---|---|
Region | Value |
us-west1 | 0 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 0 |
us-east1 | 0 |
us-east4 | 0 |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | 0 |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 0 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 0 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | 0 |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | 0 |
europe-west12 | Not available |
africa-south1 | Not available |
Number of P100 GPUs for training, per region | |
---|---|
Region | Value |
us-west1 | 30 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 56 |
us-east1 | 30 |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | 30 |
europe-west4 | Not available |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | Not available |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | 30 |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | 6 |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of V100 GPUs for training, per region | |
---|---|
Region | Value |
us-west1 | 6 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 6 |
us-east1 | Not available |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 6 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | Not available |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | 6 |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of A100 GPUs for training, per region | |
---|---|
Region | Value |
us-west1 | Not available |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 8 |
us-east1 | Not available |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 8 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 8 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of concurrent A100 80GB GPUs for training, per region | |
---|---|
Region | Value |
us-west1 | Not available |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 0 |
us-east1 | Not available |
us-east4 | 0 |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 0 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 0 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
If interested, please see the quotas documentation.
Number of concurrent H100 GPUs for training, per region | |
---|---|
Region | Value |
us-west1 | 0 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 0 |
us-east1 | Not available |
us-east4 | 0 |
us-east5 | 0 |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 0 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 0 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | 0 |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of concurrent H100 Mega GPUs for training, per region | |
---|---|
Region | Value |
us-west1 | 0 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | 0 |
us-central1 | 0 |
us-east1 | Not available |
us-east4 | 0 |
us-east5 | 0 |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | 0 |
europe-west4 | 0 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 0 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | 0 |
asia-northeast2 | Not available |
australia-southeast1 | 0 |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
If interested, please see the quotas documentation.
Number of TPU V2 cores for training, per region | |
---|---|
Region | Value |
us-west1 | Not available |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 8 |
us-east1 | Not available |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 8 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | Not available |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | 8 |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of TPU V2 pod cores for training, per region | |
---|---|
Region | Value |
us-west1 | Not available |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | Not available |
us-east1 | Not available |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | Not available |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | Not available |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of TPU V3 cores for training, per region | |
---|---|
Region | Value |
us-west1 | Not available |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 8 |
us-east1 | Not available |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 8 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | Not available |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | 8 |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of TPU V3 pod cores for training, per region | |
---|---|
Region | Value |
us-west1 | Not available |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | Not available |
us-east1 | Not available |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | Not available |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | Not available |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
HDD usage (GB) during training, per region | |
---|---|
Region | Value |
us-west1 | 180,000 |
us-west2 | 3,600 |
us-west3 | 180,000 |
us-west4 | 3,600 |
us-central1 | 180,000 |
us-east1 | 180,000 |
us-east4 | 3,600 |
us-east5 | 3,600 |
us-south1 | 180,000 |
northamerica-northeast1 | 180,000 |
northamerica-northeast2 | 3,600 |
southamerica-east1 | 3,600 |
southamerica-west1 | 3,600 |
europe-west2 | 180,000 |
europe-west1 | 180,000 |
europe-west4 | 180,000 |
europe-west6 | 3,600 |
europe-west3 | 180,000 |
europe-north1europe-central2 | 3,600 |
europe-west8 | 3,600 |
europe-west9 | 180,000 |
europe-southwest1asia-south1 | 3,600 |
asia-southeast1 | 180,000 |
asia-southeast2 | 180,000 |
asia-east2 | 180,000 |
asia-east1 | 180,000 |
asia-northeast1 | 180,000 |
asia-northeast2 | 3,600 |
australia-southeast1 | 180,000 |
australia-southeast2 | 3,600 |
asia-northeast3 | 180,000 |
me-west1 | 180,000 |
me-central1 | 3,600 |
me-central2 | 3,600 |
europe-west12 | 3,600 |
africa-south1 | 3,600 |
SSD usage (GB) during training, per region | |
---|---|
Region | Value |
us-west1 | 75,000 |
us-west2 | 450 |
us-west3 | 75,000 |
us-west4 | 450 |
us-central1 | 75,000 |
us-east1 | 75,000 |
us-east4 | 450 |
us-east5 | 450 |
us-south1 | 75,000 |
northamerica-northeast1 | 75,000 |
northamerica-northeast2 | 450 |
southamerica-east1 | 450 |
southamerica-west1 | 450 |
europe-west2 | 75,000 |
europe-west1 | 75,000 |
europe-west4 | 75,000 |
europe-west6 | 450 |
europe-west3 | 75,000 |
europe-north1europe-central2 | 450 |
europe-west8 | 450 |
europe-west9 | 75,000 |
europe-southwest1asia-south1 | 450 |
asia-southeast1 | 75,000 |
asia-southeast2 | 75,000 |
asia-east2 | 75,000 |
asia-east1 | 75,000 |
asia-northeast1 | 75,000 |
asia-northeast2 | 450 |
australia-southeast1 | 75,000 |
australia-southeast2 | 450 |
asia-northeast3 | 75,000 |
me-west1 | 75,000 |
me-central1 | 450 |
me-central2 | 450 |
europe-west12 | 450 |
africa-south1 | 450 |
Serving
Quota | Value |
---|---|
Number of deployed custom model replicas | 100 |
Number of CPUs for serving, per region | |
---|---|
Region | Value |
us-west1 | 2,200 |
us-west2 | 2,200 |
us-west3 | 2,200 |
us-west4 | 16 |
us-central1 | 2,200 |
us-east1 | 2,200 |
us-east4 | 2,200 |
us-east5 | 16 |
us-south1 | 450 |
northamerica-northeast1 | 2,200 |
northamerica-northeast2 | 450 |
southamerica-east1 | 2,200 |
southamerica-west1 | 450 |
europe-west2 | 2,200 |
europe-west1 | 2,200 |
europe-west4 | 2,200 |
europe-west6 | 2,200 |
europe-west3 | 2,200 |
europe-north1europe-central2 | 16 |
europe-west8 | 16 |
europe-west9 | 16 |
europe-southwest1asia-south1 | 16 |
asia-southeast1 | 2,200 |
asia-southeast2 | 2,200 |
asia-east2 | 2,200 |
asia-east1 | 2,200 |
asia-northeast1 | 2,200 |
asia-northeast2 | 16 |
australia-southeast1 | 2,200 |
australia-southeast2 | 16 |
asia-northeast3 | 2,200 |
me-west1 | 450 |
me-central1 | 16 |
me-central2 | 16 |
europe-west12 | 16 |
africa-south1 | 16 |
Number of P100 GPUs for serving, per region | |
---|---|
Region | Value |
us-west1 | 30 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 56 |
us-east1 | 30 |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | 30 |
europe-west4 | Not available |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | Not available |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | 30 |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of P4 GPUs for serving, per region | |
---|---|
Region | Value |
us-west1 | Not available |
us-west2 | 6 |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 6 |
us-east1 | Not available |
us-east4 | 6 |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | 6 |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 6 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 6 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | 6 |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of T4 GPUs for serving, per region | |
---|---|
Region | Value |
us-west1 | 12 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 12 |
us-east1 | 12 |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | 12 |
europe-west1 | Not available |
europe-west4 | 12 |
europe-west6 | Not available |
europe-west3 | 0 |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 6 |
asia-southeast2 | Not available |
asia-east2 | 12 |
asia-east1 | 6 |
asia-northeast1 | 6 |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | 6 |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of L4 GPUs for serving, per region | |
---|---|
Region | Value |
us-west1 | 28 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 28 |
us-east1 | 28 |
us-east4 | 28 |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | 28 |
europe-west1 | 28 |
europe-west4 | 28 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 28 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | 28 |
asia-northeast1 | 28 |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of V100 GPUs for serving, per region | |
---|---|
Region | Value |
us-west1 | 6 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 6 |
us-east1 | Not available |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 6 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | Not available |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of A100 GPUs for serving, per region | |
---|---|
Region | Value |
us-west1 | Not available |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 14 |
us-east1 | Not available |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 14 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 14 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | 14 |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | 14 |
me-west1 | 1 |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of A100 80GB GPUs for serving, per region | |
---|---|
Region | Value |
us-west1 | Not available |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 1 |
us-east1 | Not available |
us-east4 | 1 |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 1 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 1 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of H100 GPUs for serving, per region | |
---|---|
Region | Value |
us-west1 | 8 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | 8 |
us-east1 | Not available |
us-east4 | 0 |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | 8 |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | 8 |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Number of v5e TPU Chips for serving, per region | |
---|---|
Region | Value |
us-west1 | 4 |
us-west2 | Not available |
us-west3 | Not available |
us-west4 | Not available |
us-central1 | Not available |
us-east1 | Not available |
us-east4 | Not available |
us-east5 | Not available |
us-south1 | Not available |
northamerica-northeast1 | Not available |
northamerica-northeast2 | Not available |
southamerica-east1 | Not available |
southamerica-west1 | Not available |
europe-west2 | Not available |
europe-west1 | Not available |
europe-west4 | Not available |
europe-west6 | Not available |
europe-west3 | Not available |
europe-north1europe-central2 | Not available |
europe-west8 | Not available |
europe-west9 | Not available |
europe-southwest1asia-south1 | Not available |
asia-southeast1 | Not available |
asia-southeast2 | Not available |
asia-east2 | Not available |
asia-east1 | Not available |
asia-northeast1 | Not available |
asia-northeast2 | Not available |
australia-southeast1 | Not available |
australia-southeast2 | Not available |
asia-northeast3 | Not available |
me-west1 | Not available |
me-central1 | Not available |
me-central2 | Not available |
europe-west12 | Not available |
africa-south1 | Not available |
Custom-trained model limits
The following limits apply to Vertex AI custom-trained models for a given project and region.
Serving
Limit | Value |
---|---|
Number of replicas per project | 200 |
Number of containers per cluster | 25,000 |
Vertex AI Feature Store
This section lists the quotas and limits for the following:
Vertex AI Feature Store
The following quotas apply to a given project and region. For example, in a
single project, you can have 200 online serving nodes in us-central1
and
another 50 nodes in us-east4
.
Quota | Value |
---|---|
Online serving requests per minute | 300,000 |
Maximum number of FeatureOnlineStore instances |
10 |
Maximum number of search requests per minute | 6,000,000 |
Maximum number of online serving nodes across all Optimized FeatureOnlineStore instances in the project |
80 |
Maximum number of FeatureView instances across all FeatureOnlineStore instances |
30 |
Vertex AI Feature Store also has the following limits. You can't request an increase to any of the limits in the following table:
Limit | Value |
---|---|
Maximum number of FeatureGroup resources in a project and location |
250 |
Maximum number of Feature resources within a FeatureGroup |
10,000 |
Maximum size of feature data per entity | 5 MB |
Number of entity IDs per online serving request (FetchFeatureValues ) |
1 |
Maximum length of an entity ID | 4076 characters |
Storage limit for an Optimized online serving node | 200 GB |
Vertex AI Feature Store (Legacy)
The following quotas apply to a given project and region. For example, in a
single project, you can have 75 concurrent batch jobs in us-central1
and
another 75 jobs in europe-west4
.
Quota | Value |
---|---|
Online serving requests per minute | 300,000 |
Streaming ingestion requests per minute | 60,000 |
Streaming ingestion write throughput per minute | 1.2 GB |
Feature creation requests per minute | 100 |
Online serving nodes across all featurestores | 30 |
Concurrent batch jobs (ingestion, serving, and delete feature values combined) | 75 |
Concurrent requests to delete feature values | 1 |
Entity types across all featurestores | 75 |
Vertex AI Feature Store (Legacy) also has the following limits. You can't request an increase to any of the limits in the following table:
Limit | Value |
---|---|
Storage limit for an online serving node | 5 TB |
Total data in the offline store | Unlimited |
Features per entity type | 5,000 |
Number of create, update, and delete featurestore requests per day per project per region | 500 |
For streaming ingestion, the size per request | 1 MB |
For streaming read, the number of entities that can be included per request | 100 |
For batch import, the number of files that can be included per request | 5,000 for Avro or 500 for CSV |
For batch serving and exports, the number of features you can request | 5,000 |
For batch ingestion and streaming ingestion, the oldest timestamp for which feature data can be ingested | 4,000 days from current date |
The data retention limit in Vertex AI Feature Store (Legacy) has the following default value, which you can override:
Data retention defaults | Default value |
---|---|
Data retention in offline store (oldest feature value timestamp after which the values are deleted) | 4,000 days from the current date |
Data retention in online store (oldest feature value timestamp after which the values are deleted) | 4,000 days from the current date |
You can override the data retention limit in the following ways:
To override the data retention limit for the online store, set the
online_storage_ttl_days
parameter while creating or updating a featurestore.To override the data retention limit for the offline store, set the
offline_storage_ttl_days
parameter while creating or updating an entity type.
Vector Search
The following quotas apply to Vector Search for a given project in each region.
Quota | Value |
---|---|
Concurrent index creation operations | 5 |
Concurrent index update operations | 5 |
Number of deployed index nodes | 50 |
Number of deployed index N2D nodes | 5 |
Number of Index | 100 |
Streaming Update requests per minute | 6,000 |
Streaming Update throughput(in KB) per minute | 120,000 |
Vertex ML Metadata
The following limits apply to Vertex ML Metadata for a given project in each region.
Limit | Value |
---|---|
Maximum serialized size of the metadata field within a resource | 204,800 bytes |
Maximum serialized size of metadata schemas | 204,800 bytes |
Vertex AI Pipelines
The following quotas and limits apply to Vertex AI Pipelines for a given project in each region.
Quota | Value |
---|---|
Running pipeline tasks in parallel* | 600 |
Concurrent pipeline runs* | 300 |
* Pipeline run and task requests beyond this limit are queued until resources are available.
Vertex AI Pipelines have the following limits. Note that, unlike quotas, you can't request an increase to a limit.
Limit | Value |
---|---|
Number of pipeline tasks per job | 10,000 |
Input and output artifacts per pipeline task | 100 |
Input and output artifacts per pipeline job | 10,000 |
Maximum size of JSON payload containing output parameters and artifacts per pipeline task | 131,072 bytes |
Maximum running time for a pipeline task** | 7 days |
* Pipeline tasks running beyond this limit are cancelled.
Vertex AI Decision Optimization
The following quotas and limits apply to Vertex AI Decision Optimization for a given project in each region.
Quota | Value |
---|---|
Solve request per minute | 30 |
Colab Enterprise quotas and limits
Colab Enterprise quotas and limits are listed separately. See Colab Enterprise quotas and limits.
Quota increases
If you want to increase any of your quotas for Vertex AI, you can use the Google Cloud console to request a quota increase.
For more information about submitting a quota increase request, see the following sections of Working with quotas:
Quotas by region and model
View and edit the quotas in the Google Cloud console
To view and edit the quotas in the Google Cloud console, do the following:- Go to the Quotas and System Limits page.
- To adjust the quota, copy and paste the property
aiplatform.googleapis.com/generate_content_requests_per_minute_per_project_per_base_model
in the Filter. Press Enter. - Click the three dots at the end of the row, and select Edit quota.
- Enter a new quota value in the pane, and click Submit request.
Go to Quotas and System Limits
View the requests per minute (RPM) quotas by region and by model
By default, models 2.0 and later use Dynamic shared quota (DSQ).
Choose a region to view the quotas for each available model:
Increase the quotas
If you want to increase any of your quotas for Generative AI on Vertex AI, you can use the Google Cloud console to request a quota increase.
RAG Engine quotas
For each service to perform retrieval-augmented generation (RAG) using RAG Engine, the following quotas apply, with the quota measured as requests per minute (RPM).Service | Quota | Metric |
---|---|---|
RAG Engine data management APIs | 60 RPM | VertexRagDataService requests per minute per region |
RetrievalContexts API |
1,500 RPM | VertexRagService retrieve requests per minute per region |
base_model: textembedding-gecko |
1,500 RPM | Online prediction requests per base model per minute per region per base_model An additional filter for you to specify is base_model: textembedding-gecko |
Service | Limit | Metric |
---|---|---|
Concurrent ImportRagFiles requests |
3 RPM | VertexRagService concurrent import requests per region |
Maximum number of files per ImportRagFiles request |
10,000 | VertexRagService import rag files requests per region |
For more rate limits and quotas, see Generative AI on Vertex AI rate limits.
Batch requests
The quotas and limits for batch prediction requests are the same across all regions.Concurrent batch prediction request limits
The following table lists the limits for the number of concurrent batch prediction requests:Limit | Value |
---|---|
Gemini models | 8 |
Concurrent batch prediction request quotas
The following table lists the quotas for the number of concurrent batch prediction requests, which don't apply to Gemini models:Quota | Value |
---|---|
aiplatform.googleapis.com/textembedding_gecko_concurrent_batch_prediction_jobs |
4 |
Custom-trained model quotas
The following quotas apply to Generative AI on Vertex AI tuned models for a given project and region:Quota | Value |
---|---|
Restricted image training TPU V3 pod cores per region * supported Region - europe-west4 |
64 |
Restricted image training Nvidia A100 80GB GPUs per region * supported Region - us-central1 * supported Region - us-east4 |
8 2 |
Text embedding limits
Each text embedding model request can have up to 250 input texts (generating 1 embedding per input text) and 20,000 tokens per request. Only the first 2,048 tokens in each input text are used to compute the embeddings.
For text-embedding-large-exp-03-07
, each request can only include a single input text. The quota for this model is listed under the name text-embedding-large-001
.
Gen AI evaluation service service quotas
The Gen AI evaluation service usesgemini-2.0-flash
as a default judge model
for model-based metrics.
A single evaluation request for a model-based metric might result in multiple underlying requests to
the Gen AI evaluation service. Each model's quota is calculated on a per-project basis, which means
that any requests directed to gemini-2.0-flash
for model inference and
model-based evaluation contribute to the quota.
Quotas for the Gen AI evaluation service and the underlying judge model are shown
in the following table:
Request quota | Default quota |
---|---|
Gen AI evaluation service requests per minute | 1,000 requests per project per region |
Online prediction requests per minute forbase_model: gemini-2.0-flash |
See Quotas by region and model. |
If you receive an error related to quotas while using the Gen AI evaluation service, you might need to file a quota increase request. See View and manage quotas for more information.
Limit | Value |
---|---|
Gen AI evaluation service request timeout | 60 seconds |
When you use the Gen AI evaluation service for the first time in a new project, you might experience an initial setup delay up to two minutes. If your first request fails, wait a few minutes and then retry. Subsequent evaluation requests typically complete within 60 seconds.
The maximum input and output tokens for model-based metrics depend on the model used as the judge model. See Google models for a list of models.
Pipeline evaluation quotas
If you receive an error related to quotas while using the evaluation pipelines service, you might need to file a quota increase request. See View and manage quotas for more information. The evaluation pipelines service uses Vertex AI Pipelines to runPipelineJobs
. See relevant quotas for
Vertex AI Pipelines. The following are general quota recommendations:
Service | Quota | Recommendation |
---|---|---|
Vertex AI API | Concurrent LLM batch prediction jobs per region | Pointwise: 1 * num_concurrent_pipelines Pairwise: 2 * num_concurrent_pipelines |
Vertex AI API | Evaluation requests per minute per region | 1000 * num_concurrent_pipelines |
Tasks | Quota | Base model | Recommendation |
---|---|---|---|
summarization question_answering |
Online prediction requests per base model per minute per region per base_model | text-bison |
60 * num_concurrent_pipelines |
Vertex AI Agent Engine
The following quotas and limits apply to Vertex AI Agent Engine for a given project in each region.Quota | Value |
---|---|
Create/Delete/Update Vertex AI Agent Engine per minute | 10 |
Query/StreamQuery Vertex AI Agent Engine per minute | 60 |
Maximum number of Vertex AI Agent Engine resources | 100 |
Troubleshoot error code 429
To troubleshoot the 429 error, see Error code 429.
What's next
- Learn more about Generative AI on Vertex AI quotas and limits.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-04-23 UTC.