Batch prediction quotas and limits

The quotas and limits for batch prediction requests are the same across all regions.

Concurrent batch prediction request limits

The following table lists the limits for the number of concurrent batch prediction requests:
Limit Value
Gemini models 8
If the number of tasks submitted exceeds the allocated limit, the tasks are placed in a queue and processed when the limit capacity becomes available.

Concurrent batch prediction request quotas

The following table lists the quotas for the number of concurrent batch prediction requests, which don't apply to Gemini models:
Quota Value
aiplatform.googleapis.com/textembedding_gecko_concurrent_batch_prediction_jobs 4
If the number of tasks submitted exceeds the allocated quota, the tasks are placed in a queue and processed when the quota capacity becomes available.

What's next