Gen AI evaluation service quotas
Stay organized with collections Save and categorize content based on your preferences.

The Gen AI evaluation service uses gemini-2.0-flash as a default judge model for model-based metrics. A single evaluation request for a model-based metric might result in multiple underlying requests to the Gen AI evaluation service. Each model's quota is calculated on a per-project basis, which means that any requests directed to gemini-2.0-flash for model inference and model-based evaluation contribute to the quota. Quotas for the Gen AI evaluation service and the underlying judge model are shown in the following table:

Request quota	Default quota
Gen AI evaluation service requests per minute	1,000 requests per project per region
Online prediction requests per minute for `base_model: gemini-2.0-flash`	See Quotas by region and model.

If you receive an error related to quotas while using the Gen AI evaluation service, you might need to file a quota increase request. See View and manage quotas for more information.

Limit	Value
Gen AI evaluation service request timeout	60 seconds

When you use the Gen AI evaluation service for the first time in a new project, you might experience an initial setup delay up to two minutes. If your first request fails, wait a few minutes and then retry. Subsequent evaluation requests typically complete within 60 seconds.

The maximum input and output tokens for model-based metrics depend on the model used as the judge model. See Google models for a list of models.

Vertex AI Pipelines quotas

Each tuning job uses Vertex AI Pipelines. For more information, see Vertex AI Pipelines quotas and limits.

What's next

To learn about quotas and limits for Vertex AI, see Vertex AI quotas and limits.
To learn more about Google Cloud quotas and limits, see Understand quota values and system limits.

Gen AI evaluation service quotas Stay organized with collections Save and categorize content based on your preferences.

Vertex AI Pipelines quotas

What's next

Gen AI evaluation service quotas
Stay organized with collections Save and categorize content based on your preferences.