Quotas

Accurate quota estimation is based on the number of your queries per second (QPS) to each API. The following sections outline the quotas for APIs used with each Agent Assist feature.

See the quota page for more information on requesting a quota increase. After submitting your request, Google might contact you for more information and inform you when your request is approved or denied.

Project types

The following quota tables list two types of projects: consumer and resource. See the documentation on using multiple projects for definitions of these two project types.

CCAI transcription

This feature uses either telephony or gRPC integration, which each have different API quotas.

Telephony integration

See the Dialogflow quotas for the APIs used with telephony integration.

gRPC integration

Quota limit name Default value Region Charging resource/Consumer project Description
AnalyzeContentOperationsPerMinutePerProject 300 requests/min Global Consumer project AnalyzeContent/StreamingAnalyzeContent requests. Quota is shared between Dialogflow and Agent Assist.

Sentiment analysis

Quota limit name Default value Region Charging resource/Consumer project Description
AnalyzeSentimentOperationsPerMinutePerProject 300 requests/min Global Consumer project Sentiment analysis requests through AnalyzeContent or StreamingAnalyzeContent.
AnalyzeSentimentOperationsPerMinutePerProjectPerRegion 300 requests/min Global Consumer project AnalyzeSentiment and StreamingAnalyzeSenitment requests. Quota is shared between Dialogflow and Agent Assist.

Build your own assist

This feature uses the following AI models:

  • text-bison@001 (default limit 0)
  • text-bison@002
  • text-bison-32k@002
  • gemini-1.0-pro
  • gemini-1.5-pro
  • gemini-1.5-pro-001
  • gemini-1.5-flash-001
  • gemini-1.5-flash-002
  • gemini-2.0-flash-001
Quota limit name Default value Region Charging resource/Consumer project Description
GeneratorSuggestionOperationsPerMinutePerModelType 10 requests/min Global Consumer project Generator suggestion operations per model type
GeneratorSuggestionOperationsPerMinutePerModelTypePerRegion 10 requests/min Regional Consumer project Generator suggestion operations per model type and region

Summarization

AI-generated summarization uses the following models:

  • summarization-1.0
  • summarization-2.0
  • summarization-2.1
  • summarization-3.0
  • summarization-3.1
  • summarization-4.0

The following table shows the quota type and model used for each version of summarization.


Summarization version

Quota type

Backend model

Generator 4.0

Generator based

Pretrained Gemini-2.0-flash-001

Generator 3.1

Generator based

Lora-tuned gemini-1.5-flash-001

Generator 3.0

Generator based

Lora-tuned gemini-1.0-pro-002

Generator 2.1

Generator based

Lora-tuned text-bison-32k@002

Generator 2.0

Generator based

Lora-tuned text-bison-32k@002

Generator 1.0

Generator based

Lora-tuned text-bison@001

Baseline v2

Baseline v2 model

text-bison

Baseline v1

Non Generator based

LongT5 model

Custom 2.0

Non Generator based

LongT5 model

The quota types in the previous table are reflected in the following quota list for APIs used with summarization.


Quota type

Quota limit name

Default value

Region

Charging resource/Consumer project

Description

Generator based

GeneratorSuggestionOperationsPerMinutePerModelTypePerRegion

10 requests/min

Regional

Consumer project

Generator suggestion operations per model type and region

Generator based

GeneratorSuggestionOperationsPerMinutePerModelType

10 requests/min

Global

Consumer project

Generator suggestion operations per model type

Generator based

SuggestConversationSummaryOperationsPerMinutePerProject

60 requests/min

Global

Resource project

Suggest conversation summary operations

Non Generator based

SuggestConversationSummaryOperationsPerMinutePerProjectPerRegion

0-2 requests/min

Regional

Resource project

Suggest conversation summary operations in each region

Baseline v2 model

SuggestSummaryV2BaselineOperationsPerMinutePerProject

120 requests/min

Global

Resource project

Conversation Summary Suggestion V2 Baseline polling requests

Baseline v2 model

SuggestSummaryV2BaselineOperationsPerMinutePerProjectPerRegion

60 requests/min

Regional

Resource project

Conversation Summary Suggestion V2 Baseline polling requests in each region

Generative knowledge assist

Quota limit name Default value Region Charging resource/Consumer project Description
SearchKnowledgeOperationsPerMinutePerProject 60 requests/min Global Consumer project SearchKnowledge requests

Proactive generative knowledge assist

Quota limit name Default value Region Charging resource/Consumer project Description
SuggestKnowledgeAssistOperationsPerMinutePerProject 60 requests/min Global Resource project KnowledgeAssist requests through AnalyzeContent or SuggestKnowledgeAssist
SuggestKnowledgeAssistOperationsPerMinutePerProjectPerRegion 30 requests/min Regional Resource project KnowledgeAssist requests through AnalyzeContent or SuggestKnowledgeAssist in each region

Other API quotas

Quota limit name Default value Region Charging resource/Consumer project Description
ConversationOperationsPerMinutePerProject 300 requests/min Global Consumer project Other conversation requests except AnalyzeContent and StreamingAnalyzeContent, for example CreateConversation, CompleteConversation. Quota is shared between Dialogflow and Agent Assist.
MessagePollingOperationsPerMinutePerProject 1,200 requests / min Global Consumer project ListMessages requests. Quota is shared between Dialogflow and Agent Assist.
AnswerRecordOperationsPerMinutePerProject 300 requests / min Global Consumer project AnswerRecord requests