Accurate quota estimation is based on the number of your queries per second (QPS) to each API. The following sections outline the quotas for APIs used with each Agent Assist feature.
See the quota page for more information on requesting a quota increase. After submitting your request, Google might contact you for more information and inform you when your request is approved or denied.
Project types
The following quota tables list two types of projects: consumer and resource. See the documentation on using multiple projects for definitions of these two project types.
CCAI transcription
This feature uses either telephony or gRPC integration, which each have different API quotas.
Telephony integration
See the Dialogflow quotas for the APIs used with telephony integration.
gRPC integration
Quota limit name | Default value | Region | Charging resource/Consumer project | Description |
---|---|---|---|---|
AnalyzeContentOperationsPerMinutePerProject | 300 requests/min | Global | Consumer project | AnalyzeContent/StreamingAnalyzeContent requests. Quota is shared between Dialogflow and Agent Assist. |
Sentiment analysis
Quota limit name | Default value | Region | Charging resource/Consumer project | Description |
---|---|---|---|---|
AnalyzeSentimentOperationsPerMinutePerProject | 300 requests/min | Global | Consumer project | Sentiment analysis requests through AnalyzeContent or StreamingAnalyzeContent. |
AnalyzeSentimentOperationsPerMinutePerProjectPerRegion | 300 requests/min | Global | Consumer project | AnalyzeSentiment and StreamingAnalyzeSenitment requests. Quota is shared between Dialogflow and Agent Assist. |
Build your own assist
This feature uses the following AI models:
- text-bison@001 (default limit 0)
- text-bison@002
- text-bison-32k@002
- gemini-1.0-pro
- gemini-1.5-pro
- gemini-1.5-pro-001
- gemini-1.5-flash-001
- gemini-1.5-flash-002
- gemini-2.0-flash-001
Quota limit name | Default value | Region | Charging resource/Consumer project | Description |
---|---|---|---|---|
GeneratorSuggestionOperationsPerMinutePerModelType | 10 requests/min | Global | Consumer project | Generator suggestion operations per model type |
GeneratorSuggestionOperationsPerMinutePerModelTypePerRegion | 10 requests/min | Regional | Consumer project | Generator suggestion operations per model type and region |
Summarization
AI-generated summarization uses the following models:
- summarization-1.0
- summarization-2.0
- summarization-2.1
- summarization-3.0
- summarization-3.1
- summarization-4.0
The following table shows the quota type and model used for each version of summarization.
Summarization version |
Quota type |
Backend model |
---|---|---|
Generator 4.0 |
Generator based |
Pretrained Gemini-2.0-flash-001 |
Generator 3.1 |
Generator based |
Lora-tuned gemini-1.5-flash-001 |
Generator 3.0 |
Generator based |
Lora-tuned gemini-1.0-pro-002 |
Generator 2.1 |
Generator based |
Lora-tuned text-bison-32k@002 |
Generator 2.0 |
Generator based |
Lora-tuned text-bison-32k@002 |
Generator 1.0 |
Generator based |
Lora-tuned text-bison@001 |
Baseline v2 |
Baseline v2 model |
text-bison |
Baseline v1 |
Non Generator based |
LongT5 model |
Custom 2.0 |
Non Generator based |
LongT5 model |
The quota types in the previous table are reflected in the following quota list for APIs used with summarization.
Quota type |
Quota limit name |
Default value |
Region |
Charging resource/Consumer project |
Description |
---|---|---|---|---|---|
Generator based |
GeneratorSuggestionOperationsPerMinutePerModelTypePerRegion |
10 requests/min |
Regional |
Consumer project |
Generator suggestion operations per model type and region |
Generator based |
GeneratorSuggestionOperationsPerMinutePerModelType |
10 requests/min |
Global |
Consumer project |
Generator suggestion operations per model type |
Generator based |
SuggestConversationSummaryOperationsPerMinutePerProject |
60 requests/min |
Global |
Resource project |
Suggest conversation summary operations |
Non Generator based |
SuggestConversationSummaryOperationsPerMinutePerProjectPerRegion |
0-2 requests/min |
Regional |
Resource project |
Suggest conversation summary operations in each region |
Baseline v2 model |
SuggestSummaryV2BaselineOperationsPerMinutePerProject |
120 requests/min |
Global |
Resource project |
Conversation Summary Suggestion V2 Baseline polling requests |
Baseline v2 model |
SuggestSummaryV2BaselineOperationsPerMinutePerProjectPerRegion |
60 requests/min |
Regional |
Resource project |
Conversation Summary Suggestion V2 Baseline polling requests in each region |
Generative knowledge assist
Quota limit name | Default value | Region | Charging resource/Consumer project | Description |
---|---|---|---|---|
SearchKnowledgeOperationsPerMinutePerProject | 60 requests/min | Global | Consumer project | SearchKnowledge requests |
Proactive generative knowledge assist
Quota limit name | Default value | Region | Charging resource/Consumer project | Description |
---|---|---|---|---|
SuggestKnowledgeAssistOperationsPerMinutePerProject | 60 requests/min | Global | Resource project | KnowledgeAssist requests through AnalyzeContent or SuggestKnowledgeAssist |
SuggestKnowledgeAssistOperationsPerMinutePerProjectPerRegion | 30 requests/min | Regional | Resource project | KnowledgeAssist requests through AnalyzeContent or SuggestKnowledgeAssist in each region |
Other API quotas
Quota limit name | Default value | Region | Charging resource/Consumer project | Description |
---|---|---|---|---|
ConversationOperationsPerMinutePerProject | 300 requests/min | Global | Consumer project | Other conversation requests except AnalyzeContent and StreamingAnalyzeContent, for example CreateConversation, CompleteConversation. Quota is shared between Dialogflow and Agent Assist. |
MessagePollingOperationsPerMinutePerProject | 1,200 requests / min | Global | Consumer project | ListMessages requests. Quota is shared between Dialogflow and Agent Assist. |
AnswerRecordOperationsPerMinutePerProject | 300 requests / min | Global | Consumer project | AnswerRecord requests |