Dynamic shared quota (DSQ)

Dynamic shared quota (DSQ) was introduced to serve your pay-as-you-go (PayGo) requests with greater flexibility to adapt to your workload needs without having to manage quotas and quota increase requests (QIR). DSQ serves incoming requests by distributing available PayGo capacity among customers for a specific model and region. Your requests are served as long as capacity is available without any preset quota limit.

Supported models

The following Gemini models are supported by DSQ:

How DSQ works

Dynamic shared quota (DSQ) adapts to your traffic patterns and needs without a preset quota and serves your requests as long as capacity is available. With DSQ, you don't submit a quota increase request (QIR) whenever traffic increases, because there is no quota that might throttle your requests.

To prevent large traffic spikes sent by a few customers from interfering with other customers sending smaller and more steady traffic, DSQ adopts a traffic control mechanism by setting a tokens per second (TPS) limit at the organization level. This TPS limit is different from standard quotas, and doesn't automatically throttle requests above the limit. Instead, DSQ sets different priorities for requests depending on whether they are within or above the TPS limit. Therefore, traffic spikes beyond the TPS limit won't interfere with the requests within the limit.

Gemini requests with multi-modal inputs are subject to the corresponding system rate limits that include image, audio, video, and document.

What's next