To view the list of supported models and regions, see Deployments and endpoints. For the list of regions and models that support ML processing, see ML processing.
Features of Single Zone Provisioned Throughput
This section outlines the key features of Single Zone Provisioned Throughput:
Pricing and units are consistent with standard Provisioned Throughput: Single Zone Provisioned Throughput uses the same measure of throughput (GSUs), pricing, and terms as standard Provisioned Throughput.
Single Zone Provisioned Throughput supports in-region ML processing: All requests are processed in the purchased region, including traffic that exceeds your purchased amount of throughput. This traffic is billed at the pay-as-you-go rate using buffer capacity in the region.
You control the overages: You can control overflow traffic using the same headers as with standard Provisioned Throughput.
You can monitor your order: You can monitor your Single Zone Provisioned Throughput order using the existing Provisioned Throughput monitoring capabilities.
Limitations
Single Zone Provisioned Throughput has the following limitations:
Single Zone Provisioned Throughput is not a Covered Service and is excluded from the Gemini Online Inference on Vertex AI Service Level Agreement.
Single Zone Provisioned Throughput does not integrate with or support Batch requests or Fine Tuning.
Purchase Single Zone Provisioned Throughput
For assistance with purchasing Single Zone Provisioned Throughput, contact your Google Cloud account representative.