Vertex AI in express mode lets you try a subset of Vertex AI features
by using only an express mode API key. This page shows you the REST resources
available for Vertex AI in express mode. Unlike the standard REST resource endpoints on Google Cloud, endpoints that are
available when using Vertex AI in express mode use the global endpoint
Standard Vertex AI endpoint format: Endpoint format for Vertex AI in express mode: aiplatform.googleapis.com
and don't include projects
or locations
. For
example, the following shows the difference between standard and express mode
endpoints for the datasets resource:https://{location}-aiplatform.googleapis.com/v1/projects/{project}/locations/{location}/{model}:generateContent
https://aiplatform.googleapis.com/v1/{model}:generateContent
REST Resource: v1.publishers.models
Methods
countTokens
POST /v1/{endpoint}:countTokens
Perform a token counting.
generateContent
POST /v1/{model}:generateContent
Generate content with multimodal inputs.
streamGenerateContent
POST /v1/{model}:streamGenerateContent
Generate content with multimodal inputs with streaming support.REST Resource: v1beta1.publishers.models
Methods
countTokens
POST /v1beta1/{endpoint}:countTokens
Perform a token counting.
generateContent
POST /v1beta1/{model}:generateContent
Generate content with multimodal inputs.
streamGenerateContent
POST /v1beta1/{model}:streamGenerateContent
Generate content with multimodal inputs with streaming support.
Vertex AI in express mode REST API reference
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-29 UTC.