Veo is the name of the model that supports video generation.
Veo generates a video from a text prompt or an image prompt that
you provide. For more information about Veo, see
Veo video generation
overview. To explore this model in the console, see the
Try Veo on Vertex AI (Vertex AI Studio) Veo API supports the following models:
Required for text-to-video. Optional if an input image prompt is
provided (image-to-video).
A text string that describes the video that you want to generate. For
example:
Union field
Optional. An image to guide video generation. Provide the image as either a Image-to-video is supported by the following models: Union field
Optional. An image to use as the final frame for video in-filling (generating the video content that leads up to the frame). Provide the image as either a
Supported by
Union field
Optional. A Veo generated video to extend in length. Provide the video as either a
Supported by
A Base64-encoded string of the image or video file's bytes. A string URI to a Cloud Storage bucket location.
Required for the following objects:
Specifies the mime type of a video or image. For images, the following mime types are accepted: For videos, the following mime types are accepted:
Optional. Specifies the aspect ratio of generated videos. The
following are accepted values:
Required. The length in seconds of the video files that you want to generate.
The accepted values for each model are:
Optional. Specifies whether to use Gemini to enhance your prompt. Accepted
values are
Required for
This parameter isn't supported by
Optional. A text string that describes anything you want to
discourage the model from generating. For example:
Optional. The safety setting that controls whether people or face
generation is allowed. One of the following:
Optional. Veo 3 models only. The resolution of the
generated video. Accepted values are
Optional. The number of video samples to generate. Accepted values are
Optional. A number used to initialize the random generation process. Using the same seed, prompt, and other parameters results in the same output videos, making the generation deterministic.
The accepted range is
Optional. A Cloud Storage bucket URI to store the output video, in
the format
Use the following requests to send a text-to-video request or an image-to-video
request: To test a text prompt by using the Vertex AI Veo API, send a POST request to
the publisher model endpoint.
Before using any of the request data,
make the following replacements:
Additional optional parameters Use the following optional variables depending on your use
case. Add some or all of the following parameters in the
HTTP method and URL:
Request JSON body:
To send your request, choose one of these options:
Save the request body in a file named
Save the request body in a file named Video Generation
model card in
the Model Garden.Supported Models
veo-2.0-generate-001
veo-3.0-generate-001
veo-3.0-fast-generate-001
veo-3.0-generate-preview
(Preview)veo-3.0-fast-generate-preview
(Preview)HTTP request
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:predictLongRunning \
-d '{
"instances": [
{
"prompt": string,
"image": {
// Union field can be only one of the following:
"bytesBase64Encoded": string,
"gcsUri": string,
// End of list of possible types for union field.
"mimeType": string
},
"lastFrame": {
// Union field can be only one of the following:
"bytesBase64Encoded": string,
"gcsUri": string,
// End of list of possible types for union field.
"mimeType": string
},
"video": {
// Union field can be only one of the following:
"bytesBase64Encoded": string,
"gcsUri": string,
// End of list of possible types for union field.
"mimeType": string
}
}
],
"parameters": {
"aspectRatio": string,
"durationSeconds": integer,
"enhancePrompt": boolean,
"generateAudio": boolean,
"negativePrompt": string,
"personGeneration": string,
"resolution": string, // Veo 3 models only
"sampleCount": integer,
"seed": uint32,
"storageUri": string
}
}'
Instances
prompt
string
image
.bytesBase64Encoded
string or a gcsUri
string that points to a
Cloud Storage bucket location.
veo-2.0-generate-001
veo-3.0-generate-preview
lastFrame
.bytesBase64Encoded
string or a gcsUri
string that points to a
Cloud Storage bucket location.
veo-2.0-generate-001
only.
video
.bytesBase64Encoded
string or a gcsUri
string that points to a
Cloud Storage bucket location.
veo-2.0-generate-001
only.
bytesBase64Encoded
string
gcsUri
string
mimeType
string
image/jpeg
image/png
video/mp4
Parameters
aspectRatio
string
16:9
(default)9:16
(not supported by veo-3.0-generate-preview
)
durationSeconds
integer
veo-2.0-generate-001
:
5
-8
. The default is 8
.
veo-3.0-generate-preview
:
8
.
enhancePrompt
boolean
true
or false
. The default is true
.
generateAudio
boolean
veo-3.0-generate-preview
.
Specifies whether to generate audio for the video. Accepted values are
true
or false
.
veo-2.0-generate-001
.
negativePrompt
string
personGeneration
string
allow_adult
(default): allow generation of
adults only.
dont_allow
: disallows inclusion of people or faces in
images.
resolution
string
720p
(default) or
1080p
.
sampleCount
int
1
to 4
.
seed
uint32
0
to 4,294,967,295
.
storageUri
string
gs://BUCKET_NAME/SUBDIRECTORY
. If you don't provide a
Cloud Storage bucket, base64-encoded video
bytes are returned in the response.
Sample request
Text-to-video generation request
REST
veo-2.0-generate-001
veo-3.0-generate-001
veo-3.0-fast-generate-001
veo-3.0-generate-preview
(Preview)veo-3.0-fast-generate-preview
(Preview)gs://video-bucket/output/
.
"parameters": {}
object.
"parameters": {
"aspectRatio": "ASPECT_RATIO",
"negativePrompt": "NEGATIVE_PROMPT",
"personGeneration": "PERSON_SAFETY_SETTING",
// "resolution": RESOLUTION, // Veo 3 models only
"sampleCount": RESPONSE_COUNT,
"seed": SEED_NUMBER
}
16:9
(default, landscape) or 9:16
(portrait).
allow_adult
(default value): Allow generation of adults only.disallow
: Disallows inclusion of people or faces in images.720p
(default) or 1080p
.
1
-4
.
0
- 4294967295
.
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning
{
"instances": [
{
"prompt": "TEXT_PROMPT"
}
],
"parameters": {
"storageUri": "OUTPUT_STORAGE_URI",
"sampleCount": "RESPONSE_COUNT"
}
}
curl
request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning"PowerShell
request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning" | Select-Object -Expand Content
{
"name": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/a1b07c8e-7b5a-4aba-bb34-3e1ccb8afcc8"
}
Image-to-video generation request
REST
To test a text prompt by using the Vertex AI Veo API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- MODEL_ID: The model ID to use. Available values:
veo-2.0-generate-001
(GA)veo-3.0-generate-preview
(Preview)
- TEXT_PROMPT: The text prompt used to guide video generation.
- INPUT_IMAGE: Base64-encoded bytes string representing the input image. To ensure quality, the input image should be 720p or higher (1280 x 720 pixels) and have a 16:9 or 9:16 aspect ratio. Images of other aspect ratios or sizes may be resized or centrally cropped during the upload process.
- MIME_TYPE: The MIME type of the input image. Only the images of
the following MIME types are supported:
image/jpeg
orimage/png
. - OUTPUT_STORAGE_URI: Optional: The Cloud Storage bucket to
store the output videos. If not provided, video bytes are returned in the
response. For example:
gs://video-bucket/output/
. - RESPONSE_COUNT: The number of video files you want to generate. Accepted integer values: 1-4.
- DURATION: The length of video files that you want to generate. Accepted integer values are 5-8.
-
Additional optional parameters
Use the following optional variables depending on your use case. Add some or all of the following parameters in the
"parameters": {}
object."parameters": { "aspectRatio": "ASPECT_RATIO", "negativePrompt": "NEGATIVE_PROMPT", "personGeneration": "PERSON_SAFETY_SETTING", // "resolution": RESOLUTION, // Veo 3 models only "sampleCount": RESPONSE_COUNT, "seed": SEED_NUMBER }
- ASPECT_RATIO: string. Optional. Defines the aspect ratio of the generated
videos. Values:
16:9
(default, landscape) or9:16
(portrait). - NEGATIVE_PROMPT: string. Optional. A text string that describes what you want to discourage the model from generating.
- PERSON_SAFETY_SETTING: string. Optional. The safety setting that controls
whether people or face generation is allowed. Values:
allow_adult
(default value): Allow generation of adults only.disallow
: Disallows inclusion of people or faces in images.
- RESOLUTION: string. Optional. Veo 3 models only. The resolution of the
generated video. Values:
720p
(default) or1080p
. - RESPONSE_COUNT: int. Optional. The number of output images requested. Values:
1
-4
. - SEED_NUMBER: uint32. Optional. A number to make generated videos deterministic.
Specifying a seed number with your request without changing other parameters guides the
model to produce the same videos. Values:
0
-4294967295
.
- ASPECT_RATIO: string. Optional. Defines the aspect ratio of the generated
videos. Values:
HTTP method and URL:
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning
Request JSON body:
{ "instances": [ { "prompt": "TEXT_PROMPT", "image": { "bytesBase64Encoded": "INPUT_IMAGE", "mimeType": "MIME_TYPE" } } ], "parameters": { "storageUri": "OUTPUT_STORAGE_URI", "sampleCount": RESPONSE_COUNT } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning" | Select-Object -Expand Content
{ "name": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/a1b07c8e-7b5a-4aba-bb34-3e1ccb8afcc8" }
Poll the status of the video generation long-running operation
Check the status of the video generation long-running operation.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- MODEL_ID: The model ID to use.
- OPERATION_ID: The unique operation ID returned in the original generate video request.
HTTP method and URL:
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation
Request JSON body:
{ "operationName": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID" }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation" | Select-Object -Expand Content
Response body (generate video request)
Sending a text-to-video or image-to-video request returns the following response:
{
"name": string
}
Response element | Description |
---|---|
name |
The full operation name of the long-running operation that begins after a video generation request is sent. |
Sample response (generate video request)
{
"name": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID"
}
Response body (poll long-running operation)
Polling the status of the original video generation long-running operation returns a response similar to the following:
{
"name": string,
"done": boolean,
"response":{
"@type":"type.googleapis.com/cloud.ai.large_models.vision.GenerateVideoResponse",
"raiMediaFilteredCount": integer,
"videos":[
{
"gcsUri": string,
"mimeType": string
},
{
"gcsUri": string,
"mimeType": string
},
{
"gcsUri": string,
"mimeType": string
},
{
"gcsUri": string,
"mimeType": string
},
]
}
}
Response element | Description |
---|---|
bytesBase64Encoded |
A Base64 bytes encoded string that represents the video object. |
done |
A boolean value that indicates whether the operation is complete. |
encoding |
The video encoding type. |
gcsUri |
The Cloud Storage URI of the generated video. |
name |
The full operation name of the long-running operation that begins after a video generation request is sent. |
raiMediaFilteredCount |
Returns a count of videos that Veo filtered due to
responsible AI policies. If no videos are filtered, the returned count is
0 .
|
raiMediaFilteredReasons |
Lists the reasons for any Veo filtered videos due to responsible AI policies. For more information, see Safety filter code categories. |
response |
The response body of the long-running operation. |
video |
The generated video. |
Sample response (poll long-running operation)
{
"name": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID",
"done":true,
"response":{
"@type":"type.googleapis.com/cloud.ai.large_models.vision.GenerateVideoResponse",
"raiMediaFilteredCount": 0,
"videos":[
{
"gcsUri":"gs://STORAGE_BUCKET/TIMESTAMPED_SUBDIRECTORY/sample_0.mp4",
"mimeType":"video/mp4"
},
{
"gcsUri":"gs://STORAGE_BUCKET/TIMESTAMPED_SUBDIRECTORY/sample_1.mp4",
"mimeType":"video/mp4"
},
{
"gcsUri":"gs://STORAGE_BUCKET/TIMESTAMPED_SUBDIRECTORY/sample_2.mp4",
"mimeType":"video/mp4"
},
{
"gcsUri":"gs://STORAGE_BUCKET/TIMESTAMPED_SUBDIRECTORY/sample_3.mp4",
"mimeType":"video/mp4"
}
]
}
}
More information
- For more information about using Veo on Vertex AI, see Generate videos using text and image prompts using Veo.
What's next
- Read Google DeepMind's information on the Veo model.
- Read the blog post "Veo and Imagen 3: Announcing new video and image generation models on Vertex AI".
- Read the blog post "New generative media models and tools, built with and for creators".