Direct Veo video generation using a reference image

Preview

This product or feature is a Generative AI Preview offering, subject to the "Pre-GA Offerings Terms" of the Google Cloud Service Specific Terms, as well as the Additional Terms for Generative AI Preview Products. For this Generative AI Preview offering, Customers may elect to use it for production or commercial purposes, or disclose Generated Output to third-parties, and may process personal data as outlined in the Cloud Data Processing Addendum, subject to the obligations and restrictions described in the agreement under which you access Google Cloud. Pre-GA products are available "as is" and might have limited support. For more information, see the launch stage descriptions.

Veo on Vertex AI lets you use reference images to direct your generated video's content and artistic style. You can choose to use one of the following when using reference images with Veo:

Asset image: You provide up to three images of a single person, character, or product. Veo preserves the subject's appearance in the output video.
Style image: You provide a single style image. Veo applies the style from your uploaded image in the output video. This feature is only supported by veo-2.0-generate-exp in Preview.

For more information about writing effective text prompts for video generation, see the Veo prompt guide.

Try Veo in a Colab

Before you begin

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Enable the Vertex AI API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Enable the Vertex AI API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

Set up authentication for your environment.

Select the tab for how you plan to use the samples on this page:
Console

When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
Python

To use the Python samples on this page in a local development environment, install and initialize the gcloud CLI, and then set up Application Default Credentials with your user credentials.
For more information, see Set up ADC for a local development environment in the Google Cloud authentication documentation.
REST

To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Use subject images to generate videos

Do the following:

Console

In the Google Cloud console, go to the Vertex AI Studio > Media Studio page.

Media Studio
Click Veo.
In the Settings pane, select the following settings:
- Model: select one of the following:
  - Veo 2: veo-2.0-generate-exp
  - Veo 3: veo-3.1-generate-preview
    
    Note: veo-3.1-generate-preview only returns 8 second videos when you use reference images.
- Number of results: adjust the slider or enter a value between 1 and 4.
In the Reference section, select Subject > click Add.
Choose one to three images on your computer to upload.
Optional: In the Safety section, select one of the following Person generation settings:
- Allow (Adults only): default value. Generate adult people or faces only. Don't generate youth or children people or faces.
- Don't allow: don't generate people or faces.
Optional: In the Advanced options section, enter a Seed value for randomizing video generation.
In the Write your prompt box, enter your text prompt that describes the videos to generate.
Click Generate.

Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

import time
from google import genai
from google.genai.types import GenerateVideosConfig, Image, VideoGenerationReferenceImage

client = genai.Client()

# TODO(developer): Update and un-comment below line
# output_gcs_uri = "gs://your-bucket/your-prefix"

operation = client.models.generate_videos(
    model="veo-3.1-generate-preview",
    prompt="slowly rotate this coffee mug in a 360 degree circle",
    config=GenerateVideosConfig(
        reference_images=[
            VideoGenerationReferenceImage(
                image=Image(
                    gcs_uri="gs://cloud-samples-data/generative-ai/image/mug.png",
                    mime_type="image/png",
                ),
                reference_type="asset",
            ),
        ],
        aspect_ratio="16:9",
        output_gcs_uri=output_gcs_uri,
    ),
)

while not operation.done:
    time.sleep(15)
    operation = client.operations.get(operation)
    print(operation)

if operation.response:
    print(operation.result.generated_videos[0].video.uri)

# Example response:
# gs://your-bucket/your-prefix

REST

After you set up your environment, you can use REST to test a text prompt. The following sample sends a request to the publisher model endpoint.

For more information about the Veo API, see the Veo on Vertex AI API.

Use the following commands to send a video generation request. This request begins a long-running operation and stores output to a Cloud Storage bucket you specify.

Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- MODEL_ID: A string representing the model ID to use. The following are accepted values:
  - Veo 2: veo-2.0-generate-exp
  - Veo 3: veo-3.1-generate-preview
- TEXT_PROMPT: The text prompt used to guide video generation.
- BASE64_ENCODED_IMAGE: A base64-bytes encoded subject image. You can repeat this field and mimeType to specify up to three subject images.
- IMAGE_MIME_TYPE: The MIME type of the input image. Only one of the following:
  - image/jpeg
  - image/png
  You can repeat this field and bytesBase64Encoded to specify up to three subject images.
- OUTPUT_STORAGE_URI: Optional: The Cloud Storage bucket to store the output videos. If not provided, a Base64-bytes encoded video is returned in the response. For example: gs://video-bucket/output/.
- RESPONSE_COUNT: The number of video files you want to generate. Accepted integer values: 1-4.
- Additional optional parameters
  
  Use the following optional variables depending on your use case. Add some or all of the following parameters in the "parameters": {} object.
```
"parameters": {
  "aspectRatio": "ASPECT_RATIO",
  "negativePrompt": "NEGATIVE_PROMPT",
  "personGeneration": "PERSON_SAFETY_SETTING",
  // "resolution": RESOLUTION, // Veo 3 models only
  "sampleCount": RESPONSE_COUNT,
  "seed": SEED_NUMBER
}
```
  - ASPECT_RATIO: Optional: A string value that describes the aspect ratio of the generated videos. You can use the following values:
    - "16:9" for landscape
    - "9:16" for portrait
    The default value is "16:9"
  - NEGATIVE_PROMPT: Optional: A string value that describes content that you want to prevent the model from generating.
  - PERSON_SAFETY_SETTING: Optional: A string value that controls the safety setting for generating people or face generation. You can use the following values:
    - "allow_adult": Only allow generation of adult people and faces.
    - "disallow": Doesn't generate people or faces.
    The default value is "allow_adult".
  - RESOLUTION: Optional: A string value that controls the resolution of the generated video. Supported by Veo 3 models only. You can use the following values:
    - "720p"
    - "1080p"
    The default value is "720p".
  - RESPONSE_COUNT: Optional. An integer value that describes the number of videos to generate. The accepted range of values is 1-4.
  - SEED_NUMBER: Optional. An uint32 value that the model uses to generate deterministic videos. Specifying a seed number with your request without changing other parameters guides the model to produce the same videos. The accepted range of values is 0-4294967295.
HTTP method and URL:
```
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning
```
Request JSON body:
```
{
  "instances": [
    {
      "prompt": "TEXT_PROMPT",
      // The following fields can be repeated for up to three total
      // images.
      "referenceImages": [
        {
          "image": {
            "bytesBase64Encoded": "BASE64_ENCODED_IMAGE",
            "mimeType": "IMAGE_MIME_TYPE"
          },
          "referenceType": "asset"
        }
      ]
    }
  ],
  "parameters": {
    "durationSeconds": 8,
    "storageUri": "OUTPUT_STORAGE_URI",
    "sampleCount": RESPONSE_COUNT
  }
}
```
To send your request, choose one of these options:
curl

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning"
PowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning" | Select-Object -Expand Content
This request returns a full operation name with a unique operation ID. Use this full operation name to poll that status of the video generation request.
```
{
  "name":
  "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/a1b07c8e-7b5a-4aba-bb34-3e1ccb8afcc8"
}
```

Optional: Check the status of the video generation long-running operation.

Before using any of the request data, make the following replacements:

PROJECT_ID: Your Google Cloud project ID.
MODEL_ID: The model ID to use.
OPERATION_ID: The unique operation ID returned in the original generate video request.

HTTP method and URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation

Request JSON body:

{
  "operationName": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID"
}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation"

PowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation" | Select-Object -Expand Content

This request returns information about the operation, including if the operation is still running or is done.

Response

{
  "name": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID",
  "done": true,
  "response": {
    "raiMediaFilteredCount": 0,
    "@type": "type.googleapis.com/cloud.ai.large_models.vision.GenerateVideoResponse",
    "videos": [
      {
        "gcsUri":"gs://BUCKET_NAME/TIMESTAMPED_FOLDER/sample_0.mp4",
        "mimeType": "video/mp4"
      }
    ]
  }
}

Use style images to generate videos

Do the following:

Console

In the Google Cloud console, go to the Vertex AI Studio > Media Studio page.

Media Studio
Click Veo.
In the Settings pane, select the following settings:
- Model: select veo-2.0-generate-exp.
- Number of results: adjust the slider or enter a value between 1 and 4.
In the Reference section, select Style > click Add.
Choose an image on your computer to upload.
Optional: In the Safety section, select one of the following Person generation settings:
- Allow (Adults only): default value. Generate adult people or faces only. Don't generate youth or children people or faces.
- Don't allow: don't generate people or faces.
Optional: In the Advanced options section, enter a Seed value for randomizing video generation.
In the Write your prompt box, enter your text prompt that describes the videos to generate.
Click Generate.

REST

After you set up your environment, you can use REST to test a text prompt. The following sample sends a request to the publisher model endpoint.

For more information about the Veo API, see the Veo on Vertex AI API.

Use the following commands to send a video generation request. This request begins a long-running operation and stores output to a Cloud Storage bucket you specify.

Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- MODEL_ID: A string representing the model ID to use. Use the following value: veo-2.0-generate-exp.
  
  Important: Veo 3.1 models don't support referenceImages.style. Use veo-2.0-generate-exp when using style images.
- TEXT_PROMPT: The text prompt used to guide video generation.
- BASE64_ENCODED_IMAGE: A base64-bytes encoded style image.
- IMAGE_MIME_TYPE: The MIME type of the input image. Only one of the following:
  - image/jpeg
  - image/png
- OUTPUT_STORAGE_URI: Optional: The Cloud Storage bucket to store the output videos. If not provided, video bytes are returned in the response. For example: gs://video-bucket/output/.
- RESPONSE_COUNT: The number of video files you want to generate. Accepted integer values: 1-4.
- Additional optional parameters
  
  Use the following optional variables depending on your use case. Add some or all of the following parameters in the "parameters": {} object.
```
"parameters": {
  "aspectRatio": "ASPECT_RATIO",
  "negativePrompt": "NEGATIVE_PROMPT",
  "personGeneration": "PERSON_SAFETY_SETTING",
  // "resolution": RESOLUTION, // Veo 3 models only
  "sampleCount": RESPONSE_COUNT,
  "seed": SEED_NUMBER
}
```
  - ASPECT_RATIO: Optional: A string value that describes the aspect ratio of the generated videos. You can use the following values:
    - "16:9" for landscape
    - "9:16" for portrait
    The default value is "16:9"
  - NEGATIVE_PROMPT: Optional: A string value that describes content that you want to prevent the model from generating.
  - PERSON_SAFETY_SETTING: Optional: A string value that controls the safety setting for generating people or face generation. You can use the following values:
    - "allow_adult": Only allow generation of adult people and faces.
    - "disallow": Doesn't generate people or faces.
    The default value is "allow_adult".
  - RESOLUTION: Optional: A string value that controls the resolution of the generated video. Supported by Veo 3 models only. You can use the following values:
    - "720p"
    - "1080p"
    The default value is "720p".
  - RESPONSE_COUNT: Optional. An integer value that describes the number of videos to generate. The accepted range of values is 1-4.
  - SEED_NUMBER: Optional. An uint32 value that the model uses to generate deterministic videos. Specifying a seed number with your request without changing other parameters guides the model to produce the same videos. The accepted range of values is 0-4294967295.
HTTP method and URL:
```
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning
```
Request JSON body:
```
{
  "instances": [
    {
      "prompt": "TEXT_PROMPT",
      "referenceImages": [
        {
          "image": {
            "bytesBase64Encoded": "BASE64_ENCODED_IMAGE",
            "mimeType": "IMAGE_MIME_TYPE"
          },
          "referenceType": "style"
        }
      ]
    }
  ],
  "parameters": {
    "durationSeconds": 8,
    "storageUri": "OUTPUT_STORAGE_URI",
    "sampleCount": RESPONSE_COUNT
  }
}
```
To send your request, choose one of these options:
curl

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning"
PowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning" | Select-Object -Expand Content
This request returns a full operation name with a unique operation ID. Use this full operation name to poll that status of the video generation request.
```
{
  "name":
  "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/a1b07c8e-7b5a-4aba-bb34-3e1ccb8afcc8"
}
```

Optional: Check the status of the video generation long-running operation.

Before using any of the request data, make the following replacements:

PROJECT_ID: Your Google Cloud project ID.
MODEL_ID: The model ID to use.
OPERATION_ID: The unique operation ID returned in the original generate video request.

HTTP method and URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation

Request JSON body:

{
  "operationName": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID"
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation" | Select-Object -Expand Content

This request returns information about the operation, including if the operation is still running or is done.

Response

{
  "name": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID",
  "done": true,
  "response": {
    "raiMediaFilteredCount": 0,
    "@type": "type.googleapis.com/cloud.ai.large_models.vision.GenerateVideoResponse",
    "videos": [
      {
        "gcsUri":"gs://BUCKET_NAME/TIMESTAMPED_FOLDER/sample_0.mp4",
        "mimeType": "video/mp4"
      }
    ]
  }
}

Direct Veo video generation using a reference image

Before you begin

Console

Python

REST

Use subject images to generate videos

Console

Python

Install

REST

curl

PowerShell

curl

PowerShell

Response

Use style images to generate videos

Console

REST

curl

PowerShell

curl

PowerShell

Response

What's next