Use reference images to guide video generation

Veo on Vertex AI lets you use reference images with veo-2.0-generate-exp to guide your generated video's content and artistic style. You can choose to use one of the following when using reference images with Veo:

  • Asset image: You provide up to three images of a single person, character, or product. Veo preserves the subject's appearance in the output video.

  • Style image: You provide a single style image. Veo applies the style from your uploaded image in the output video.

For more information about writing effective text prompts for video generation, see the Veo prompt guide.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. Set up authentication for your environment.

    Select the tab for how you plan to use the samples on this page:

    Console

    When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

    REST

    To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.

      Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:

      gcloud init

      If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

    For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Use subject images to generate videos

Do the following:

Console

  1. In the Google Cloud console, go to the Vertex AI Studio > Media Studio page.

    Media Studio

  2. Click Veo.

  3. In the Settings pane, select the following settings:

    • Model: select veo-2.0-generate-exp.

    • Number of results: adjust the slider or enter a value between 1 and 4.

  4. In the Reference section, select Subject > click Add.

  5. Choose one to three images on your computer to upload.

  6. Optional: In the Safety section, select one of the following Person generation settings:

    • Allow (Adults only): default value. Generate adult people or faces only. Don't generate youth or children people or faces.

    • Don't allow: don't generate people or faces.

  7. Optional: In the Advanced options section, enter a Seed value for randomizing video generation.

  8. In the Write your prompt box, enter your text prompt that describes the videos to generate.

  9. Click Generate.

REST

After you set up your environment, you can use REST to test a text prompt. The following sample sends a request to the publisher model endpoint.

For more information about the Veo API, see the Veo on Vertex AI API.

  1. Use the following commands to send a video generation request. This request begins a long-running operation and stores output to a Cloud Storage bucket you specify.

    Before using any of the request data, make the following replacements:

    • PROJECT_ID: Your Google Cloud project ID.
    • TEXT_PROMPT: The text prompt used to guide video generation.
    • BASE64_ENCODED_IMAGE: A base64-bytes encoded subject image. You can repeat this field and mimeType to specify up to three subject images.
    • IMAGE_MIME_TYPE: The MIME type of the input image. Only one of the following:

      • image/jpeg
      • image/png

      You can repeat this field and bytesBase64Encoded to specify up to three subject images.

    • OUTPUT_STORAGE_URI: Optional: The Cloud Storage bucket to store the output videos. If not provided, a Base64-bytes encoded video is returned in the response. For example: gs://video-bucket/output/.
    • RESPONSE_COUNT: The number of video files you want to generate. Accepted integer values: 1-4.
    • Additional optional parameters

      Use the following optional variables depending on your use case. Add some or all of the following parameters in the "parameters": {} object.

      "parameters": {
        "aspectRatio": "ASPECT_RATIO",
        "negativePrompt": "NEGATIVE_PROMPT",
        "personGeneration": "PERSON_SAFETY_SETTING",
        // "resolution": RESOLUTION, // Veo 3 models only
        "sampleCount": RESPONSE_COUNT,
        "seed": SEED_NUMBER
      }
      • ASPECT_RATIO: string. Optional. Defines the aspect ratio of the generated videos. Values: 16:9 (default, landscape) or 9:16 (portrait).
      • NEGATIVE_PROMPT: string. Optional. A text string that describes what you want to discourage the model from generating.
      • PERSON_SAFETY_SETTING: string. Optional. The safety setting that controls whether people or face generation is allowed. Values:
        • allow_adult (default value): Allow generation of adults only.
        • disallow: Disallows inclusion of people or faces in images.
      • RESOLUTION: string. Optional. Veo 3 models only. The resolution of the generated video. Values: 720p (default) or 1080p.
      • RESPONSE_COUNT: int. Optional. The number of output images requested. Values: 1-4.
      • SEED_NUMBER: uint32. Optional. A number to make generated videos deterministic. Specifying a seed number with your request without changing other parameters guides the model to produce the same videos. Values: 0 - 4294967295.

    HTTP method and URL:

    POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/veo-2.0-generate-001:predictLongRunning

    Request JSON body:

    {
      "instances": [
        {
          "prompt": "TEXT_PROMPT",
          // The following fields can be repeated for up to three total
          // images.
          "referenceImages": [
            {
              "image": {
                "bytesBase64Encoded": "BASE64_ENCODED_IMAGE",
                "mimeType": "IMAGE_MIME_TYPE"
              },
              "referenceType": "asset"
            }
          ]
        }
      ],
      "parameters": {
        "durationSeconds": 8,
        "storageUri": "OUTPUT_STORAGE_URI",
        "sampleCount": RESPONSE_COUNT
      }
    }
    

    To send your request, choose one of these options:

    curl

    Save the request body in a file named request.json, and execute the following command:

    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/veo-2.0-generate-001:predictLongRunning"

    PowerShell

    Save the request body in a file named request.json, and execute the following command:

    $cred = gcloud auth print-access-token
    $headers = @{ "Authorization" = "Bearer $cred" }

    Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/veo-2.0-generate-001:predictLongRunning" | Select-Object -Expand Content
    This request returns a full operation name with a unique operation ID. Use this full operation name to poll that status of the video generation request.
    {
      "name": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/veo-2.0-generate-001/operations/a1b07c8e-7b5a-4aba-bb34-3e1ccb8afcc8"
    }
    

  2. Optional: Check the status of the video generation long-running operation.

    Before using any of the request data, make the following replacements:

    • PROJECT_ID: Your Google Cloud project ID.
    • MODEL_ID: The model ID to use.
    • OPERATION_ID: The unique operation ID returned in the original generate video request.

    HTTP method and URL:

    POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation

    Request JSON body:

    {
      "operationName": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID"
    }
    

    To send your request, choose one of these options:

    curl

    Save the request body in a file named request.json, and execute the following command:

    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation"

    PowerShell

    Save the request body in a file named request.json, and execute the following command:

    $cred = gcloud auth print-access-token
    $headers = @{ "Authorization" = "Bearer $cred" }

    Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation" | Select-Object -Expand Content
    This request returns information about the operation, including if the operation is still running or is done.

Use style images to generate videos

Do the following:

Console

  1. In the Google Cloud console, go to the Vertex AI Studio > Media Studio page.

    Media Studio

  2. Click Veo.

  3. In the Settings pane, select the following settings:

    • Model: select veo-2.0-generate-exp.

    • Number of results: adjust the slider or enter a value between 1 and 4.

  4. In the Reference section, select Style > click Add.

  5. Choose an image on your computer to upload.

  6. Optional: In the Safety section, select one of the following Person generation settings:

    • Allow (Adults only): default value. Generate adult people or faces only. Don't generate youth or children people or faces.

    • Don't allow: don't generate people or faces.

  7. Optional: In the Advanced options section, enter a Seed value for randomizing video generation.

  8. In the Write your prompt box, enter your text prompt that describes the videos to generate.

  9. Click Generate.

REST

After you set up your environment, you can use REST to test a text prompt. The following sample sends a request to the publisher model endpoint.

For more information about the Veo API, see the Veo on Vertex AI API.

  1. Use the following commands to send a video generation request. This request begins a long-running operation and stores output to a Cloud Storage bucket you specify.

    Before using any of the request data, make the following replacements:

    • PROJECT_ID: Your Google Cloud project ID.
    • TEXT_PROMPT: The text prompt used to guide video generation.
    • BASE64_ENCODED_IMAGE: A base64-bytes encoded style image.
    • IMAGE_MIME_TYPE: The MIME type of the input image. Only one of the following:
      • image/jpeg
      • image/png
    • OUTPUT_STORAGE_URI: Optional: The Cloud Storage bucket to store the output videos. If not provided, video bytes are returned in the response. For example: gs://video-bucket/output/.
    • RESPONSE_COUNT: The number of video files you want to generate. Accepted integer values: 1-4.
    • Additional optional parameters

      Use the following optional variables depending on your use case. Add some or all of the following parameters in the "parameters": {} object.

      "parameters": {
        "aspectRatio": "ASPECT_RATIO",
        "negativePrompt": "NEGATIVE_PROMPT",
        "personGeneration": "PERSON_SAFETY_SETTING",
        // "resolution": RESOLUTION, // Veo 3 models only
        "sampleCount": RESPONSE_COUNT,
        "seed": SEED_NUMBER
      }
      • ASPECT_RATIO: string. Optional. Defines the aspect ratio of the generated videos. Values: 16:9 (default, landscape) or 9:16 (portrait).
      • NEGATIVE_PROMPT: string. Optional. A text string that describes what you want to discourage the model from generating.
      • PERSON_SAFETY_SETTING: string. Optional. The safety setting that controls whether people or face generation is allowed. Values:
        • allow_adult (default value): Allow generation of adults only.
        • disallow: Disallows inclusion of people or faces in images.
      • RESOLUTION: string. Optional. Veo 3 models only. The resolution of the generated video. Values: 720p (default) or 1080p.
      • RESPONSE_COUNT: int. Optional. The number of output images requested. Values: 1-4.
      • SEED_NUMBER: uint32. Optional. A number to make generated videos deterministic. Specifying a seed number with your request without changing other parameters guides the model to produce the same videos. Values: 0 - 4294967295.

    HTTP method and URL:

    POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/veo-2.0-generate-001:predictLongRunning

    Request JSON body:

    {
      "instances": [
        {
          "prompt": "TEXT_PROMPT",
          "referenceImages": [
            {
              "image": {
                "bytesBase64Encoded": "BASE64_ENCODED_IMAGE",
                "mimeType": "IMAGE_MIME_TYPE"
              },
              "referenceType": "style"
            }
          ]
        }
      ],
      "parameters": {
        "durationSeconds": 8,
        "storageUri": "OUTPUT_STORAGE_URI",
        "sampleCount": RESPONSE_COUNT
      }
    }
    

    To send your request, choose one of these options:

    curl

    Save the request body in a file named request.json, and execute the following command:

    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/veo-2.0-generate-001:predictLongRunning"

    PowerShell

    Save the request body in a file named request.json, and execute the following command:

    $cred = gcloud auth print-access-token
    $headers = @{ "Authorization" = "Bearer $cred" }

    Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/veo-2.0-generate-001:predictLongRunning" | Select-Object -Expand Content
    This request returns a full operation name with a unique operation ID. Use this full operation name to poll that status of the video generation request.
    {
      "name": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/veo-2.0-generate-001/operations/a1b07c8e-7b5a-4aba-bb34-3e1ccb8afcc8"
    }
    

  2. Optional: Check the status of the video generation long-running operation.

    Before using any of the request data, make the following replacements:

    • PROJECT_ID: Your Google Cloud project ID.
    • MODEL_ID: The model ID to use.
    • OPERATION_ID: The unique operation ID returned in the original generate video request.

    HTTP method and URL:

    POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation

    Request JSON body:

    {
      "operationName": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID"
    }
    

    To send your request, choose one of these options:

    curl

    Save the request body in a file named request.json, and execute the following command:

    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation"

    PowerShell

    Save the request body in a file named request.json, and execute the following command:

    $cred = gcloud auth print-access-token
    $headers = @{ "Authorization" = "Bearer $cred" }

    Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:fetchPredictOperation" | Select-Object -Expand Content
    This request returns information about the operation, including if the operation is still running or is done.

What's next