Generate videos from an image

You can use Veo on Vertex AI to generate new videos from an image and text prompt. Supported interfaces include the Google Cloud console and the Vertex AI API.

For more information about writing effective text prompts for video generation, see the Veo prompt guide.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. Set up authentication for your environment.

    Select the tab for how you plan to use the samples on this page:

    Console

    When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

    REST

    To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.

      After installing the Google Cloud CLI, initialize it by running the following command:

      gcloud init

      If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

    For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Generate videos from an image

Sample input Sample output
  1. Input image1
    Input PNG file of a crocheted elephant
  2. Text prompt: the elephant moves around naturally

Output video of a crocheted elephant

1 Image generated using Imagen on Vertex AI from the prompt: A Crochet elephant in intricate patterns walking on the savanna

You can generate novel videos using only an image as an input, or and image and descriptive text as the inputs. The following samples show you basic instructions to generate videos from image and text.

Console

  1. In the Google Cloud console, go to the Vertex AI Studio > Media Studio page.

    Media Studio

  2. Click Video.

  3. Optional: In the Settings pane, configure the following settings:

    • Model: choose a model from the available options.
    • Aspect ratio: choose either 16:9 or 9:16.

    • Number of results: adjust the slider or enter a value between 1 and 4.

    • Video length: select a length between 5 seconds and 8 seconds.

    • Output directory: click Browse to create or select a Cloud Storage bucket to store output files.

  4. Optional: In the Safety section, select one of the following Person generation settings:

    • Allow (Adults only): default value. Generate adult people or faces only. Don't generate youth or children people or faces.

    • Don't allow: don't generate people or faces.

  5. Optional: In the Advanced options section, enter a Seed value for randomizing video generation.

  6. In the Write your prompt prompt box, click Upload.

  7. Choose a local image to upload and click Select.

  8. In the Write your prompt box, enter your text prompt that describes the videos to generate.

  9. Click Generate.

REST

After you set up your environment, you can use REST to test a text prompt. The following sample sends a request to the publisher model endpoint.

For more information about the Veo API, see the Veo on Vertex AI API.

  1. Use the following command to send a video generation request. This request begins a long-running operation and stores output to a Cloud Storage bucket you specify.

    Before using any of the request data, make the following replacements:

    • PROJECT_ID: Your Google Cloud project ID.
    • MODEL_ID: The model ID to use. Available values:
      • veo-2.0-generate-001 (GA)
      • veo-3.0-generate-preview (Preview)
    • TEXT_PROMPT: The text prompt used to guide video generation.
    • INPUT_IMAGE: Base64-encoded bytes string representing the input image. To ensure quality, the input image should be 720p or higher (1280 x 720 pixels) and have a 16:9 or 9:16 aspect ratio. Images of other aspect ratios or sizes may be resized or centrally cropped during the upload process.
    • MIME_TYPE: The MIME type of the input image. Only the images of the following MIME types are supported: image/jpeg or image/png.
    • OUTPUT_STORAGE_URI: Optional: The Cloud Storage bucket to store the output videos. If not provided, video bytes are returned in the response. For example: gs://video-bucket/output/.
    • RESPONSE_COUNT: The number of video files you want to generate. Accepted integer values: 1-4.
    • DURATION: The length of video files that you want to generate. Accepted integer values are 5-8.
    • Additional optional parameters

      Use the following optional variables depending on your use case. Add some or all of the following parameters in the "parameters": {} object.

      "parameters": {
        "aspectRatio": "ASPECT_RATIO",
        "negativePrompt": "NEGATIVE_PROMPT",
        "personGeneration": "PERSON_SAFETY_SETTING",
        "sampleCount": RESPONSE_COUNT,
        "seed": SEED_NUMBER
      }
      • ASPECT_RATIO: string. Optional. Defines the aspect ratio of the generated videos. Values: 16:9 (default, landscape) or 9:16 (portrait).
      • NEGATIVE_PROMPT: string. Optional. A text string that describes what you want to discourage the model from generating.
      • PERSON_SAFETY_SETTING: string. Optional. The safety setting that controls whether people or face generation is allowed. Values:
        • allow_adult (default value): Allow generation of adults only.
        • disallow: Disallows inclusion of people or faces in images.
      • RESPONSE_COUNT: int. Optional. The number of output images requested. Values: 1-4.
      • SEED_NUMBER: uint32. Optional. A number to make generated videos deterministic. Specifying a seed number with your request without changing other parameters guides the model to produce the same videos. Values: 0 - 4294967295.

    HTTP method and URL:

    POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning

    Request JSON body:

    {
      "instances": [
        {
          "prompt": "TEXT_PROMPT",
          "image": {
            "bytesBase64Encoded": "INPUT_IMAGE",
            "mimeType": "MIME_TYPE"
          }
        }
      ],
      "parameters": {
        "storageUri": "OUTPUT_STORAGE_URI",
        "sampleCount": RESPONSE_COUNT
      }
    }
    

    To send your request, choose one of these options:

    curl

    Save the request body in a file named request.json, and execute the following command:

    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning"

    PowerShell

    Save the request body in a file named request.json, and execute the following command:

    $cred = gcloud auth print-access-token
    $headers = @{ "Authorization" = "Bearer $cred" }

    Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning" | Select-Object -Expand Content
    This request returns a full operation name with a unique operation ID. Use this full operation name to poll that status of the video generation request.
    {
      "name": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/a1b07c8e-7b5a-4aba-bb34-3e1ccb8afcc8"
    }
    

  2. Optional: Check the status of the video generation long-running operation.

    Before using any of the request data, make the following replacements:

    • PROJECT_ID: Your Google Cloud project ID.
    • MODEL_ID: The model ID to use. Available values:
      • veo-2.0-generate-001
    • TEXT_PROMPT: The text prompt used to guide video generation.
    • OUTPUT_STORAGE_URI: Optional: The Cloud Storage bucket to store the output videos. If not provided, video bytes are returned in the response. For example: gs://video-bucket/output/.
    • RESPONSE_COUNT: The number of video files you want to generate. Accepted integer values: 1-4.
    • Additional optional parameters

      Use the following optional variables depending on your use case. Add some or all of the following parameters in the "parameters": {} object.

      "parameters": {
        "aspectRatio": "ASPECT_RATIO",
        "negativePrompt": "NEGATIVE_PROMPT",
        "personGeneration": "PERSON_SAFETY_SETTING",
        "sampleCount": RESPONSE_COUNT,
        "seed": SEED_NUMBER
      }
      • ASPECT_RATIO: string. Optional. Defines the aspect ratio of the generated videos. Values: 16:9 (default, landscape) or 9:16 (portrait).
      • NEGATIVE_PROMPT: string. Optional. A text string that describes what you want to discourage the model from generating.
      • PERSON_SAFETY_SETTING: string. Optional. The safety setting that controls whether people or face generation is allowed. Values:
        • allow_adult (default value): Allow generation of adults only.
        • disallow: Disallows inclusion of people or faces in images.
      • RESPONSE_COUNT: int. Optional. The number of output images requested. Values: 1-4.
      • SEED_NUMBER: uint32. Optional. A number to make generated videos deterministic. Specifying a seed number with your request without changing other parameters guides the model to produce the same videos. Values: 0 - 4294967295.

    HTTP method and URL:

    POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning

    Request JSON body:

    {
      "instances": [
        {
          "prompt": "TEXT_PROMPT"
        }
      ],
      "parameters": {
        "storageUri": "OUTPUT_STORAGE_URI",
        "sampleCount": "RESPONSE_COUNT"
      }
    }
    

    To send your request, choose one of these options:

    curl

    Save the request body in a file named request.json, and execute the following command:

    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning"

    PowerShell

    Save the request body in a file named request.json, and execute the following command:

    $cred = gcloud auth print-access-token
    $headers = @{ "Authorization" = "Bearer $cred" }

    Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning" | Select-Object -Expand Content
    This request returns a full operation name with a unique operation ID. Use this full operation name to poll that status of the video generation request.
    {
      "name": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/a1b07c8e-7b5a-4aba-bb34-3e1ccb8afcc8"
    }
    

Gen AI SDK for Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

import time
from google import genai
from google.genai.types import GenerateVideosConfig, Image

client = genai.Client()

# TODO(developer): Update and un-comment below line
# output_gcs_uri = "gs://your-bucket/your-prefix"

operation = client.models.generate_videos(
    model="veo-3.0-generate-preview",
    image=Image(
        gcs_uri="gs://cloud-samples-data/generative-ai/image/flowers.png",
        mime_type="image/png",
    ),
    config=GenerateVideosConfig(
        aspect_ratio="16:9",
        output_gcs_uri=output_gcs_uri,
    ),
)

while not operation.done:
    time.sleep(15)
    operation = client.operations.get(operation)
    print(operation)

if operation.response:
    print(operation.result.generated_videos[0].video.uri)

# Example response:
# gs://your-bucket/your-prefix

What's next