Replace the background of an image

This guide shows you how to replace the background of an image using Imagen on Vertex AI. You can either let Imagen on Vertex AI automatically detect the background using object segmentation, or you can provide your own mask for more precise control.

This page covers the following topics:

The following diagram summarizes the overall workflow:

View Imagen for Editing and Customization model card

Masking methods

The following table compares the two methods for masking the background of an image.

Method Description Pros Cons Best for
Automatic background mask Imagen automatically identifies and separates the primary subject from the background. Fast, easy, and requires no manual effort. Less precise; might not work perfectly for complex images with ambiguous foreground/background elements. Quickly editing standard product photos with clear subjects and simple backgrounds.
Defined mask area You manually specify the area to be preserved or replaced by providing your own mask image or using editing tools. High precision and complete control over the edited area. Requires manual effort and time to create the mask. Complex images, images where automatic detection fails, or when exact boundaries are critical.

Product image editing example

The following example shows how you can enhance a product image by modifying its background while preserving the product's appearance.

Sample generated image in the console
Image generated with the Imagen product image editing feature from the prompt: on a table in a boutique store. Original image source: Irene Kredenets on Unsplash.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. Enable the Vertex AI API.

    Enable the API

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Verify that billing is enabled for your Google Cloud project.

  7. Enable the Vertex AI API.

    Enable the API

  8. Set up authentication for your environment.

    Select the tab for how you plan to use the samples on this page:

    Console

    When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

    Python

    To use the Python samples on this page in a local development environment, install and initialize the gcloud CLI, and then set up Application Default Credentials with your user credentials.

      Install the Google Cloud CLI.

      If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

      If you're using a local shell, then create local authentication credentials for your user account:

      gcloud auth application-default login

      You don't need to do this if you're using Cloud Shell.

      If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

    For more information, see Set up ADC for a local development environment in the Google Cloud authentication documentation.

    REST

    To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.

      Install the Google Cloud CLI.

      If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

    For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Edit with an automatically detected background mask

Imagen lets you edit product images with automatic background detection. This can be helpful if you need to modify a product image's background, but preserve the product's appearance. Product image editing uses technology from Google Product Studio (GPS) and is available in Imagen through the console or API.

Sample generated image in the console
Image generated with the Imagen product image editing feature from the prompt: on a table in a boutique store. Original image source: Irene Kredenets on Unsplash.

Use the following samples to edit an image with an automatically detected background.

Imagen 3

Use the following samples to send a product image editing request using the Imagen 3 model.

Console

  1. In the Google Cloud console, go to the Vertex AI > Media Studio page.

    Go to Media Studio

  2. Click Upload. In the displayed file dialog, select a file to upload.

  3. Click Inpaint.

  4. In the Parameters panel, click Product Background.

  5. In the editing toolbar, click background_replaceExtract.

  6. Select one of the mask extraction options:

    • Background elements: detects the background elements and creates a mask around them.
    • Foreground elements: detects the foreground objects and creates a mask around them.
    • background_replacePeople: detects people and creates a mask around them.
  7. Optional: In the Parameters side panel, adjust the following options:

    • Model: the Imagen model to use
    • Number of results: the number of result to generate
    • Negative prompt: items to avoid generating
  8. In the prompt field, enter a prompt to modify the image.

  9. Click sendGenerate.

Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import RawReferenceImage, MaskReferenceImage, MaskReferenceConfig, EditImageConfig

client = genai.Client()

# TODO(developer): Update and un-comment below line
# output_file = "output-image.png"

raw_ref = RawReferenceImage(
    reference_image=Image.from_file(location='test_resources/suitcase.png'), reference_id=0)
mask_ref = MaskReferenceImage(
    reference_id=1,
    reference_image=None,
    config=MaskReferenceConfig(
        mask_mode="MASK_MODE_BACKGROUND",
    ),
)

image = client.models.edit_image(
    model="imagen-3.0-capability-001",
    prompt="A light blue suitcase in front of a window in an airport",
    reference_images=[raw_ref, mask_ref],
    config=EditImageConfig(
        edit_mode="EDIT_MODE_BGSWAP",
    ),
)

image.generated_images[0].image.save(output_file)

print(f"Created output image using {len(image.generated_images[0].image.image_bytes)} bytes")
# Example response:
# Created output image using 1234567 bytes

REST

For more information, see the Edit images API reference.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your Google Cloud project ID.
  • LOCATION: Your project's region. For example, us-central1, europe-west2, or asia-northeast3. For a list of available regions, see Generative AI on Vertex AI locations.
  • TEXT_PROMPT: The text prompt that guides what images the model generates. This field is required for both generation and editing.
  • referenceType: A ReferenceImage is an image that provides additional context for image editing. A normal RGB raw reference image (REFERENCE_TYPE_RAW) is required for editing use cases. At most one raw reference image may exist in one request. The output image has the same height and width as raw reference image. A mask reference image (REFERENCE_TYPE_MASK) is required for masked editing use cases.
  • referenceId: The integer ID of the reference image. In this example the two reference image objects are of different types, so they have distinct referenceId values (1 and 2).
  • B64_BASE_IMAGE: The base image to edit or upscale. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.
  • maskImageConfig.maskMode: The mask mode for mask editing. MASK_MODE_BACKGROUND is used to automatically mask out background without a user-provided mask.
  • MASK_DILATION - float. The percentage of image width to dilate this mask by. A value of 0.00 is recommended to avoid extending foreground product. Minimum: 0, maximum: 1. Default: 0.03.
  • EDIT_STEPS - integer. The number of sampling steps for the base model. For product image editing, start at 75 steps.
  • EDIT_IMAGE_COUNT - The number of edited images. Accepted integer values: 1-4. Default value: 4.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict

Request JSON body:

{
  "instances": [
    {
      "prompt": "TEXT_PROMPT",
      "referenceImages": [
        {
          "referenceType": "REFERENCE_TYPE_RAW",
          "referenceId": 1,
          "referenceImage": {
            "bytesBase64Encoded": "B64_BASE_IMAGE"
          }
        },
        {
          "referenceType": "REFERENCE_TYPE_MASK",
          "referenceId": 2,
          "maskImageConfig": {
            "maskMode": "MASK_MODE_BACKGROUND",
            "dilation": MASK_DILATION
          }
        }
      ]
    }
  ],
  "parameters": {
    "editConfig": {
      "baseSteps": EDIT_STEPS
    },
    "editMode": "EDIT_MODE_BGSWAP",
    "sampleCount": EDIT_IMAGE_COUNT
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict" | Select-Object -Expand Content
The following sample response is for a product background editing request. The response returns four prediction objects, with the generated image bytes base64-encoded.
{
  "predictions": [
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    },
    {
      "mimeType": "image/png",
      "bytesBase64Encoded": "BASE64_IMG_BYTES"
    },
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    },
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    }
  ]
}

Imagen 2

Use the following samples to send a product image editing request using the Imagen 2 or Imagen model.

Console

  1. In the Google Cloud console, go to the Vertex AI > Media Studio page.

    Go to Media Studio

  2. In the lower task panel, click Edit image.

  3. Click Upload to select your locally stored product image to edit.

  4. In the Parameters panel, select Enable product style image editing.

  5. In the Prompt field (Write your prompt here), enter your prompt.

  6. Click Generate.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


import vertexai
from vertexai.preview.vision_models import Image, ImageGenerationModel

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# input_file = "input-image.png"
# output_file = "output-image.png"
# prompt = "" # The text prompt describing what you want to see in the background.

vertexai.init(project=PROJECT_ID, location="us-central1")

model = ImageGenerationModel.from_pretrained("imagegeneration@006")
base_img = Image.load_from_file(location=input_file)

images = model.edit_image(
    base_image=base_img,
    prompt=prompt,
    edit_mode="product-image",
)

images[0].save(location=output_file, include_generation_parameters=False)

# Optional. View the edited image in a notebook.
# images[0].show()

print(f"Created output image using {len(images[0]._image_bytes)} bytes")
# Example response:
# Created output image using 1234567 bytes

REST

For more information about imagegeneration model requests, see the imagegeneration model API reference.

To enable product image editing using the Imagen 2 version 006 model (imagegeneration@006), include the following field in the "editConfig": {} object: "editMode": "product-image". This request always returns 4 images.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your Google Cloud project ID.
  • LOCATION: Your project's region. For example, us-central1, europe-west2, or asia-northeast3. For a list of available regions, see Generative AI on Vertex AI locations.
  • TEXT_PROMPT: The text prompt that guides what images the model generates. This field is required for both generation and editing.
  • B64_BASE_IMAGE: The base image to edit or upscale. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.
  • PRODUCT_POSITION: Optional. A setting to maintain the original positioning of the detected product or object, or allow the model to reposition it. Available values: reposition (default value), which allows for repositioning, or fixed, which maintains the product position. For non-square input images, the product position behavior is always "reposition", even if "fixed" is set.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict

Request JSON body:

{
  "instances": [
    {
      "prompt": "TEXT_PROMPT",
      "image": {
          "bytesBase64Encoded": "B64_BASE_IMAGE"
      },
    }
  ],
  "parameters": {
    "editConfig": {
      "editMode": "product-image",
      "productPosition": "PRODUCT_POSITION",
    }
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict" | Select-Object -Expand Content
The following sample response is for a product background editing request. The response returns four prediction objects, with the generated image bytes base64-encoded.
{
  "predictions": [
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    },
    {
      "mimeType": "image/png",
      "bytesBase64Encoded": "BASE64_IMG_BYTES"
    },
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    },
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    }
  ]
}

Edit with a defined mask area

Instead of having Imagen detect the mask automatically, you can provide your own mask to define the area to replace.

Console

  1. In the Google Cloud console, go to the Vertex AI > Media Studio page.

    Go to Media Studio

  2. Click Upload. In the displayed file dialog, select a file to upload.

  3. Click Inpaint.

  4. In the Parameters panel, click Product Background.

  5. Do one of the following:

    • Upload your own mask:
      1. Create a mask on your computer.

      2. Click Upload mask. In the displayed dialog, select a mask to upload.

    • Define your own mask: in the editing toolbar, use the mask tools (box, brush, or masked_transitionsinvert tool) to specify the area or areas to add content to.

  6. Optional: In the Parameters panel, adjust the following options:

    • Model: the Imagen model to use
    • Number of results: the number of result to generate
    • Negative prompt: items to avoid generating
  7. In the prompt field, enter a prompt to modify the image.

  8. Click Generate.

Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import RawReferenceImage, MaskReferenceImage, MaskReferenceConfig, EditImageConfig

client = genai.Client()

# TODO(developer): Update and un-comment below line
# output_file = "output-image.png"

raw_ref = RawReferenceImage(
    reference_image=Image.from_file(location='test_resources/suitcase.png'), reference_id=0)
mask_ref = MaskReferenceImage(
    reference_id=1,
    reference_image=Image.from_file(location='test_resources/suitcase_mask.png'),
    config=MaskReferenceConfig(
        mask_mode="MASK_MODE_USER_PROVIDED",
        mask_dilation=0.0,
    ),
)

image = client.models.edit_image(
    model="imagen-3.0-capability-001",
    prompt="A light blue suitcase in an airport",
    reference_images=[raw_ref, mask_ref],
    config=EditImageConfig(
        edit_mode="EDIT_MODE_BGSWAP",
    ),
)

image.generated_images[0].image.save(output_file)

print(f"Created output image using {len(image.generated_images[0].image.image_bytes)} bytes")
# Example response:
# Created output image using 1234567 bytes

REST

For more information, see the Edit images API reference.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your Google Cloud project ID.
  • LOCATION: Your project's region. For example, us-central1, europe-west2, or asia-northeast3. For a list of available regions, see Generative AI on Vertex AI locations.
  • TEXT_PROMPT: The text prompt that guides what images the model generates. This field is required for both generation and editing.
  • referenceId: The integer ID of the reference image. In this example the two reference image objects are of different types, so they have distinct referenceId values (1 and 2).
  • B64_BASE_IMAGE: The base image to edit or upscale. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.
  • B64_MASK_IMAGE: The black and white image you want to use as a mask layer to edit the original image. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.
  • MASK_DILATION - float. The percentage of image width to dilate this mask by. A value of 0.00 is recommended to avoid extending foreground product. Minimum: 0, maximum: 1. Default: 0.03.
  • EDIT_STEPS - integer. The number of sampling steps for the base model. For product image editing, start at 75 steps.
  • EDIT_IMAGE_COUNT - The number of edited images. Accepted integer values: 1-4. Default value: 4.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict

Request JSON body:

{
  "instances": [
    {
      "prompt": "TEXT_PROMPT": [
        {
          "referenceType": "REFERENCE_TYPE_RAW",
          "referenceId": 1,
          "referenceImage": {
            "bytesBase64Encoded": "B64_BASE_IMAGE"
          }
        },
        {
          "referenceType": "REFERENCE_TYPE_MASK",
          "referenceId": 2,
          "referenceImage": {
            "bytesBase64Encoded": "B64_MASK_IMAGE"
          },
          "maskImageConfig": {
            "maskMode": "MASK_MODE_USER_PROVIDED",
            "dilation": MASK_DILATION
          }
        }
      ]
    }
  ],
  "parameters": {
    "editConfig": {
      "baseSteps": EDIT_STEPS
    },
    "editMode": "EDIT_MODE_BGSWAP",
    "sampleCount": EDIT_IMAGE_COUNT
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict" | Select-Object -Expand Content
The following sample response is for a product background editing request.
{
  "predictions": [
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    },
    {
      "mimeType": "image/png",
      "bytesBase64Encoded": "BASE64_IMG_BYTES"
    },
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    },
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    }
  ]
}

Limitations

Because masks are sometimes incomplete, the model might try to complete the foreground object when small parts are missing at the boundary. As a rare side effect, if the foreground object is already complete, the model might create slight extensions.

To work around this, you can segment the model output and then blend it. The following Python snippet shows an example of this workaround:

blended = Image.composite(out_images[0].resize(image_expanded.size), image_expanded, mask_expanded)

What's next

Read articles about Imagen and other Generative AI on Vertex AI products: