This page describes inserting objects into an image, also called inpainting. Imagen on Vertex AI lets you specify a mask area to insert objects into an image. You can bring your own mask, or you can let Imagen on Vertex AI generate a mask for you.
Content insertion example
With inpainting to insert content you can use a base image, an image mask, and a text prompt to add content to an existing image.
Inputs
Base image* to edit | Mask area specified using tools in the Google Cloud console | Text prompt |
---|---|---|
![]() |
![]() |
strawberries |
* Image credit: Alex Lvrs on Unsplash.
Output after specifying a mask area in the Google Cloud console
![]() |
![]() |
![]() |
View Imagen for Editing and Customization model card
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI API.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI API.
-
Set up authentication for your environment.
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
Java
To use the Java samples on this page in a local development environment, install and initialize the gcloud CLI, and then set up Application Default Credentials with your user credentials.
-
Install the Google Cloud CLI.
-
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
If you're using a local shell, then create local authentication credentials for your user account:
gcloud auth application-default login
You don't need to do this if you're using Cloud Shell.
If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.
For more information, see Set up ADC for a local development environment in the Google Cloud authentication documentation.
Node.js
To use the Node.js samples on this page in a local development environment, install and initialize the gcloud CLI, and then set up Application Default Credentials with your user credentials.
-
Install the Google Cloud CLI.
-
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
If you're using a local shell, then create local authentication credentials for your user account:
gcloud auth application-default login
You don't need to do this if you're using Cloud Shell.
If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.
For more information, see Set up ADC for a local development environment in the Google Cloud authentication documentation.
Python
To use the Python samples on this page in a local development environment, install and initialize the gcloud CLI, and then set up Application Default Credentials with your user credentials.
-
Install the Google Cloud CLI.
-
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
If you're using a local shell, then create local authentication credentials for your user account:
gcloud auth application-default login
You don't need to do this if you're using Cloud Shell.
If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.
For more information, see Set up ADC for a local development environment in the Google Cloud authentication documentation.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
After installing the Google Cloud CLI, initialize it by running the following command:
gcloud init
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
-
Insert with a defined mask area
Use the following samples to specify inpainting to insert content. In these samples you specify a base image, a text prompt, and a mask area to modify the base image.
Imagen 3
Use the following samples to send an inpainting request using the Imagen 3 model.
Console
-
In the Google Cloud console, go to the Vertex AI > Media Studio page.
- Click Upload. In the displayed file dialog, select a file to upload.
- Click Inpaint.
-
Do one of the following:
- Upload your own mask:
- Create a mask on your computer.
- Click Upload mask. In the displayed dialog, select a mask to upload.
- Define your mask: in the editing toolbar, use the mask tools (masked_transitionsinvert tool) to specify the area or areas to add content to. box, brush, or
- Upload your own mask:
-
Optional: In the Parameters panel, adjust the following
options:
- Model: the Imagen model to use
- Number of results: the number of result to generate
- Negative prompt: items to avoid generating
- In the prompt field, enter a prompt to modify the image.
- Click Generate.
Gen AI SDK for Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=us-central1 export GOOGLE_GENAI_USE_VERTEXAI=True
REST
For more information, see the Edit images API reference.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- LOCATION: Your project's region. For example,
us-central1
,europe-west2
, orasia-northeast3
. For a list of available regions, see Generative AI on Vertex AI locations. - TEXT_PROMPT: The text prompt guides what images the model generates. When you use a prompt for inpainting insertion, use a description of the masked area for best results. Avoid single-word prompts. For example, use "a cute corgi" instead of "corgi".
- B64_BASE_IMAGE: The base image to edit or upscale. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.
- B64_MASK_IMAGE: The black and white image you want to use as a mask layer to edit the original image. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.
- MASK_DILATION - float. The percentage of image width to dilate this mask by. A
value of
0.01
is recommended to compensate for imperfect input masks. - EDIT_STEPS - integer. The number of sampling steps for the base model. For
inpainting insertion, start at
35
steps. Increase steps to upper limit of75
if the quality doesn't meet your requirements. Increasing steps also increases request latency. - EDIT_IMAGE_COUNT - The number of edited images. Accepted integer values: 1-4. Default value: 4.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict
Request JSON body:
{ "instances": [ { "prompt": "TEXT_PROMPT", "referenceImages": [ { "referenceType": "REFERENCE_TYPE_RAW", "referenceId": 1, "referenceImage": { "bytesBase64Encoded": "B64_BASE_IMAGE" } }, { "referenceType": "REFERENCE_TYPE_MASK", "referenceId": 2, "referenceImage": { "bytesBase64Encoded": "B64_MASK_IMAGE" }, "maskImageConfig": { "maskMode": "MASK_MODE_USER_PROVIDED", "dilation": MASK_DILATION } } ] } ], "parameters": { "editConfig": { "baseSteps": EDIT_STEPS }, "editMode": "EDIT_MODE_INPAINT_INSERTION", "sampleCount": EDIT_IMAGE_COUNT } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict" | Select-Object -Expand Content
"sampleCount": 2
. The response returns two prediction objects, with
the generated image bytes base64-encoded.
{ "predictions": [ { "bytesBase64Encoded": "BASE64_IMG_BYTES", "mimeType": "image/png" }, { "mimeType": "image/png", "bytesBase64Encoded": "BASE64_IMG_BYTES" } ] }
Imagen 2
Use the following samples to send an inpainting request using the Imagen 2 model.
In the Google Cloud console, go to the Vertex AI > Media Studio
page.
Do one of the following: To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python.
For more information, see the
Vertex AI SDK for Python API reference documentation.
Console
Vertex AI SDK for Python
REST
For more information about imagegeneration
model requests, see the
imagegeneration
model API reference.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- LOCATION: Your project's region. For example,
us-central1
,europe-west2
, orasia-northeast3
. For a list of available regions, see Generative AI on Vertex AI locations. - TEXT_PROMPT: The text prompt that guides what images the model generates. This field is required for both generation and editing.
- B64_BASE_IMAGE: The base image to edit or upscale. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.
- B64_MASK_IMAGE: The black and white image you want to use as a mask layer to edit the original image. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.
- EDIT_IMAGE_COUNT: The number of edited images. Default value: 4.
- GUIDANCE_SCALE_VALUE: A parameter (integer) that controls
how much the model adheres to the text prompt. Larger values increase alignment between the text
prompt and generated images, but may compromise image quality. Values:
0
-500
. Default:60
.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict
Request JSON body:
{ "instances": [ { "prompt": "TEXT_PROMPT", "image": { "bytesBase64Encoded": "B64_BASE_IMAGE" }, "mask": { "image": { "bytesBase64Encoded": "B64_MASK_IMAGE" } } } ], "parameters": { "sampleCount": EDIT_IMAGE_COUNT, "editConfig": { "editMode": "inpainting-insert", "guidanceScale": GUIDANCE_SCALE_VALUE } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict" | Select-Object -Expand Content
"sampleCount": 2
. The response returns two prediction objects, with
the generated image bytes base64-encoded.
{ "predictions": [ { "bytesBase64Encoded": "BASE64_IMG_BYTES", "mimeType": "image/png" }, { "mimeType": "image/png", "bytesBase64Encoded": "BASE64_IMG_BYTES" } ] }
Before trying this sample, follow the Java setup instructions in the
Vertex AI quickstart using
client libraries.
For more information, see the
Vertex AI Java API
reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
In this sample, you specify the model as part of an
For more information about model versions and features,
see Imagen models.
Before trying this sample, follow the Node.js setup instructions in the
Vertex AI quickstart using
client libraries.
For more information, see the
Vertex AI Node.js API
reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Java
EndpointName
. The EndpointName
is passed to the predict
method which is called on a
PredictionServiceClient
. The service returns an
edited version of the image, which is then saved locally.Node.js
predict
method on a
PredictionServiceClient
.
The service generates images which are
then saved locally. For more information about model versions and features, see
Imagen models.
Insert with automatic mask detection
Use the following samples to specify inpainting to insert content. In these samples you specify a base image and a text prompt. Imagen automatically detects and creates a mask area to modify the base image.
Imagen 3
Use the following samples to send an inpainting request using the Imagen 3 model.
Console
-
In the Google Cloud console, go to the Vertex AI > Media Studio page.
- Click Upload. In the displayed file dialog, select a file to upload.
- Click Inpaint.
- In the editing toolbar, click background_replaceExtract mask.
-
Select one of the mask extraction options:
- Background elements: detects the background elements and creates a mask around them.
- Foreground elements: detects the foreground objects and creates a mask around them.
- background_replacePeople: detects people and creates a mask around them.
-
Optional: In the Parameters panel, adjust the following
options:
- Model: the Imagen model to use
- Number of results: the number of result to generate
- Negative prompt: items to avoid generating
- In the prompt field, enter a new prompt to modify the image.
- Click sendGenerate.
Gen AI SDK for Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=us-central1 export GOOGLE_GENAI_USE_VERTEXAI=True
REST
For more information, see the Edit images API reference.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- LOCATION: Your project's region. For example,
us-central1
,europe-west2
, orasia-northeast3
. For a list of available regions, see Generative AI on Vertex AI locations. - TEXT_PROMPT: The text prompt guides what images the model generates. When you use a prompt for inpainting insertion, use a description of the masked area for best results. Avoid single-word prompts. For example, use "a cute corgi" instead of "corgi".
- B64_BASE_IMAGE: The base image to edit or upscale. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.
- MASK_MODE - A string that sets the type of automatic mask creation the model uses.
Available values:
MASK_MODE_BACKGROUND
: Automatically generates a mask using background segmentation.MASK_MODE_FOREGROUND
: Automatically generates a mask using foreground segmentation.MASK_MODE_SEMANTIC
: Automatically generates a mask using semantic segmentation based on the segmentation classes you specify in themaskImageConfig.maskClasses
array. For example:"maskImageConfig": { "maskMode": "MASK_MODE_SEMANTIC", "maskClasses": [175, 176], // bicycle, car "dilation": 0.01 }
- MASK_DILATION - float. The percentage of image width to dilate this mask by. A
value of
0.01
is recommended to compensate for imperfect input masks. - EDIT_STEPS - integer. The number of sampling steps for the base model. For
inpainting insertion, start at
35
steps. Increase steps to upper limit of75
if the quality doesn't meet your requirements. Increasing steps also increases request latency. - EDIT_IMAGE_COUNT - The number of edited images. Accepted integer values: 1-4. Default value: 4.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict
Request JSON body:
{ "instances": [ { "prompt": "TEXT_PROMPT", "referenceImages": [ { "referenceType": "REFERENCE_TYPE_RAW", "referenceId": 1, "referenceImage": { "bytesBase64Encoded": "B64_BASE_IMAGE" } }, { "referenceType": "REFERENCE_TYPE_MASK", "referenceId": 2, "maskImageConfig": { "maskMode": "MASK_MODE", "dilation": MASK_DILATION } } ] } ], "parameters": { "editConfig": { "baseSteps": EDIT_STEPS }, "editMode": "EDIT_MODE_INPAINT_INSERTION", "sampleCount": EDIT_IMAGE_COUNT } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict" | Select-Object -Expand Content
"sampleCount": 2
. The response returns two prediction objects, with
the generated image bytes base64-encoded.
{ "predictions": [ { "bytesBase64Encoded": "BASE64_IMG_BYTES", "mimeType": "image/png" }, { "mimeType": "image/png", "bytesBase64Encoded": "BASE64_IMG_BYTES" } ] }
Imagen 2
Use the following samples to send an inpainting request using the Imagen 2 model.
Console
-
In the Google Cloud console, go to the Vertex AI > Media Studio page.
-
In the lower task panel, click
Edit image. -
Click Upload to select your locally stored image to edit.
-
In the editing toolbar, click background_replaceExtract.
-
Select one of the mask extraction options:
- Background elements - Detects the background elements and creates a mask around them.
- Foreground elements - Detects the foreground objects and creates a mask around them.
- background_replacePeople - Detects people and creates a mask around them.
-
Optional: In the Parameters panel, adjust the Number of results, Negative prompt, Text prompt guidance, or other parameters.
-
In the prompt field, enter a prompt to modify the image.
-
Click
Generate.
Vertex AI SDK for Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Vertex AI SDK for Python API reference documentation.
REST
For more information about imagegeneration
model requests, see the
imagegeneration
model API reference.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- LOCATION: Your project's region. For example,
us-central1
,europe-west2
, orasia-northeast3
. For a list of available regions, see Generative AI on Vertex AI locations. - TEXT_PROMPT: The text prompt that guides what images the model generates. This field is required for both generation and editing.
- B64_BASE_IMAGE: The base image to edit or upscale. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.
- EDIT_IMAGE_COUNT: The number of edited images. Default value: 4.
- MASK_TYPE: Prompts the model to generate a mask instead of you needing to provide
one. Consequently, when you provide this parameter, you should omit a
mask
object. Available values:background
: Automatically generates a mask to all regions except primary object, person, or subject in the image.foreground
: Automatically generates a mask to the primary object, person, or subject in the image.semantic
: Use automatic segmentation to create a mask area for one or more of the segmentation classes. Set the segmentation classes using theclasses
parameter and the correspondingclass_id
values. You can specify up to 5 classes. When you use the semantic mask type, themaskMode
object should look like the following:"maskMode": { "maskType": "semantic", "classes": [class_id1, class_id2] }
- GUIDANCE_SCALE_VALUE: A parameter (integer) that controls
how much the model adheres to the text prompt. Larger values increase alignment between the text
prompt and generated images, but may compromise image quality. Values:
0
-500
. Default:60
.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict
Request JSON body:
{ "instances": [ { "prompt": "TEXT_PROMPT", "image": { "bytesBase64Encoded": "B64_BASE_IMAGE" } } ], "parameters": { "sampleCount": EDIT_IMAGE_COUNT, "editConfig": { "editMode": "inpainting-insert", "maskMode": { "maskType": "MASK_TYPE" }, "guidanceScale": GUIDANCE_SCALE_VALUE } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict" | Select-Object -Expand Content
"sampleCount": 2
. The response returns two prediction objects, with
the generated image bytes base64-encoded.
{ "predictions": [ { "bytesBase64Encoded": "BASE64_IMG_BYTES", "mimeType": "image/png" }, { "mimeType": "image/png", "bytesBase64Encoded": "BASE64_IMG_BYTES" } ] }
Limitations
The following sections explain limitations of Imagen's remove objects feature.
Modified pixels
Pixels generated by the model that aren't in the mask aren't guaranteed to be identical to the input and are generated at the model's resolution (such as 1024x1024). Very minute changes may exist in the generated image.
If you want perfect preservation of the image, then we recommend that you blend the generated image with the input image, using the mask. Typically, if the input image resolution is 2K or higher, blending the generated image and input image is required.
Insert limitation
Insert typically matches the base image style. However, certain keywords may trigger outputs that resemble cartoon styles, despite your intention to create a photorealistic output.
One example that we've seen in particular is inaccurate colors. For example, "yellow giraffe" tends to produce a cartoon giraffe, because photorealistic giraffes are brown and tan. Similarly, photorealistic but unnatural colors are difficult to generate.
What's next
Read articles about Imagen and other Generative AI on Vertex AI products:
- A developer's guide to getting started with Imagen 3 on Vertex AI
- New generative media models and tools, built with and for creators
- New in Gemini: Custom Gems and improved image generation with Imagen 3
- Google DeepMind: Imagen 3 - Our highest quality text-to-image model