此页面由 Cloud Translation API 翻译。

使用修复功能将对象插入图片中

本页介绍了如何将对象插入图片中，也称为修复。借助 Imagen on Vertex AI，您可以指定遮盖区域，以便将对象插入图片中。您可以自带蒙版，也可以让 Imagen on Vertex AI 为您生成蒙版。

内容插入示例

借助修复方法插入内容，您可以使用基础图片、图片蒙版和文本提示将内容添加到现有图片。

输入

要修改的基础映像^*	使用 Google Cloud 控制台中的工具指定的蒙版区域	文本提示
		草莓

^{* 图片来源：Unsplash 用户 Alex Lvrs。}

在 Google Cloud 控制台中指定蒙版区域后的输出

屏幕截图：生成的编辑内容，显示一个装有红色液体的玻璃罐。在此屏幕截图中，之前位于图片前景中的柠檬片被直接放在罐子前面的两颗草莓取代。

屏幕截图：生成的编辑内容，显示一个装有红色液体的玻璃罐。在此屏幕截图中，之前位于图片前景中的柠檬片被罐子左侧的三颗草莓取代。

屏幕截图：生成的编辑内容，显示一个装有红色液体的玻璃罐。在此屏幕截图中，之前位于图片前景中的柠檬片被两颗草莓取代，这两颗草莓略微位于罐子前方和左侧。

查看 Imagen for Editing and Customization 模型卡片

准备工作

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

Enable the Vertex AI API.

Enable the API

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

Enable the Vertex AI API.

Enable the API

为您的环境设置身份验证。

Select the tab for how you plan to use the samples on this page:

Console

When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

Java

如需在本地开发环境中使用本页面上的 Java 示例，请安装并初始化 gcloud CLI，然后使用您的用户凭据设置应用默认凭据。

Install the Google Cloud CLI.
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
To initialize the gcloud CLI, run the following command:
```
gcloud init
```
If you're using a local shell, then create local authentication credentials for your user account:
```
gcloud auth application-default login
```
You don't need to do this if you're using Cloud Shell.

If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

Google Cloud

Node.js

如需在本地开发环境中使用本页面上的 Node.js 示例，请安装并初始化 gcloud CLI，然后使用您的用户凭据设置应用默认凭据。

Install the Google Cloud CLI.
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
To initialize the gcloud CLI, run the following command:
```
gcloud init
```
If you're using a local shell, then create local authentication credentials for your user account:
```
gcloud auth application-default login
```
You don't need to do this if you're using Cloud Shell.

If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

Google Cloud

Python

如需在本地开发环境中使用本页面上的 Python 示例，请安装并初始化 gcloud CLI，然后使用您的用户凭据设置应用默认凭据。

Install the Google Cloud CLI.
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
To initialize the gcloud CLI, run the following command:
```
gcloud init
```
If you're using a local shell, then create local authentication credentials for your user account:
```
gcloud auth application-default login
```
You don't need to do this if you're using Cloud Shell.

If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

Google Cloud

REST

如需在本地开发环境中使用本页面上的 REST API 示例，请使用您提供给 gcloud CLI 的凭证。

After installing the Google Cloud CLI, initialize it by running the following command:

gcloud init

If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

如需了解详情，请参阅 Google Cloud 身份验证文档中的使用 REST 时进行身份验证。

插入包含已定义蒙版区域的图片

使用以下示例指定通过修复插入内容。在这些示例中，您可以指定基础图片、文本提示和蒙版区域来修改基础图片。

Imagen 3

使用以下示例通过 Imagen 3 模型发送修复请求。

控制台

在 Google Cloud 控制台中，前往 Vertex AI > Media Studio 页面。

前往媒体工作室
点击上传。在显示的文件对话框中，选择要上传的文件。
点击涂抹修复。
执行下列其中一项操作：
- 上传您自己的遮罩：
  1. 在电脑上创建遮罩。
  2. 点击上传遮罩。在显示的对话框中，选择要上传的蒙版。
- 定义蒙版：在修改工具栏中，使用蒙版工具（方框、画笔或 masked_transitions 反转工具）指定要添加内容的一个或多个区域。
可选：在参数面板中，调整以下选项：
- 模型：要使用的 Imagen 模型
- 结果数量：要生成的结果数量
- 否定提示：要避免生成的内容
在提示字段中，输入用于修改图片的提示。
点击生成。

Python

安装

pip install --upgrade google-genai

如需了解详情，请参阅 SDK 参考文档。

设置环境变量以将 Gen AI SDK 与 Vertex AI 搭配使用：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import RawReferenceImage, MaskReferenceImage, MaskReferenceConfig, EditImageConfig

client = genai.Client()

# TODO(developer): Update and un-comment below line
# output_file = "output-image.png"

raw_ref = RawReferenceImage(
    reference_image=Image.from_file(location='test_resources/fruit.png'), reference_id=0)
mask_ref = MaskReferenceImage(
    reference_id=1,
    reference_image=Image.from_file(location='test_resources/fruit_mask.png'),
    config=MaskReferenceConfig(
        mask_mode="MASK_MODE_USER_PROVIDED",
        mask_dilation=0.01,
    ),
)

image = client.models.edit_image(
    model="imagen-3.0-capability-001",
    prompt="A plate of cookies",
    reference_images=[raw_ref, mask_ref],
    config=EditImageConfig(
        edit_mode="EDIT_MODE_INPAINT_INSERTION",
    ),
)

image.generated_images[0].image.save(output_file)

print(f"Created output image using {len(image.generated_images[0].image.image_bytes)} bytes")
# Example response:
# Created output image using 1234567 bytes

REST

如需了解详情，请参阅编辑图片 API 参考文档。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的 Google Cloud 项目 ID。
LOCATION：您的项目的区域。例如 us-central1、europe-west2 或 asia-northeast3。如需查看可用区域的列表，请参阅 Vertex AI 上的生成式 AI 位置。
TEXT_PROMPT：文本提示可用于指导模型生成的图片。在使用插入修复的提示时，请使用被蒙版区域的说明，以获得最佳结果。避免使用单字词提示。例如，请使用“a cute corgi”而不是“corgi”。
B64_BASE_IMAGE：要修改或放大的基础图片。图片必须指定为 base64 编码的字节字符串。大小上限：10 MB。
B64_MASK_IMAGE：您要用作蒙版层来修改原始图片的黑白图片。图片必须指定为 base64 编码的字节字符串。大小上限：10 MB。
MASK_DILATION - 浮点数。将此蒙版扩大的图像宽度的百分比。建议使用值 0.01 来弥补不完美的输入蒙版。
EDIT_STEPS - 整数。基本模型的采样步数。对于修复插入，请从 35 步数开始。如果质量不符合您的要求，请将步数增加到 75 的上限。增加步数也会增加请求延迟时间。
EDIT_IMAGE_COUNT - 已修改图片的数量。接受的整数值：1-4。默认值：4。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict

请求 JSON 正文：

{
  "instances": [
    {
      "prompt": "TEXT_PROMPT",
      "referenceImages": [
        {
          "referenceType": "REFERENCE_TYPE_RAW",
          "referenceId": 1,
          "referenceImage": {
            "bytesBase64Encoded": "B64_BASE_IMAGE"
          }
        },
        {
          "referenceType": "REFERENCE_TYPE_MASK",
          "referenceId": 2,
          "referenceImage": {
            "bytesBase64Encoded": "B64_MASK_IMAGE"
          },
          "maskImageConfig": {
            "maskMode": "MASK_MODE_USER_PROVIDED",
            "dilation": MASK_DILATION
          }
        }
      ]
    }
  ],
  "parameters": {
    "editConfig": {
      "baseSteps": EDIT_STEPS
    },
    "editMode": "EDIT_MODE_INPAINT_INSERTION",
    "sampleCount": EDIT_IMAGE_COUNT
  }
}

如需发送请求，请选择以下方式之一：

curl

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict" | Select-Object -Expand Content

以下示例响应适用于包含 "sampleCount": 2 的请求。响应返回两个预测对象，其中生成的图片字节采用 base64 编码。

{
  "predictions": [
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    },
    {
      "mimeType": "image/png",
      "bytesBase64Encoded": "BASE64_IMG_BYTES"
    }
  ]
}

Imagen 2

注意：自 2025 年 6 月 24 日起，Imagen 版本 1 和 2 将被弃用。Imagen 模型 imagegeneration@002、imagegeneration@005 和 imagegeneration@006 将于 2025 年 9 月 24 日移除。如需详细了解如何迁移到 Imagen 3，请参阅迁移到 Imagen 3。

使用以下示例通过 Imagen 2 模型发送修复请求。

控制台

在 Google Cloud 控制台中，前往 Vertex AI > Media Studio 页面。

前往媒体工作室
点击上传。在显示的文件对话框中，选择要上传的文件。
点击涂抹修复。
执行下列其中一项操作：
- 上传您自己的遮罩：
  1. 在电脑上创建遮罩。
  2. 点击上传遮罩。在显示的对话框中，选择要上传的蒙版。
- 定义蒙版：在修改工具栏中，使用蒙版工具（方框、画笔或 masked_transitions 反转工具）指定要添加内容的一个或多个区域。
可选。在参数面板中，调整以下选项：
- 模型：要使用的 Imagen 模型
- 结果数量：要生成的结果数量
- 否定提示：要避免生成的内容
在提示字段中，输入新提示以修改图片。
点击生成。

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


import vertexai
from vertexai.preview.vision_models import Image, ImageGenerationModel

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# input_file = "input-image.png"
# mask_file = "mask-image.png"
# output_file = "output-image.png"
# prompt = "red hat" # The text prompt describing what you want to see inserted.

vertexai.init(project=PROJECT_ID, location="us-central1")

model = ImageGenerationModel.from_pretrained("imagegeneration@006")
base_img = Image.load_from_file(location=input_file)
mask_img = Image.load_from_file(location=mask_file)

images = model.edit_image(
    base_image=base_img,
    mask=mask_img,
    prompt=prompt,
    edit_mode="inpainting-insert",
)

images[0].save(location=output_file, include_generation_parameters=False)

# Optional. View the edited image in a notebook.
# images[0].show()

print(f"Created output image using {len(images[0]._image_bytes)} bytes")
# Example response:
# Created output image using 1400814 bytes

REST

如需详细了解 imagegeneration 模型请求，请参阅 imagegeneration 模型 API 参考文档。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的 Google Cloud 项目 ID。
LOCATION：您的项目的区域。例如 us-central1、europe-west2 或 asia-northeast3。如需查看可用区域的列表，请参阅 Vertex AI 上的生成式 AI 位置。
TEXT_PROMPT：用于指导模型生成什么图片的文本提示。生成和修改都需要此字段。
B64_BASE_IMAGE：要修改或放大的基础图片。图片必须指定为 base64 编码的字节字符串。大小上限：10 MB。
B64_MASK_IMAGE：您要用作蒙版层来修改原始图片的黑白图片。图片必须指定为 base64 编码的字节字符串。大小上限：10 MB。
EDIT_IMAGE_COUNT：已修改图片的数量。默认值：4。
GUIDANCE_SCALE_VALUE：一个参数（整数），用于控制模型遵循文本提示的程度。较大的值会提高文本提示与生成的图片之间的一致性程度，但可能会降低图片质量。值：0 - 500。默认值：60。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict

请求 JSON 正文：

{
  "instances": [
    {
      "prompt": "TEXT_PROMPT",
      "image": {
          "bytesBase64Encoded": "B64_BASE_IMAGE"
      },
      "mask": {
        "image": {
          "bytesBase64Encoded": "B64_MASK_IMAGE"
        }
      }
    }
  ],
  "parameters": {
    "sampleCount": EDIT_IMAGE_COUNT,
    "editConfig": {
      "editMode": "inpainting-insert",
      "guidanceScale": GUIDANCE_SCALE_VALUE
    }
  }
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict" | Select-Object -Expand Content

以下示例响应适用于包含 "sampleCount": 2 的请求。响应返回两个预测对象，其中生成的图片字节采用 base64 编码。

{
  "predictions": [
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    },
    {
      "mimeType": "image/png",
      "bytesBase64Encoded": "BASE64_IMG_BYTES"
    }
  ]
}

Java

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Java 设置说明执行操作。如需了解详情，请参阅 Vertex AI Java API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

在此示例中，您将此模型指定为 EndpointName 的一部分。EndpointName 会传递给对 PredictionServiceClient 调用的 predict 方法。该服务会返回图片的修改版本，然后将其保存在本地。

如需详细了解模型版本和功能，请参阅 Imagen 模型。


import com.google.api.gax.rpc.ApiException;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.gson.Gson;
import com.google.protobuf.InvalidProtocolBufferException;
import com.google.protobuf.Value;
import com.google.protobuf.util.JsonFormat;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Base64;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;

public class EditImageInpaintingInsertMaskSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "my-project-id";
    String location = "us-central1";
    String inputPath = "/path/to/my-input.png";
    String maskPath = "/path/to/my-mask.png";
    String prompt =
        ""; // The text prompt describing what you want to see inserted in the mask area.

    editImageInpaintingInsertMask(projectId, location, inputPath, maskPath, prompt);
  }

  // Edit an image using a mask file. Inpainting can insert the object designated by the prompt
  // into the masked area.
  public static PredictResponse editImageInpaintingInsertMask(
      String projectId, String location, String inputPath, String maskPath, String prompt)
      throws ApiException, IOException {
    final String endpoint = String.format("%s-aiplatform.googleapis.com:443", location);
    PredictionServiceSettings predictionServiceSettings =
        PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (PredictionServiceClient predictionServiceClient =
        PredictionServiceClient.create(predictionServiceSettings)) {

      final EndpointName endpointName =
          EndpointName.ofProjectLocationPublisherModelName(
              projectId, location, "google", "imagegeneration@006");

      // Encode image and mask to Base64
      String imageBase64 =
          Base64.getEncoder().encodeToString(Files.readAllBytes(Paths.get(inputPath)));
      String maskBase64 =
          Base64.getEncoder().encodeToString(Files.readAllBytes(Paths.get(maskPath)));

      // Create the image and image mask maps
      Map<String, String> imageMap = new HashMap<>();
      imageMap.put("bytesBase64Encoded", imageBase64);

      Map<String, String> maskMap = new HashMap<>();
      maskMap.put("bytesBase64Encoded", maskBase64);
      Map<String, Map> imageMaskMap = new HashMap<>();
      imageMaskMap.put("image", maskMap);

      Map<String, Object> instancesMap = new HashMap<>();
      instancesMap.put("prompt", prompt); // [ "prompt", "<my-prompt>" ]
      instancesMap.put(
          "image", imageMap); // [ "image", [ "bytesBase64Encoded", "iVBORw0KGgo...==" ] ]
      instancesMap.put(
          "mask",
          imageMaskMap); // [ "mask", [ "image", [ "bytesBase64Encoded", "iJKDF0KGpl...==" ] ] ]
      instancesMap.put("editMode", "inpainting-insert"); // [ "editMode", "inpainting-insert" ]
      Value instances = mapToValue(instancesMap);

      // Optional parameters
      Map<String, Object> paramsMap = new HashMap<>();
      paramsMap.put("sampleCount", 1);
      Value parameters = mapToValue(paramsMap);

      PredictResponse predictResponse =
          predictionServiceClient.predict(
              endpointName, Collections.singletonList(instances), parameters);

      for (Value prediction : predictResponse.getPredictionsList()) {
        Map<String, Value> fieldsMap = prediction.getStructValue().getFieldsMap();
        if (fieldsMap.containsKey("bytesBase64Encoded")) {
          String bytesBase64Encoded = fieldsMap.get("bytesBase64Encoded").getStringValue();
          Path tmpPath = Files.createTempFile("imagen-", ".png");
          Files.write(tmpPath, Base64.getDecoder().decode(bytesBase64Encoded));
          System.out.format("Image file written to: %s\n", tmpPath.toUri());
        }
      }
      return predictResponse;
    }
  }

  private static Value mapToValue(Map<String, Object> map) throws InvalidProtocolBufferException {
    Gson gson = new Gson();
    String json = gson.toJson(map);
    Value.Builder builder = Value.newBuilder();
    JsonFormat.parser().merge(json, builder);
    return builder.build();
  }
}

Node.js

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Node.js 设置说明执行操作。如需了解详情，请参阅 Vertex AI Node.js API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

在此示例中，您将对 PredictionServiceClient 调用 predict 方法。该服务会生成图片，然后将其保存在本地。如需详细了解模型版本和功能，请参阅 Imagen 模型。

/**
 * TODO(developer): Update these variables before running the sample.
 */
const projectId = process.env.CAIP_PROJECT_ID;
const location = 'us-central1';
const inputFile = 'resources/woman.png';
const maskFile = 'resources/woman_inpainting_insert_mask.png';
const prompt = 'hat';

const aiplatform = require('@google-cloud/aiplatform');

// Imports the Google Cloud Prediction Service Client library
const {PredictionServiceClient} = aiplatform.v1;

// Import the helper module for converting arbitrary protobuf.Value objects
const {helpers} = aiplatform;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: `${location}-aiplatform.googleapis.com`,
};

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function editImageInpaintingInsertMask() {
  const fs = require('fs');
  const util = require('util');
  // Configure the parent resource
  const endpoint = `projects/${projectId}/locations/${location}/publishers/google/models/imagegeneration@006`;

  const imageFile = fs.readFileSync(inputFile);
  // Convert the image data to a Buffer and base64 encode it.
  const encodedImage = Buffer.from(imageFile).toString('base64');

  const maskImageFile = fs.readFileSync(maskFile);
  // Convert the image mask data to a Buffer and base64 encode it.
  const encodedMask = Buffer.from(maskImageFile).toString('base64');

  const promptObj = {
    prompt: prompt, // The text prompt describing what you want to see inserted
    editMode: 'inpainting-insert',
    image: {
      bytesBase64Encoded: encodedImage,
    },
    mask: {
      image: {
        bytesBase64Encoded: encodedMask,
      },
    },
  };
  const instanceValue = helpers.toValue(promptObj);
  const instances = [instanceValue];

  const parameter = {
    // Optional parameters
    seed: 100,
    // Controls the strength of the prompt
    // 0-9 (low strength), 10-20 (medium strength), 21+ (high strength)
    guidanceScale: 21,
    sampleCount: 1,
  };
  const parameters = helpers.toValue(parameter);

  const request = {
    endpoint,
    instances,
    parameters,
  };

  // Predict request
  const [response] = await predictionServiceClient.predict(request);
  const predictions = response.predictions;
  if (predictions.length === 0) {
    console.log(
      'No image was generated. Check the request parameters and prompt.'
    );
  } else {
    let i = 1;
    for (const prediction of predictions) {
      const buff = Buffer.from(
        prediction.structValue.fields.bytesBase64Encoded.stringValue,
        'base64'
      );
      // Write image content to the output file
      const writeFile = util.promisify(fs.writeFile);
      const filename = `output${i}.png`;
      await writeFile(filename, buff);
      console.log(`Saved image ${filename}`);
      i++;
    }
  }
}
await editImageInpaintingInsertMask();

插入并自动检测蒙版

使用以下示例指定通过修复插入内容。在这些示例中，您需要指定基础图片和文本提示。Imagen 会自动检测并创建蒙版区域来修改基础图片。

Imagen 3

使用以下示例通过 Imagen 3 模型发送修复请求。

控制台

在 Google Cloud 控制台中，前往 Vertex AI > Media Studio 页面。

前往媒体工作室
点击上传。在显示的文件对话框中，选择要上传的文件。
点击涂抹修复。
在修改工具栏中，点击 background_replace提取蒙版。
选择其中一个蒙版提取选项：
- 背景元素：检测背景元素并创建一个围绕它们的蒙版。
- 前景元素：检测前景对象并创建一个围绕它们的蒙版。
- background_replace人员：检测人员并创建一个围绕他们的蒙版。
可选：在参数面板中，调整以下选项：
- 模型：要使用的 Imagen 模型
- 结果数量：要生成的结果数量
- 否定提示：要避免生成的内容
在提示字段中，输入新提示以修改图片。
点击发送生成。

Python

安装

pip install --upgrade google-genai

如需了解详情，请参阅 SDK 参考文档。

设置环境变量以将 Gen AI SDK 与 Vertex AI 搭配使用：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import RawReferenceImage, MaskReferenceImage, MaskReferenceConfig, EditImageConfig

client = genai.Client()

# TODO(developer): Update and un-comment below line
# output_file = "output-image.png"

raw_ref = RawReferenceImage(
    reference_image=Image.from_file(location='test_resources/fruit.png'), reference_id=0)
mask_ref = MaskReferenceImage(
    reference_id=1,
    reference_image=None,
    config=MaskReferenceConfig(
        mask_mode="MASK_MODE_FOREGROUND",
        mask_dilation=0.1,
    ),
)

image = client.models.edit_image(
    model="imagen-3.0-capability-001",
    prompt="A small white ceramic bowl with lemons and limes",
    reference_images=[raw_ref, mask_ref],
    config=EditImageConfig(
        edit_mode="EDIT_MODE_INPAINT_INSERTION",
    ),
)

image.generated_images[0].image.save(output_file)

print(f"Created output image using {len(image.generated_images[0].image.image_bytes)} bytes")
# Example response:
# Created output image using 1234567 bytes

REST

如需了解详情，请参阅修改图片 API 参考文档。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的 Google Cloud 项目 ID。
LOCATION：您的项目的区域。例如 us-central1、europe-west2 或 asia-northeast3。如需查看可用区域的列表，请参阅 Vertex AI 上的生成式 AI 位置。
TEXT_PROMPT：文本提示可用于指导模型生成的图片。在使用插入修复的提示时，请使用被蒙版区域的说明，以获得最佳结果。避免使用单字词提示。例如，请使用“a cute corgi”而不是“corgi”。
B64_BASE_IMAGE：要修改或放大的基础图片。图片必须指定为 base64 编码的字节字符串。大小上限：10 MB。
MASK_MODE - 用于设置模型使用的自动蒙版创建类型的字符串。可用的值：
- MASK_MODE_BACKGROUND：使用背景分割自动生成蒙版。
- MASK_MODE_FOREGROUND：使用前景分割自动生成蒙版。
- MASK_MODE_SEMANTIC：根据您在 maskImageConfig.maskClasses 数组中指定的分割类，使用语义分割自动生成蒙版。例如：
```
          "maskImageConfig": {
            "maskMode": "MASK_MODE_SEMANTIC",
            "maskClasses": [175, 176], // bicycle, car
            "dilation": 0.01
          }
        
```
MASK_DILATION - 浮点数。将此蒙版扩大的图像宽度的百分比。建议使用值 0.01 来弥补不完美的输入蒙版。
EDIT_STEPS - 整数。基本模型的采样步数。对于修复插入，请从 35 步数开始。如果质量不符合您的要求，请将步数增加到 75 的上限。增加步数也会增加请求延迟时间。
EDIT_IMAGE_COUNT - 已修改图片的数量。接受的整数值：1-4。默认值：4。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict

请求 JSON 正文：

{
  "instances": [
    {
      "prompt": "TEXT_PROMPT",
      "referenceImages": [
        {
          "referenceType": "REFERENCE_TYPE_RAW",
          "referenceId": 1,
          "referenceImage": {
            "bytesBase64Encoded": "B64_BASE_IMAGE"
          }
        },
        {
          "referenceType": "REFERENCE_TYPE_MASK",
          "referenceId": 2,
          "maskImageConfig": {
            "maskMode": "MASK_MODE",
            "dilation": MASK_DILATION
          }
        }
      ]
    }
  ],
  "parameters": {
    "editConfig": {
      "baseSteps": EDIT_STEPS
    },
    "editMode": "EDIT_MODE_INPAINT_INSERTION",
    "sampleCount": EDIT_IMAGE_COUNT
  }
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict" | Select-Object -Expand Content

以下示例响应适用于包含 "sampleCount": 2 的请求。响应返回两个预测对象，其中生成的图片字节采用 base64 编码。

{
  "predictions": [
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    },
    {
      "mimeType": "image/png",
      "bytesBase64Encoded": "BASE64_IMG_BYTES"
    }
  ]
}

Imagen 2

使用以下示例通过 Imagen 2 模型发送修复请求。

控制台

在 Google Cloud 控制台中，前往 Vertex AI > Media Studio 页面。

前往媒体工作室
在下层任务面板中，点击 修改图片。
点击上传，以选择要修改的本地存储的图片。
在修改工具栏中，点击 background_replace提取。
选择其中一个蒙版提取选项：
- 背景元素 - 检测背景元素并创建一个围绕它们的蒙版。
- 前景元素 - 检测前景对象并创建一个围绕它们的蒙版。
- background_replace人员 - 检测人员并创建一个围绕他们的蒙版。
可选：在参数面板中，调整结果数量、否定提示、文本提示指导或其他参数。
在提示字段中，输入用于修改图片的提示。
点击生成。

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


import vertexai
from vertexai.preview.vision_models import Image, ImageGenerationModel

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# input_file = "input-image.png"
# mask_mode = "background" # 'background', 'foreground', or 'semantic'
# output_file = "output-image.png"
# prompt = "beach" # The text prompt describing what you want to see inserted.

vertexai.init(project=PROJECT_ID, location="us-central1")

model = ImageGenerationModel.from_pretrained("imagegeneration@006")
base_img = Image.load_from_file(location=input_file)

images = model.edit_image(
    base_image=base_img,
    mask_mode=mask_mode,
    prompt=prompt,
    edit_mode="inpainting-insert",
)

images[0].save(location=output_file, include_generation_parameters=False)

# Optional. View the edited image in a notebook.
# images[0].show()

print(f"Created output image using {len(images[0]._image_bytes)} bytes")
# Example response:
# Created output image using 1234567 bytes

REST

如需详细了解 imagegeneration 模型请求，请参阅 imagegeneration 模型 API 参考文档。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的 Google Cloud 项目 ID。
LOCATION：您的项目的区域。例如 us-central1、europe-west2 或 asia-northeast3。如需查看可用区域的列表，请参阅 Vertex AI 上的生成式 AI 位置。
TEXT_PROMPT：用于指导模型生成什么图片的文本提示。生成和修改都需要此字段。
B64_BASE_IMAGE：要修改或放大的基础图片。图片必须指定为 base64 编码的字节字符串。大小上限：10 MB。
EDIT_IMAGE_COUNT：已修改图片的数量。默认值：4。
MASK_TYPE：提示模型生成蒙版，而无需您提供蒙版。因此，当您提供此参数时，应省略 mask 对象。可用的值：
- background：自动为图片中除主要对象、人物或主体外的所有区域生成蒙版。
- foreground：自动为图片中的主要对象、人物或主题生成蒙版。
- semantic：使用自动细分功能为一个或多个细分类别创建蒙版区域。使用 classes 参数和相应的 class_id 值设置细分类别。您最多可以指定 5 个类别。使用语义蒙版类型时，maskMode 对象应如下所示：
```
"maskMode": {
  "maskType": "semantic",
  "classes": [class_id1, class_id2]
}
```
GUIDANCE_SCALE_VALUE：一个参数（整数），用于控制模型遵循文本提示的程度。较大的值会提高文本提示与生成的图片之间的一致性程度，但可能会降低图片质量。值：0 - 500。默认值：60。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict

请求 JSON 正文：

{
  "instances": [
    {
      "prompt": "TEXT_PROMPT",
      "image": {
        "bytesBase64Encoded": "B64_BASE_IMAGE"
      }
    }
  ],
  "parameters": {
    "sampleCount": EDIT_IMAGE_COUNT,
    "editConfig": {
      "editMode": "inpainting-insert",
      "maskMode": {
        "maskType": "MASK_TYPE"
      },
      "guidanceScale": GUIDANCE_SCALE_VALUE
    }
  }
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagegeneration@006:predict" | Select-Object -Expand Content

以下示例响应适用于包含 "sampleCount": 2 的请求。响应返回两个预测对象，其中生成的图片字节采用 base64 编码。

{
  "predictions": [
    {
      "bytesBase64Encoded": "BASE64_IMG_BYTES",
      "mimeType": "image/png"
    },
    {
      "mimeType": "image/png",
      "bytesBase64Encoded": "BASE64_IMG_BYTES"
    }
  ]
}

限制

以下部分介绍了 Imagen 的“移除对象”功能的限制。

修改后的像素

模型生成的不在遮罩中的像素不保证与输入完全相同，并且是按模型的分辨率（例如 1024x1024）生成的。生成的图片中可能存在非常细微的变化。

如果您希望完美保留图片，建议您使用遮罩将生成的图片与输入图片混合。通常，如果输入图片的分辨率为 2K 或更高，则需要将生成的图片与输入图片进行混合。

插入限制

插入的图片通常与基础图片风格一致。不过，某些关键字可能会触发类似卡通风格的输出，即使您打算创建逼真的输出也是如此。

我们特别注意到一个示例，即颜色不准确。例如，“黄色长颈鹿”往往会生成卡通长颈鹿，因为写实长颈鹿是棕色和浅棕色的。同样，逼真但非自然的颜色也难以生成。

后续步骤

阅读有关 Imagen 和其他 Vertex AI 上的生成式 AI 产品的文章：

使用修复功能将对象插入图片中 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

内容插入示例

准备工作

Console

Java

Node.js

Python

REST

插入包含已定义蒙版区域的图片

Imagen 3

控制台

Python

安装

REST

curl

PowerShell

Imagen 2

控制台

Python

REST

curl

PowerShell

Java

Node.js

插入并自动检测蒙版

Imagen 3

控制台

Python

安装

REST

curl

PowerShell

Imagen 2

控制台

Python

REST

curl

PowerShell

限制

修改后的像素

插入限制

后续步骤

使用修复功能将对象插入图片中