Text Embeddings API

Text embeddings API 可将文本数据转换为数值向量。这些向量表示旨在捕获它们所表示字词的语义含义和上下文。

支持的模型：

英语模型	多语言模型
`textembedding-gecko@001`	`textembedding-gecko-multilingual@001`
`textembedding-gecko@003`	`text-multilingual-embedding-002`
`text-embedding-004`
`text-embedding-005`

语法

curl

PROJECT_ID = PROJECT_ID
REGION = us-central1
MODEL_ID = MODEL_ID

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/publishers/google/models/${MODEL_ID}:predict -d \
  '{
    "instances": [
      ...
    ],
    "parameters": {
      ...
    }
  }'

Python

PROJECT_ID = PROJECT_ID
REGION = us-central1
MODEL_ID = MODEL_ID

import vertexai
from vertexai.language_models import TextEmbeddingModel

vertexai.init(project=PROJECT_ID, location=REGION)

model = TextEmbeddingModel.from_pretrained(MODEL_ID)
embeddings = model.get_embeddings(...)

参数列表

参数
`texts`	`list of union[string, TextEmbeddingInput]` 每个实例都表示要嵌入的一段文本。
`TextEmbeddingInput`	`string` 您要为其生成嵌入的文本。
`auto_truncate`	可选：`bool`。如果设置为 true，输入文本将被截断。如果设置为 false，当输入文本的长度超过模型支持的最大长度时，系统会返回错误。默认值为 true。
`output_dimensionality`	可选：`int`。用于指定输出嵌入大小。如果设置此参数，则输出嵌入将被截断为指定的大小。

请求正文

{
  "instances": [
    { 
      "task_type": "RETRIEVAL_DOCUMENT",
      "title": "document title",
      "content": "I would like embeddings for this text!"
    },
  ]
}

参数

参数
`content`	`string` 您要为其生成嵌入的文本。
`task_type`	可选：`string`。用于传达预期的下游应用，以帮助模型生成更好的嵌入。如果留空，则使用默认值 `RETRIEVAL_QUERY`。 `RETRIEVAL_QUERY` `RETRIEVAL_DOCUMENT` `SEMANTIC_SIMILARITY` `CLASSIFICATION` `CLUSTERING` `QUESTION_ANSWERING` `FACT_VERIFICATION` `CODE_RETRIEVAL_QUERY` textembedding-gecko@001 模型不支持 `task_type` 参数。如需详细了解任务类型，请参阅选择嵌入任务类型。
`title`	可选：`string`。用于帮助模型生成更好的嵌入。此参数必须与 `task_type=RETRIEVAL_DOCUMENT` 搭配使用才能生效。

content

string

您要为其生成嵌入的文本。

task_type

可选：string。

用于传达预期的下游应用，以帮助模型生成更好的嵌入。如果留空，则使用默认值 RETRIEVAL_QUERY。

RETRIEVAL_QUERY
RETRIEVAL_DOCUMENT
SEMANTIC_SIMILARITY
CLASSIFICATION
CLUSTERING
QUESTION_ANSWERING
FACT_VERIFICATION
CODE_RETRIEVAL_QUERY

textembedding-gecko@001 模型不支持 task_type 参数。

如需详细了解任务类型，请参阅选择嵌入任务类型。

title

可选：string。

用于帮助模型生成更好的嵌入。此参数必须与 task_type=RETRIEVAL_DOCUMENT 搭配使用才能生效。

taskType

下表介绍了 task_type 参数值及其用例：

`task_type`	说明
`RETRIEVAL_QUERY`	在搜索或检索设置中指定给定文本是查询。
`RETRIEVAL_DOCUMENT`	在搜索或检索设置中指定给定文本是文档。
`SEMANTIC_SIMILARITY`	指定给定文本用于语义文本相似度 (STS)。
`CLASSIFICATION`	指定嵌入用于分类。
`CLUSTERING`	指定嵌入用于聚类。
`QUESTION_ANSWERING`	指定查询嵌入用于回答问题。将 RETRIEVAL_DOCUMENT 用于文本端。
`FACT_VERIFICATION`	指定查询嵌入用于事实验证。
`CODE_RETRIEVAL_QUERY`	指定查询嵌入用于 Java 和 Python 的代码检索。

检索任务：

查询：使用 task_type=RETRIEVAL_QUERY 指示输入文本是搜索查询。语料库：使用 task_type=RETRIEVAL_DOCUMENT 表示输入文本属于要搜索的文档集合的一部分。

相似性任务：

语义相似度：对两段输入文本使用 task_type=SEMANTIC_SIMILARITY，以评估它们的整体含义相似度。

响应正文

{
  "predictions": [
    {
      "embeddings": {
        "statistics": {
          "truncated": boolean,
          "token_count": integer
        },
        "values": [ number ]
      }
    }
  ]
}

响应元素	说明
`embeddings`	根据输入文本生成的结果。
`statistics`	根据输入文本计算的统计信息。
`truncated`	表示输入文本是否超过允许的词元数量上限并被截断。
`tokenCount`	输入文本的词元数。
`values`	`values` 字段包含与输入文本中的字词对应的嵌入向量。

示例响应

{
  "predictions": [
    {
      "embeddings": {
        "values": [
          0.0058424929156899452,
          0.011848051100969315,
          0.032247550785541534,
          -0.031829461455345154,
          -0.055369812995195389,
          ...
        ],
        "statistics": {
          "token_count": 4,
          "truncated": false
        }
      }
    }
  ]
}

示例

嵌入文本字符串

基本用例

以下示例展示了如何获取文本字符串的嵌入。

REST

设置您的环境后，您可以使用 REST 测试文本提示。以下示例会向发布方模型端点发送请求。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
TEXT：您要为其生成嵌入的文本。限制：五个文本，除 textembedding-gecko@001 之外的所有模型的每个文本最多 2,048 个词元。textembedding-gecko@001 的输入词元长度上限为 3072。
AUTO_TRUNCATE：如果设置为 false，则超出词元限制的文本会导致请求失败。默认值为 true。

HTTP 方法和网址：

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-embedding-005:predict

请求 JSON 正文：

{
  "instances": [
    { "content": "TEXT"}
  ],
  "parameters": { 
    "autoTruncate": AUTO_TRUNCATE 
  }
}

如需发送请求，请选择以下方式之一：

curl

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-embedding-005:predict"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-embedding-005:predict" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应。请注意，为节省空间，系统截断了 values。

响应

{
  "predictions": [
    {
      "embeddings": {
        "statistics": {
          "truncated": false,
          "token_count": 6
        },
        "values": [ ... ]
      }
    }
  ]
}

请注意此示例网址中的以下内容：

使用 generateContent 方法请求在回答完全生成后返回回答。为了降低真人观众对于延迟的感知度，请使用 streamGenerateContent 方法在生成回答时流式传输回答。
多模态模型 ID 位于网址末尾且位于方法之前（例如 gemini-1.5-flash 或 gemini-1.0-pro-vision）。此示例可能还支持其他模型。

Python 版 Vertex AI SDK

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python 版 Vertex AI SDK API 参考文档。

from __future__ import annotations

from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel


def embed_text() -> list[list[float]]:
    """Embeds texts with a pre-trained, foundational model.

    Returns:
        A list of lists containing the embedding vectors for each input text
    """

    # A list of texts to be embedded.
    texts = ["banana muffins? ", "banana bread? banana muffins?"]
    # The dimensionality of the output embeddings.
    dimensionality = 256
    # The task type for embedding. Check the available tasks in the model's documentation.
    task = "RETRIEVAL_DOCUMENT"

    model = TextEmbeddingModel.from_pretrained("text-embedding-005")
    inputs = [TextEmbeddingInput(text, task) for text in texts]
    kwargs = dict(output_dimensionality=dimensionality) if dimensionality else {}
    embeddings = model.get_embeddings(inputs, **kwargs)

    print(embeddings)
    # Example response:
    # [[0.006135190837085247, -0.01462465338408947, 0.004978656303137541, ...], [0.1234434666, ...]],
    return [embedding.values for embedding in embeddings]

Go

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Go 设置说明执行操作。如需了解详情，请参阅 Vertex AI Go API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

import (
	"context"
	"fmt"
	"io"

	aiplatform "cloud.google.com/go/aiplatform/apiv1"
	"cloud.google.com/go/aiplatform/apiv1/aiplatformpb"

	"google.golang.org/api/option"
	"google.golang.org/protobuf/types/known/structpb"
)

// embedTexts shows how embeddings are set for text-embedding-005 model
func embedTexts(w io.Writer, project, location string) error {
	// location := "us-central1"
	ctx := context.Background()

	apiEndpoint := fmt.Sprintf("%s-aiplatform.googleapis.com:443", location)
	dimensionality := 5
	model := "text-embedding-005"
	texts := []string{"banana muffins? ", "banana bread? banana muffins?"}

	client, err := aiplatform.NewPredictionClient(ctx, option.WithEndpoint(apiEndpoint))
	if err != nil {
		return err
	}
	defer client.Close()

	endpoint := fmt.Sprintf("projects/%s/locations/%s/publishers/google/models/%s", project, location, model)
	instances := make([]*structpb.Value, len(texts))
	for i, text := range texts {
		instances[i] = structpb.NewStructValue(&structpb.Struct{
			Fields: map[string]*structpb.Value{
				"content":   structpb.NewStringValue(text),
				"task_type": structpb.NewStringValue("QUESTION_ANSWERING"),
			},
		})
	}

	params := structpb.NewStructValue(&structpb.Struct{
		Fields: map[string]*structpb.Value{
			"outputDimensionality": structpb.NewNumberValue(float64(dimensionality)),
		},
	})

	req := &aiplatformpb.PredictRequest{
		Endpoint:   endpoint,
		Instances:  instances,
		Parameters: params,
	}
	resp, err := client.Predict(ctx, req)
	if err != nil {
		return err
	}
	embeddings := make([][]float32, len(resp.Predictions))
	for i, prediction := range resp.Predictions {
		values := prediction.GetStructValue().Fields["embeddings"].GetStructValue().Fields["values"].GetListValue().Values
		embeddings[i] = make([]float32, len(values))
		for j, value := range values {
			embeddings[i][j] = float32(value.GetNumberValue())
		}
	}

	fmt.Fprintf(w, "Dimensionality: %d. Embeddings length: %d", len(embeddings[0]), len(embeddings))
	return nil
}

Java

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Java 设置说明执行操作。如需了解详情，请参阅 Vertex AI Java API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

import static java.util.stream.Collectors.toList;

import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictRequest;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.protobuf.Struct;
import com.google.protobuf.Value;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.OptionalInt;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class PredictTextEmbeddingsSample {
  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    // Details about text embedding request structure and supported models are available in:
    // https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings
    String endpoint = "us-central1-aiplatform.googleapis.com:443";
    String project = "YOUR_PROJECT_ID";
    String model = "text-embedding-005";
    predictTextEmbeddings(
        endpoint,
        project,
        model,
        List.of("banana bread?", "banana muffins?"),
        "QUESTION_ANSWERING",
        OptionalInt.of(256));
  }

  // Gets text embeddings from a pretrained, foundational model.
  public static List<List<Float>> predictTextEmbeddings(
      String endpoint,
      String project,
      String model,
      List<String> texts,
      String task,
      OptionalInt outputDimensionality)
      throws IOException {
    PredictionServiceSettings settings =
        PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();
    Matcher matcher = Pattern.compile("^(?<Location>\\w+-\\w+)").matcher(endpoint);
    String location = matcher.matches() ? matcher.group("Location") : "us-central1";
    EndpointName endpointName =
        EndpointName.ofProjectLocationPublisherModelName(project, location, "google", model);

    // You can use this prediction service client for multiple requests.
    try (PredictionServiceClient client = PredictionServiceClient.create(settings)) {
      PredictRequest.Builder request =
          PredictRequest.newBuilder().setEndpoint(endpointName.toString());
      if (outputDimensionality.isPresent()) {
        request.setParameters(
            Value.newBuilder()
                .setStructValue(
                    Struct.newBuilder()
                        .putFields("outputDimensionality", valueOf(outputDimensionality.getAsInt()))
                        .build()));
      }
      for (int i = 0; i < texts.size(); i++) {
        request.addInstances(
            Value.newBuilder()
                .setStructValue(
                    Struct.newBuilder()
                        .putFields("content", valueOf(texts.get(i)))
                        .putFields("task_type", valueOf(task))
                        .build()));
      }
      PredictResponse response = client.predict(request.build());
      List<List<Float>> floats = new ArrayList<>();
      for (Value prediction : response.getPredictionsList()) {
        Value embeddings = prediction.getStructValue().getFieldsOrThrow("embeddings");
        Value values = embeddings.getStructValue().getFieldsOrThrow("values");
        floats.add(
            values.getListValue().getValuesList().stream()
                .map(Value::getNumberValue)
                .map(Double::floatValue)
                .collect(toList()));
      }
      return floats;
    }
  }

  private static Value valueOf(String s) {
    return Value.newBuilder().setStringValue(s).build();
  }

  private static Value valueOf(int n) {
    return Value.newBuilder().setNumberValue(n).build();
  }
}

Node.js

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Node.js 设置说明执行操作。如需了解详情，请参阅 Vertex AI Node.js API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

async function main(
  project,
  model = 'text-embedding-005',
  texts = 'banana bread?;banana muffins?',
  task = 'QUESTION_ANSWERING',
  dimensionality = 0,
  apiEndpoint = 'us-central1-aiplatform.googleapis.com'
) {
  const aiplatform = require('@google-cloud/aiplatform');
  const {PredictionServiceClient} = aiplatform.v1;
  const {helpers} = aiplatform; // helps construct protobuf.Value objects.
  const clientOptions = {apiEndpoint: apiEndpoint};
  const location = 'us-central1';
  const endpoint = `projects/${project}/locations/${location}/publishers/google/models/${model}`;

  async function callPredict() {
    const instances = texts
      .split(';')
      .map(e => helpers.toValue({content: e, task_type: task}));
    const parameters = helpers.toValue(
      dimensionality > 0 ? {outputDimensionality: parseInt(dimensionality)} : {}
    );
    const request = {endpoint, instances, parameters};
    const client = new PredictionServiceClient(clientOptions);
    const [response] = await client.predict(request);
    const predictions = response.predictions;
    const embeddings = predictions.map(p => {
      const embeddingsProto = p.structValue.fields.embeddings;
      const valuesProto = embeddingsProto.structValue.fields.values;
      return valuesProto.listValue.values.map(v => v.numberValue);
    });
    console.log('Got embeddings: \n' + JSON.stringify(embeddings));
  }

  callPredict();
}

高级用例

以下示例演示了一些高级功能

使用 task_type 和 title 可提高嵌入质量。
使用参数来控制 API 的行为。

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
TEXT：您要为其生成嵌入的文本。限制：五个文本，每个文本最多 3,072 个词元。
TASK_TYPE：用于传达预期的下游应用，以帮助模型生成更好的嵌入。
TITLE：用于帮助模型生成更好的嵌入。
AUTO_TRUNCATE：如果设置为 false，则超出词元限制的文本会导致请求失败。默认值为 true。
OUTPUT_DIMENSIONALITY：用于指定输出嵌入大小。如果设置此参数，则输出嵌入将被截断为指定的大小。

HTTP 方法和网址：

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/textembedding-gecko@003:predict

请求 JSON 正文：

{
  "instances": [
    { "content": "TEXT",
      "task_type": "TASK_TYPE",
      "title": "TITLE"
    },
  ],
  "parameters": {
    "autoTruncate": AUTO_TRUNCATE,
    "outputDimensionality": OUTPUT_DIMENSIONALITY
  }
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/textembedding-gecko@003:predict"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/textembedding-gecko@003:predict" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应。请注意，为节省空间，系统截断了 values。

响应

{
  "predictions": [
    {
      "embeddings": {
        "statistics": {
          "truncated": false,
          "token_count": 6
        },
        "values": [ ... ]
      }
    }
  ]
}

Python 版 Vertex AI SDK

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python 版 Vertex AI SDK API 参考文档。

import re

from google.cloud.aiplatform import initializer as aiplatform_init
from vertexai.language_models import TextEmbeddingModel


def tune_embedding_model(
    api_endpoint: str,
    base_model_name: str = "text-embedding-005",
    corpus_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/corpus.jsonl",
    queries_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/queries.jsonl",
    train_label_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/train.tsv",
    test_label_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/test.tsv",
):  # noqa: ANN201
    """Tune an embedding model using the specified parameters.
    Args:
        api_endpoint (str): The API endpoint for the Vertex AI service.
        base_model_name (str): The name of the base model to use for tuning.
        corpus_path (str): GCS URI of the JSONL file containing the corpus data.
        queries_path (str): GCS URI of the JSONL file containing the queries data.
        train_label_path (str): GCS URI of the TSV file containing the training labels.
        test_label_path (str): GCS URI of the TSV file containing the test labels.
    """
    match = re.search(r"^(\w+-\w+)", api_endpoint)
    location = match.group(1) if match else "us-central1"
    base_model = TextEmbeddingModel.from_pretrained(base_model_name)
    tuning_job = base_model.tune_model(
        task_type="DEFAULT",
        corpus_data=corpus_path,
        queries_data=queries_path,
        training_data=train_label_path,
        test_data=test_label_path,
        batch_size=128,  # The batch size to use for training.
        train_steps=1000,  # The number of training steps.
        tuned_model_location=location,
        output_dimensionality=768,  # The dimensionality of the output embeddings.
        learning_rate_multiplier=1.0,  # The multiplier for the learning rate.
    )
    return tuning_job

Go

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Go 设置说明执行操作。如需了解详情，请参阅 Vertex AI Go API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

import (
	"context"
	"fmt"
	"io"

	aiplatform "cloud.google.com/go/aiplatform/apiv1"
	"cloud.google.com/go/aiplatform/apiv1/aiplatformpb"

	"google.golang.org/api/option"
	"google.golang.org/protobuf/types/known/structpb"
)

// embedTexts shows how embeddings are set for text-embedding-005 model
func embedTexts(w io.Writer, project, location string) error {
	// location := "us-central1"
	ctx := context.Background()

	apiEndpoint := fmt.Sprintf("%s-aiplatform.googleapis.com:443", location)
	dimensionality := 5
	model := "text-embedding-005"
	texts := []string{"banana muffins? ", "banana bread? banana muffins?"}

	client, err := aiplatform.NewPredictionClient(ctx, option.WithEndpoint(apiEndpoint))
	if err != nil {
		return err
	}
	defer client.Close()

	endpoint := fmt.Sprintf("projects/%s/locations/%s/publishers/google/models/%s", project, location, model)
	instances := make([]*structpb.Value, len(texts))
	for i, text := range texts {
		instances[i] = structpb.NewStructValue(&structpb.Struct{
			Fields: map[string]*structpb.Value{
				"content":   structpb.NewStringValue(text),
				"task_type": structpb.NewStringValue("QUESTION_ANSWERING"),
			},
		})
	}

	params := structpb.NewStructValue(&structpb.Struct{
		Fields: map[string]*structpb.Value{
			"outputDimensionality": structpb.NewNumberValue(float64(dimensionality)),
		},
	})

	req := &aiplatformpb.PredictRequest{
		Endpoint:   endpoint,
		Instances:  instances,
		Parameters: params,
	}
	resp, err := client.Predict(ctx, req)
	if err != nil {
		return err
	}
	embeddings := make([][]float32, len(resp.Predictions))
	for i, prediction := range resp.Predictions {
		values := prediction.GetStructValue().Fields["embeddings"].GetStructValue().Fields["values"].GetListValue().Values
		embeddings[i] = make([]float32, len(values))
		for j, value := range values {
			embeddings[i][j] = float32(value.GetNumberValue())
		}
	}

	fmt.Fprintf(w, "Dimensionality: %d. Embeddings length: %d", len(embeddings[0]), len(embeddings))
	return nil
}

Java

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Java 设置说明执行操作。如需了解详情，请参阅 Vertex AI Java API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

import static java.util.stream.Collectors.toList;

import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictRequest;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.protobuf.Struct;
import com.google.protobuf.Value;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.OptionalInt;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class PredictTextEmbeddingsSample {
  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    // Details about text embedding request structure and supported models are available in:
    // https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings
    String endpoint = "us-central1-aiplatform.googleapis.com:443";
    String project = "YOUR_PROJECT_ID";
    String model = "text-embedding-005";
    predictTextEmbeddings(
        endpoint,
        project,
        model,
        List.of("banana bread?", "banana muffins?"),
        "QUESTION_ANSWERING",
        OptionalInt.of(256));
  }

  // Gets text embeddings from a pretrained, foundational model.
  public static List<List<Float>> predictTextEmbeddings(
      String endpoint,
      String project,
      String model,
      List<String> texts,
      String task,
      OptionalInt outputDimensionality)
      throws IOException {
    PredictionServiceSettings settings =
        PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();
    Matcher matcher = Pattern.compile("^(?<Location>\\w+-\\w+)").matcher(endpoint);
    String location = matcher.matches() ? matcher.group("Location") : "us-central1";
    EndpointName endpointName =
        EndpointName.ofProjectLocationPublisherModelName(project, location, "google", model);

    // You can use this prediction service client for multiple requests.
    try (PredictionServiceClient client = PredictionServiceClient.create(settings)) {
      PredictRequest.Builder request =
          PredictRequest.newBuilder().setEndpoint(endpointName.toString());
      if (outputDimensionality.isPresent()) {
        request.setParameters(
            Value.newBuilder()
                .setStructValue(
                    Struct.newBuilder()
                        .putFields("outputDimensionality", valueOf(outputDimensionality.getAsInt()))
                        .build()));
      }
      for (int i = 0; i < texts.size(); i++) {
        request.addInstances(
            Value.newBuilder()
                .setStructValue(
                    Struct.newBuilder()
                        .putFields("content", valueOf(texts.get(i)))
                        .putFields("task_type", valueOf(task))
                        .build()));
      }
      PredictResponse response = client.predict(request.build());
      List<List<Float>> floats = new ArrayList<>();
      for (Value prediction : response.getPredictionsList()) {
        Value embeddings = prediction.getStructValue().getFieldsOrThrow("embeddings");
        Value values = embeddings.getStructValue().getFieldsOrThrow("values");
        floats.add(
            values.getListValue().getValuesList().stream()
                .map(Value::getNumberValue)
                .map(Double::floatValue)
                .collect(toList()));
      }
      return floats;
    }
  }

  private static Value valueOf(String s) {
    return Value.newBuilder().setStringValue(s).build();
  }

  private static Value valueOf(int n) {
    return Value.newBuilder().setNumberValue(n).build();
  }
}

Node.js

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Node.js 设置说明执行操作。如需了解详情，请参阅 Vertex AI Node.js API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

async function main(
  project,
  model = 'text-embedding-005',
  texts = 'banana bread?;banana muffins?',
  task = 'QUESTION_ANSWERING',
  dimensionality = 0,
  apiEndpoint = 'us-central1-aiplatform.googleapis.com'
) {
  const aiplatform = require('@google-cloud/aiplatform');
  const {PredictionServiceClient} = aiplatform.v1;
  const {helpers} = aiplatform; // helps construct protobuf.Value objects.
  const clientOptions = {apiEndpoint: apiEndpoint};
  const location = 'us-central1';
  const endpoint = `projects/${project}/locations/${location}/publishers/google/models/${model}`;

  async function callPredict() {
    const instances = texts
      .split(';')
      .map(e => helpers.toValue({content: e, task_type: task}));
    const parameters = helpers.toValue(
      dimensionality > 0 ? {outputDimensionality: parseInt(dimensionality)} : {}
    );
    const request = {endpoint, instances, parameters};
    const client = new PredictionServiceClient(clientOptions);
    const [response] = await client.predict(request);
    const predictions = response.predictions;
    const embeddings = predictions.map(p => {
      const embeddingsProto = p.structValue.fields.embeddings;
      const valuesProto = embeddingsProto.structValue.fields.values;
      return valuesProto.listValue.values.map(v => v.numberValue);
    });
    console.log('Got embeddings: \n' + JSON.stringify(embeddings));
  }

  callPredict();
}

支持的文本语言

所有文本嵌入模型都支持英语文本，并且已针对英语文本进行了评估。textembedding-gecko-multilingual@001 和 text-multilingual-embedding-002 模型还支持并已在以下语言上进行了评估：

评估的语言：Arabic (ar)、Bengali (bn)、English (en)、Spanish (es)、German (de)、Persian (fa)、Finnish (fi)、French (fr)、Hindi (hi)、Indonesian (id)、Japanese (ja)、Korean (ko)、Russian (ru)、Swahili (sw)、Telugu (te)、Thai (th)、Yoruba (yo)、Chinese (zh)
支持的语言：Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusiasn, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Sotho, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, West Frisian, Xhosa, Yiddish, Yoruba, Zulu。

模型版本

如需使用稳定的模型版本，请指定模型版本号，例如 text-embedding-005。不建议指定不含版本号的模型（例如 textembedding-gecko），因为它只是指向其他模型的旧指针，并不稳定。每个稳定版本会在后续稳定版发布日期后的六个月内可用。

下表包含可用的稳定模型版本：

模型名称	发布日期	终止日期
text-embedding-005	2024 年 11 月 18 日	待定。
text-embedding-004	2024 年 5 月 14 日	2025 年 11 月 18 日
text-multilingual-embedding-002	2024 年 5 月 14 日	待定。
textembedding-gecko@003	2023 年 12 月 12 日	2025 年 5 月 14 日
textembedding-gecko-multilingual@001	2023 年 11 月 2 日	2025 年 5 月 14 日
textembedding-gecko@002 （回退，但仍受支持）	2023 年 11 月 2 日	2025 年 4 月 9 日
textembedding-gecko@001 （回退，但仍受支持）	2023 年 6 月 7 日	2025 年 4 月 9 日
multimodalembedding@001	2024 年 2 月 12 日	待定。

如需了解详情，请参阅模型版本和生命周期。

后续步骤

如需查看详细文档，请参阅以下内容：

文本嵌入

Text Embeddings API 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

语法

curl

Python

参数列表

请求正文

taskType

响应正文

示例响应

示例

嵌入文本字符串

基本用例

REST

curl

PowerShell

响应

Python 版 Vertex AI SDK

Go

Java

Node.js

高级用例

REST

curl

PowerShell

响应

Python 版 Vertex AI SDK

Go

Java

Node.js

支持的文本语言

模型版本

后续步骤

Text Embeddings API