更新架构

您可以更新包含支持架构的数据的任何数据的架构,例如结构化数据包含结构化数据的网站数据或其他包含元数据的非结构化数据

您可以在 Google Cloud 控制台中或使用 schemas.patch API 方法更新架构。仅通过 REST API 支持更新网站的架构。

如需更新架构,您可以添加新字段、更改字段的可编入索引、可搜索和可检索注释,或将字段标记为关键属性(例如 titleuridescription)。

更新架构

您可以在 Google Cloud 控制台中或使用 API 更新架构。

控制台

如需在 Google Cloud 控制台中更新架构,请按以下步骤操作:

  1. 请查看要求和限制部分,检查您的架构更新是否有效。

  2. 如果您要更新字段注解(将字段设置为可编入索引、可检索、可动态生成 Facetable、可搜索或可填充),请参阅配置字段设置,了解每种注解类型的限制和要求。

  3. 检查您是否已完成数据注入。否则,您可能还无法修改架构。

  4. 在 Google Cloud 控制台中,前往 Agent Builder 页面。

    Agent Builder

  5. 在导航菜单中,点击数据存储区

  6. 名称列中,点击包含要更新的架构的数据存储区。

  7. 点击 Schema 标签页可查看数据的架构。

    如果这是您首次修改这些字段,此标签页可能为空。

  8. 点击修改按钮。

  9. 更新架构:

    • 映射键值属性:在架构的键值属性列中,选择要将字段映射到的键值属性。例如,如果名为 details 的字段始终包含文档的说明,请将该字段映射到键值属性 Description

    • 更新维度数(高级):如果您将自定义矢量嵌入与 Vertex AI Search 搭配使用,则可以更新此设置。请参阅高级:使用自定义嵌入

    • 更新字段注解:如需更新字段的注解,请选择或取消选择字段的注解设置。可用的注释包括可检索可编入索引动态分面可搜索可填充。某些字段设置存在限制。如需了解每种注解类型的说明和要求,请参阅配置字段设置

    • 添加新字段:在导入包含这些字段的新文档之前,先将新字段添加到架构中,可以缩短 Vertex AI Agent Builder 在导入后重新编制数据索引所需的时间。

      1. 点击添加新字段以展开该部分。

      2. 点击 add_box Add node,然后为新字段指定设置。

        如需指明数组,请将数组设置为。例如,如需添加字符串数组,请将 type 设置为 string,并将 Array 设置为 Yes

        对于网站数据存储区索引,您添加的所有字段默认都是数组。

  10. 点击保存以应用架构更改。

    更改架构会触发重新编制索引。对于大型数据存储区,重新编制索引的流程可能需要几小时才能完成。

REST

如需使用 API 更新架构,请按以下步骤操作:

  1. 请查看要求和限制以及限制示例(仅限 REST)部分,检查您的架构更改是否有效。

    如需更新包含网站或非结构化数据(带有元数据)的数据存储区的架构,请跳至第 5 步以调用 schema.patch 方法。

  2. 如果您要更新字段注解(将字段设置为可编入索引、可检索、动态 Facetable 或可搜索),请参阅配置字段设置,了解每种注解类型的限制和要求。

  3. 如果您要修改自动检测到的架构,请确保您已完成数据注入。否则,您可能还无法修改架构。

  4. 找到您的数据存储区 ID。如果您已拥有数据存储区 ID,请跳至下一步。

    1. 在 Google Cloud 控制台中,前往 Agent Builder 页面,然后在导航菜单中点击数据存储区

      前往“数据存储区”页面

    2. 点击您的数据存储区的名称。

    3. 在数据存储区的数据页面上,获取数据存储区 ID。

  5. 使用 schemas.patch API 方法以 JSON 对象的形式提供新的 JSON 架构。

    curl -X PATCH \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    "https://discoveryengine.googleapis.com/v1beta/projects/PROJECT_ID/locations/global/collections/default_collection/dataStores/DATA_STORE_ID/schemas/default_schema" \
    -d '{
      "structSchema": JSON_SCHEMA_OBJECT
    }'
    

    替换以下内容:

    • PROJECT_ID:您的 Google Cloud 项目的 ID。
    • DATA_STORE_ID:Vertex AI Search 数据存储区的 ID。
    • JSON_SCHEMA_OBJECT:您的新 JSON 架构(作为 JSON 对象)。例如:

      {
        "$schema": "https://json-schema.org/draft/2020-12/schema",
        "type": "object",
        "properties": {
          "title": {
            "type": "string",
            "keyPropertyMapping": "title"
          },
          "categories": {
            "type": "array",
            "items": {
              "type": "string",
              "keyPropertyMapping": "category"
            }
          },
          "uri": {
            "type": "string",
            "keyPropertyMapping": "uri"
          }
        }
      }
  6. 可选:按照查看架构定义中的步骤查看架构。

C#

如需了解详情,请参阅 Vertex AI Agent Builder C# API 参考文档

如需向 Vertex AI Agent Builder 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

using Google.Cloud.DiscoveryEngine.V1;
using Google.LongRunning;

public sealed partial class GeneratedSchemaServiceClientSnippets
{
    /// <summary>Snippet for UpdateSchema</summary>
    /// <remarks>
    /// This snippet has been automatically generated and should be regarded as a code template only.
    /// It will require modifications to work:
    /// - It may require correct/in-range values for request initialization.
    /// - It may require specifying regional endpoints when creating the service client as shown in
    ///   https://cloud.google.com/dotnet/docs/reference/help/client-configuration#endpoint.
    /// </remarks>
    public void UpdateSchemaRequestObject()
    {
        // Create client
        SchemaServiceClient schemaServiceClient = SchemaServiceClient.Create();
        // Initialize request argument(s)
        UpdateSchemaRequest request = new UpdateSchemaRequest
        {
            Schema = new Schema(),
            AllowMissing = false,
        };
        // Make the request
        Operation<Schema, UpdateSchemaMetadata> response = schemaServiceClient.UpdateSchema(request);

        // Poll until the returned long-running operation is complete
        Operation<Schema, UpdateSchemaMetadata> completedResponse = response.PollUntilCompleted();
        // Retrieve the operation result
        Schema result = completedResponse.Result;

        // Or get the name of the operation
        string operationName = response.Name;
        // This name can be stored, then the long-running operation retrieved later by name
        Operation<Schema, UpdateSchemaMetadata> retrievedResponse = schemaServiceClient.PollOnceUpdateSchema(operationName);
        // Check if the retrieved long-running operation has completed
        if (retrievedResponse.IsCompleted)
        {
            // If it has completed, then access the result
            Schema retrievedResult = retrievedResponse.Result;
        }
    }
}

Go

如需了解详情,请参阅 Vertex AI Agent Builder Go API 参考文档

如需向 Vertex AI Agent Builder 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证


package main

import (
	"context"

	discoveryengine "cloud.google.com/go/discoveryengine/apiv1"
	discoveryenginepb "cloud.google.com/go/discoveryengine/apiv1/discoveryenginepb"
)

func main() {
	ctx := context.Background()
	// This snippet has been automatically generated and should be regarded as a code template only.
	// It will require modifications to work:
	// - It may require correct/in-range values for request initialization.
	// - It may require specifying regional endpoints when creating the service client as shown in:
	//   https://pkg.go.dev/cloud.google.com/go#hdr-Client_Options
	c, err := discoveryengine.NewSchemaClient(ctx)
	if err != nil {
		// TODO: Handle error.
	}
	defer c.Close()

	req := &discoveryenginepb.UpdateSchemaRequest{
		// TODO: Fill request struct fields.
		// See https://pkg.go.dev/cloud.google.com/go/discoveryengine/apiv1/discoveryenginepb#UpdateSchemaRequest.
	}
	op, err := c.UpdateSchema(ctx, req)
	if err != nil {
		// TODO: Handle error.
	}

	resp, err := op.Wait(ctx)
	if err != nil {
		// TODO: Handle error.
	}
	// TODO: Use resp.
	_ = resp
}

Java

如需了解详情,请参阅 Vertex AI Agent Builder Java API 参考文档

如需向 Vertex AI Agent Builder 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.cloud.discoveryengine.v1.Schema;
import com.google.cloud.discoveryengine.v1.SchemaServiceClient;
import com.google.cloud.discoveryengine.v1.UpdateSchemaRequest;

public class SyncUpdateSchema {

  public static void main(String[] args) throws Exception {
    syncUpdateSchema();
  }

  public static void syncUpdateSchema() throws Exception {
    // This snippet has been automatically generated and should be regarded as a code template only.
    // It will require modifications to work:
    // - It may require correct/in-range values for request initialization.
    // - It may require specifying regional endpoints when creating the service client as shown in
    // https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
    try (SchemaServiceClient schemaServiceClient = SchemaServiceClient.create()) {
      UpdateSchemaRequest request =
          UpdateSchemaRequest.newBuilder()
              .setSchema(Schema.newBuilder().build())
              .setAllowMissing(true)
              .build();
      Schema response = schemaServiceClient.updateSchemaAsync(request).get();
    }
  }
}

Python

如需了解详情,请参阅 Vertex AI Agent Builder Python API 参考文档

如需向 Vertex AI Agent Builder 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import discoveryengine_v1


def sample_update_schema():
    # Create a client
    client = discoveryengine_v1.SchemaServiceClient()

    # Initialize request argument(s)
    request = discoveryengine_v1.UpdateSchemaRequest(
    )

    # Make the request
    operation = client.update_schema(request=request)

    print("Waiting for operation to complete...")

    response = operation.result()

    # Handle the response
    print(response)

Ruby

如需了解详情,请参阅 Vertex AI Agent Builder Ruby API 参考文档

如需向 Vertex AI Agent Builder 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

require "google/cloud/discovery_engine/v1"

##
# Snippet for the update_schema call in the SchemaService service
#
# This snippet has been automatically generated and should be regarded as a code
# template only. It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
# client as shown in https://cloud.google.com/ruby/docs/reference.
#
# This is an auto-generated example demonstrating basic usage of
# Google::Cloud::DiscoveryEngine::V1::SchemaService::Client#update_schema.
#
def update_schema
  # Create a client object. The client can be reused for multiple calls.
  client = Google::Cloud::DiscoveryEngine::V1::SchemaService::Client.new

  # Create a request. To set request fields, pass in keyword arguments.
  request = Google::Cloud::DiscoveryEngine::V1::UpdateSchemaRequest.new

  # Call the update_schema method.
  result = client.update_schema request

  # The returned object is of type Gapic::Operation. You can use it to
  # check the status of an operation, cancel it, or wait for results.
  # Here is how to wait for a response.
  result.wait_until_done! timeout: 60
  if result.response?
    p result.response
  else
    puts "No response received."
  end
end

要求和限制

更新架构时,请确保新架构与要更新的架构向后兼容。如需使用不向后兼容的新架构更新架构,您需要删除数据存储区中的所有文档,删除架构,然后创建新架构。

更新架构会触发对所有文档重新编制索引的流程。这可能需要一些时间,并且会产生额外费用:

  • 时间。为大型数据存储区重新编制索引可能需要数小时或数天的时间。

  • 支出。重新编制索引可能会产生费用,具体取决于解析器。例如,为使用 OCR 解析器或布局解析器的数据存储区重新编制索引都会产生费用。如需了解详情,请参阅 Document AI 功能定价

架构更新不支持以下操作:

  • 更改字段类型。架构更新不支持更改字段的类型。例如,映射到整数的字段无法更改为字符串。
  • 移除字段。字段一经定义便无法移除。您可以继续添加新字段,但无法移除现有字段。

限制示例(仅限 REST)

本部分介绍了有效和无效架构更新类型的示例。以下示例使用以下 JSON 架构示例:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "title": {
      "type": "string"
    },
    "description": {
      "type": "string",
      "keyPropertyMapping": "description"
    },
    "categories": {
      "type": "array",
      "items": {
        "type": "string",
        "keyPropertyMapping": "category"
      }
    }
  }
}

支持的更新示例

支持对示例架构进行以下更新。

  • 添加字段。在此示例中,字段 properties.uri 已添加到架构中。

    {
      "$schema": "https://json-schema.org/draft/2020-12/schema",
      "type": "object",
      "properties": {
        "title": {
          "type": "string"
        },
        "description": {
          "type": "string",
          "keyPropertyMapping": "description"
        },
        "uri": { // Added field. This is supported.
          "type": "string",
          "keyPropertyMapping": "uri"
        },
        "categories": {
          "type": "array",
          "items": {
            "type": "string",
            "keyPropertyMapping": "category"
          }
        }
      }
    }
    
  • titledescriptionuri 添加或移除键值对注解。在此示例中,keyPropertyMapping 已添加到 title 字段。

    {
      "$schema": "https://json-schema.org/draft/2020-12/schema",
      "type": "object",
      "properties": {
        "title": {
          "type": "string",
          "keyPropertyMapping": "title" // Added "keyPropertyMapping". This is supported.
        },
        "description": {
          "type": "string",
          "keyPropertyMapping": "description"
        },
        "categories": {
          "type": "array",
          "items": {
            "type": "string",
            "keyPropertyMapping": "category"
          }
        }
      }
    }
    

无效架构更新示例

不支持对示例架构进行以下更新。

  • 更改字段类型。在此示例中,title 字段的类型已从字符串更改为数字。不支持此操作。

      {
        "$schema": "https://json-schema.org/draft/2020-12/schema",
        "type": "object",
        "properties": {
          "title": {
            "type": "number" // Changed from string. Not allowed.
          },
          "description": {
            "type": "string",
            "keyPropertyMapping": "description"
          },
          "categories": {
            "type": "array",
            "items": {
              "type": "string",
              "keyPropertyMapping": "category"
            }
          }
        }
      }
    
  • 移除字段。在此示例中,title 字段已被移除。不支持此操作。

      {
        "$schema": "https://json-schema.org/draft/2020-12/schema",
        "type": "object",
        "properties": {
          // "title" is removed. Not allowed.
          "description": {
            "type": "string",
            "keyPropertyMapping": "description"
          },
          "uri": {
            "type": "string",
            "keyPropertyMapping": "uri"
          },
          "categories": {
            "type": "array",
            "items": {
              "type": "string",
              "keyPropertyMapping": "category"
            }
          }
        }
      }
    

后续步骤