本頁面由 Cloud Translation API 翻譯而成。

取得線上推論和說明

本頁說明如何使用 Google Cloud 控制台或 Vertex AI API，從表格分類或迴歸模型取得線上 (即時) 推論和說明。

線上推論為同步要求，而非非同步要求的批次推論。如要依據應用程式輸入內容發出要求，或是需要及時進行推論，您可以選用線上推論模式。

您必須先將模型部署至端點，才能使用模型提供線上推論。部署模型時，系統會將實體資源與模型建立關聯，讓模型以低延遲的方式提供線上推論結果。

涵蓋的主題包括：

將模型部署至端點
使用已部署的模型取得線上推論
使用已部署的模型取得線上解釋

事前準備

您必須先訓練分類或迴歸模型，並評估其準確度，才能取得線上推論。

將模型部署至端點

您可以將多個模型部署至同一個端點，也可以將模型部署至多個端點。如要進一步瞭解部署模型的選項和用途，請參閱「關於部署模型」一文。

請使用下列其中一種方法部署模型：

Google Cloud 控制台

在 Google Cloud 控制台的 Vertex AI 專區中，前往「Models」頁面。

前往「Models」(模型) 頁面
按一下要部署的模型名稱，開啟模型詳細資料頁面。
選取「Deploy & Test」分頁標籤。

如果模型已部署至任何端點，這些端點會列在「Deploy your model」部分。
按一下「Deploy to endpoint」。
在「Define your endpoint」頁面中，按照下列方式進行設定：
1. 您可以選擇將模型部署至新端點或現有端點。
  - 如要將模型部署至新端點，請選取「Create new endpoint」(建立新端點)，然後為新端點提供名稱。
  - 如要將模型部署至現有端點，請選取「新增至現有端點」，然後從下拉式清單中選取端點。
  - 您可以在端點中新增多個模型，也可以在多個端點中新增模型。瞭解詳情。
2. 按一下「繼續」。
在「Model settings」頁面中，按照下列方式進行設定：
1. 如果您要將模型部署至新端點，請接受「流量分配」為 100。如果您要將模型部署至已部署一或多個模型的現有端點，則必須更新要部署的模型和已部署模型的流量拆分百分比，讓所有百分比加總為 100%。
2. 輸入要為模型提供的運算節點數量下限。
  
  這是這個模型隨時可用的節點數量。無論是用於處理推論負載，還是用於待機 (最少) 節點，系統都會針對所使用的節點收取費用，即使沒有推論流量也一樣。請參閱定價頁面。
3. 選取機器類型。
  
  機器資源越多，推論效能就會越好，但成本也會增加。
4. 瞭解如何變更推論記錄的預設設定。
5. 按一下 [Continue] (繼續)。
在「模型監控」頁面中，按一下「繼續」。
在「監控目標」頁面中，完成以下設定：
1. 輸入訓練資料的位置。
2. 輸入目標欄的名稱。
按一下「Deploy」，將模型部署至端點。

API

使用 Vertex AI API 部署模型時，您必須完成下列步驟：

視需要建立端點。
取得端點 ID。
將模型部署至端點。

建立端點

如果您要將模型部署至現有端點，可以略過這個步驟。

gcloud

以下範例使用 gcloud ai endpoints create 指令：

  gcloud ai endpoints create \
    --region=LOCATION \
    --display-name=ENDPOINT_NAME

更改下列內容：

LOCATION_ID：您使用 Vertex AI 的區域。
ENDPOINT_NAME：端點的顯示名稱。

Google Cloud CLI 工具可能需要幾秒鐘的時間才能建立端點。

REST

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：您的區域。
PROJECT_ID：您的專案 ID。
ENDPOINT_NAME：端點的顯示名稱。

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints

JSON 要求主體：

{
  "display_name": "ENDPOINT_NAME"
}

如要傳送要求，請展開以下其中一個選項：

curl (Linux、macOS 或 Cloud Shell)

注意：以下指令假設您已使用使用者帳戶登入 gcloud CLI，方法是執行 gcloud init 或 gcloud auth login，或是使用 Cloud Shell，後者會自動登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints"

PowerShell (Windows)

注意：下列指令假設您已透過執行 gcloud init 或 gcloud auth login 登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateEndpointOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-11-05T17:45:42.812656Z",
      "updateTime": "2020-11-05T17:45:42.812656Z"
    }
  }
}

"done": true

Java

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。


import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.aiplatform.v1.CreateEndpointOperationMetadata;
import com.google.cloud.aiplatform.v1.Endpoint;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.LocationName;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateEndpointSample {

  public static void main(String[] args)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String endpointDisplayName = "YOUR_ENDPOINT_DISPLAY_NAME";
    createEndpointSample(project, endpointDisplayName);
  }

  static void createEndpointSample(String project, String endpointDisplayName)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    EndpointServiceSettings endpointServiceSettings =
        EndpointServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient endpointServiceClient =
        EndpointServiceClient.create(endpointServiceSettings)) {
      String location = "us-central1";
      LocationName locationName = LocationName.of(project, location);
      Endpoint endpoint = Endpoint.newBuilder().setDisplayName(endpointDisplayName).build();

      OperationFuture<Endpoint, CreateEndpointOperationMetadata> endpointFuture =
          endpointServiceClient.createEndpointAsync(locationName, endpoint);
      System.out.format("Operation name: %s\n", endpointFuture.getInitialFuture().get().getName());
      System.out.println("Waiting for operation to finish...");
      Endpoint endpointResponse = endpointFuture.get(300, TimeUnit.SECONDS);

      System.out.println("Create Endpoint Response");
      System.out.format("Name: %s\n", endpointResponse.getName());
      System.out.format("Display Name: %s\n", endpointResponse.getDisplayName());
      System.out.format("Description: %s\n", endpointResponse.getDescription());
      System.out.format("Labels: %s\n", endpointResponse.getLabelsMap());
      System.out.format("Create Time: %s\n", endpointResponse.getCreateTime());
      System.out.format("Update Time: %s\n", endpointResponse.getUpdateTime());
    }
  }
}

Node.js

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const endpointDisplayName = 'YOUR_ENDPOINT_DISPLAY_NAME';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

// Imports the Google Cloud Endpoint Service Client library
const {EndpointServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const endpointServiceClient = new EndpointServiceClient(clientOptions);

async function createEndpoint() {
  // Configure the parent resource
  const parent = `projects/${project}/locations/${location}`;
  const endpoint = {
    displayName: endpointDisplayName,
  };
  const request = {
    parent,
    endpoint,
  };

  // Get and print out a list of all the endpoints for this resource
  const [response] = await endpointServiceClient.createEndpoint(request);
  console.log(`Long running operation : ${response.name}`);

  // Wait for operation to complete
  await response.promise();
  const result = response.result;

  console.log('Create endpoint response');
  console.log(`\tName : ${result.name}`);
  console.log(`\tDisplay name : ${result.displayName}`);
  console.log(`\tDescription : ${result.description}`);
  console.log(`\tLabels : ${JSON.stringify(result.labels)}`);
  console.log(`\tCreate time : ${JSON.stringify(result.createTime)}`);
  console.log(`\tUpdate time : ${JSON.stringify(result.updateTime)}`);
}
createEndpoint();

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Vertex AI SDK for Python API 參考說明文件。

def create_endpoint_sample(
    project: str,
    display_name: str,
    location: str,
):
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint.create(
        display_name=display_name,
        project=project,
        location=location,
    )

    print(endpoint.display_name)
    print(endpoint.resource_name)
    return endpoint

取得端點 ID

您需要端點 ID 才能部署模型。

gcloud

以下範例使用 gcloud ai endpoints list 指令：

  gcloud ai endpoints list \
    --region=LOCATION \
    --filter=display_name=ENDPOINT_NAME

更改下列內容：

LOCATION_ID：您使用 Vertex AI 的區域。
ENDPOINT_NAME：端點的顯示名稱。

請注意「ENDPOINT_ID」欄中的數字。在下一個步驟中使用這個 ID。

REST

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：您使用 Vertex AI 的區域。
PROJECT_ID：您的專案 ID。
ENDPOINT_NAME：端點的顯示名稱。

HTTP 方法和網址：

GET https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME

如要傳送要求，請展開以下其中一個選項：

curl (Linux、macOS 或 Cloud Shell)

執行下列指令：

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME"

PowerShell (Windows)

注意：下列指令假設您已透過執行 gcloud init 或 gcloud auth login 登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

{
  "endpoints": [
    {
      "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID",
      "displayName": "ENDPOINT_NAME",
      "etag": "AMEw9yPz5pf4PwBHbRWOGh0PcAxUdjbdX2Jm3QO_amguy3DbZGP5Oi_YUKRywIE-BtLx",
      "createTime": "2020-04-17T18:31:11.585169Z",
      "updateTime": "2020-04-17T18:35:08.568959Z"
    }
  ]
}

請注意 ENDPOINT_ID。

部署模型

請選取下方對應您語言或環境的分頁：

gcloud

以下範例使用 gcloud ai endpoints deploy-model 指令。

以下範例會將 Model 部署至 Endpoint，但不會使用 GPU 加速預測服務，也不會在多個 DeployedModel 資源之間分割流量：

使用下列任何指令資料之前，請先替換以下項目：

ENDPOINT_ID：端點的 ID。
LOCATION_ID：您使用 Vertex AI 的區域。
MODEL_ID：要部署的模型 ID。
DEPLOYED_MODEL_NAME：DeployedModel 的名稱。您也可以使用 Model 的顯示名稱來命名 DeployedModel。
MACHINE_TYPE：選用。此部署作業的每個節點使用的機器資源。預設為 n1-standard-2。進一步瞭解機器類型。
MIN_REPLICA_COUNT：此部署作業的節點數量下限。節點數量可視推論負載需求增加或減少，但不得超過節點數量上限，也不能少於這個數量。這個值必須大於或等於 1。如果省略 --min-replica-count 標記，值預設為 1。
MAX_REPLICA_COUNT：此部署作業的節點數量上限。節點數量可視推論負載需求增加或減少，但不得超過這個數量，也不能少於節點數量下限。如果省略 --max-replica-count 標記，節點數量上限會設為 --min-replica-count 的值。

執行 gcloud ai endpoints deploy-model 指令：

Linux、macOS 或 Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \
  --machine-type=MACHINE_TYPE \
  --min-replica-count=MIN_REPLICA_COUNT \
  --max-replica-count=MAX_REPLICA_COUNT \
  --traffic-split=0=100

Windows (PowerShell)

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME `
  --machine-type=MACHINE_TYPE `
  --min-replica-count=MIN_REPLICA_COUNT `
  --max-replica-count=MAX_REPLICA_COUNT `
  --traffic-split=0=100

Windows (cmd.exe)

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME ^
  --machine-type=MACHINE_TYPE ^
  --min-replica-count=MIN_REPLICA_COUNT ^
  --max-replica-count=MAX_REPLICA_COUNT ^
  --traffic-split=0=100

流量分配

上述範例中的 --traffic-split=0=100 標記會將 Endpoint 收到的預測流量 100% 傳送至新的 DeployedModel，並以臨時 ID 0 表示。如果您的 Endpoint 已包含其他 DeployedModel 資源，您可以將流量分配給新 DeployedModel 和舊 DeployedModel。例如，如要將 20% 的流量傳送至新的 DeployedModel，並將 80% 的流量傳送至較舊的 DeployedModel，請執行下列指令。

使用下列任何指令資料之前，請先替換以下項目：

OLD_DEPLOYED_MODEL_ID：現有 DeployedModel 的 ID。

執行 gcloud ai endpoints deploy-model 指令：

Linux、macOS 或 Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \ 
  --machine-type=MACHINE_TYPE \
  --min-replica-count=MIN_REPLICA_COUNT \
  --max-replica-count=MAX_REPLICA_COUNT \
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows (PowerShell)

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME \ 
  --machine-type=MACHINE_TYPE `
  --min-replica-count=MIN_REPLICA_COUNT `
  --max-replica-count=MAX_REPLICA_COUNT `
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows (cmd.exe)

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME \ 
  --machine-type=MACHINE_TYPE ^
  --min-replica-count=MIN_REPLICA_COUNT ^
  --max-replica-count=MAX_REPLICA_COUNT ^
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

REST

您可以使用 endpoints.predict 方法要求線上推論。

部署模型。

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：您使用 Vertex AI 的區域。
PROJECT_ID：您的專案 ID。
ENDPOINT_ID：端點的 ID。
MODEL_ID：要部署的模型 ID。
DEPLOYED_MODEL_NAME：DeployedModel 的名稱。您也可以使用 Model 的顯示名稱來命名 DeployedModel。
MACHINE_TYPE：選用。此部署作業的每個節點使用的機器資源。預設為 n1-standard-2。進一步瞭解機器類型。
ACCELERATOR_TYPE：要連結至機器的加速器類型。如果未指定 ACCELERATOR_COUNT 或 ACCELERATOR_COUNT 為零，則為選用項目。不建議用於使用非 GPU 圖片的 AutoML 模型或自訂訓練模型。瞭解詳情。
ACCELERATOR_COUNT：每個備援資料庫可使用的加速器數量。(選用步驟) 如果是使用非 GPU 圖片的 AutoML 模型或自訂訓練模型，則應為零或未指定。
MIN_REPLICA_COUNT：此部署作業的節點數量下限。節點數量可視推論負載需求增加或減少，但不得超過節點數量上限，也不能少於這個數量。這個值必須大於或等於 1。
MAX_REPLICA_COUNT：此部署作業的節點數量上限。節點數量可視推論負載需求增加或減少，但不得超過這個數量，也不能少於節點數量下限。
REQUIRED_REPLICA_COUNT：選用。這項部署作業必須達到節點數量，才能標示為成功。必須大於或等於 1，且小於或等於節點數下限。如未指定，則預設值為節點數量下限。
TRAFFIC_SPLIT_THIS_MODEL：傳送至此端點的預測流量百分比，會路由至透過此作業部署的模型。預設值為 100。所有流量百分比的總和必須為 100。進一步瞭解流量分配。
DEPLOYED_MODEL_ID_N：選用。如果其他模型已部署至這個端點，您必須更新其流量分配百分比，讓所有百分比加總為 100。
TRAFFIC_SPLIT_MODEL_N：已部署模型 ID 鍵的流量分配百分比值。
PROJECT_NUMBER：系統自動產生的專案編號

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel

JSON 要求主體：

{
  "deployedModel": {
    "model": "projects/PROJECT/locations/us-central1/models/MODEL_ID",
    "displayName": "DEPLOYED_MODEL_NAME",
    "dedicatedResources": {
       "machineSpec": {
         "machineType": "MACHINE_TYPE",
         "acceleratorType": "ACCELERATOR_TYPE",
         "acceleratorCount": "ACCELERATOR_COUNT"
       },
       "minReplicaCount": MIN_REPLICA_COUNT,
       "maxReplicaCount": MAX_REPLICA_COUNT,
       "requiredReplicaCount": REQUIRED_REPLICA_COUNT
     },
  },
  "trafficSplit": {
    "0": TRAFFIC_SPLIT_THIS_MODEL,
    "DEPLOYED_MODEL_ID_1": TRAFFIC_SPLIT_MODEL_1,
    "DEPLOYED_MODEL_ID_2": TRAFFIC_SPLIT_MODEL_2
  },
}

如要傳送要求，請展開以下其中一個選項：

curl (Linux、macOS 或 Cloud Shell)

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel"

PowerShell (Windows)

注意：下列指令假設您已透過執行 gcloud init 或 gcloud auth login 登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

{
  "name": "projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployModelOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-10-19T17:53:16.502088Z",
      "updateTime": "2020-10-19T17:53:16.502088Z"
    }
  }
}

Java

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.aiplatform.v1.DedicatedResources;
import com.google.cloud.aiplatform.v1.DeployModelOperationMetadata;
import com.google.cloud.aiplatform.v1.DeployModelResponse;
import com.google.cloud.aiplatform.v1.DeployedModel;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.MachineSpec;
import com.google.cloud.aiplatform.v1.ModelName;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ExecutionException;

public class DeployModelCustomTrainedModelSample {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "PROJECT";
    String endpointId = "ENDPOINT_ID";
    String modelName = "MODEL_NAME";
    String deployedModelDisplayName = "DEPLOYED_MODEL_DISPLAY_NAME";
    deployModelCustomTrainedModelSample(project, endpointId, modelName, deployedModelDisplayName);
  }

  static void deployModelCustomTrainedModelSample(
      String project, String endpointId, String model, String deployedModelDisplayName)
      throws IOException, ExecutionException, InterruptedException {
    EndpointServiceSettings settings =
        EndpointServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();
    String location = "us-central1";

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient client = EndpointServiceClient.create(settings)) {
      MachineSpec machineSpec = MachineSpec.newBuilder().setMachineType("n1-standard-2").build();
      DedicatedResources dedicatedResources =
          DedicatedResources.newBuilder().setMinReplicaCount(1).setMachineSpec(machineSpec).build();

      String modelName = ModelName.of(project, location, model).toString();
      DeployedModel deployedModel =
          DeployedModel.newBuilder()
              .setModel(modelName)
              .setDisplayName(deployedModelDisplayName)
              // `dedicated_resources` must be used for non-AutoML models
              .setDedicatedResources(dedicatedResources)
              .build();
      // key '0' assigns traffic for the newly deployed model
      // Traffic percentage values must add up to 100
      // Leave dictionary empty if endpoint should not accept any traffic
      Map<String, Integer> trafficSplit = new HashMap<>();
      trafficSplit.put("0", 100);
      EndpointName endpoint = EndpointName.of(project, location, endpointId);
      OperationFuture<DeployModelResponse, DeployModelOperationMetadata> response =
          client.deployModelAsync(endpoint, deployedModel, trafficSplit);

      // You can use OperationFuture.getInitialFuture to get a future representing the initial
      // response to the request, which contains information while the operation is in progress.
      System.out.format("Operation name: %s\n", response.getInitialFuture().get().getName());

      // OperationFuture.get() will block until the operation is finished.
      DeployModelResponse deployModelResponse = response.get();
      System.out.format("deployModelResponse: %s\n", deployModelResponse);
    }
  }
}

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Vertex AI SDK for Python API 參考說明文件。

def deploy_model_with_dedicated_resources_sample(
    project,
    location,
    model_name: str,
    machine_type: str,
    endpoint: Optional[aiplatform.Endpoint] = None,
    deployed_model_display_name: Optional[str] = None,
    traffic_percentage: Optional[int] = 0,
    traffic_split: Optional[Dict[str, int]] = None,
    min_replica_count: int = 1,
    max_replica_count: int = 1,
    accelerator_type: Optional[str] = None,
    accelerator_count: Optional[int] = None,
    explanation_metadata: Optional[explain.ExplanationMetadata] = None,
    explanation_parameters: Optional[explain.ExplanationParameters] = None,
    metadata: Optional[Sequence[Tuple[str, str]]] = (),
    sync: bool = True,
):
    """
    model_name: A fully-qualified model resource name or model ID.
          Example: "projects/123/locations/us-central1/models/456" or
          "456" when project and location are initialized or passed.
    """

    aiplatform.init(project=project, location=location)

    model = aiplatform.Model(model_name=model_name)

    # The explanation_metadata and explanation_parameters should only be
    # provided for a custom trained model and not an AutoML model.
    model.deploy(
        endpoint=endpoint,
        deployed_model_display_name=deployed_model_display_name,
        traffic_percentage=traffic_percentage,
        traffic_split=traffic_split,
        machine_type=machine_type,
        min_replica_count=min_replica_count,
        max_replica_count=max_replica_count,
        accelerator_type=accelerator_type,
        accelerator_count=accelerator_count,
        explanation_metadata=explanation_metadata,
        explanation_parameters=explanation_parameters,
        metadata=metadata,
        sync=sync,
    )

    model.wait()

    print(model.display_name)
    print(model.resource_name)
    return model

Node.js

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

const automl = require('@google-cloud/automl');
const client = new automl.v1beta1.AutoMlClient();

/**
 * Demonstrates using the AutoML client to create a model.
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// const projectId = '[PROJECT_ID]' e.g., "my-gcloud-project";
// const computeRegion = '[REGION_NAME]' e.g., "us-central1";
// const datasetId = '[DATASET_ID]' e.g., "TBL2246891593778855936";
// const tableId = '[TABLE_ID]' e.g., "1991013247762825216";
// const columnId = '[COLUMN_ID]' e.g., "773141392279994368";
// const modelName = '[MODEL_NAME]' e.g., "testModel";
// const trainBudget = '[TRAIN_BUDGET]' e.g., "1000",
// `Train budget in milli node hours`;

// A resource that represents Google Cloud Platform location.
const projectLocation = client.locationPath(projectId, computeRegion);

// Get the full path of the column.
const columnSpecId = client.columnSpecPath(
  projectId,
  computeRegion,
  datasetId,
  tableId,
  columnId
);

// Set target column to train the model.
const targetColumnSpec = {name: columnSpecId};

// Set tables model metadata.
const tablesModelMetadata = {
  targetColumnSpec: targetColumnSpec,
  trainBudgetMilliNodeHours: trainBudget,
};

// Set datasetId, model name and model metadata for the dataset.
const myModel = {
  datasetId: datasetId,
  displayName: modelName,
  tablesModelMetadata: tablesModelMetadata,
};

// Create a model with the model metadata in the region.
client
  .createModel({parent: projectLocation, model: myModel})
  .then(responses => {
    const initialApiResponse = responses[1];
    console.log(`Training operation name: ${initialApiResponse.name}`);
    console.log('Training started...');
  })
  .catch(err => {
    console.error(err);
  });

瞭解如何變更推論記錄的預設設定。

取得作業狀態

部分要求會啟動需要時間才能完成的長時間作業。這些要求會傳回作業名稱，您可以使用該名稱查看作業狀態或取消作業。Vertex AI 提供輔助方法，可針對長時間執行的作業進行呼叫。詳情請參閱「處理長時間執行作業」。

使用已部署的模型取得線上推論結果

如要進行線上推論，請將一或多個測試項目提交給模型進行分析，模型會根據模型的目標傳回結果。使用 Google Cloud 控制台或 Vertex AI API 要求線上推論。

Google Cloud 控制台

在 Google Cloud 控制台的 Vertex AI 專區中，前往「Models」頁面。

前往「Models」(模型) 頁面
在模型清單中，按一下要要求推論的模型名稱。
選取「Deploy & test」分頁標籤。
在「Test your model」部分下方，新增測試項目來要求推論。系統會為您填入基準推論資料，您也可以輸入自己的推論資料，然後按一下「預測」。

推論完成後，Vertex AI 會在控制台中傳回結果。

API：分類

gcloud

建立名為 request.json 的檔案，並在其中加入下列內容：
```
      {
  "instances": [
    {
      PREDICTION_DATA_ROW
    }
  ]
}
    
```
取代下列內容：
- PREDICTION_DATA_ROW：JSON 物件，其中鍵為功能名稱，值為相應的功能值。舉例來說，如果資料集包含數字、字串陣列和類別，資料列可能會像以下範例要求一樣：
```
"length":3.6,
"material":"cotton",
"tag_array": ["abc","def"]
```
  您必須為訓練中包含的每個特徵提供值。用於預測的資料格式必須與訓練時使用的格式相符。詳情請參閱「預測資料格式」。
執行下列指令：
```
gcloud ai endpoints predict ENDPOINT_ID \
  --region=LOCATION_ID \
  --json-request=request.json
```
更改下列內容：
- ENDPOINT_ID：端點的 ID。
- LOCATION_ID：您使用 Vertex AI 的區域。

REST

您可以使用 endpoints.predict 方法要求線上推論。

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：端點所在的區域。例如：us-central1。
PROJECT_ID：您的專案 ID。
ENDPOINT_ID：端點的 ID。
PREDICTION_DATA_ROW：JSON 物件，其中鍵為功能名稱，值為相應的功能值。舉例來說，如果資料集包含數字、字串陣列和類別，資料列可能會像以下範例要求一樣：
```
"length":3.6,
"material":"cotton",
"tag_array": ["abc","def"]
```
您必須為訓練中包含的每個特徵提供值。用於預測的資料格式必須與訓練時使用的格式相符。詳情請參閱「預測資料格式」。
DEPLOYED_MODEL_ID：由 predict 方法輸出，並由 explain 方法接受做為輸入內容。用於產生推論的模型 ID。如果您需要針對先前要求的推論要求說明，且已部署多個模型，您可以使用這個 ID 確保系統會針對先前提供推論的相同模型傳回說明。

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict

JSON 要求主體：

{
  "instances": [
    {
      PREDICTION_DATA_ROW
    }
  ]
}

如要傳送要求，請選擇以下其中一個選項：

curl

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict"

PowerShell

注意：下列指令假設您已透過執行 gcloud init 或 gcloud auth login 登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

   {
     "predictions": [
      {
         "scores": [
           0.96771615743637085,
           0.032283786684274673
         ],
         "classes": [
           "0",
           "1"
         ]
      }
     ]
     "deployedModelId": "2429510197"
   }

Java

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。


import com.google.cloud.aiplatform.util.ValueConverter;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.cloud.aiplatform.v1.schema.predict.prediction.TabularClassificationPredictionResult;
import com.google.protobuf.ListValue;
import com.google.protobuf.Value;
import com.google.protobuf.util.JsonFormat;
import java.io.IOException;
import java.util.List;

public class PredictTabularClassificationSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String instance = "[{ “feature_column_a”: “value”, “feature_column_b”: “value”}]";
    String endpointId = "YOUR_ENDPOINT_ID";
    predictTabularClassification(instance, project, endpointId);
  }

  static void predictTabularClassification(String instance, String project, String endpointId)
      throws IOException {
    PredictionServicPredictionServiceSettingsceSettings =
        PredictionServicPredictionServiceSettings          .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PredictionServicPredictionServiceClientceClient =
        PredictionServicPredictionServiceClientonServiceSettings)) {
      String location = "us-central1";
      EndpointName endEndpointNameEndpointName.of(EndpointNameation, endpointId);

      ListValue.BuildeListValueue = ListValue.newBuiListValue     JsonFormat.parseJsonFormatinstance, listValue);
      List<Value> instanListValuelistValue.getValuesList();

      Value parametersValuelue.newBuilderValuetListValue(listValue).build();
      PredictResponse PredictResponse =
          predictionServiceClient.predict(endpointName, instanceList, parameters);
      System.out.println("Predict Tabular Classification Response");
      System.out.format("\tDeployed Model Id: %s\n", predictResponse.predictResponse.getDeployedModelId().out.println("Predictions");
      for (Value predictionValueedictResponse.predictResponse.getPredictionsList()larClassificTabularClassificationPredictionResultuilder =
            TabularClassificTabularClassificationPredictionResult       TabularClassificTabularClassificationPredictionResult      (TabularClassificTabularClassificationPredictionResult  ValueConverter.fValueConvertertBuilder, prediction);

        for (int i = 0; i < result.getClasseresult.getClassesCount()   System.out.printf("\tClass: %s", result.getClasseresult.getClasses(i)tem.out.printf("\tScore: %f", result.getScoresresult.getScores(i)   }
    }
  }
}

Node.js

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const endpointId = 'YOUR_ENDPOINT_ID';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');
const {prediction} =
  aiplatform.protos.google.cloud.aiplatform.v1.schema.predict;

// Imports the Google Cloud Prediction service client
const {PredictionServiceClient} = aiplatform.v1;

// Import the helper module for converting arbitrary protobuf.Value objects.
const {helpers} = aiplatform;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function predictTablesClassification() {
  // Configure the endpoint resource
  const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;
  const parameters = helpers.toValue({});

  const instance = helpers.toValue({
    petal_length: '1.4',
    petal_width: '1.3',
    sepal_length: '5.1',
    sepal_width: '2.8',
  });

  const instances = [instance];
  const request = {
    endpoint,
    instances,
    parameters,
  };

  // Predict request
  const [response] = await predictionServiceClient.predict(request);

  console.log('Predict tabular classification response');
  console.log(`\tDeployed model id : ${response.deployedModelId}\n`);
  const predictions = response.predictions;
  console.log('Predictions :');
  for (const predictionResultVal of predictions) {
    const predictionResultObj =
      prediction.TabularClassificationPredictionResult.fromValue(
        predictionResultVal
      );
    for (const [i, class_] of predictionResultObj.classes.entries()) {
      console.log(`\tClass: ${class_}`);
      console.log(`\tScore: ${predictionResultObj.scores[i]}\n\n`);
    }
  }
}
predictTablesClassification();

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Vertex AI SDK for Python API 參考說明文件。

def predict_tabular_classification_sample(
    project: str,
    location: str,
    endpoint_name: str,
    instances: List[Dict],
):
    """
    Args
        project: Your project ID or project number.
        location: Region where Endpoint is located. For example, 'us-central1'.
        endpoint_name: A fully qualified endpoint name or endpoint ID. Example: "projects/123/locations/us-central1/endpoints/456" or
               "456" when project and location are initialized or passed.
        instances: A list of one or more instances (examples) to return a prediction for.
    """
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint(endpoint_name)

    response = endpoint.predict(instances=instances)

    for prediction_ in response.predictions:
        print(prediction_)

API：迴歸

gcloud

建立名為 `request.json` 的檔案，其中含有下列內容：
```
      {
  "instances": [
    {
      PREDICTION_DATA_ROW
    }
  ]
}
    
```
取代下列內容：
- PREDICTION_DATA_ROW：JSON 物件，其中鍵為功能名稱，值為相應的功能值。舉例來說，如果資料集包含數字、數字陣列和類別，資料列可能會像以下範例要求一樣：
```
"age":3.6,
"sq_ft":5392,
"code": "90331"
```
  您必須為訓練中包含的每個特徵提供值。用於預測的資料格式必須與用於訓練的格式相符。詳情請參閱「預測資料格式」。
執行下列指令：
```
gcloud ai endpoints predict ENDPOINT_ID \
  --region=LOCATION_ID \
  --json-request=request.json
```
更改下列內容：
- ENDPOINT_ID：端點的 ID。
- LOCATION_ID：您使用 Vertex AI 的區域。

REST

您可以使用 endpoints.predict 方法要求線上推論。

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：端點所在的區域。例如：us-central1。
PROJECT_ID：您的專案 ID。
ENDPOINT_ID：端點的 ID。
PREDICTION_DATA_ROW：JSON 物件，其中鍵為功能名稱，值為相應的功能值。舉例來說，如果資料集包含數字、數字陣列和類別，資料列可能會像以下範例要求一樣：
```
"age":3.6,
"sq_ft":5392,
"code": "90331"
```
您必須為訓練中包含的每個特徵提供值。用於預測的資料格式必須與用於訓練的格式相符。詳情請參閱「預測資料格式」。
DEPLOYED_MODEL_ID：由 predict 方法輸出，並由 explain 方法接受做為輸入內容。用於產生推論的模型 ID。如果您需要針對先前要求的推論要求說明，且已部署多個模型，您可以使用這個 ID 確保系統會針對先前提供推論的相同模型傳回說明。

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict

JSON 要求主體：

{
  "instances": [
    {
      PREDICTION_DATA_ROW
    }
  ]
}

如要傳送要求，請選擇以下其中一個選項：

curl

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict"

PowerShell

注意：下列指令假設您已透過執行 gcloud init 或 gcloud auth login 登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：


{
  "predictions": [
    [
      {
        "value": 65.14233,
        "lower_bound": 4.6572,
        "upper_bound": 164.0279
      }
    ]
  ],
  "deployedModelId": "DEPLOYED_MODEL_ID"
}

Java

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Java。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。


import com.google.cloud.aiplatform.util.ValueConverter;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.cloud.aiplatform.v1.schema.predict.prediction.TabularRegressionPredictionResult;
import com.google.protobuf.ListValue;
import com.google.protobuf.Value;
import com.google.protobuf.util.JsonFormat;
import java.io.IOException;
import java.util.List;

public class PredictTabularRegressionSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String instance = "[{ “feature_column_a”: “value”, “feature_column_b”: “value”}]";
    String endpointId = "YOUR_ENDPOINT_ID";
    predictTabularRegression(instance, project, endpointId);
  }

  static void predictTabularRegression(String instance, String project, String endpointId)
      throws IOException {
    PredictionServicPredictionServiceSettingsceSettings =
        PredictionServicPredictionServiceSettings          .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PredictionServicPredictionServiceClientceClient =
        PredictionServicPredictionServiceClientonServiceSettings)) {
      String location = "us-central1";
      EndpointName endEndpointNameEndpointName.of(EndpointNameation, endpointId);

      ListValue.BuildeListValueue = ListValue.newBuiListValue     JsonFormat.parseJsonFormatinstance, listValue);
      List<Value> instanListValuelistValue.getValuesList();

      Value parametersValuelue.newBuilderValuetListValue(listValue).build();
      PredictResponse PredictResponse =
          predictionServiceClient.predict(endpointName, instanceList, parameters);
      System.out.println("Predict Tabular Regression Response");
      System.out.format("\tDisplay Model Id: %s\n", predictResponse.predictResponse.getDeployedModelId().out.println("Predictions");
      for (Value predictionValueedictResponse.predictResponse.getPredictionsList()larRegressioTabularRegressionPredictionResultuilder =
            TabularRegressioTabularRegressionPredictionResult        TabularRegressioTabularRegressionPredictionResult      (TabularRegressioTabularRegressionPredictionResult.fValueConvertertBuilder, prediction);

        System.out.printf("\tUpper bound: %f\n", result.getUpperBresult.getUpperBound()m.out.printf("\tLower bound: %f\n", result.getLowerBresult.getLowerBound()m.out.printf("\tValue: %f\n", result.getValue(result.getValue()
  }
}

Node.js

在試用這個範例之前，請先按照 Vertex AI 快速入門：使用用戶端程式庫中的操作說明設定 Node.js。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證機制」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const endpointId = 'YOUR_ENDPOINT_ID';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');
const {prediction} =
  aiplatform.protos.google.cloud.aiplatform.v1.schema.predict;

// Imports the Google Cloud Prediction service client
const {PredictionServiceClient} = aiplatform.v1;

// Import the helper module for converting arbitrary protobuf.Value objects.
const {helpers} = aiplatform;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function predictTablesRegression() {
  // Configure the endpoint resource
  const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;
  const parameters = helpers.toValue({});

  // TODO (erschmid): Make this less painful
  const instance = helpers.toValue({
    BOOLEAN_2unique_NULLABLE: false,
    DATETIME_1unique_NULLABLE: '2019-01-01 00:00:00',
    DATE_1unique_NULLABLE: '2019-01-01',
    FLOAT_5000unique_NULLABLE: 1611,
    FLOAT_5000unique_REPEATED: [2320, 1192],
    INTEGER_5000unique_NULLABLE: '8',
    NUMERIC_5000unique_NULLABLE: 16,
    STRING_5000unique_NULLABLE: 'str-2',
    STRUCT_NULLABLE: {
      BOOLEAN_2unique_NULLABLE: false,
      DATE_1unique_NULLABLE: '2019-01-01',
      DATETIME_1unique_NULLABLE: '2019-01-01 00:00:00',
      FLOAT_5000unique_NULLABLE: 1308,
      FLOAT_5000unique_REPEATED: [2323, 1178],
      FLOAT_5000unique_REQUIRED: 3089,
      INTEGER_5000unique_NULLABLE: '1777',
      NUMERIC_5000unique_NULLABLE: 3323,
      TIME_1unique_NULLABLE: '23:59:59.999999',
      STRING_5000unique_NULLABLE: 'str-49',
      TIMESTAMP_1unique_NULLABLE: '1546387199999999',
    },
    TIMESTAMP_1unique_NULLABLE: '1546387199999999',
    TIME_1unique_NULLABLE: '23:59:59.999999',
  });

  const instances = [instance];
  const request = {
    endpoint,
    instances,
    parameters,
  };

  // Predict request
  const [response] = await predictionServiceClient.predict(request);

  console.log('Predict tabular regression response');
  console.log(`\tDeployed model id : ${response.deployedModelId}`);
  const predictions = response.predictions;
  console.log('\tPredictions :');
  for (const predictionResultVal of predictions) {
    const predictionResultObj =
      prediction.TabularRegressionPredictionResult.fromValue(
        predictionResultVal
      );
    console.log(`\tUpper bound: ${predictionResultObj.upper_bound}`);
    console.log(`\tLower bound: ${predictionResultObj.lower_bound}`);
    console.log(`\tLower bound: ${predictionResultObj.value}`);
  }
}
predictTablesRegression();

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Vertex AI SDK for Python API 參考說明文件。

def predict_tabular_regression_sample(
    project: str,
    location: str,
    endpoint_name: str,
    instances: List[Dict],
):
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint(endpoint_name)

    response = endpoint.predict(instances=instances)

    for prediction_ in response.predictions:
        print(prediction_)

解讀預測結果

分類

分類模型會傳回可信度分數。

信心分數會顯示模型將每個類別或標籤與測試項目建立關聯的強度。數字越高，表示模型判斷標籤適用於該項目的信心就越高。您可以自行決定接受模型結果所需的可信度分數門檻。

迴歸

迴歸模型會傳回推論值。對於 BigQuery 目的地，也會傳回推論間隔。推論間隔會提供值範圍，模型有 95% 信心包含實際結果。

使用已部署的模型取得線上說明

您可以要求進行含有說明的推論 (也稱為功能歸因)，瞭解模型如何做出推論。當地特徵重要性值會指出各項特徵對推論結果的影響程度。透過 Vertex Explainable AI，將特徵歸因納入 Vertex AI 推論中。

控制台

當您使用 Google Cloud 主控台要求線上推論時，系統會自動傳回本機特徵重要性值。

如果您使用預先填入的預測值，則本地特徵重要性值都為零。這是因為預先填入的值是基準預測資料，因此傳回的預測值是基準預測值。

gcloud

建立名為 request.json 的檔案，並在當中加入下列內容：
```
{
  "instances": [
    {
      PREDICTION_DATA_ROW
    }
  ]
}
```
更改下列內容：
- PREDICTION_DATA_ROW：JSON 物件，其中鍵為功能名稱，值為相應的功能值。舉例來說，如果資料集包含數字、字串陣列和類別，資料列可能會像以下範例要求一樣：
```
"length":3.6,
"material":"cotton",
"tag_array": ["abc","def"]
```
  您必須為訓練中包含的每個特徵提供值。用於預測的資料格式必須與訓練時使用的格式相符。詳情請參閱「預測資料格式」。
執行下列指令：
```
gcloud ai endpoints explain ENDPOINT_ID \
  --region=LOCATION_ID \
  --json-request=request.json
```
更改下列內容：
- ENDPOINT_ID：端點的 ID。
- LOCATION_ID：您使用 Vertex AI 的區域。
如要將說明要求傳送至 Endpoint 上的特定 DeployedModel，您可以選擇指定 --deployed-model-id 標記：
```
gcloud ai endpoints explain ENDPOINT_ID \
  --region=LOCATION \
  --deployed-model-id=DEPLOYED_MODEL_ID \
  --json-request=request.json
```
除了先前說明的預留位置之外，請替換下列項目：
- DEPLOYED_MODEL_ID 選用：您要取得說明的已部署模型 ID。這個 ID 會包含在 predict 方法的回應中。如果您需要針對特定模型要求說明，且有多個模型已部署至同一個端點，您可以使用這個 ID 確保系統會針對該特定模型傳回說明。

REST

以下範例顯示表格分類模型的線上推論要求，其中包含局部特徵歸因。迴歸模型的要求格式相同。

使用任何要求資料之前，請先替換以下項目：

LOCATION：端點所在的區域。例如：us-central1。
PROJECT：您的專案 ID。
ENDPOINT_ID：端點的 ID。
PREDICTION_DATA_ROW：JSON 物件，其中鍵為功能名稱，值為相應的功能值。舉例來說，如果資料集包含數字、字串陣列和類別，資料列可能會像以下範例要求一樣：
```
"length":3.6,
"material":"cotton",
"tag_array": ["abc","def"]
```
您必須為訓練中包含的每個特徵提供值。用於預測的資料格式必須與訓練時使用的格式相符。詳情請參閱「預測資料格式」。
DEPLOYED_MODEL_ID (選用)：您要取得解釋的已部署模型 ID。這個 ID 會包含在 predict 方法的回應中。如果您需要針對特定模型要求說明，且有多個模型已部署至同一個端點，您可以使用這個 ID 確保系統會針對該特定模型傳回說明。

HTTP 方法和網址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/endpoints/ENDPOINT_ID:explain

JSON 要求主體：

{
  "instances": [
    {
      PREDICTION_DATA_ROW
    }
  ],
  "deployedModelId": "DEPLOYED_MODEL_ID"
}

如要傳送要求，請選擇以下其中一個選項：

curl

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/endpoints/ENDPOINT_ID:explain"

PowerShell

注意：下列指令假設您已透過執行 gcloud init 或 gcloud auth login 登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/endpoints/ENDPOINT_ID:explain" | Select-Object -Expand Content

Python 適用的 Vertex AI SDK

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Vertex AI SDK for Python API 參考說明文件。

def explain_sample(project: str, location: str, endpoint_id: str, instance_dict: Dict):

    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint(endpoint_id)

    response = endpoint.explain(instances=[instance_dict], parameters={})

    for explanation in response.explanations:
        print(" explanation")
        # Feature attributions.
        attributions = explanation.attributions
        for attribution in attributions:
            print("  attribution")
            print("   baseline_output_value:", attribution.baseline_output_value)
            print("   instance_output_value:", attribution.instance_output_value)
            print("   output_display_name:", attribution.output_display_name)
            print("   approximation_error:", attribution.approximation_error)
            print("   output_name:", attribution.output_name)
            output_index = attribution.output_index
            for output_index in output_index:
                print("   output_index:", output_index)

    for prediction in response.predictions:
        print(prediction)

取得先前傳回預測結果的說明

由於說明會增加資源使用量，因此建議您只在需要時才要求說明。有時，針對您已收到的推論結果要求說明，可能會有所幫助，因為推論結果可能是異常值或不合理。

如果所有推論都來自同一個模型，您只需重新傳送要求資料，並要求提供說明。不過，如果有多個模型傳回推論，您必須確保將說明要求傳送至正確的模型。您可以在要求中加入已部署模型的 ID deployedModelID，藉此查看特定模型的說明。這項 ID 會包含在原始推論要求的回應中。請注意，部署的模型 ID 與模型 ID 不同。

解讀說明結果

如要計算本地特徵的重要性，系統會先計算基準推論分數。基準值會根據訓練資料計算，其中數值特徵使用中位數值，類別特徵則使用模式。從基準值產生的推論就是基準推論分數。基準值會針對模型計算一次，且不會變更。

針對特定推論，每項特徵的區域特徵重要性會指出，與基準推論分數相比，該特徵對結果的加分或減分幅度。所有特徵重要性值的總和，等於基準推論分數與推論結果之間的差異。

對於分類模型，分數一律會介於 0.0 和 1.0 之間 (含頭尾)。因此，分類模型的本地特徵重要性值一律會介於 -1.0 和 1.0 (含) 之間。

如需功能歸因查詢的範例和詳細資訊，請參閱「分類和迴歸的功能歸因」。

推論和解釋的輸出內容範例

分類

從表格分類模型 (含特徵重要性) 進行線上推論的酬載，與下列範例類似。

0.928652400970459 的 instanceOutputValue 是最高分類別的信心分數，在本例中為 class_a。baselineOutputValue 欄位包含基準推論分數 0.808652400970459。這項功能對這項結果的影響最大。feature_3

{
"predictions": [
  {
    "scores": [
      0.928652400970459,
      0.071347599029541
    ],
    "classes": [
      "class_a",
      "class_b"
    ]
  }
]
"explanations": [
  {
    "attributions": [
      {
        "baselineOutputValue": 0.808652400970459,
        "instanceOutputValue": 0.928652400970459,
        "approximationError":  0.0058915703929231,
        "featureAttributions": {
          "feature_1": 0.012394922231235,
          "feature_2": 0.050212341234556,
          "feature_3": 0.057392736534209,
        },
        "outputIndex": [
          0
        ],
        "outputName": "scores"
      }
    ],
  }
]
"deployedModelId": "234567"
}

迴歸

使用表格迴歸模型的特徵重要性，線上推論的酬載會類似下列範例。

1795.1246466281819 的 instanceOutputValue 是預測值，lower_bound 和 upper_bound 欄位則提供 95% 信賴區間。baselineOutputValue 欄位包含基準推論分數 1788.7423095703125。這項功能對這項結果的影響最大。feature_3

{
"predictions": [
  {
    "value": 1795.1246466281819,
    "lower_bound": 246.32196807861328,
    "upper_bound": 8677.51904296875
  }
]
"explanations": [
  {
    "attributions": [
      {
        "baselineOutputValue": 1788.7423095703125,
        "instanceOutputValue": 1795.1246466281819,
        "approximationError": 0.0038215703911553,
        "featureAttributions": {
          "feature_1": 0.123949222312359,
          "feature_2": 0.802123412345569,
          "feature_3": 5.456264423211472,
        },
        "outputIndex": [
          -1
        ]
      }
    ]
  }
],
"deployedModelId": "345678"
}

後續步驟

瞭解如何匯出模型。

取得線上推論和說明 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

事前準備

將模型部署至端點

Google Cloud 控制台

API

建立端點

gcloud

REST

curl (Linux、macOS 或 Cloud Shell)

PowerShell (Windows)

Java

Node.js

Python 適用的 Vertex AI SDK

取得端點 ID

gcloud

REST

curl (Linux、macOS 或 Cloud Shell)

PowerShell (Windows)

部署模型

gcloud

Linux、macOS 或 Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

流量分配

Linux、macOS 或 Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl (Linux、macOS 或 Cloud Shell)

PowerShell (Windows)

Java

Python 適用的 Vertex AI SDK

Node.js

取得作業狀態

使用已部署的模型取得線上推論結果

Google Cloud 控制台

API：分類

gcloud

REST

curl

PowerShell

Java

Node.js

Python 適用的 Vertex AI SDK

API：迴歸

gcloud

REST

curl

PowerShell

Java

Node.js

Python 適用的 Vertex AI SDK

解讀預測結果

分類

迴歸

使用已部署的模型取得線上說明

控制台

gcloud

REST

curl

PowerShell

Python 適用的 Vertex AI SDK

解讀說明結果

推論和解釋的輸出內容範例

分類

迴歸

後續步驟

取得線上推論和說明