本頁面由 Cloud Translation API 翻譯而成。

透過自訂訓練模型取得線上推論

本頁說明如何使用 Google Cloud CLI 或 Vertex AI API，從自訂訓練模型取得線上 (即時) 推論結果。

設定線上推論的輸入格式

本節說明如何將推論輸入例項格式化並編碼為 JSON，如果您使用 predict 或 explain 方法，就必須這麼做。如果您使用rawPredict方法，則可略過這個步驟。如要瞭解該選擇哪種方法，請參閱將要求傳送至端點。

如果您使用 Python 適用的 Vertex AI SDK 傳送推論要求，請指定不含 instances 欄位的執行個體清單。舉例來說，請指定 [ ["the","quick","brown"], ... ]，而不是 { "instances": [ ["the","quick","brown"], ... ] }。

如果模型使用自訂容器，輸入內容必須採用 JSON 格式，且還有一個額外的 parameters 欄位可用於容器。進一步瞭解如何使用自訂容器推斷格式。

讓樣本採用 JSON 字串格式

線上推論的基本格式是一份資料樣本清單。視您在訓練應用程式中設定輸入內容的方式而定，樣本可以是簡單的值清單或 JSON 物件的內含元素。TensorFlow 模型可以接受較為複雜的輸入內容，而大部分 scikit-learn 和 XGBoost 模型採用的輸入內容格式都是數字清單。

這個範例顯示了 TensorFlow 模型的輸入張量和樣本鍵：

 {"values": [1, 2, 3, 4], "key": 1}

只要 JSON 字串的格式符合下列規則，其構成內容就可以較為複雜：

樣本資料的頂層必須是 JSON 物件，也就是鍵/值組合的字典。
樣本物件的個別值可以是字串、數字或清單。您無法嵌入 JSON 物件。
清單僅能包含相同類型的項目 (包括其他清單)。不能混合使用字串和數值。

您要將線上推論的輸入樣本做為 projects.locations.endpoints.predict 呼叫的訊息主體傳送。進一步瞭解要求主體的格式需求。

將每個樣本設為 JSON 陣列中的一個項目，並將該陣列做為 JSON 物件的 instances 欄位。例如：

{"instances": [
  {"values": [1, 2, 3, 4], "key": 1},
  {"values": [5, 6, 7, 8], "key": 2}
]}

編碼二進位資料以用於推論輸入

二進位資料無法採用 JSON 支援的 UTF-8 編碼字串格式。如果您的輸入內容含有二進位資料，則必須使用 base64 編碼來表示。下列是必要的特殊格式設定：

編碼字串必須採用具有 b64 單一索引鍵的 JSON 物件格式。在 Python 3 中，base64 編碼會輸出一個位元組序列。您必須將這個序列轉換成字串，讓它可透過 JSON 序列化：
```
{'image_bytes': {'b64': base64.b64encode(jpeg_data).decode()}}
```
在 TensorFlow 模型程式碼中，您必須為二進位輸入和輸出張量提供結尾為「_bytes」的別名。

要求和回應範例

本節說明推論要求主體和回應主體的格式，並提供 TensorFlow、scikit-learn 和 XGBoost 的範例。

要求主體詳細資料

TensorFlow

要求主體包含採用下列結構的資料 (JSON 表示法)：

{
  "instances": [
    <value>|<simple/nested list>|<object>,
    ...
  ]
}

instances[] 物件為必要項目，且必須包含要取得推論結果的執行個體清單。

樣本清單內每項元素的結構取決於您的模型輸入定義。樣本可以包含已命名的輸入 (做為物件)，也可以僅包含未加標籤的值。

只有部分資料會包含已命名的輸入。某些樣本屬於簡易的 JSON 值 (布林值、數值或字串)，但樣本通常是包含簡單值或複雜巢狀結構的清單。

以下是幾個要求主體的範例。

將每一資料列都編碼為字串值的 CSV 資料：

{"instances": ["1.0,true,\\"x\\"", "-2.0,false,\\"y\\""]}

純文字：

{"instances": ["the quick brown fox", "the lazy dog"]}

編碼為字詞清單的語句 (字串向量)：

{
  "instances": [
    ["the","quick","brown"],
    ["the","lazy","dog"],
    ...
  ]
}

浮點純量值：

{"instances": [0.0, 1.1, 2.2]}

整數向量：

{
  "instances": [
    [0, 1, 2],
    [3, 4, 5],
    ...
  ]
}

張量 (下列範例是二維張量)：

{
  "instances": [
    [
      [0, 1, 2],
      [3, 4, 5]
    ],
    ...
  ]
}

可透過不同方式來表示的圖片。在此編碼配置中，前兩個維度代表圖片的列和欄，第三個維度則包含每個像素的 R、G、B 值的清單 (向量)：

{
  "instances": [
    [
      [
        [138, 30, 66],
        [130, 20, 56],
        ...
      ],
      [
        [126, 38, 61],
        [122, 24, 57],
        ...
      ],
      ...
    ],
    ...
  ]
}

資料編碼

JSON 字串的編碼必須為 UTF-8。如要傳送二進位資料，您必須使用 base64 編碼並將資料標示為二進位。如要將 JSON 字串標示為二進位，請將字串替換為具備 b64 單一屬性的 JSON 物件：

{"b64": "..."}

下列範例顯示須採用 base64 編碼兩個序列化 tf.Examples 例項 (此為僅供說明之用的偽資料)：

{"instances": [{"b64": "X5ad6u"}, {"b64": "IA9j4nx"}]}

下列範例顯示須採用 base64 編碼兩個 JPEG 圖片位元組字串 (此為僅供說明之用的偽資料)：

{"instances": [{"b64": "ASa8asdf"}, {"b64": "JLK7ljk3"}]}

多個輸入張量

部分模型具有可接受多個輸入張量的基礎 TensorFlow 圖形。此案例採用 JSON 名稱/值組合中的名稱來識別輸入張量。

輸入張量別名為「tag」(字串) 和「image」(base64 編碼字串) 的圖表：

{
  "instances": [
    {
      "tag": "beach",
      "image": {"b64": "ASa8asdf"}
    },
    {
      "tag": "car",
      "image": {"b64": "JLK7ljk3"}
    }
  ]
}

輸入張量別名為「tag」(字串) 和「image」(8 位元整數的 3 維陣列) 的圖表：

{
  "instances": [
    {
      "tag": "beach",
      "image": [
        [
          [138, 30, 66],
          [130, 20, 56],
          ...
        ],
        [
          [126, 38, 61],
          [122, 24, 57],
          ...
        ],
        ...
      ]
    },
    {
      "tag": "car",
      "image": [
        [
          [255, 0, 102],
          [255, 0, 97],
          ...
        ],
        [
          [254, 1, 101],
          [254, 2, 93],
          ...
        ],
        ...
      ]
    },
    ...
  ]
}

scikit-learn

要求主體包含採用下列結構的資料 (JSON 表示法)：

{
  "instances": [
    <simple list>,
    ...
  ]
}

instances[] 是必要物件，而且必須包含要取得推論結果的樣本清單。在下列範例中，每個輸入樣本都是浮點清單：

{
  "instances": [
    [0.0, 1.1, 2.2],
    [3.3, 4.4, 5.5],
    ...
  ]
}

輸入樣本的維度必須與模型預期的維度相符。舉例來說，如果模型須具備三個特徵，則每個輸入樣本的長度必須為 3。

XGBoost

要求主體包含採用下列結構的資料 (JSON 表示法)：

{
  "instances": [
    <simple list>,
    ...
  ]
}

instances[] 是必要物件，而且必須包含要取得推論結果的樣本清單。在下列範例中，每個輸入樣本都是浮點清單：

{
  "instances": [
    [0.0, 1.1, 2.2],
    [3.3, 4.4, 5.5],
    ...
  ]
}

輸入樣本的維度必須與模型預期的維度相符。舉例來說，如果模型須具備三個特徵，則每個輸入樣本的長度必須為 3。

Vertex AI 不支援讓 XGBoost 輸入內容項目採用稀疏表示法。

線上推論服務對零和 NaN 有不同的解釋。如果某個特徵的值為零，請在對應的輸入中使用 0.0。如果特徵的值遺失，請在對應的輸入中使用 "NaN"。

下列範例表示具有單一輸入樣本的推論要求，其中第一個特徵的值是 0.0，第二個特徵值的是 1.1，沒有第三個特徵的值：

{"instances": [[0.0, 1.1, "NaN"]]}

PyTorch

如果模型使用 PyTorch 預建容器，TorchServe 的預設處理常式會將每個執行個體包裝在 data 欄位中。例如：

{
  "instances": [
    { "data": , <value> },
    { "data": , <value> }
  ]
}

回應主體詳細資料

如果呼叫成功，要求主體中的每個樣本都會在回應主體中產生一個推論項目，提供順序如下所示：

{
  "predictions": [
    {
      object
    }
  ],
  "deployedModelId": string
}

如果任何執行個體的推論失敗，回應主體就不會包含任何推論結果。而是包含一個錯誤項目：

{
  "error": string
}

predictions[] 物件包含推論清單，要求中的每個例項都有一個對應的推論。

如果發生錯誤，error 字串將包含一個描述問題的訊息。如果處理任何樣本時發生錯誤，服務會傳回錯誤，而不是傳回推論清單。

雖然每個樣本都會有一個推論，但是推論的格式與樣本的格式沒有直接的關聯。推論的格式是由模型中定義的輸出集合所指定。推論集合會以 JSON 清單的形式傳回。清單的每個成員可為簡單值、清單或任意複雜度的 JSON 物件。如果模型擁有的輸出張量不只一個，每個推論就會是包含每個輸出名稱/值組合的 JSON 物件。這些名稱可識別圖表中的輸出別名。

回應主體範例

TensorFlow

下列範例顯示幾種可能的回應：

針對三個輸入樣本產生一組簡單預測，其中每個預測都是整數值：
```
{"predictions":
   [5, 4, 3],
   "deployedModelId": 123456789012345678
}
```
一組較為複雜的預測，每個預測都包含兩個已命名的值。這些值對應到名為 label 和 scores 的輸出張量。label 的值為預測類別 (「car」或「beach」)，而 scores 則包含該樣本在各可能類別的機率清單。
```
{
  "predictions": [
    {
      "label": "beach",
      "scores": [0.1, 0.9]
    },
    {
      "label": "car",
      "scores": [0.75, 0.25]
    }
  ],
  "deployedModelId": 123456789012345678
}
```
如果處理輸入樣本時發生錯誤，則回應如下：
```
{"error": "Divide by zero"}
```

scikit-learn

下列範例顯示幾種可能的回應：

針對三個輸入樣本產生一組簡單預測，其中每個預測都是整數值：
```
{"predictions":
   [5, 4, 3],
   "deployedModelId": 123456789012345678
}
```
如果處理輸入樣本時發生錯誤，則回應如下：
```
{"error": "Divide by zero"}
```

XGBoost

下列範例顯示幾種可能的回應：

針對三個輸入樣本產生一組簡單預測，其中每個預測都是整數值：
```
{"predictions":
   [5, 4, 3],
   "deployedModelId": 123456789012345678
}
```
如果處理輸入樣本時發生錯誤，則回應如下：
```
{"error": "Divide by zero"}
```

向端點傳送要求

你可以透過下列三種方式傳送要求：

推論要求：將要求傳送至 predict，取得線上推論結果。
原始推論要求：將要求傳送至 rawPredict，讓您使用任意 HTTP 酬載，不必遵循本頁「設定輸入內容格式」一節所述的準則。在下列情況下，您可能會想取得原始推論結果：
- 您使用自訂容器接收要求並傳送與指南不同的回應。
- 您需要較低的延遲時間。rawPredict 會略過序列化步驟，直接將要求轉送至推論容器。
- 您使用 NVIDIA Triton 放送推論。
說明要求：將要求傳送至 explain。如果您已為 Vertex Explainable AI Model設定，即可取得線上說明。線上說明要求與線上推論要求格式相同，且會傳回類似的回應；唯一不同的是，線上說明回應會包含特徵歸因和推論。

將線上推論要求傳送至專屬公開端點

專屬端點可使用 HTTP 和 gRPC 通訊協定進行通訊。如果是 gRPC 要求，必須加入 x-vertex-ai-endpoint-id 標頭，才能確保正確識別端點。下列 API 可透過這些專屬端點使用：

預測
RawPredict
StreamRawPredict
Chat Completion (僅限 Model Garden)

專屬端點會使用新的網址路徑。您可以從 REST API 的 dedicatedEndpointDns 欄位，或 Vertex AI SDK for Python 的 Endpoint.dedicated_endpoint_dns 中擷取這個路徑。您也可以使用下列程式碼，手動建構端點路徑：

f"https://ENDPOINT_ID.LOCATION_ID-PROJECT_NUMBER.prediction.vertexai.goog/v1/projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict"

更改下列內容：

ENDPOINT_ID：端點的 ID。
LOCATION_ID：您使用 Vertex AI 的區域。
PROJECT_NUMBER：專案編號。這與專案 ID 不同。您可以在 Google Cloud 控制台的專案「Project Settings」(專案設定) 頁面中找到專案編號。

如要使用 Python 適用的 Vertex AI SDK，將推論傳送至專屬端點，請將 use_dedicated_endpoint 參數設為 True：

endpoint.predict(instances=instances, use_dedicated_endpoint=True)

將線上推論要求傳送至共用的公開端點

gcloud

下列範例使用 gcloud ai endpoints predict 指令：

在您的本機環境中，將下列 JSON 物件寫入檔案。檔案名稱不重要，但以這個範例來說，請將檔案命名為 request.json。
```
{
 "instances": INSTANCES
}
```
更改下列內容：
- INSTANCES：您要取得推論結果的執行個體 JSON 陣列。每個執行個體的格式取決於訓練好的機器學習模型預期的輸入內容。詳情請參閱「設定線上推論的輸入格式」。
執行下列指令：
```
gcloud ai endpoints predict ENDPOINT_ID \
  --region=LOCATION_ID \
  --json-request=request.json
```
更改下列內容：
- ENDPOINT_ID：端點的 ID。
- LOCATION_ID：您使用 Vertex AI 的區域。

REST

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：您使用 Vertex AI 的區域。
PROJECT_ID：您的專案 ID
ENDPOINT_ID：端點的 ID。
INSTANCES：您要取得推論結果的執行個體 JSON 陣列。每個執行個體的格式取決於訓練好的機器學習模型預期的輸入內容。詳情請參閱「設定線上推論的輸入格式」。

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict

JSON 要求主體：

{
  "instances": INSTANCES
}

如要傳送要求，請選擇以下其中一個選項：

curl

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI，或使用 Cloud Shell，自動登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict"

PowerShell

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict" | Select-Object -Expand Content

如果成功，您會收到類似下列內容的 JSON 回應。在回覆中，預期會看到以下取代內容：

PREDICTIONS：預測結果的 JSON 陣列，其中一個預測結果對應至您在要求主體中加入的每個執行個體。
DEPLOYED_MODEL_ID：提供預測結果的DeployedModel ID。

{
  "predictions": PREDICTIONS,
  "deployedModelId": "DEPLOYED_MODEL_ID"
}

Java

在試用這個範例之前，請先按照Java使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。


import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictRequest;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.protobuf.ListValue;
import com.google.protobuf.Value;
import com.google.protobuf.util.JsonFormat;
import java.io.IOException;
import java.util.List;

public class PredictCustomTrainedModelSample {
  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String instance = "[{ “feature_column_a”: “value”, “feature_column_b”: “value”}]";
    String project = "YOUR_PROJECT_ID";
    String endpointId = "YOUR_ENDPOINT_ID";
    predictCustomTrainedModel(project, endpointId, instance);
  }

  static void predictCustomTrainedModel(String project, String endpointId, String instance)
      throws IOException {
    PredictionServicPredictionServiceSettingsceSettings =
        PredictionServicPredictionServiceSettings          .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PredictionServicPredictionServiceClientceClient =
        PredictionServicPredictionServiceClientonServiceSettings)) {
      String location = "us-central1";
      EndpointName endEndpointNameEndpointName.of(EndpointNameation, endpointId);

      ListValue.BuildeListValueue = ListValue.newBuiListValue     JsonFormat.parseJsonFormatinstance, listValue);
      List<Value> instanListValuelistValue.getValuesList();

      PredictRequest pPredictRequest=
          PredictRequest.nPredictRequest            .setEndpoint(endpointName.toSendpointName.toString().addAllInstances(instanceList)
              .build();
      PredictResponse PredictResponse = predictionServiceClient.predict(predictRequest);

      System.out.println("Predict Custom Trained model Response");
      System.out.format("\tDeployed Model Id: %s\n", predictResponse.predictResponse.getDeployedModelId()out.println("Predictions");
      for (Value predictionValueedictResponse.predictResponse.getPredictionsList()em.out.format("\tPrediction: %s\n", prediction);
      }
    }
  }
}

Node.js

在試用這個範例之前，請先按照Node.js使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const filename = "YOUR_PREDICTION_FILE_NAME";
// const endpointId = "YOUR_ENDPOINT_ID";
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const util = require('util');
const {readFile} = require('fs');
const readFileAsync = util.promisify(readFile);

// Imports the Google Cloud Prediction Service Client library
const {PredictionServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function predictCustomTrainedModel() {
  // Configure the parent resource
  const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;
  const parameters = {
    structValue: {
      fields: {},
    },
  };
  const instanceDict = await readFileAsync(filename, 'utf8');
  const instanceValue = JSON.parse(instanceDict);
  const instance = {
    structValue: {
      fields: {
        Age: {stringValue: instanceValue['Age']},
        Balance: {stringValue: instanceValue['Balance']},
        Campaign: {stringValue: instanceValue['Campaign']},
        Contact: {stringValue: instanceValue['Contact']},
        Day: {stringValue: instanceValue['Day']},
        Default: {stringValue: instanceValue['Default']},
        Deposit: {stringValue: instanceValue['Deposit']},
        Duration: {stringValue: instanceValue['Duration']},
        Housing: {stringValue: instanceValue['Housing']},
        Job: {stringValue: instanceValue['Job']},
        Loan: {stringValue: instanceValue['Loan']},
        MaritalStatus: {stringValue: instanceValue['MaritalStatus']},
        Month: {stringValue: instanceValue['Month']},
        PDays: {stringValue: instanceValue['PDays']},
        POutcome: {stringValue: instanceValue['POutcome']},
        Previous: {stringValue: instanceValue['Previous']},
      },
    },
  };

  const instances = [instance];
  const request = {
    endpoint,
    instances,
    parameters,
  };

  // Predict request
  const [response] = await predictionServiceClient.predict(request);

  console.log('Predict custom trained model response');
  console.log(`\tDeployed model id : ${response.deployedModelId}`);
  const predictions = response.predictions;
  console.log('\tPredictions :');
  for (const prediction of predictions) {
    console.log(`\t\tPrediction : ${JSON.stringify(prediction)}`);
  }
}
predictCustomTrainedModel();

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

def endpoint_predict_sample(
    project: str, location: str, instances: list, endpoint: str
):
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint(endpoint)

    prediction = endpoint.predict(instances=instances)
    print(prediction)
    return prediction

傳送線上原始推論要求

gcloud

下列範例使用 gcloud ai endpoints raw-predict 指令：

如要使用指令列中指定的 REQUEST 內的 JSON 物件要求推論，請執行下列操作：

 gcloud ai endpoints raw-predict ENDPOINT_ID \
     --region=LOCATION_ID \
     --request=REQUEST

如要使用儲存在 image.jpeg 檔案中的圖片和適當的 Content-Type 標頭要求推論：
```
 gcloud ai endpoints raw-predict ENDPOINT_ID \
     --region=LOCATION_ID \
     --http-headers=Content-Type=image/jpeg \
     --request=@image.jpeg
 
```
更改下列內容：
- ENDPOINT_ID：端點的 ID。
- LOCATION_ID：您使用 Vertex AI 的區域。
- REQUEST：要取得推論結果的要求內容。要求的格式取決於自訂容器的預期內容，不一定是 JSON 物件。

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

from google.cloud import aiplatform_v1


def sample_raw_predict():
    # Create a client
    client = aiplatform_v1.PredictionServiceClient()

    # Initialize request argument(s)
    request = aiplatform_v1.RawPredictRequest(
        endpoint="endpoint_value",
    )

    # Make the request
    response = client.raw_predict(request=request)

    # Handle the response
    print(response)

回應會包含下列 HTTP 標頭：

X-Vertex-AI-Endpoint-Id：提供這項推論的 Endpoint ID。
X-Vertex-AI-Deployed-Model-Id：提供這項推論的端點 DeployedModel ID。

傳送線上說明要求

gcloud

下列範例使用 gcloud ai endpoints explain 指令：

在您的本機環境中，將下列 JSON 物件寫入檔案。檔案名稱不重要，但以這個範例來說，請將檔案命名為 request.json。
```
{
 "instances": INSTANCES
}
```
更改下列內容：
- INSTANCES：您要取得推論結果的執行個體 JSON 陣列。每個執行個體的格式取決於訓練好的機器學習模型預期的輸入內容。詳情請參閱「設定線上推論的輸入格式」。
執行下列指令：
```
gcloud ai endpoints explain ENDPOINT_ID \
  --region=LOCATION_ID \
  --json-request=request.json
```
更改下列內容：
- ENDPOINT_ID：端點的 ID。
- LOCATION_ID：您使用 Vertex AI 的區域。
如要將說明要求傳送至特定 DeployedModel 的 Endpoint，可以選擇指定 --deployed-model-id 旗標：
```
gcloud ai endpoints explain ENDPOINT_ID \
  --region=LOCATION \
  --deployed-model-id=DEPLOYED_MODEL_ID \
  --json-request=request.json
```
除了先前說明的預留位置外，請替換下列項目：
- DEPLOYED_MODEL_ID 選用：您要取得說明的已部署模型 ID。ID 會納入 predict 方法的回應中。如要為特定模型要求說明，且您在同一個端點部署了多個模型，可以使用這個 ID 確保系統傳回該特定模型的說明。

REST

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：您使用 Vertex AI 的區域。
PROJECT_ID：您的專案 ID
ENDPOINT_ID：端點的 ID。
INSTANCES：您要取得推論結果的執行個體 JSON 陣列。每個執行個體的格式取決於訓練好的機器學習模型預期的輸入內容。詳情請參閱「設定線上推論的輸入格式」。

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:explain

JSON 要求主體：

{
  "instances": INSTANCES
}

如要傳送要求，請選擇以下其中一個選項：

curl

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:explain"

PowerShell

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:explain" | Select-Object -Expand Content

如果成功，您會收到類似下列內容的 JSON 回應。在回覆中，預期會看到以下取代內容：

PREDICTIONS：預測結果的 JSON 陣列，其中一個預測結果對應至您在要求主體中加入的每個執行個體。
EXPLANATIONS：說明的 JSON 陣列，每項預測都有一則說明。
DEPLOYED_MODEL_ID：提供預測結果的DeployedModel ID。

{
  "predictions": PREDICTIONS,
  "explanations": EXPLANATIONS,
  "deployedModelId": "DEPLOYED_MODEL_ID"
}

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

def explain_tabular_sample(
    project: str, location: str, endpoint_id: str, instance_dict: Dict
):

    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint(endpoint_id)

    response = endpoint.explain(instances=[instance_dict], parameters={})

    for explanation in response.explanations:
        print(" explanation")
        # Feature attributions.
        attributions = explanation.attributions
        for attribution in attributions:
            print("  attribution")
            print("   baseline_output_value:", attribution.baseline_output_value)
            print("   instance_output_value:", attribution.instance_output_value)
            print("   output_display_name:", attribution.output_display_name)
            print("   approximation_error:", attribution.approximation_error)
            print("   output_name:", attribution.output_name)
            output_index = attribution.output_index
            for output_index in output_index:
                print("   output_index:", output_index)

    for prediction in response.predictions:
        print(prediction)

後續步驟

瞭解線上推論記錄。

透過自訂訓練模型取得線上推論 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

設定線上推論的輸入格式

讓樣本採用 JSON 字串格式

編碼二進位資料以用於推論輸入

要求和回應範例

要求主體詳細資料

TensorFlow

資料編碼

多個輸入張量

scikit-learn

XGBoost

PyTorch

回應主體詳細資料

回應主體範例

TensorFlow

scikit-learn

XGBoost

向端點傳送要求

將線上推論要求傳送至專屬公開端點

將線上推論要求傳送至共用的公開端點

gcloud

REST

curl

PowerShell

Java

Node.js

Python

傳送線上原始推論要求

gcloud

Python

傳送線上說明要求

gcloud

REST

curl

PowerShell

Python

後續步驟

透過自訂訓練模型取得線上推論