TabNet のオンライン推論を取得する

このページでは、 Google Cloud コンソールまたは Vertex AI API を使用して、表形式の分類モデルまたは回帰モデルからオンライン（リアルタイム）推論と説明を取得する方法を説明します。

オンライン推論は同期リクエストです（バッチ推論は非同期リクエストです）。アプリケーションの入力に応じてリクエストを送信する場合、またはタイムリーな推論が必要となる状況でリクエストを送信する場合は、オンライン推論を使用します。

オンライン推論用にモデルを配信する前に、エンドポイントにモデルをデプロイする必要があります。モデルのデプロイでは、少ないレイテンシでオンライン推論を提供できるように、モデルに物理リソースを関連付けます。

ここで取り上げるトピックは次のとおりです。

エンドポイントにモデルをデプロイする
デプロイされたモデルを使用してオンライン推論を取得する

始める前に

オンライン推論を取得するには、まず、モデルをトレーニングする必要があります。

エンドポイントにモデルをデプロイする

1 つのエンドポイントに複数のモデルをデプロイすることも、モデルを複数のエンドポイントにデプロイすることもできます。モデルのデプロイにおけるオプションとユースケースの詳細については、モデルのデプロイについてをご覧ください。

モデルをデプロイするには、次のいずれかの方法を使用します。

Google Cloud コンソール

Google Cloud コンソールの Vertex AI セクションで、[モデル] ページに移動します。

[モデル] ページに移動
デプロイするモデルの名前をクリックして、詳細ページを開きます。
[デプロイとテスト] タブを選択します。

モデルがいずれかのエンドポイントにデプロイされている場合は、[モデルのデプロイ] セクションに一覧表示されます。
[エンドポイントへのデプロイ] をクリックします。
[エンドポイントの定義] ページで、次のように構成します。
1. モデルは、新しいエンドポイントまたは既存のエンドポイントにデプロイできます。
  - 新しいエンドポイントにモデルをデプロイするには、[新しいエンドポイントを作成する] を選択し、新しいエンドポイントの名前を指定します。
  - モデルを既存のエンドポイントにデプロイするには、[既存のエンドポイントに追加] を選択して、プルダウンリストからエンドポイントを選択します。
  - 1 つのエンドポイントに複数のモデルを追加することも、モデルを複数のエンドポイントに追加することもできます。詳細については、こちらをご覧ください。
2. [続行] をクリックします。
[モデル設定] ページで、次のように構成します。
1. モデルを新しいエンドポイントにデプロイする場合は、トラフィック分割を 100 にします。1 つ以上のモデルがデプロイされている既存のエンドポイントにモデルをデプロイする場合は、すべての割合の合計が 100% になるように、デプロイするモデルとデプロイ済みのモデルのトラフィック分割の割合を更新する必要があります。
2. モデルのコンピューティングノードの最小数を入力します。
  
  これは、このモデルで使用可能なノードの数になります。推論負荷を処理しているか、スタンバイ状態かに関係なく、使用されているノードに対して料金が発生します（推論トラフィックがない場合でも課金されます）。料金ページをご覧ください。
3. マシンタイプを選択します。
  
  マシンリソースのサイズが大きいほど推論パフォーマンスが向上しますが、コストも増加します。
4. 予測ロギングのデフォルト設定を変更する方法を確認する。
5. [続行] をクリックする
[モデルのモニタリング] ページで、[続行] をクリックします。
[モニタリングの目的] ページで、次のように構成します。
1. トレーニングデータの場所を入力します。
2. ターゲット列の名前を入力します。
[デプロイ] をクリックして、エンドポイントにモデルをデプロイします。

API

Vertex AI API を使用してモデルをデプロイする場合は、次の手順を行います。

必要に応じてエンドポイントを作成します。
エンドポイント ID を取得します。
エンドポイントにモデルをデプロイします。

エンドポイントを作成する

既存のエンドポイントにモデルをデプロイする場合は、この手順を省略できます。

gcloud

次の例では、gcloud ai endpoints create コマンドを使用します。

  gcloud ai endpoints create \
    --region=LOCATION \
    --display-name=ENDPOINT_NAME

次のように置き換えます。

LOCATION_ID: Vertex AI を使用するリージョン。
ENDPOINT_NAME: エンドポイントの表示名。

Google Cloud CLI ツールがエンドポイントを作成するまでに数秒かかる場合があります。

REST

リクエストのデータを使用する前に、次のように置き換えます。

LOCATION_ID: 使用するリージョン。
PROJECT_ID: 実際のプロジェクト ID。
ENDPOINT_NAME: エンドポイントの表示名。

HTTP メソッドと URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints

リクエストの本文（JSON）:

{
  "display_name": "ENDPOINT_NAME"
}

リクエストを送信するには、次のいずれかのオプションを展開します。

curl（Linux、macOS、Cloud Shell）

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ユーザーアカウントで gcloud CLI にログインしているか、Cloud Shell を使用して自動的に gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints"

PowerShell（Windows）

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ご自分のユーザーアカウントで gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints" | Select-Object -Expand Content

次のような JSON レスポンスが返されます。

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateEndpointOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-11-05T17:45:42.812656Z",
      "updateTime": "2020-11-05T17:45:42.812656Z"
    }
  }
}

レスポンスに "done": true が含まれるまで、オペレーションのステータスをポーリングできます。

Java

このサンプルを試す前に、Vertex AI クイックスタート: クライアントライブラリの使用にある Java の設定手順を完了してください。詳細については、Vertex AI Java API のリファレンスドキュメントをご覧ください。

Vertex AI に対する認証を行うには、アプリケーションのデフォルト認証情報を設定します。詳細については、ローカル開発環境の認証を設定するをご覧ください。


import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.aiplatform.v1.CreateEndpointOperationMetadata;
import com.google.cloud.aiplatform.v1.Endpoint;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.LocationName;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateEndpointSample {

  public static void main(String[] args)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String endpointDisplayName = "YOUR_ENDPOINT_DISPLAY_NAME";
    createEndpointSample(project, endpointDisplayName);
  }

  static void createEndpointSample(String project, String endpointDisplayName)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    EndpointServiceSettings endpointServiceSettings =
        EndpointServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient endpointServiceClient =
        EndpointServiceClient.create(endpointServiceSettings)) {
      String location = "us-central1";
      LocationName locationName = LocationName.of(project, location);
      Endpoint endpoint = Endpoint.newBuilder().setDisplayName(endpointDisplayName).build();

      OperationFuture<Endpoint, CreateEndpointOperationMetadata> endpointFuture =
          endpointServiceClient.createEndpointAsync(locationName, endpoint);
      System.out.format("Operation name: %s\n", endpointFuture.getInitialFuture().get().getName());
      System.out.println("Waiting for operation to finish...");
      Endpoint endpointResponse = endpointFuture.get(300, TimeUnit.SECONDS);

      System.out.println("Create Endpoint Response");
      System.out.format("Name: %s\n", endpointResponse.getName());
      System.out.format("Display Name: %s\n", endpointResponse.getDisplayName());
      System.out.format("Description: %s\n", endpointResponse.getDescription());
      System.out.format("Labels: %s\n", endpointResponse.getLabelsMap());
      System.out.format("Create Time: %s\n", endpointResponse.getCreateTime());
      System.out.format("Update Time: %s\n", endpointResponse.getUpdateTime());
    }
  }
}

Node.js

このサンプルを試す前に、Vertex AI クイックスタート: クライアントライブラリの使用にある Node.js の設定手順を完了してください。詳細については、Vertex AI Node.js API のリファレンスドキュメントをご覧ください。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const endpointDisplayName = 'YOUR_ENDPOINT_DISPLAY_NAME';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

// Imports the Google Cloud Endpoint Service Client library
const {EndpointServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const endpointServiceClient = new EndpointServiceClient(clientOptions);

async function createEndpoint() {
  // Configure the parent resource
  const parent = `projects/${project}/locations/${location}`;
  const endpoint = {
    displayName: endpointDisplayName,
  };
  const request = {
    parent,
    endpoint,
  };

  // Get and print out a list of all the endpoints for this resource
  const [response] = await endpointServiceClient.createEndpoint(request);
  console.log(`Long running operation : ${response.name}`);

  // Wait for operation to complete
  await response.promise();
  const result = response.result;

  console.log('Create endpoint response');
  console.log(`\tName : ${result.name}`);
  console.log(`\tDisplay name : ${result.displayName}`);
  console.log(`\tDescription : ${result.description}`);
  console.log(`\tLabels : ${JSON.stringify(result.labels)}`);
  console.log(`\tCreate time : ${JSON.stringify(result.createTime)}`);
  console.log(`\tUpdate time : ${JSON.stringify(result.updateTime)}`);
}
createEndpoint();

Python

Vertex AI SDK for Python のインストールまたは更新の方法については、Vertex AI SDK for Python をインストールするをご覧ください。詳細については、Python API リファレンスドキュメントをご覧ください。

def create_endpoint_sample(
    project: str,
    display_name: str,
    location: str,
):
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint.create(
        display_name=display_name,
        project=project,
        location=location,
    )

    print(endpoint.display_name)
    print(endpoint.resource_name)
    return endpoint

エンドポイント ID を取得する

モデルをデプロイするには、エンドポイント ID が必要です。

gcloud

次の例では、gcloud ai endpoints list コマンドを使用します。

  gcloud ai endpoints list \
    --region=LOCATION \
    --filter=display_name=ENDPOINT_NAME

次のように置き換えます。

LOCATION_ID: Vertex AI を使用するリージョン。
ENDPOINT_NAME: エンドポイントの表示名。

ENDPOINT_ID 列に表示される番号をメモします。この ID は次の手順で使用します。

REST

リクエストのデータを使用する前に、次のように置き換えます。

LOCATION_ID: Vertex AI を使用するリージョン。
PROJECT_ID:
ENDPOINT_NAME: エンドポイントの表示名。

HTTP メソッドと URL:

GET https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME

リクエストを送信するには、次のいずれかのオプションを展開します。

curl（Linux、macOS、Cloud Shell）

次のコマンドを実行します。

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME"

PowerShell（Windows）

次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME" | Select-Object -Expand Content

次のような JSON レスポンスが返されます。

{
  "endpoints": [
    {
      "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID",
      "displayName": "ENDPOINT_NAME",
      "etag": "AMEw9yPz5pf4PwBHbRWOGh0PcAxUdjbdX2Jm3QO_amguy3DbZGP5Oi_YUKRywIE-BtLx",
      "createTime": "2020-04-17T18:31:11.585169Z",
      "updateTime": "2020-04-17T18:35:08.568959Z"
    }
  ]
}

ENDPOINT_ID に注意してください。

モデルをデプロイする

お使いの言語または環境に応じて、以下のタブを選択してください。

gcloud

次の例では、gcloud ai endpoints deploy-model コマンドを使用しています。

次の例では、予測処理を高速化するために GPU を使用せずに Model を Endpoint にデプロイし、複数の DeployedModel リソース間でトラフィックを分割しません。

後述のコマンドデータを使用する前に、次のように置き換えます。

ENDPOINT_ID: エンドポイントの ID。
LOCATION_ID: Vertex AI を使用するリージョン。
MODEL_ID: デプロイするモデルの ID。
DEPLOYED_MODEL_NAME: DeployedModel の名前。DeployedModel の Model の表示名を使用することもできます。
MACHINE_TYPE: 省略可。このデプロイの各ノードで使用するマシンリソース。デフォルトの設定は n1-standard-2 です。マシンタイプの詳細。
MIN_REPLICA_COUNT: このデプロイの最小ノード数。ノード数は、推論負荷に応じてノードの最大数まで増減できますが、この数より少なくすることはできません。1 以上の値を指定してください。--min-replica-count フラグを省略すると、値はデフォルトで 1 になります。
MAX_REPLICA_COUNT: このデプロイの最大ノード数。ノード数は、推論負荷に応じてこのノード数まで増減に応じて増減できますが、最大値を超えることはできません。--max-replica-count フラグを省略した場合、最大ノード数は --min-replica-count の値に設定されます。

gcloud ai endpoints deploy-model コマンドを実行します。

Linux、macOS、Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \
  --machine-type=MACHINE_TYPE \
  --min-replica-count=MIN_REPLICA_COUNT \
  --max-replica-count=MAX_REPLICA_COUNT \
  --traffic-split=0=100

Windows（PowerShell）

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME `
  --machine-type=MACHINE_TYPE `
  --min-replica-count=MIN_REPLICA_COUNT `
  --max-replica-count=MAX_REPLICA_COUNT `
  --traffic-split=0=100

Windows（cmd.exe）

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME ^
  --machine-type=MACHINE_TYPE ^
  --min-replica-count=MIN_REPLICA_COUNT ^
  --max-replica-count=MAX_REPLICA_COUNT ^
  --traffic-split=0=100

トラフィックの分割

上記の例の --traffic-split=0=100 フラグでは、Endpoint が受信する新しい予測トラフィックの 100% を新しい DeployedModel に送信します。これは、一時的な ID 0 で表されます。Endpoint にすでに他の DeployedModel リソースがある場合は、新しい DeployedModel と古いリソースとの間でトラフィックを分割できます。たとえば、トラフィックの 20% を新しい DeployedModel に、80% を古いリソースに送信するには、次のコマンドを実行します。

後述のコマンドデータを使用する前に、次のように置き換えます。

OLD_DEPLOYED_MODEL_ID: 既存の DeployedModel の ID。

gcloud ai endpoints deploy-model コマンドを実行します。

Linux、macOS、Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \ 
  --machine-type=MACHINE_TYPE \
  --min-replica-count=MIN_REPLICA_COUNT \
  --max-replica-count=MAX_REPLICA_COUNT \
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows（PowerShell）

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME \ 
  --machine-type=MACHINE_TYPE `
  --min-replica-count=MIN_REPLICA_COUNT `
  --max-replica-count=MAX_REPLICA_COUNT `
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows（cmd.exe）

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME \ 
  --machine-type=MACHINE_TYPE ^
  --min-replica-count=MIN_REPLICA_COUNT ^
  --max-replica-count=MAX_REPLICA_COUNT ^
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

REST

オンライン推論をリクエストするには、endpoints.predict メソッドを使用します。

モデルをデプロイする。

リクエストのデータを使用する前に、次のように置き換えます。

LOCATION_ID: Vertex AI を使用するリージョン。
PROJECT_ID:
ENDPOINT_ID: エンドポイントの ID。
MODEL_ID: デプロイするモデルの ID。
DEPLOYED_MODEL_NAME: DeployedModel の名前。DeployedModel の Model の表示名を使用することもできます。
MACHINE_TYPE: 省略可。このデプロイの各ノードで使用するマシンリソース。デフォルトの設定は n1-standard-2 です。マシンタイプの詳細をご覧ください。
ACCELERATOR_TYPE: マシンに接続するアクセラレータのタイプ。ACCELERATOR_COUNT が指定されていない場合、またはゼロの場合は省略できます。AutoML モデルや、GPU 以外のイメージを使用するカスタムトレーニングモデルでは、推奨されません。詳細。
ACCELERATOR_COUNT: 各レプリカで使用するアクセラレータの数。省略可。GPU 以外のイメージを使用する AutoML モデルまたはカスタムトレーニングモデルの場合、0 を指定するか何も指定しないかのどちらかにしてください。
MIN_REPLICA_COUNT: このデプロイの最小ノード数。ノード数は、推論負荷に応じてノードの最大数まで増減できますが、この数より少なくすることはできません。1 以上の値を指定してください。
MAX_REPLICA_COUNT: このデプロイの最大ノード数。ノード数は、推論負荷に応じてこのノード数まで増減に応じて増減できますが、最大値を超えることはできません。
REQUIRED_REPLICA_COUNT: 省略可。このデプロイが成功とマークされるために必要なノード数。1 以上、ノードの最小数以下にする必要があります。指定しない場合、デフォルト値はノードの最小数です。
TRAFFIC_SPLIT_THIS_MODEL: このオペレーションでデプロイするモデルにルーティングされる、このエンドポイントへの予測トラフィックの割合。デフォルトは 100 です。すべてのトラフィックの割合の合計は 100 になる必要があります。トラフィック分割の詳細
DEPLOYED_MODEL_ID_N: 省略可。他のモデルがこのエンドポイントにデプロイされている場合は、すべての割合の合計が 100 になるように、トラフィック分割の割合を更新する必要があります。
TRAFFIC_SPLIT_MODEL_N: デプロイされたモデル ID キーのトラフィック分割の割合値。
PROJECT_NUMBER: プロジェクトに自動生成されたプロジェクト番号

HTTP メソッドと URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel

リクエストの本文（JSON）:

{
  "deployedModel": {
    "model": "projects/PROJECT/locations/us-central1/models/MODEL_ID",
    "displayName": "DEPLOYED_MODEL_NAME",
    "dedicatedResources": {
       "machineSpec": {
         "machineType": "MACHINE_TYPE",
         "acceleratorType": "ACCELERATOR_TYPE",
         "acceleratorCount": "ACCELERATOR_COUNT"
       },
       "minReplicaCount": MIN_REPLICA_COUNT,
       "maxReplicaCount": MAX_REPLICA_COUNT,
       "requiredReplicaCount": REQUIRED_REPLICA_COUNT
     },
  },
  "trafficSplit": {
    "0": TRAFFIC_SPLIT_THIS_MODEL,
    "DEPLOYED_MODEL_ID_1": TRAFFIC_SPLIT_MODEL_1,
    "DEPLOYED_MODEL_ID_2": TRAFFIC_SPLIT_MODEL_2
  },
}

リクエストを送信するには、次のいずれかのオプションを展開します。

curl（Linux、macOS、Cloud Shell）

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel"

PowerShell（Windows）

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel" | Select-Object -Expand Content

次のような JSON レスポンスが返されます。

{
  "name": "projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployModelOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-10-19T17:53:16.502088Z",
      "updateTime": "2020-10-19T17:53:16.502088Z"
    }
  }
}

Java

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.aiplatform.v1.DedicatedResources;
import com.google.cloud.aiplatform.v1.DeployModelOperationMetadata;
import com.google.cloud.aiplatform.v1.DeployModelResponse;
import com.google.cloud.aiplatform.v1.DeployedModel;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.MachineSpec;
import com.google.cloud.aiplatform.v1.ModelName;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ExecutionException;

public class DeployModelCustomTrainedModelSample {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "PROJECT";
    String endpointId = "ENDPOINT_ID";
    String modelName = "MODEL_NAME";
    String deployedModelDisplayName = "DEPLOYED_MODEL_DISPLAY_NAME";
    deployModelCustomTrainedModelSample(project, endpointId, modelName, deployedModelDisplayName);
  }

  static void deployModelCustomTrainedModelSample(
      String project, String endpointId, String model, String deployedModelDisplayName)
      throws IOException, ExecutionException, InterruptedException {
    EndpointServiceSettings settings =
        EndpointServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();
    String location = "us-central1";

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient client = EndpointServiceClient.create(settings)) {
      MachineSpec machineSpec = MachineSpec.newBuilder().setMachineType("n1-standard-2").build();
      DedicatedResources dedicatedResources =
          DedicatedResources.newBuilder().setMinReplicaCount(1).setMachineSpec(machineSpec).build();

      String modelName = ModelName.of(project, location, model).toString();
      DeployedModel deployedModel =
          DeployedModel.newBuilder()
              .setModel(modelName)
              .setDisplayName(deployedModelDisplayName)
              // `dedicated_resources` must be used for non-AutoML models
              .setDedicatedResources(dedicatedResources)
              .build();
      // key '0' assigns traffic for the newly deployed model
      // Traffic percentage values must add up to 100
      // Leave dictionary empty if endpoint should not accept any traffic
      Map<String, Integer> trafficSplit = new HashMap<>();
      trafficSplit.put("0", 100);
      EndpointName endpoint = EndpointName.of(project, location, endpointId);
      OperationFuture<DeployModelResponse, DeployModelOperationMetadata> response =
          client.deployModelAsync(endpoint, deployedModel, trafficSplit);

      // You can use OperationFuture.getInitialFuture to get a future representing the initial
      // response to the request, which contains information while the operation is in progress.
      System.out.format("Operation name: %s\n", response.getInitialFuture().get().getName());

      // OperationFuture.get() will block until the operation is finished.
      DeployModelResponse deployModelResponse = response.get();
      System.out.format("deployModelResponse: %s\n", deployModelResponse);
    }
  }
}

Python

def deploy_model_with_dedicated_resources_sample(
    project,
    location,
    model_name: str,
    machine_type: str,
    endpoint: Optional[aiplatform.Endpoint] = None,
    deployed_model_display_name: Optional[str] = None,
    traffic_percentage: Optional[int] = 0,
    traffic_split: Optional[Dict[str, int]] = None,
    min_replica_count: int = 1,
    max_replica_count: int = 1,
    accelerator_type: Optional[str] = None,
    accelerator_count: Optional[int] = None,
    explanation_metadata: Optional[explain.ExplanationMetadata] = None,
    explanation_parameters: Optional[explain.ExplanationParameters] = None,
    metadata: Optional[Sequence[Tuple[str, str]]] = (),
    sync: bool = True,
):
    """
    model_name: A fully-qualified model resource name or model ID.
          Example: "projects/123/locations/us-central1/models/456" or
          "456" when project and location are initialized or passed.
    """

    aiplatform.init(project=project, location=location)

    model = aiplatform.Model(model_name=model_name)

    # The explanation_metadata and explanation_parameters should only be
    # provided for a custom trained model and not an AutoML model.
    model.deploy(
        endpoint=endpoint,
        deployed_model_display_name=deployed_model_display_name,
        traffic_percentage=traffic_percentage,
        traffic_split=traffic_split,
        machine_type=machine_type,
        min_replica_count=min_replica_count,
        max_replica_count=max_replica_count,
        accelerator_type=accelerator_type,
        accelerator_count=accelerator_count,
        explanation_metadata=explanation_metadata,
        explanation_parameters=explanation_parameters,
        metadata=metadata,
        sync=sync,
    )

    model.wait()

    print(model.display_name)
    print(model.resource_name)
    return model

Node.js

const automl = require('@google-cloud/automl');
const client = new automl.v1beta1.AutoMlClient();

/**
 * Demonstrates using the AutoML client to create a model.
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// const projectId = '[PROJECT_ID]' e.g., "my-gcloud-project";
// const computeRegion = '[REGION_NAME]' e.g., "us-central1";
// const datasetId = '[DATASET_ID]' e.g., "TBL2246891593778855936";
// const tableId = '[TABLE_ID]' e.g., "1991013247762825216";
// const columnId = '[COLUMN_ID]' e.g., "773141392279994368";
// const modelName = '[MODEL_NAME]' e.g., "testModel";
// const trainBudget = '[TRAIN_BUDGET]' e.g., "1000",
// `Train budget in milli node hours`;

// A resource that represents Google Cloud Platform location.
const projectLocation = client.locationPath(projectId, computeRegion);

// Get the full path of the column.
const columnSpecId = client.columnSpecPath(
  projectId,
  computeRegion,
  datasetId,
  tableId,
  columnId
);

// Set target column to train the model.
const targetColumnSpec = {name: columnSpecId};

// Set tables model metadata.
const tablesModelMetadata = {
  targetColumnSpec: targetColumnSpec,
  trainBudgetMilliNodeHours: trainBudget,
};

// Set datasetId, model name and model metadata for the dataset.
const myModel = {
  datasetId: datasetId,
  displayName: modelName,
  tablesModelMetadata: tablesModelMetadata,
};

// Create a model with the model metadata in the region.
client
  .createModel({parent: projectLocation, model: myModel})
  .then(responses => {
    const initialApiResponse = responses[1];
    console.log(`Training operation name: ${initialApiResponse.name}`);
    console.log('Training started...');
  })
  .catch(err => {
    console.error(err);
  });

予測ロギングのデフォルト設定を変更する方法を確認する。

オペレーションのステータスを取得する

一部のリクエストでは、完了までに長時間かかるオペレーションが実行されます。このようなリクエストではオペレーション名が返されます。そのオペレーション名を使用して、オペレーションのステータス確認やキャンセルを行うことができます。Vertex AI には、長時間実行オペレーションに対して呼び出しを行うためのヘルパーメソッドが用意されています。詳細については、長時間実行オペレーションによる作業をご覧ください。

デプロイされたモデルを使用してオンライン推論を取得する

オンライン推論を行うには、モデルに分析のためのテスト項目を 1 つ以上送信し、モデルがモデルの目的に基づいて結果を返します。 Google Cloud コンソールまたは Vertex AI API を使用して、オンライン推論をリクエストします。

Google Cloud コンソール

Google Cloud コンソールの Vertex AI セクションで、[モデル] ページに移動します。

[モデル] ページに移動
モデルのリストで、推論をリクエストするモデルの名前をクリックします。
[デプロイとテスト] タブを選択します。
[モデルのテスト] セクションで、推論をリクエストするテスト項目を追加します。ベースラインの推論データが入力されます。または、独自の推論データを入力して [予測] をクリックします。

推論が完了すると、Vertex AI がコンソールに結果を返します。

API: 分類

gcloud

次の内容のファイルを request.json という名前で作成します。
```
      {
  "instances": [
    {
      PREDICTION_DATA_ROW
    }
  ]
}
    
```
次のように置き換えます。
- PREDICTION_DATA_ROW: キー（特徴の名前）と値（対応する特徴の値）を持つ JSON オブジェクト。たとえば、数値、文字列の配列、カテゴリを持つデータセットの場合、データの行は次のリクエストの例のようになります。
```
"length":3.6,
"material":"cotton",
"tag_array": ["abc","def"]
```
  トレーニングに含まれるすべての特徴に対して値を指定する必要があります。予測に使用するデータの形式は、トレーニングに使用される形式と同じにする必要があります。詳細については、予測のデータ形式をご覧ください。
次のコマンドを実行します。
```
gcloud ai endpoints predict ENDPOINT_ID \
  --region=LOCATION_ID \
  --json-request=request.json
```
次のように置き換えます。
- ENDPOINT_ID: エンドポイントの ID。
- LOCATION_ID: Vertex AI を使用するリージョン。

REST

オンライン推論をリクエストするには、endpoints.predict メソッドを使用します。

リクエストのデータを使用する前に、次のように置き換えます。

LOCATION_ID: エンドポイントが配置されているリージョン。例: us-central1
PROJECT_ID: 実際のプロジェクト ID。
ENDPOINT_ID: エンドポイントの ID。
PREDICTION_DATA_ROW: キー（特徴の名前）と値（対応する特徴の値）を持つ JSON オブジェクト。たとえば、数値、文字列の配列、カテゴリを持つデータセットの場合、データの行は次のリクエストの例のようになります。
```
"length":3.6,
"material":"cotton",
"tag_array": ["abc","def"]
```
トレーニングに含まれるすべての特徴に対して値を指定する必要があります。予測に使用するデータの形式は、トレーニングに使用される形式と同じにする必要があります。詳細については、予測のデータ形式をご覧ください。
DEPLOYED_MODEL_ID: predict メソッドにより出力されます。推論の生成に使用されるモデルの ID。

HTTP メソッドと URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict

リクエストの本文（JSON）:

{
  "instances": [
    {
      PREDICTION_DATA_ROW
    }
  ]
}

リクエストを送信するには、次のいずれかのオプションを選択します。

curl

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict"

PowerShell

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict" | Select-Object -Expand Content

次のような JSON レスポンスが返されます。

   {
     "predictions": [
      {
         "scores": [
           0.96771615743637085,
           0.032283786684274673
         ],
         "classes": [
           "0",
           "1"
         ]
      }
     ]
     "deployedModelId": "2429510197"
   }

Java


import com.google.cloud.aiplatform.util.ValueConverter;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.cloud.aiplatform.v1.schema.predict.prediction.TabularClassificationPredictionResult;
import com.google.protobuf.ListValue;
import com.google.protobuf.Value;
import com.google.protobuf.util.JsonFormat;
import java.io.IOException;
import java.util.List;

public class PredictTabularClassificationSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String instance = "[{ “feature_column_a”: “value”, “feature_column_b”: “value”}]";
    String endpointId = "YOUR_ENDPOINT_ID";
    predictTabularClassification(instance, project, endpointId);
  }

  static void predictTabularClassification(String instance, String project, String endpointId)
      throws IOException {
    PredictionServicPredictionServiceSettingsceSettings =
        PredictionServicPredictionServiceSettings          .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PredictionServicPredictionServiceClientceClient =
        PredictionServicPredictionServiceClientonServiceSettings)) {
      String location = "us-central1";
      EndpointName endEndpointNameEndpointName.of(EndpointNameation, endpointId);

      ListValue.BuildeListValueue = ListValue.newBuiListValue     JsonFormat.parseJsonFormatinstance, listValue);
      List<Value> instanListValuelistValue.getValuesList();

      Value parametersValuelue.newBuilderValuetListValue(listValue).build();
      PredictResponse PredictResponse =
          predictionServiceClient.predict(endpointName, instanceList, parameters);
      System.out.println("Predict Tabular Classification Response");
      System.out.format("\tDeployed Model Id: %s\n", predictResponse.predictResponse.getDeployedModelId().out.println("Predictions");
      for (Value predictionValueedictResponse.predictResponse.getPredictionsList()larClassificTabularClassificationPredictionResultuilder =
            TabularClassificTabularClassificationPredictionResult       TabularClassificTabularClassificationPredictionResult      (TabularClassificTabularClassificationPredictionResult  ValueConverter.fValueConvertertBuilder, prediction);

        for (int i = 0; i < result.getClasseresult.getClassesCount()   System.out.printf("\tClass: %s", result.getClasseresult.getClasses(i)tem.out.printf("\tScore: %f", result.getScoresresult.getScores(i)   }
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const endpointId = 'YOUR_ENDPOINT_ID';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');
const {prediction} =
  aiplatform.protos.google.cloud.aiplatform.v1.schema.predict;

// Imports the Google Cloud Prediction service client
const {PredictionServiceClient} = aiplatform.v1;

// Import the helper module for converting arbitrary protobuf.Value objects.
const {helpers} = aiplatform;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function predictTablesClassification() {
  // Configure the endpoint resource
  const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;
  const parameters = helpers.toValue({});

  const instance = helpers.toValue({
    petal_length: '1.4',
    petal_width: '1.3',
    sepal_length: '5.1',
    sepal_width: '2.8',
  });

  const instances = [instance];
  const request = {
    endpoint,
    instances,
    parameters,
  };

  // Predict request
  const [response] = await predictionServiceClient.predict(request);

  console.log('Predict tabular classification response');
  console.log(`\tDeployed model id : ${response.deployedModelId}\n`);
  const predictions = response.predictions;
  console.log('Predictions :');
  for (const predictionResultVal of predictions) {
    const predictionResultObj =
      prediction.TabularClassificationPredictionResult.fromValue(
        predictionResultVal
      );
    for (const [i, class_] of predictionResultObj.classes.entries()) {
      console.log(`\tClass: ${class_}`);
      console.log(`\tScore: ${predictionResultObj.scores[i]}\n\n`);
    }
  }
}
predictTablesClassification();

Python

def predict_tabular_classification_sample(
    project: str,
    location: str,
    endpoint_name: str,
    instances: List[Dict],
):
    """
    Args
        project: Your project ID or project number.
        location: Region where Endpoint is located. For example, 'us-central1'.
        endpoint_name: A fully qualified endpoint name or endpoint ID. Example: "projects/123/locations/us-central1/endpoints/456" or
               "456" when project and location are initialized or passed.
        instances: A list of one or more instances (examples) to return a prediction for.
    """
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint(endpoint_name)

    response = endpoint.predict(instances=instances)

    for prediction_ in response.predictions:
        print(prediction_)

API: 回帰

gcloud

request.json という名前のファイルを作成します。その内容は次のとおりです。
```
      {
  "instances": [
    {
      PREDICTION_DATA_ROW
    }
  ]
}
    
```
次のように置き換えます。
- PREDICTION_DATA_ROW: キー（特徴の名前）と値（対応する特徴の値）を持つ JSON オブジェクト。たとえば、数値、数値の配列、カテゴリを持つデータセットの場合、データの行は次のリクエストの例のようになります。
```
"age":3.6,
"sq_ft":5392,
"code": "90331"
```
  トレーニングに含まれるすべての特徴に対して値を指定する必要があります。予測に使用するデータの形式は、トレーニングに使用される形式と同じにする必要があります。詳細については、予測のデータ形式をご覧ください。
次のコマンドを実行します。
```
gcloud ai endpoints predict ENDPOINT_ID \
  --region=LOCATION_ID \
  --json-request=request.json
```
次のように置き換えます。
- ENDPOINT_ID: エンドポイントの ID。
- LOCATION_ID: Vertex AI を使用するリージョン。

REST

オンライン推論をリクエストするには、endpoints.predict メソッドを使用します。

リクエストのデータを使用する前に、次のように置き換えます。

LOCATION_ID: エンドポイントが配置されているリージョン。例: us-central1
PROJECT_ID:
ENDPOINT_ID: エンドポイントの ID。
PREDICTION_DATA_ROW: キー（特徴の名前）と値（対応する特徴の値）を持つ JSON オブジェクト。たとえば、数値、数値の配列、カテゴリを持つデータセットの場合、データの行は次のリクエストの例のようになります。
```
"age":3.6,
"sq_ft":5392,
"code": "90331"
```
トレーニングに含まれるすべての特徴に対して値を指定する必要があります。予測に使用するデータの形式は、トレーニングに使用される形式と同じにする必要があります。詳細については、予測のデータ形式をご覧ください。
DEPLOYED_MODEL_ID: predict メソッドにより出力されます。推論の生成に使用されるモデルの ID。

HTTP メソッドと URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict

リクエストの本文（JSON）:

{
  "instances": [
    {
      PREDICTION_DATA_ROW
    }
  ]
}

リクエストを送信するには、次のいずれかのオプションを選択します。

curl

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict"

PowerShell

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict" | Select-Object -Expand Content

次のような JSON レスポンスが返されます。


{
  "predictions": [
    [
      {
        "value": 65.14233
      }
    ]
  ],
  "deployedModelId": "DEPLOYED_MODEL_ID"
}

Java


import com.google.cloud.aiplatform.util.ValueConverter;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.cloud.aiplatform.v1.schema.predict.prediction.TabularRegressionPredictionResult;
import com.google.protobuf.ListValue;
import com.google.protobuf.Value;
import com.google.protobuf.util.JsonFormat;
import java.io.IOException;
import java.util.List;

public class PredictTabularRegressionSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String instance = "[{ “feature_column_a”: “value”, “feature_column_b”: “value”}]";
    String endpointId = "YOUR_ENDPOINT_ID";
    predictTabularRegression(instance, project, endpointId);
  }

  static void predictTabularRegression(String instance, String project, String endpointId)
      throws IOException {
    PredictionServicPredictionServiceSettingsceSettings =
        PredictionServicPredictionServiceSettings          .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PredictionServicPredictionServiceClientceClient =
        PredictionServicPredictionServiceClientonServiceSettings)) {
      String location = "us-central1";
      EndpointName endEndpointNameEndpointName.of(EndpointNameation, endpointId);

      ListValue.BuildeListValueue = ListValue.newBuiListValue     JsonFormat.parseJsonFormatinstance, listValue);
      List<Value> instanListValuelistValue.getValuesList();

      Value parametersValuelue.newBuilderValuetListValue(listValue).build();
      PredictResponse PredictResponse =
          predictionServiceClient.predict(endpointName, instanceList, parameters);
      System.out.println("Predict Tabular Regression Response");
      System.out.format("\tDisplay Model Id: %s\n", predictResponse.predictResponse.getDeployedModelId().out.println("Predictions");
      for (Value predictionValueedictResponse.predictResponse.getPredictionsList()larRegressioTabularRegressionPredictionResultuilder =
            TabularRegressioTabularRegressionPredictionResult        TabularRegressioTabularRegressionPredictionResult      (TabularRegressioTabularRegressionPredictionResult.fValueConvertertBuilder, prediction);

        System.out.printf("\tUpper bound: %f\n", result.getUpperBresult.getUpperBound()m.out.printf("\tLower bound: %f\n", result.getLowerBresult.getLowerBound()m.out.printf("\tValue: %f\n", result.getValue(result.getValue()
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const endpointId = 'YOUR_ENDPOINT_ID';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');
const {prediction} =
  aiplatform.protos.google.cloud.aiplatform.v1.schema.predict;

// Imports the Google Cloud Prediction service client
const {PredictionServiceClient} = aiplatform.v1;

// Import the helper module for converting arbitrary protobuf.Value objects.
const {helpers} = aiplatform;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function predictTablesRegression() {
  // Configure the endpoint resource
  const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;
  const parameters = helpers.toValue({});

  // TODO (erschmid): Make this less painful
  const instance = helpers.toValue({
    BOOLEAN_2unique_NULLABLE: false,
    DATETIME_1unique_NULLABLE: '2019-01-01 00:00:00',
    DATE_1unique_NULLABLE: '2019-01-01',
    FLOAT_5000unique_NULLABLE: 1611,
    FLOAT_5000unique_REPEATED: [2320, 1192],
    INTEGER_5000unique_NULLABLE: '8',
    NUMERIC_5000unique_NULLABLE: 16,
    STRING_5000unique_NULLABLE: 'str-2',
    STRUCT_NULLABLE: {
      BOOLEAN_2unique_NULLABLE: false,
      DATE_1unique_NULLABLE: '2019-01-01',
      DATETIME_1unique_NULLABLE: '2019-01-01 00:00:00',
      FLOAT_5000unique_NULLABLE: 1308,
      FLOAT_5000unique_REPEATED: [2323, 1178],
      FLOAT_5000unique_REQUIRED: 3089,
      INTEGER_5000unique_NULLABLE: '1777',
      NUMERIC_5000unique_NULLABLE: 3323,
      TIME_1unique_NULLABLE: '23:59:59.999999',
      STRING_5000unique_NULLABLE: 'str-49',
      TIMESTAMP_1unique_NULLABLE: '1546387199999999',
    },
    TIMESTAMP_1unique_NULLABLE: '1546387199999999',
    TIME_1unique_NULLABLE: '23:59:59.999999',
  });

  const instances = [instance];
  const request = {
    endpoint,
    instances,
    parameters,
  };

  // Predict request
  const [response] = await predictionServiceClient.predict(request);

  console.log('Predict tabular regression response');
  console.log(`\tDeployed model id : ${response.deployedModelId}`);
  const predictions = response.predictions;
  console.log('\tPredictions :');
  for (const predictionResultVal of predictions) {
    const predictionResultObj =
      prediction.TabularRegressionPredictionResult.fromValue(
        predictionResultVal
      );
    console.log(`\tUpper bound: ${predictionResultObj.upper_bound}`);
    console.log(`\tLower bound: ${predictionResultObj.lower_bound}`);
    console.log(`\tLower bound: ${predictionResultObj.value}`);
  }
}
predictTablesRegression();

Python

def predict_tabular_regression_sample(
    project: str,
    location: str,
    endpoint_name: str,
    instances: List[Dict],
):
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint(endpoint_name)

    response = endpoint.predict(instances=instances)

    for prediction_ in response.predictions:
        print(prediction_)

予測結果を解釈する

分類

分類モデルは信頼スコアを返します。

信頼スコアは、モデルによる各クラスまたはラベルとテスト項目の関連性の強さを表します。数値が大きいほど、その項目にラベルを適用するモデルの信頼度が高くなります。モデルの結果を受け入れるのに必要な信頼スコアの高さを決定します。

回帰

回帰モデルは推論値を返します。

モデルで確率的推論を使用する場合、value フィールドには最適化目標の最小化値が含まれます。たとえば、最適化目標が minimize-rmse の場合、value フィールドには平均値が含まれます。minimize-mae の場合、value フィールドには中央値が含まれます。

モデルで分位数を使用した確率論的推論を使用する場合、Vertex AI は最適化目標の最小化値に加えて、分位点の値と推論も提供します。分位点の値はモデルのトレーニング時に設定されます。分位点の推論は、分位点の値に関連付けられた推論値です。

TabNet は、決定のサポートに使用された特徴の分析情報をユーザーが取得できるようにすることで、モデルを本質的に解釈可能にします。このアルゴリズムはアテンションを利用します。これにより、一部の特徴の影響を選択的に強化し、加重平均を使用して他の特徴の影響を軽減することを学習します。特定の決定を行う場合、TabNet は各特徴に設定する重要度を段階的に決定します。その後、各ステップを組み合わせて最終的な予測を作成します。アテンションでは、乗算が行われます。値が大きいほど、特徴が予測でより大きな役割を果たしていることを示します。値が 0 の場合、その意思決定で特徴が役割を果たしていないことになります。TabNet は複数の決定ステップを使用するため、すべてのステップにわたって特徴に設定されたアテンションは、適切なスケーリング後に線形結合されます。TabNet のすべての決定ステップにわたるこの線形結合は、TabNet が提供する総体的な特徴の重要度です。

推論の出力例

表形式の回帰モデルで、特徴の重要度を適用したオンライン推論を行うと、この例のようなペイロードが返されます。

{
   "predictions":[
      {
         "value":0.3723912537097931,
         "feature_importance":{
            "MSSubClass":0.12,
            "MSZoning":0.33,
            "LotFrontage":0.27,
            "LotArea":0.06,
            ...
         }
      }
   ]
}

次のステップ

モデルのエクスポート方法を確認する。
オンライン推論の料金について確認する。

TabNet のオンライン推論を取得する コレクションでコンテンツを整理 必要に応じて、コンテンツの保存と分類を行います。

始める前に

エンドポイントにモデルをデプロイする

Google Cloud コンソール

API

エンドポイントを作成する

gcloud

REST

curl（Linux、macOS、Cloud Shell）

PowerShell（Windows）

Java

Node.js

Python

エンドポイント ID を取得する

gcloud

REST

curl（Linux、macOS、Cloud Shell）

PowerShell（Windows）

モデルをデプロイする

gcloud

Linux、macOS、Cloud Shell

Windows（PowerShell）

Windows（cmd.exe）

トラフィックの分割

Linux、macOS、Cloud Shell

Windows（PowerShell）

Windows（cmd.exe）

REST

curl（Linux、macOS、Cloud Shell）

PowerShell（Windows）

Java

Python

Node.js

オペレーションのステータスを取得する

デプロイされたモデルを使用してオンライン推論を取得する

Google Cloud コンソール

API: 分類

gcloud

REST

curl

PowerShell

Java

Node.js

Python

API: 回帰

gcloud

REST

curl

PowerShell

Java

Node.js

Python

予測結果を解釈する

分類

回帰

推論の出力例

次のステップ

TabNet のオンライン推論を取得する