監控功能

Vertex AI 特徵儲存庫可讓您排定及執行特徵監控作業,監控特徵資料、擷取特徵統計資料,以及偵測特徵偏移。只有在特徵註冊庫中註冊特徵資料來源後,才能監控特徵資料。

如要監控特徵資料,您可以在 FeatureGroup 資源下建立 FeatureMonitor 資源。建立 FeatureMonitor 資源時,您可以設定監控排程,定期對特徵資料執行監控工作。或者,您也可以手動執行特徵監控工作,在監控排程以外的時間監控特徵資料。

每項執行的監控作業,Vertex AI 特徵儲存庫都會產生 FeatureMonitorJob 資源,您可以擷取該資源,查看特徵統計資料和特徵資料中偵測到的偏移資訊。

事前準備

使用 Vertex AI 特徵儲存庫監控特徵前,請先完成本節列出的先決條件。

註冊特徵資料來源

在特徵註冊庫中建立特徵群組特徵,註冊 BigQuery 中的特徵資料來源。用於擷取及監控特徵統計資料的 FeatureMonitor 資源與特徵群組相關聯。

驗證 Vertex AI

向 Vertex AI 進行驗證 (如果尚未完成)。

Select the tab for how you plan to use the samples on this page:

Python

如要在本機開發環境中使用本頁的 Python 範例,請安裝並初始化 gcloud CLI,然後使用使用者憑證設定應用程式預設憑證。

  1. Install the Google Cloud CLI.

  2. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  3. To initialize the gcloud CLI, run the following command:

    gcloud init
  4. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

    If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

詳情請參閱 Set up authentication for a local development environment

REST

如要在本機開發環境中使用本頁的 REST API 範例,請使用您提供給 gcloud CLI 的憑證。

    After installing the Google Cloud CLI, initialize it by running the following command:

    gcloud init

    If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

詳情請參閱 Google Cloud 驗證說明文件中的「Authenticate for using REST」。

建立具有監控排程的特徵監控項目

如要擷取及監控特徵統計資料,請建立 FeatureMonitor 資源,指定定期執行特徵監控工作的時間表,並擷取特徵群組中註冊特徵的特徵統計資料。

請使用下列範例建立 FeatureMonitor 資源。如要為同一組功能設定多個時間表,請務必建立多個 FeatureMonitor 資源。

REST

如要建立 FeatureMonitor 資源並排定特徵監控作業,請使用 featureMonitors.create 方法傳送 POST 要求。

使用任何要求資料之前,請先替換以下項目:

  • LOCATION_ID:要建立功能監控的區域,例如 us-central1
  • PROJECT_ID:您的專案 ID。
  • FEATUREGROUP_NAME:您設定特徵監控的特徵群組名稱。
  • FEATURE_MONITOR_NAME:要建立的新特徵監控器名稱。
  • FEATURE_ID_1FEATURE_ID_2:要監控的功能 ID。
  • DRIFT_THRESHOLD_1DRIFT_THRESHOLD_2:特徵監控器中每個特徵的偏移門檻。系統會使用偏移門檻偵測異常狀況,例如特徵偏移。輸入 [0, 1) 範圍內的值。如未輸入值,預設門檻會設為 0.3
    Vertex AI 特徵儲存庫會比較連續特徵監控工作執行作業的快照,並使用 BigQuery 中的 ML.TFDV_VALIDATE 函式計算漂移。 如要分類異常,類別特徵會使用 L-infinity 距離,數值特徵則會使用 Jensen-Shannon 散度
  • CRON:Cron 排程運算式,代表執行特徵監控工作的頻率。詳情請參閱 cron

HTTP 方法和網址:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors?feature_monitor_id=FEATURE_MONITOR_NAME

JSON 要求主體:

{
  "feature_selection_config": {
    "feature_configs": [
      {"feature_id":"FEATURE_ID_1", "drift_threshold": "DRIFT_THRESHOLD_1" },
      {"feature_id":"FEATURE_ID_2", "drift_threshold": "DRIFT_THRESHOLD_2" }
    ],
  },
  "schedule_config": {
    "cron": "CRON"
  }
}

如要傳送要求,請選擇以下其中一個選項:

curl

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors?feature_monitor_id=FEATURE_MONITOR_NAME"

PowerShell

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors?feature_monitor_id=FEATURE_MONITOR_NAME" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1beta1.CreateFeatureMonitorOperationMetadata",
    "genericMetadata": {
      "createTime": "2024-12-15T19:35:03.975958Z",
      "updateTime": "2024-12-15T19:35:03.975958Z"
    }
  }
}

Python

在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。

from google.cloud import aiplatform
from vertexai.resources.preview import feature_store

def create_feature_monitor_sample(
    project: str,
    location: str,
    existing_feature_group_id: str,
    feature_monitor_id: str,
    feature_selection_configs: List[Tuple[str, float]]
    schedule_config: str # Cron string. For example, "0 * * * *" indicates hourly execution.
):
    aiplatform.init(project="PROJECT_ID", location="LOCATION_ID")
    feature_group = feature_store.FeatureGroup("FEATUREGROUP_NAME")
    feature_monitor = feature_group.create_feature_monitor(
        name= "FEATURE_MONITOR_NAME",
        feature_selection_configs=[("FEATURE_ID_1", DRIFT_THRESHOLD_1),("FEATURE_ID_2", DRIFT_THRESHOLD_2)],
        schedule_config="CRON"
        )

取代下列項目:

  • LOCATION_ID:要建立功能監控的區域,例如 us-central1
  • PROJECT_ID:您的專案 ID。
  • FEATUREGROUP_NAME:您設定特徵監控的特徵群組名稱。
  • FEATURE_MONITOR_NAME:要建立的新特徵監控器名稱。
  • FEATURE_ID_1FEATURE_ID_2:要監控的功能 ID。
  • DRIFT_THRESHOLD_1DRIFT_THRESHOLD_2:特徵監控器中每個特徵的偏移門檻。系統會使用偏移門檻偵測特徵偏移。請輸入介於 01 之間的值。如未輸入值,預設門檻為 0.3
    Vertex AI 特徵儲存庫會比較目前特徵監控工作的資料快照,與前一個特徵監控工作的資料快照。 請注意,如要計算分布偏差,Vertex AI 特徵儲存庫會使用 BigQuery 中的 ML.TFDV_VALIDATE 函式
    用於比較統計資料的指標:類別特徵使用 L 無窮範數距離, 數值特徵使用 Jensen-Shannon 散度
  • CRON:代表執行特徵監控工作的頻率的 Cron 排程運算式。詳情請參閱 cron

手動執行特徵監控工作

您可以略過連續排定的特徵監控工作之間的等待時間,並手動執行特徵監控工作。如果您想立即擷取監控資訊並偵測特徵資料中的異常狀況,而不是等待下一個排定的監控作業執行,這項功能就非常實用。

REST

如要手動執行特徵監控作業,請建立 FeatureMonitorJob 資源,然後使用 featureMonitorJobs.create 方法傳送 POST 要求。

使用任何要求資料之前,請先替換以下項目:

  • LOCATION_ID:您要執行特徵監控作業的區域,例如 us-central1
  • FEATUREGROUP_NAME:包含 FeatureMonitor 資源的特徵群組名稱。
  • PROJECT_ID:您的專案 ID。
  • FEATURE_MONITOR_NAME:要執行特徵監控工作的 FeatureMonitor 資源名稱。

HTTP 方法和網址:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_ID/featureMonitorJobs

如要傳送要求,請選擇以下其中一個選項:

curl

執行下列指令:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d "" \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_ID/featureMonitorJobs"

PowerShell

執行下列指令:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_ID/featureMonitorJobs" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID"
}

Python

在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。

from google.cloud import aiplatform
from vertexai.resources.preview import feature_store

aiplatofrm.init(project="PROJECT_ID", location="LOCATION_ID")

feature_group = FeatureGroup.get("FEATUREGROUP_NAME}")
feature_monitor = feature_group.get_feature_monitor(FEATURE_MONITOR_NAME)
feature_monitor_job = feature_monitor.create_feature_monitor_job()

取代下列項目:

  • LOCATION_ID:您要執行特徵監控作業的區域,例如 us-central1
  • PROJECT_ID:您的專案 ID。
  • FEATUREGROUP_NAME:包含 FeatureMonitor 資源的特徵群組名稱。
  • FEATURE_MONITOR_NAME:要執行特徵監控工作的 FeatureMonitor 資源名稱。

從監控工作擷取特徵統計資料

如要擷取特徵監控工作中的所有特徵統計資料,請使用特徵監控工作執行期間產生的特徵監控工作 ID,擷取 FeatureMonitorJob 資源。您也可以擷取最新監控工作的特定資源特徵統計資料。

列出特徵監控工作

下列範例說明如何擷取為指定 FeatureMonitor 資源建立的所有 FeatureMonitorJob 資源清單。

REST

如要擷取特定 FeatureMonitor 資源的 FeatureMonitorJob 資源清單,請使用 featureMonitorJobs.list 方法傳送 GET 要求。

使用任何要求資料之前,請先替換以下項目:

  • LOCATION_IDFeature 資源所在的區域,例如 us-central1
  • PROJECT_ID:您的專案 ID。
  • FEATUREGROUP_NAME:包含 FeatureMonitor 資源的特徵群組名稱。
  • FEATURE_MONITOR_NAME:要列出特徵監控作業的 FeatureMonitor 資源名稱。

HTTP 方法和網址:

GET https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs

如要傳送要求,請選擇以下其中一個選項:

curl

執行下列指令:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs"

PowerShell

執行下列指令:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應:

{
  "featureMonitorJobs": [
    {
      "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID_1",
      "createTime": "2024-12-18T19:18:18.077161Z",
      "finalStatus": {},
      "featureSelectionConfig": {
        "featureConfigs": [
          {
            "featureId": "feature_name_1",
            "driftThreshold": 0.2
          },
          {
            "featureId": "feature_name_2",
            "driftThreshold": 0.2
          }
        ]
      }
    },
    {
      "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID_2",
      "createTime": "2024-12-19T19:18:30.859921Z",
      "finalStatus": {},
      "featureSelectionConfig": {
        "featureConfigs": [
          {
            "featureId": "feature_name_1",
            "driftThreshold": 0.2
          },
          {
            "featureId": "feature_name_2",
            "driftThreshold": 0.2
          }
        ]
      }
    }
  ]
}

Python

在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。

from google.cloud import aiplatform
from vertexai.resources.preview import feature_store

aiplatofrm.init(project="PROJECT_ID", location="LOCATION_ID")

feature_group = FeatureGroup.get("FEATUREGROUP_NAME")
feature_monitor = feature_group.get_feature_monitor(FEATURE_MONITOR_NAME)
feature_monitor_jobs = feature_monitor.list_feature_monitor_jobs()

取代下列項目:

  • LOCATION_IDFeature 資源所在的區域,例如 us-central1
  • PROJECT_ID:您的專案 ID。
  • FEATUREGROUP_NAME:包含 FeatureMonitor 資源的特徵群組名稱。
  • FEATURE_MONITOR_NAME:要列出特徵監控作業的 FeatureMonitor 資源名稱。

查看監控工作中的特徵統計資料

下列範例說明如何查看特徵監控作業中所有特徵的特徵統計資料。每個特徵的統計資料和異常狀況都會以 FeatureNameStatistics 格式顯示。

REST

如要透過擷取 FeatureMonitorJob 資源,查看監控工作的特徵統計資料,請使用 featureMonitorJobs.get 方法傳送 GET 要求。

使用任何要求資料之前,請先替換以下項目:

  • LOCATION_ID:執行特徵監控作業的區域,例如 us-central1
  • PROJECT_ID:您的專案 ID。
  • FEATUREGROUP_NAME:包含 FeatureMonitor 資源的特徵群組名稱。
  • FEATURE_MONITOR_NAME:執行特徵監控工作的 FeatureMonitor 資源名稱。
  • FEATURE_MONITOR_JOB_ID:要擷取的 FeatureMonitorJob 資源 ID。

HTTP 方法和網址:

GET https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID

如要傳送要求,請選擇以下其中一個選項:

curl

執行下列指令:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID"

PowerShell

執行下列指令:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID",
  "createTime": "2024-12-19T19:18:18.077161Z",
  "finalStatus": {},
  "jobSummary": {
    "featureStatsAndAnomalies": [
      {
        "featureId": "feature_id_1",
        "featureStats": {
          "name": "feature_name_1",
          "type": "STRING",
          "stringStats": {
            "commonStats": {
              "numNonMissing": "6",
              "minNumValues": "1",
              "maxNumValues": "1",
              "avgNumValues": 1,
              "numValuesHistogram": {
                "buckets": [
                  {
                    "lowValue": 1,
                    "highValue": 1,
                    "sampleCount": 0.6
                  },
                  {
                    "lowValue": 1,
                    "highValue": 1,
                    "sampleCount": 0.6
                  }
                ],
                "type": "QUANTILES"
              },
              "totNumValues": "6"
            },
            "unique": "2",
            "topValues": [
              {
                "value": "59",
                "frequency": 2
              },
              {
                "value": "19",
                "frequency": 1
              }
            ],
            "avgLength": 2,
            "rankHistogram": {
              "buckets": [
                {
                  "label": "59",
                  "sampleCount": 2
                },
                {
                  "lowRank": "1",
                  "highRank": "1",
                  "label": "19",
                  "sampleCount": 1
                }
              ]
            }
          }
        },
        "statsTime": "2024-12-19T19:18:18.077161Z",
        "featureMonitorJobId": "FEATURE_MONITOR_JOB_ID",
        "featureMonitorId": "FEATURE_MONITOR_NAME"
      },
      {
        "featureId": "feature_id_2",
        "featureStats": {
          "name": "feature_name_1",
          "type": "STRING",
          "stringStats": {
            "commonStats": {
              "numNonMissing": "6",
              "minNumValues": "1",
              "maxNumValues": "1",
              "avgNumValues": 1,
              "numValuesHistogram": {
                "buckets": [
                  {
                    "lowValue": 1,
                    "highValue": 1,
                    "sampleCount": 0.6
                  },
                  {
                    "lowValue": 1,
                    "highValue": 1,
                    "sampleCount": 0.6
                  }
                ],
                "type": "QUANTILES"
              },
              "totNumValues": "6"
            },
            "unique": "2",
            "topValues": [
              {
                "value": "59",
                "frequency": 2
              },
              {
                "value": "19",
                "frequency": 1
              }
            ],
            "avgLength": 2,
            "rankHistogram": {
              "buckets": [
                {
                  "label": "59",
                  "sampleCount": 2
                },
                {
                  "lowRank": "1",
                  "highRank": "1",
                  "label": "19",
                  "sampleCount": 1
                }
              ]
            }
          }
        },
        "statsTime": "2024-12-19T19:18:18.077161Z",
        "featureMonitorJobId": "FEATURE_MONITOR_JOB_ID",
        "featureMonitorId": "FEATURE_MONITOR_NAME"
      }
    ]
  },
  "driftBaseFeatureMonitorJobId": "2250003330000300000",
  "driftBaseSnapshotTime": "2024-12-12T16:00:01.211686Z",
  "featureSelectionConfig": {
    "featureConfigs": [
      {
        "featureId": "feature_id_1",
        "driftThreshold": 0.2
      },
      {
        "featureId": "feature_id_2",
        "driftThreshold": 0.2
      }
    ]
  },
  "triggerType": "FEATURE_MONITOR_JOB_TRIGGER_ON_DEMAND"
}

Python

在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。

from google.cloud import aiplatform
from vertexai.resources.preview import feature_store

aiplatofrm.init(project="PROJECT_ID", location="LOCATION_ID")
feature_group = FeatureGroup.get("FEATUREGROUP_NAME"})

feature_monitor = feature_group.get_feature_monitor("FEATURE_MONITOR_NAME")
feature_monitor_job = feature_monitor.get_feature_monitor_job("FEATURE_MONITOR_JOB_ID)")

# Retrieve feature stats and anomalies
feature_stats_and_anomalies = feature_monitor_job.feature_stats_and_anomalies
print(feature_stats_and_anomalies)

取代下列項目:

  • LOCATION_ID:執行特徵監控作業的區域,例如 us-central1
  • PROJECT_ID:您的專案 ID。
  • FEATUREGROUP_NAME:包含 FeatureMonitor 資源的特徵群組名稱。
  • FEATURE_MONITOR_NAME:執行特徵監控工作的 FeatureMonitor 資源名稱。
  • FEATURE_MONITOR_JOB_ID:要擷取的 FeatureMonitorJob 資源 ID。

查看某項功能的特徵統計資料

您可以擷取特徵詳細資料,並指定要從中擷取統計資料的監控工作數量,從最近執行的特徵監控工作中,擷取特定特徵的特徵統計資料。統計資料和異常狀況會以 FeatureNameStatistics 格式顯示。

下列範例說明如何查看特定特徵的特徵統計資料,這些資料來自指定數量的近期特徵監控工作。

REST

如要查看 Feature 資源中特定特徵的特徵統計資料,請使用 features.get 方法傳送 GET 要求,並指定要從中擷取統計資料的監控工作數量。

使用任何要求資料之前,請先替換以下項目:

  • LOCATION_ID:執行特徵監控作業的區域,例如 us-central1
  • PROJECT_ID:您的專案 ID。
  • FEATUREGROUP_NAME:包含特徵的特徵群組名稱。
  • FEATURE_NAME:要擷取特徵統計資料的 Feature 資源名稱。
  • LATEST_STATS_COUNT:要從中擷取特徵統計資料的最新監控作業數量。

HTTP 方法和網址:

GET https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/features/FEATURE_NAME?feature_stats_and_anomaly_spec.latest_stats_count=LATEST_STATS_COUNT

如要傳送要求,請選擇以下其中一個選項:

curl

執行下列指令:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/features/FEATURE_NAME?feature_stats_and_anomaly_spec.latest_stats_count=LATEST_STATS_COUNT"

PowerShell

執行下列指令:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/features/FEATURE_NAME?feature_stats_and_anomaly_spec.latest_stats_count=LATEST_STATS_COUNT" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/features/FEATURE_NAME",
  "createTime": "2024-12-19T21:17:23.373559Z",
  "updateTime": "2024-12-19T21:17:23.373559Z",
  "etag": "sample_etag",
  "featureStatsAndAnomaly": [
    {
      "featureStats": {
        "name": "FEATURE_NAME",
        "type": "STRING",
        "stringStats": {
          "commonStats": {
            "numNonMissing": "4",
            "minNumValues": "1",
            "maxNumValues": "1",
            "avgNumValues": 1,
            "numValuesHistogram": {
              "buckets": [
                {
                  "lowValue": 1,
                  "highValue": 1,
                  "sampleCount": 0.4
                },
                {
                  "lowValue": 1,
                  "highValue": 1,
                  "sampleCount": 0.4
                },
                {
                  "lowValue": 1,
                  "highValue": 1,
                  "sampleCount": 0.4
                },
                {
                  "lowValue": 1,
                  "highValue": 1,
                  "sampleCount": 0.4
                }
              ],
              "type": "QUANTILES"
            },
            "totNumValues": "4"
          },
          "unique": "4",
          "topValues": [
            {
              "value": "feature_value_1",
              "frequency": 1
            },
            {
              "value": "feature_value_2",
              "frequency": 1
            },
            {
              "value": "feature_value_3",
              "frequency": 1
            },
            {
              "value": "feature_value_4",
              "frequency": 1
            }
          ],
          "avgLength": 4,
          "rankHistogram": {
            "buckets": [
              {
                "label": "label_1",
                "sampleCount": 1
              },
              {
                "lowRank": "1",
                "highRank": "1",
                "label": "label_2",
                "sampleCount": 1
              },
              {
                "lowRank": "2",
                "highRank": "2",
                "label": "label_3",
                "sampleCount": 1
              },
              {
                "lowRank": "3",
                "highRank": "3",
                "label": "label_4",
                "sampleCount": 1
              }
            ]
          }
        }
      },
      "driftDetectionThreshold": 0.1,
      "statsTime": "2024-12-19T22:00:02.734796Z",
      "featureMonitorJobId": "feature_monitor_job_id_1",
      "featureMonitorId": "feature_monitor_name_1"
    }
  ],
  "versionColumnName": "version_column_name"
}

Python

在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。

from google.cloud import aiplatform
from vertexai.resources.preview import feature_store

aiplatofrm.init(project="PROJECT_ID", location="LOCATION_ID")

feature_group = FeatureGroup.get("FEATUREGROUP_NAME"})
feature_stats_and_anomalies = feature_group.get_feature("FEATURE_NAME", latest_stats_count=LATEST_STATS_COUNT)
print(feature_stats_and_anomalies)

取代下列項目:

  • LOCATION_ID:執行特徵監控作業的區域,例如 us-central1
  • PROJECT_ID:您的專案 ID。
  • FEATUREGROUP_NAME:包含 FeatureMonitor 資源的特徵群組名稱。
  • FEATURE_NAME:要擷取特徵統計資料的特徵名稱。
  • LATEST_STATS_COUNT:要從中擷取特徵統計資料的最新監控作業數量。

使用案例範例:使用特徵監控功能偵測特徵偏移

您可以使用特徵監控功能,偵測特徵資料中的異常狀況,也就是特徵漂移。偏移是指 BigQuery 中特徵資料隨時間發生重大且無法預測的變化。Vertex AI 特徵儲存庫會比較監控工作執行時的快照,以及先前監控工作執行期間的資料快照,協助您找出特徵漂移。

如果特徵監控器中包含的任何特徵,在兩個快照之間的差異超過 drift_threshold 參數中指定的門檻,Vertex AI 特徵儲存庫就會偵測到特徵偏移,並在 FeatureMonitorJob 資源中傳回下列資訊:

  • driftDetected 參數設為 true

  • 兩個快照之間的分布偏差。如果是數值特徵,Vertex AI 特徵儲存庫會使用 Jensen-Shannon 散度計算這個值。如果是類別特徵,Vertex AI 特徵儲存庫會使用 L 無限距離計算這個值。

  • 分布情形的偏差值超過的門檻。

下列範例說明如何擷取 FeatureMonitorJob 資源,並確認是否偵測到漂移。

REST

如要擷取 FeatureMonitorJob 資源,請使用 featureMonitorJobs.get 方法傳送 GET 要求。

使用任何要求資料之前,請先替換以下項目:

  • LOCATION_ID:執行特徵監控作業的區域,例如 us-central1
  • PROJECT_ID:您的專案 ID。
  • FEATUREGROUP_NAME:包含 FeatureMonitor 資源的特徵群組名稱。
  • FEATURE_MONITOR_NAME:執行特徵監控工作的 FeatureMonitor 資源名稱。
  • FEATURE_MONITOR_JOB_ID:要擷取的 FeatureMonitorJob 資源 ID。

HTTP 方法和網址:

GET https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID

如要傳送要求,請選擇以下其中一個選項:

curl

執行下列指令:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID"

PowerShell

執行下列指令:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID",
  "createTime": "2024-12-14T19:45:30.026522Z",
  "finalStatus": {},
  "jobSummary": {
    "featureStatsAndAnomalies": [
      {
        "featureId": "feature_id_1",
        "featureStats": {
          "name": "feature_name_1",
          "type": "STRING",
          "stringStats": {
            "commonStats": {
              "numNonMissing": "3",
              "minNumValues": "1",
              "maxNumValues": "1",
              "avgNumValues": 1,
              "numValuesHistogram": {
                "buckets": [
                  {
                    "lowValue": 1,
                    "highValue": 1,
                    "sampleCount": 0.9
                  },
                  {
                    "lowValue": 1,
                    "highValue": 1,
                    "sampleCount": 0.9
                  },
                  {
                    "lowValue": 1,
                    "highValue": 1,
                    "sampleCount": 0.9
                  }
                ],
                "type": "QUANTILES"
              },
              "totNumValues": "3"
            },
            "unique": "3",
            "topValues": [
              {
                "value": "sample_value_1",
                "frequency": 1
              },
              {
                "value": "sample_value_2",
                "frequency": 1
              },
              {
                "value": "sample_value_3",
                "frequency": 1
              }
            ],
            "avgLength": 3,
            "rankHistogram": {
              "buckets": [
                {
                  "label": "sample_label_1",
                  "sampleCount": 1
                },
                {
                  "lowRank": "1",
                  "highRank": "1",
                  "label": "sample_label_2",
                  "sampleCount": 1
                },
                {
                  "lowRank": "2",
                  "highRank": "3",
                  "label": "sample_label_3",
                  "sampleCount": 1
                }
              ]
            }
          }
        },
        "distributionDeviation": 0.1388880008888000,
        "driftDetectionThreshold": 0.1,
        "driftDetected": true,
        "statsTime": "2024-12-15T19:45:37.026522Z",
        "featureMonitorJobId": "FEATURE_MONITOR_JOB_ID",
        "featureMonitorId": "FEATURE_MONITOR_NAME"
      }
    ]
  },
  "driftBaseFeatureMonitorJobId": "2250003330000300000",
  "driftBaseSnapshotTime": "2024-12-12T18:18:18.077161Z",
  "description": "sample_feature_monitor_job_description",
  "featureSelectionConfig": {
    "featureConfigs": [
      {
        "featureId": "feature_name",
        "driftThreshold": 0.1
      }
    ]
  },
  "triggerType": "FEATURE_MONITOR_JOB_TRIGGER_ON_DEMAND"
}

Python

在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件

如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。

from google.cloud import aiplatform
from vertexai.resources.preview import feature_store

aiplatofrm.init(project="PROJECT_ID", location="LOCATION_ID")
feature_group = FeatureGroup.get("FEATUREGROUP_NAME"})
feature_monitor = feature_group.get_feature_monitor("FEATURE_MONITOR_NAME")
feature_monitor_job = feature_monitor.get_feature_monitor_job("FEATURE_MONITOR_JOB_ID)")

# Retrieve feature stats and anomalies
feature_stats_and_anomalies = feature_monitor_job.feature_stats_and_anomalies
print(feature_stats_and_anomalies)

# Check whether drifts are detected
for feature_stats_and_anomalies in feature_monitor_job.feature_stats_and_anomalies:
    print("feature: ", feature_stats_and_anomalies.feature_id)
    print("distribution deviation: ", feature_stats_and_anomalies.distribution_deviation)
    print("drift detected: ", feature_stats_and_anomalies.drift_detected)

取代下列項目:

  • LOCATION_ID:執行特徵監控作業的區域,例如 us-central1
  • PROJECT_ID:您的專案 ID。
  • FEATUREGROUP_NAME:包含 FeatureMonitor 資源的特徵群組名稱。
  • FEATURE_MONITOR_NAME:執行特徵監控工作的 FeatureMonitor 資源名稱。
  • FEATURE_MONITOR_JOB_ID:要擷取的 FeatureMonitorJob 資源 ID。