Vertex AI 特徵儲存庫可讓您排定及執行特徵監控作業,監控特徵資料、擷取特徵統計資料,以及偵測特徵偏移。只有在特徵註冊庫中註冊特徵資料來源後,才能監控特徵資料。
如要監控特徵資料,您可以在 FeatureGroup
資源下建立 FeatureMonitor
資源。建立 FeatureMonitor
資源時,您可以設定監控排程,定期對特徵資料執行監控工作。或者,您也可以手動執行特徵監控工作,在監控排程以外的時間監控特徵資料。
每項執行的監控作業,Vertex AI 特徵儲存庫都會產生 FeatureMonitorJob
資源,您可以擷取該資源,查看特徵統計資料和特徵資料中偵測到的偏移資訊。
事前準備
使用 Vertex AI 特徵儲存庫監控特徵前,請先完成本節列出的先決條件。
註冊特徵資料來源
在特徵註冊庫中建立特徵群組和特徵,註冊 BigQuery 中的特徵資料來源。用於擷取及監控特徵統計資料的 FeatureMonitor
資源與特徵群組相關聯。
驗證 Vertex AI
向 Vertex AI 進行驗證 (如果尚未完成)。
Select the tab for how you plan to use the samples on this page:
Python
如要在本機開發環境中使用本頁的 Python 範例,請安裝並初始化 gcloud CLI,然後使用使用者憑證設定應用程式預設憑證。
-
Install the Google Cloud CLI.
-
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
If you're using a local shell, then create local authentication credentials for your user account:
gcloud auth application-default login
You don't need to do this if you're using Cloud Shell.
If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.
詳情請參閱 Set up authentication for a local development environment。
REST
如要在本機開發環境中使用本頁的 REST API 範例,請使用您提供給 gcloud CLI 的憑證。
After installing the Google Cloud CLI, initialize it by running the following command:
gcloud init
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
詳情請參閱 Google Cloud 驗證說明文件中的「Authenticate for using REST」。
建立具有監控排程的特徵監控項目
如要擷取及監控特徵統計資料,請建立 FeatureMonitor
資源,指定定期執行特徵監控工作的時間表,並擷取特徵群組中註冊特徵的特徵統計資料。
請使用下列範例建立 FeatureMonitor
資源。如要為同一組功能設定多個時間表,請務必建立多個 FeatureMonitor
資源。
REST
如要建立 FeatureMonitor
資源並排定特徵監控作業,請使用 featureMonitors.create 方法傳送 POST
要求。
使用任何要求資料之前,請先替換以下項目:
- LOCATION_ID:要建立功能監控的區域,例如
us-central1
。 - PROJECT_ID:您的專案 ID。
- FEATUREGROUP_NAME:您設定特徵監控的特徵群組名稱。
- FEATURE_MONITOR_NAME:要建立的新特徵監控器名稱。
- FEATURE_ID_1 和 FEATURE_ID_2:要監控的功能 ID。
- DRIFT_THRESHOLD_1 和 DRIFT_THRESHOLD_2:特徵監控器中每個特徵的偏移門檻。系統會使用偏移門檻偵測異常狀況,例如特徵偏移。輸入
[0, 1)
範圍內的值。如未輸入值,預設門檻會設為0.3
。
Vertex AI 特徵儲存庫會比較連續特徵監控工作執行作業的快照,並使用 BigQuery 中的 ML.TFDV_VALIDATE 函式計算漂移。 如要分類異常,類別特徵會使用 L-infinity 距離,數值特徵則會使用 Jensen-Shannon 散度。 - CRON:Cron 排程運算式,代表執行特徵監控工作的頻率。詳情請參閱 cron。
HTTP 方法和網址:
POST https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors?feature_monitor_id=FEATURE_MONITOR_NAME
JSON 要求主體:
{ "feature_selection_config": { "feature_configs": [ {"feature_id":"FEATURE_ID_1", "drift_threshold": "DRIFT_THRESHOLD_1" }, {"feature_id":"FEATURE_ID_2", "drift_threshold": "DRIFT_THRESHOLD_2" } ], }, "schedule_config": { "cron": "CRON" } }
如要傳送要求,請選擇以下其中一個選項:
curl
將要求主體儲存在名為 request.json
的檔案中,然後執行下列指令:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors?feature_monitor_id=FEATURE_MONITOR_NAME"
PowerShell
將要求主體儲存在名為 request.json
的檔案中,然後執行下列指令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors?feature_monitor_id=FEATURE_MONITOR_NAME" | Select-Object -Expand Content
您應該會收到如下的 JSON 回應:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1beta1.CreateFeatureMonitorOperationMetadata", "genericMetadata": { "createTime": "2024-12-15T19:35:03.975958Z", "updateTime": "2024-12-15T19:35:03.975958Z" } } }
Python
在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件。
如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。
from google.cloud import aiplatform
from vertexai.resources.preview import feature_store
def create_feature_monitor_sample(
project: str,
location: str,
existing_feature_group_id: str,
feature_monitor_id: str,
feature_selection_configs: List[Tuple[str, float]]
schedule_config: str # Cron string. For example, "0 * * * *" indicates hourly execution.
):
aiplatform.init(project="PROJECT_ID", location="LOCATION_ID")
feature_group = feature_store.FeatureGroup("FEATUREGROUP_NAME")
feature_monitor = feature_group.create_feature_monitor(
name= "FEATURE_MONITOR_NAME",
feature_selection_configs=[("FEATURE_ID_1", DRIFT_THRESHOLD_1),("FEATURE_ID_2", DRIFT_THRESHOLD_2)],
schedule_config="CRON"
)
取代下列項目:
- LOCATION_ID:要建立功能監控的區域,例如
us-central1
。 - PROJECT_ID:您的專案 ID。
- FEATUREGROUP_NAME:您設定特徵監控的特徵群組名稱。
- FEATURE_MONITOR_NAME:要建立的新特徵監控器名稱。
- FEATURE_ID_1 和 FEATURE_ID_2:要監控的功能 ID。
- DRIFT_THRESHOLD_1 和 DRIFT_THRESHOLD_2:特徵監控器中每個特徵的偏移門檻。系統會使用偏移門檻偵測特徵偏移。請輸入介於
0
和1
之間的值。如未輸入值,預設門檻為0.3
。
Vertex AI 特徵儲存庫會比較目前特徵監控工作的資料快照,與前一個特徵監控工作的資料快照。 請注意,如要計算分布偏差,Vertex AI 特徵儲存庫會使用 BigQuery 中的 ML.TFDV_VALIDATE 函式。
用於比較統計資料的指標:類別特徵使用 L 無窮範數距離, 數值特徵使用 Jensen-Shannon 散度。 - CRON:代表執行特徵監控工作的頻率的 Cron 排程運算式。詳情請參閱 cron。
手動執行特徵監控工作
您可以略過連續排定的特徵監控工作之間的等待時間,並手動執行特徵監控工作。如果您想立即擷取監控資訊並偵測特徵資料中的異常狀況,而不是等待下一個排定的監控作業執行,這項功能就非常實用。
REST
如要手動執行特徵監控作業,請建立 FeatureMonitorJob
資源,然後使用 featureMonitorJobs.create 方法傳送 POST
要求。
使用任何要求資料之前,請先替換以下項目:
- LOCATION_ID:您要執行特徵監控作業的區域,例如
us-central1
。 - FEATUREGROUP_NAME:包含
FeatureMonitor
資源的特徵群組名稱。 - PROJECT_ID:您的專案 ID。
- FEATURE_MONITOR_NAME:要執行特徵監控工作的
FeatureMonitor
資源名稱。
HTTP 方法和網址:
POST https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_ID/featureMonitorJobs
如要傳送要求,請選擇以下其中一個選項:
curl
執行下列指令:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d "" \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_ID/featureMonitorJobs"
PowerShell
執行下列指令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_ID/featureMonitorJobs" | Select-Object -Expand Content
您應該會收到如下的 JSON 回應:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID" }
Python
在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件。
如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。
from google.cloud import aiplatform
from vertexai.resources.preview import feature_store
aiplatofrm.init(project="PROJECT_ID", location="LOCATION_ID")
feature_group = FeatureGroup.get("FEATUREGROUP_NAME}")
feature_monitor = feature_group.get_feature_monitor(FEATURE_MONITOR_NAME)
feature_monitor_job = feature_monitor.create_feature_monitor_job()
取代下列項目:
- LOCATION_ID:您要執行特徵監控作業的區域,例如
us-central1
。 - PROJECT_ID:您的專案 ID。
- FEATUREGROUP_NAME:包含
FeatureMonitor
資源的特徵群組名稱。 - FEATURE_MONITOR_NAME:要執行特徵監控工作的
FeatureMonitor
資源名稱。
從監控工作擷取特徵統計資料
如要擷取特徵監控工作中的所有特徵統計資料,請使用特徵監控工作執行期間產生的特徵監控工作 ID,擷取 FeatureMonitorJob
資源。您也可以擷取最新監控工作的特定資源特徵統計資料。
列出特徵監控工作
下列範例說明如何擷取為指定 FeatureMonitor
資源建立的所有 FeatureMonitorJob
資源清單。
REST
如要擷取特定 FeatureMonitor
資源的 FeatureMonitorJob
資源清單,請使用 featureMonitorJobs.list 方法傳送 GET
要求。
使用任何要求資料之前,請先替換以下項目:
- LOCATION_ID:
Feature
資源所在的區域,例如us-central1
。 - PROJECT_ID:您的專案 ID。
- FEATUREGROUP_NAME:包含
FeatureMonitor
資源的特徵群組名稱。 - FEATURE_MONITOR_NAME:要列出特徵監控作業的
FeatureMonitor
資源名稱。
HTTP 方法和網址:
GET https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs
如要傳送要求,請選擇以下其中一個選項:
curl
執行下列指令:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs"
PowerShell
執行下列指令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs" | Select-Object -Expand Content
您應該會收到如下的 JSON 回應:
{ "featureMonitorJobs": [ { "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID_1", "createTime": "2024-12-18T19:18:18.077161Z", "finalStatus": {}, "featureSelectionConfig": { "featureConfigs": [ { "featureId": "feature_name_1", "driftThreshold": 0.2 }, { "featureId": "feature_name_2", "driftThreshold": 0.2 } ] } }, { "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID_2", "createTime": "2024-12-19T19:18:30.859921Z", "finalStatus": {}, "featureSelectionConfig": { "featureConfigs": [ { "featureId": "feature_name_1", "driftThreshold": 0.2 }, { "featureId": "feature_name_2", "driftThreshold": 0.2 } ] } } ] }
Python
在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件。
如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。
from google.cloud import aiplatform
from vertexai.resources.preview import feature_store
aiplatofrm.init(project="PROJECT_ID", location="LOCATION_ID")
feature_group = FeatureGroup.get("FEATUREGROUP_NAME")
feature_monitor = feature_group.get_feature_monitor(FEATURE_MONITOR_NAME)
feature_monitor_jobs = feature_monitor.list_feature_monitor_jobs()
取代下列項目:
- LOCATION_ID:
Feature
資源所在的區域,例如us-central1
。 - PROJECT_ID:您的專案 ID。
- FEATUREGROUP_NAME:包含
FeatureMonitor
資源的特徵群組名稱。 - FEATURE_MONITOR_NAME:要列出特徵監控作業的
FeatureMonitor
資源名稱。
查看監控工作中的特徵統計資料
下列範例說明如何查看特徵監控作業中所有特徵的特徵統計資料。每個特徵的統計資料和異常狀況都會以 FeatureNameStatistics
格式顯示。
REST
如要透過擷取 FeatureMonitorJob
資源,查看監控工作的特徵統計資料,請使用 featureMonitorJobs.get 方法傳送 GET
要求。
使用任何要求資料之前,請先替換以下項目:
- LOCATION_ID:執行特徵監控作業的區域,例如
us-central1
。 - PROJECT_ID:您的專案 ID。
- FEATUREGROUP_NAME:包含
FeatureMonitor
資源的特徵群組名稱。 - FEATURE_MONITOR_NAME:執行特徵監控工作的
FeatureMonitor
資源名稱。 - FEATURE_MONITOR_JOB_ID:要擷取的 FeatureMonitorJob 資源 ID。
HTTP 方法和網址:
GET https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID
如要傳送要求,請選擇以下其中一個選項:
curl
執行下列指令:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID"
PowerShell
執行下列指令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID" | Select-Object -Expand Content
您應該會收到如下的 JSON 回應:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID", "createTime": "2024-12-19T19:18:18.077161Z", "finalStatus": {}, "jobSummary": { "featureStatsAndAnomalies": [ { "featureId": "feature_id_1", "featureStats": { "name": "feature_name_1", "type": "STRING", "stringStats": { "commonStats": { "numNonMissing": "6", "minNumValues": "1", "maxNumValues": "1", "avgNumValues": 1, "numValuesHistogram": { "buckets": [ { "lowValue": 1, "highValue": 1, "sampleCount": 0.6 }, { "lowValue": 1, "highValue": 1, "sampleCount": 0.6 } ], "type": "QUANTILES" }, "totNumValues": "6" }, "unique": "2", "topValues": [ { "value": "59", "frequency": 2 }, { "value": "19", "frequency": 1 } ], "avgLength": 2, "rankHistogram": { "buckets": [ { "label": "59", "sampleCount": 2 }, { "lowRank": "1", "highRank": "1", "label": "19", "sampleCount": 1 } ] } } }, "statsTime": "2024-12-19T19:18:18.077161Z", "featureMonitorJobId": "FEATURE_MONITOR_JOB_ID", "featureMonitorId": "FEATURE_MONITOR_NAME" }, { "featureId": "feature_id_2", "featureStats": { "name": "feature_name_1", "type": "STRING", "stringStats": { "commonStats": { "numNonMissing": "6", "minNumValues": "1", "maxNumValues": "1", "avgNumValues": 1, "numValuesHistogram": { "buckets": [ { "lowValue": 1, "highValue": 1, "sampleCount": 0.6 }, { "lowValue": 1, "highValue": 1, "sampleCount": 0.6 } ], "type": "QUANTILES" }, "totNumValues": "6" }, "unique": "2", "topValues": [ { "value": "59", "frequency": 2 }, { "value": "19", "frequency": 1 } ], "avgLength": 2, "rankHistogram": { "buckets": [ { "label": "59", "sampleCount": 2 }, { "lowRank": "1", "highRank": "1", "label": "19", "sampleCount": 1 } ] } } }, "statsTime": "2024-12-19T19:18:18.077161Z", "featureMonitorJobId": "FEATURE_MONITOR_JOB_ID", "featureMonitorId": "FEATURE_MONITOR_NAME" } ] }, "driftBaseFeatureMonitorJobId": "2250003330000300000", "driftBaseSnapshotTime": "2024-12-12T16:00:01.211686Z", "featureSelectionConfig": { "featureConfigs": [ { "featureId": "feature_id_1", "driftThreshold": 0.2 }, { "featureId": "feature_id_2", "driftThreshold": 0.2 } ] }, "triggerType": "FEATURE_MONITOR_JOB_TRIGGER_ON_DEMAND" }
Python
在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件。
如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。
from google.cloud import aiplatform
from vertexai.resources.preview import feature_store
aiplatofrm.init(project="PROJECT_ID", location="LOCATION_ID")
feature_group = FeatureGroup.get("FEATUREGROUP_NAME"})
feature_monitor = feature_group.get_feature_monitor("FEATURE_MONITOR_NAME")
feature_monitor_job = feature_monitor.get_feature_monitor_job("FEATURE_MONITOR_JOB_ID)")
# Retrieve feature stats and anomalies
feature_stats_and_anomalies = feature_monitor_job.feature_stats_and_anomalies
print(feature_stats_and_anomalies)
取代下列項目:
- LOCATION_ID:執行特徵監控作業的區域,例如
us-central1
。 - PROJECT_ID:您的專案 ID。
- FEATUREGROUP_NAME:包含
FeatureMonitor
資源的特徵群組名稱。 - FEATURE_MONITOR_NAME:執行特徵監控工作的
FeatureMonitor
資源名稱。 - FEATURE_MONITOR_JOB_ID:要擷取的
FeatureMonitorJob
資源 ID。
查看某項功能的特徵統計資料
您可以擷取特徵詳細資料,並指定要從中擷取統計資料的監控工作數量,從最近執行的特徵監控工作中,擷取特定特徵的特徵統計資料。統計資料和異常狀況會以 FeatureNameStatistics
格式顯示。
下列範例說明如何查看特定特徵的特徵統計資料,這些資料來自指定數量的近期特徵監控工作。
REST
如要查看 Feature
資源中特定特徵的特徵統計資料,請使用 features.get 方法傳送 GET
要求,並指定要從中擷取統計資料的監控工作數量。
使用任何要求資料之前,請先替換以下項目:
- LOCATION_ID:執行特徵監控作業的區域,例如
us-central1
。 - PROJECT_ID:您的專案 ID。
- FEATUREGROUP_NAME:包含特徵的特徵群組名稱。
- FEATURE_NAME:要擷取特徵統計資料的
Feature
資源名稱。 - LATEST_STATS_COUNT:要從中擷取特徵統計資料的最新監控作業數量。
HTTP 方法和網址:
GET https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/features/FEATURE_NAME?feature_stats_and_anomaly_spec.latest_stats_count=LATEST_STATS_COUNT
如要傳送要求,請選擇以下其中一個選項:
curl
執行下列指令:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/features/FEATURE_NAME?feature_stats_and_anomaly_spec.latest_stats_count=LATEST_STATS_COUNT"
PowerShell
執行下列指令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/features/FEATURE_NAME?feature_stats_and_anomaly_spec.latest_stats_count=LATEST_STATS_COUNT" | Select-Object -Expand Content
您應該會收到如下的 JSON 回應:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/features/FEATURE_NAME", "createTime": "2024-12-19T21:17:23.373559Z", "updateTime": "2024-12-19T21:17:23.373559Z", "etag": "sample_etag", "featureStatsAndAnomaly": [ { "featureStats": { "name": "FEATURE_NAME", "type": "STRING", "stringStats": { "commonStats": { "numNonMissing": "4", "minNumValues": "1", "maxNumValues": "1", "avgNumValues": 1, "numValuesHistogram": { "buckets": [ { "lowValue": 1, "highValue": 1, "sampleCount": 0.4 }, { "lowValue": 1, "highValue": 1, "sampleCount": 0.4 }, { "lowValue": 1, "highValue": 1, "sampleCount": 0.4 }, { "lowValue": 1, "highValue": 1, "sampleCount": 0.4 } ], "type": "QUANTILES" }, "totNumValues": "4" }, "unique": "4", "topValues": [ { "value": "feature_value_1", "frequency": 1 }, { "value": "feature_value_2", "frequency": 1 }, { "value": "feature_value_3", "frequency": 1 }, { "value": "feature_value_4", "frequency": 1 } ], "avgLength": 4, "rankHistogram": { "buckets": [ { "label": "label_1", "sampleCount": 1 }, { "lowRank": "1", "highRank": "1", "label": "label_2", "sampleCount": 1 }, { "lowRank": "2", "highRank": "2", "label": "label_3", "sampleCount": 1 }, { "lowRank": "3", "highRank": "3", "label": "label_4", "sampleCount": 1 } ] } } }, "driftDetectionThreshold": 0.1, "statsTime": "2024-12-19T22:00:02.734796Z", "featureMonitorJobId": "feature_monitor_job_id_1", "featureMonitorId": "feature_monitor_name_1" } ], "versionColumnName": "version_column_name" }
Python
在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件。
如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。
from google.cloud import aiplatform
from vertexai.resources.preview import feature_store
aiplatofrm.init(project="PROJECT_ID", location="LOCATION_ID")
feature_group = FeatureGroup.get("FEATUREGROUP_NAME"})
feature_stats_and_anomalies = feature_group.get_feature("FEATURE_NAME", latest_stats_count=LATEST_STATS_COUNT)
print(feature_stats_and_anomalies)
取代下列項目:
- LOCATION_ID:執行特徵監控作業的區域,例如
us-central1
。 - PROJECT_ID:您的專案 ID。
- FEATUREGROUP_NAME:包含
FeatureMonitor
資源的特徵群組名稱。 - FEATURE_NAME:要擷取特徵統計資料的特徵名稱。
- LATEST_STATS_COUNT:要從中擷取特徵統計資料的最新監控作業數量。
使用案例範例:使用特徵監控功能偵測特徵偏移
您可以使用特徵監控功能,偵測特徵資料中的異常狀況,也就是特徵漂移。偏移是指 BigQuery 中特徵資料隨時間發生重大且無法預測的變化。Vertex AI 特徵儲存庫會比較監控工作執行時的快照,以及先前監控工作執行期間的資料快照,協助您找出特徵漂移。
如果特徵監控器中包含的任何特徵,在兩個快照之間的差異超過 drift_threshold
參數中指定的門檻,Vertex AI 特徵儲存庫就會偵測到特徵偏移,並在 FeatureMonitorJob
資源中傳回下列資訊:
driftDetected
參數設為true
。兩個快照之間的分布偏差。如果是數值特徵,Vertex AI 特徵儲存庫會使用 Jensen-Shannon 散度計算這個值。如果是類別特徵,Vertex AI 特徵儲存庫會使用 L 無限距離計算這個值。
分布情形的偏差值超過的門檻。
下列範例說明如何擷取 FeatureMonitorJob
資源,並確認是否偵測到漂移。
REST
如要擷取 FeatureMonitorJob
資源,請使用 featureMonitorJobs.get 方法傳送 GET
要求。
使用任何要求資料之前,請先替換以下項目:
- LOCATION_ID:執行特徵監控作業的區域,例如
us-central1
。 - PROJECT_ID:您的專案 ID。
- FEATUREGROUP_NAME:包含
FeatureMonitor
資源的特徵群組名稱。 - FEATURE_MONITOR_NAME:執行特徵監控工作的
FeatureMonitor
資源名稱。 - FEATURE_MONITOR_JOB_ID:要擷取的
FeatureMonitorJob
資源 ID。
HTTP 方法和網址:
GET https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID
如要傳送要求,請選擇以下其中一個選項:
curl
執行下列指令:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID"
PowerShell
執行下列指令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID" | Select-Object -Expand Content
您應該會收到如下的 JSON 回應:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/featureMonitors/FEATURE_MONITOR_NAME/featureMonitorJobs/FEATURE_MONITOR_JOB_ID", "createTime": "2024-12-14T19:45:30.026522Z", "finalStatus": {}, "jobSummary": { "featureStatsAndAnomalies": [ { "featureId": "feature_id_1", "featureStats": { "name": "feature_name_1", "type": "STRING", "stringStats": { "commonStats": { "numNonMissing": "3", "minNumValues": "1", "maxNumValues": "1", "avgNumValues": 1, "numValuesHistogram": { "buckets": [ { "lowValue": 1, "highValue": 1, "sampleCount": 0.9 }, { "lowValue": 1, "highValue": 1, "sampleCount": 0.9 }, { "lowValue": 1, "highValue": 1, "sampleCount": 0.9 } ], "type": "QUANTILES" }, "totNumValues": "3" }, "unique": "3", "topValues": [ { "value": "sample_value_1", "frequency": 1 }, { "value": "sample_value_2", "frequency": 1 }, { "value": "sample_value_3", "frequency": 1 } ], "avgLength": 3, "rankHistogram": { "buckets": [ { "label": "sample_label_1", "sampleCount": 1 }, { "lowRank": "1", "highRank": "1", "label": "sample_label_2", "sampleCount": 1 }, { "lowRank": "2", "highRank": "3", "label": "sample_label_3", "sampleCount": 1 } ] } } }, "distributionDeviation": 0.1388880008888000, "driftDetectionThreshold": 0.1, "driftDetected": true, "statsTime": "2024-12-15T19:45:37.026522Z", "featureMonitorJobId": "FEATURE_MONITOR_JOB_ID", "featureMonitorId": "FEATURE_MONITOR_NAME" } ] }, "driftBaseFeatureMonitorJobId": "2250003330000300000", "driftBaseSnapshotTime": "2024-12-12T18:18:18.077161Z", "description": "sample_feature_monitor_job_description", "featureSelectionConfig": { "featureConfigs": [ { "featureId": "feature_name", "driftThreshold": 0.1 } ] }, "triggerType": "FEATURE_MONITOR_JOB_TRIGGER_ON_DEMAND" }
Python
在試用這個範例之前,請先按照Python使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。 詳情請參閱 Vertex AI Python API 參考說明文件。
如要向 Vertex AI 進行驗證,請設定應用程式預設憑證。 詳情請參閱「為本機開發環境設定驗證」。
from google.cloud import aiplatform
from vertexai.resources.preview import feature_store
aiplatofrm.init(project="PROJECT_ID", location="LOCATION_ID")
feature_group = FeatureGroup.get("FEATUREGROUP_NAME"})
feature_monitor = feature_group.get_feature_monitor("FEATURE_MONITOR_NAME")
feature_monitor_job = feature_monitor.get_feature_monitor_job("FEATURE_MONITOR_JOB_ID)")
# Retrieve feature stats and anomalies
feature_stats_and_anomalies = feature_monitor_job.feature_stats_and_anomalies
print(feature_stats_and_anomalies)
# Check whether drifts are detected
for feature_stats_and_anomalies in feature_monitor_job.feature_stats_and_anomalies:
print("feature: ", feature_stats_and_anomalies.feature_id)
print("distribution deviation: ", feature_stats_and_anomalies.distribution_deviation)
print("drift detected: ", feature_stats_and_anomalies.drift_detected)
取代下列項目:
- LOCATION_ID:執行特徵監控作業的區域,例如
us-central1
。 - PROJECT_ID:您的專案 ID。
- FEATUREGROUP_NAME:包含
FeatureMonitor
資源的特徵群組名稱。 - FEATURE_MONITOR_NAME:執行特徵監控工作的
FeatureMonitor
資源名稱。 - FEATURE_MONITOR_JOB_ID:要擷取的
FeatureMonitorJob
資源 ID。
除非另有註明,否則本頁面中的內容是採用創用 CC 姓名標示 4.0 授權,程式碼範例則為阿帕契 2.0 授權。詳情請參閱《Google Developers 網站政策》。Java 是 Oracle 和/或其關聯企業的註冊商標。
上次更新時間:2025-07-09 (世界標準時間)。