本頁面由 Cloud Translation API 翻譯而成。

準備物件偵測的圖片訓練資料

本頁面說明如何準備圖片訓練資料，以用於 Vertex AI 資料集，訓練圖片物件偵測模型。

以下目標章節包含資料需求、輸入/輸出結構定義檔案，以及結構定義所定義的資料匯入檔案格式 (JSON Lines 和 CSV) 等資訊。

權限

如要使用 Cloud Storage 值區中的圖片，您必須將值區的 Storage Object Viewer 角色授予 Vertex AI 服務代理程式。服務代理是 Google 管理的服務帳戶，Vertex AI 會使用這個帳戶代表您存取資料。如需更詳細的說明，請參閱「服務代理程式」。

物件偵測

資料條件

	一般圖片規定
支援的檔案類型	JPEG PNG GIF BMP ICO
圖片類型	AutoML 模型經過最佳化，可處理現實世界中的物體相片。
訓練圖片檔案大小 (MB)	大小上限為 30 MB。
預測圖片檔案* 大小 (MB)	大小上限為 1.5 MB。
圖片大小 (像素)	建議上限為 1024 像素 x 1024 像素。如果圖片解析度遠大於 1024 x 1024 像素，Vertex AI 可能會在進行圖片正規化的過程中造成一些影像失真。

	標籤和邊界框規定
以下是訓練 AutoML 模型所用資料集的規定。
為訓練例項加上標籤	至少 10 個註解 (執行個體)。
註解相關規定	每個標籤至少要有 10 張圖片，每張圖片至少要有一個註解 (定界框和標籤)。不過，為了訓練模型，建議您為每個標籤提供約 1, 000 個註解。一般來說，每個標籤的圖片越多，模型成效就越好。
標籤比例 (最常見標籤與最少見標籤)：	當最常見標籤適用的圖片數量較最少見標籤適用的圖片數量最高多出 100 倍時，即可達到最佳模型訓練效果。為提升模型效能，建議您移除使用頻率非常低的標籤。
定界框邊長	至少為圖片邊長的 0.01 倍。舉例來說，如果圖片為 1000 * 900 像素，則至少需要 10 * 9 像素的 bounding box。定界框最小尺寸：8 x 8 像素。注意：最終定界框的像素大小會經過前處理調整。詳情請參閱下方的「內部圖片前處理」資訊。
下列規定適用於用來訓練 AutoML 或自訂訓練模型的資料集。
每張不同圖片的定界框	最多 500 個半形字元。
預測要求傳回的邊界框	預設值為 100，上限為 500。

	訓練資料和資料集規定
以下是訓練 AutoML 模型所用資料集的規定。
訓練圖片特徵	訓練資料應儘可能貼近要用來進行預測的資料。舉例來說，如果所需用途主要是模糊和低解析度的圖片 (例如監視攝影機拍攝的圖片)，則訓練資料也應為模糊和低解析度的圖片。一般來說，我們也會建議您提供角度、解析度和背景各不相同的訓練圖片。 Vertex AI 模型通常無法預測人類無法指派的標籤。因此，如果無法訓練人類在觀看圖片 1-2 秒後指派標籤，則可能也無法訓練模型這麼做。
內部圖片前處理	圖片匯入後，Vertex AI 會對資料執行前處理。預先處理的圖片是訓練模型時實際使用的資料。如果圖片的最小邊緣大於 1024 像素，系統就會預先處理 (調整大小) 圖片。如果圖片較小的一側大於 1024 像素，系統會將該側縮小至 1024 像素。較大側邊和指定的兩個邊界框都會縮小相同幅度，與較小側邊的縮放比例相同。因此，如果縮小後的註解 (定界框和標籤) 小於 8 像素 x 8 像素，就會遭到移除。如果圖片的較小邊長度小於或等於 1024 像素，就不會經過前置處理調整大小。
下列規定適用於用來訓練 AutoML 或自訂訓練模型的資料集。
每個資料集中的圖片	最多 150,000 個
每個資料集的已加註定界框總數	最多 1,000,000 個
每個資料集中的標籤數量	最少 1 個，最多 1,000 個

YAML 結構定義檔案

使用下列可公開存取的結構定義檔案，匯入圖片物件偵測註解 (定界框和標籤)。這個結構定義檔案會決定資料輸入檔案的格式。這個檔案的結構遵循 OpenAPI 結構定義。

gs://google-cloud-aiplatform/schema/dataset/ioformat/image_bounding_box_io_format_1.0.0.yaml

完整結構定義檔案

title: ImageBoundingBox
description: >
  Import and export format for importing/exporting images together with bounding
  box annotations. Can be used in Dataset.import_schema_uri field.
type: object
required:
- imageGcsUri
properties:
  imageGcsUri:
    type: string
    description: >
      A Cloud Storage URI pointing to an image. Up to 30MB in size.
      Supported file mime types: `image/jpeg`, `image/gif`, `image/png`,
      `image/webp`, `image/bmp`, `image/tiff`, `image/vnd.microsoft.icon`.
  boundingBoxAnnotations:
    type: array
    description: Multiple bounding box Annotations on the image.
    items:
      type: object
      description: >
        Bounding box anntoation. `xMin`, `xMax`, `yMin`, and `yMax` are relative
        to the image size, and the point 0,0 is in the top left of the image.
      properties:
        displayName:
          type: string
          description: >
            It will be imported as/exported from AnnotationSpec's display name,
            i.e. the name of the label/class.
        xMin:
          description: The leftmost coordinate of the bounding box.
          type: number
          format: double
        xMax:
          description: The rightmost coordinate of the bounding box.
          type: number
          format: double
        yMin:
          description: The topmost coordinate of the bounding box.
          type: number
          format: double
        yMax:
          description: The bottommost coordinate of the bounding box.
          type: number
          format: double
        annotationResourceLabels:
          description: Resource labels on the Annotation.
          type: object
          additionalProperties:
            type: string
  dataItemResourceLabels:
    description: Resource labels on the DataItem.
    type: object
    additionalProperties:
      type: string

輸入檔案

JSON Lines

每行一個 JSON：



{
  "imageGcsUri": "gs://bucket/filename.ext",
  "boundingBoxAnnotations": [
    {
      "displayName": "OBJECT1_LABEL",
      "xMin": "X_MIN",
      "yMin": "Y_MIN",
      "xMax": "X_MAX",
      "yMax": "Y_MAX",
      "annotationResourceLabels": {
        "aiplatform.googleapis.com/annotation_set_name": "displayName",
        "env": "prod"
      }
    },
    {
      "displayName": "OBJECT2_LABEL",
      "xMin": "X_MIN",
      "yMin": "Y_MIN",
      "xMax": "X_MAX",
      "yMax": "Y_MAX"
    }
  ],
  "dataItemResourceLabels": {
    "aiplatform.googleapis.com/ml_use": "test/train/validation"
  }
}

田野筆記：

imageGcsUri：唯一必填欄位。
annotationResourceLabels - 可包含任意數量的鍵/值字串組合。系統保留的鍵/值組合只有以下一組：
- "aiplatform.googleapis.com/annotation_set_name" : "value"
其中 value 是資料集中現有註解集的顯示名稱。
dataItemResourceLabels - 可包含任意數量的鍵/值字串組合。系統保留的鍵/值組只有以下一項，用於指定資料項目的機器學習用途集：
- "aiplatform.googleapis.com/ml_use" : "training/test/validation"

JSON Lines 範例 - `object_detection.jsonl`：



{"imageGcsUri": "gs://bucket/filename1.jpeg", "boundingBoxAnnotations": [{"displayName": "Tomato", "xMin": "0.3", "yMin": "0.3", "xMax": "0.7", "yMax": "0.6"}], "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "test"}}
{"imageGcsUri": "gs://bucket/filename2.gif", "boundingBoxAnnotations": [{"displayName": "Tomato", "xMin": "0.8", "yMin": "0.2", "xMax": "1.0", "yMax": "0.4"},{"displayName": "Salad", "xMin": "0.0", "yMin": "0.0", "xMax": "1.0", "yMax": "1.0"}], "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "training"}}
{"imageGcsUri": "gs://bucket/filename3.png", "boundingBoxAnnotations": [{"displayName": "Baked goods", "xMin": "0.5", "yMin": "0.7", "xMax": "0.8", "yMax": "0.8"}], "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "training"}}
{"imageGcsUri": "gs://bucket/filename4.tiff", "boundingBoxAnnotations": [{"displayName": "Salad", "xMin": "0.1", "yMin": "0.2", "xMax": "0.8", "yMax": "0.9"}], "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "validation"}}
...

CSV

CSV 格式：

[ML_USE],GCS_FILE_PATH,[LABEL],[BOUNDING_BOX]*

資料欄清單

ML_USE (選填)。訓練模型時，用於資料分割。使用 TRAINING、TEST 或 VALIDATION。如要進一步瞭解如何手動分割資料，請參閱「AutoML 模型資料分割作業簡介」。
GCS_FILE_PATH。這個欄位含有圖片的 Cloud Storage URI。Cloud Storage URI 須區分大小寫。
LABEL. 標籤開頭必須是字母，且只能含有字母、數字和底線。
BOUNDING_BOX. 圖片中物件的定界框。指定定界框時，需要用到多個資料欄。

A. X_MIN、Y_MIN
B. X_MAX,Y_MIN
C. X_MAX,Y_MAX
D. X_MIN、Y_MAX

每個頂點皆由 x、y 座標值指定。座標是經過正規化的浮點值 [0,1]；0.0 是 X_MIN 或 Y_MIN，1.0 是 X_MAX 或 Y_MAX。

舉例來說，整個影像的定界框會表示為 (0.0,0.0,,,1.0,1.0,,) 或 (0.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0)。

物件的定界框可用兩種方式指定：
1. 兩個頂點 (兩組 x、y 座標)，也就是矩形的對角：
  A. X_MIN,Y_MIN
  C. X_MAX、Y_MAX
  如以下範例所示：
  A,,C,
  X_MIN,Y_MIN,,,X_MAX,Y_MAX,,
2. 如圖所示，指定所有四個頂點：
  X_MIN,Y_MIN,X_MAX,Y_MIN, X_MAX,Y_MAX,X_MIN,Y_MAX,
  如果指定的四個頂點形成的矩形沒有辦法跟影像的邊緣切齊的話，Vertex AI 即會指定可以形成此矩形的頂點。

CSV 範例 - `object_detection.csv`：

test,gs://bucket/filename1.jpeg,Tomato,0.3,0.3,,,0.7,0.6,,
training,gs://bucket/filename2.gif,Tomato,0.8,0.2,,,1.0,0.4,,
gs://bucket/filename2.gif
gs://bucket/filename3.png,Baked goods,0.5,0.7,0.8,0.7,0.8,0.8,0.5,0.8
validation,gs://bucket/filename4.tiff,Salad,0.1,0.2,,,0.8,0.9,,
...

建立資料集