本頁面由 Cloud Translation API 翻譯而成。

使用自訂容器建立執行個體

本頁說明如何根據自訂容器建立 Vertex AI Workbench 執行個體。

總覽

Vertex AI Workbench 執行個體支援使用衍生自 Google 提供基本容器的自訂容器。您可以修改這些基本容器，製作自訂容器映像檔，並使用這些自訂容器建立 Vertex AI Workbench 執行個體。

基礎容器會透過主機虛擬機器 (VM) 中的 Container-Optimized OS 進行設定。主機映像檔是從 cos-stable 映像檔系列建構而來。

限制

規劃專案時，請注意下列限制：

自訂容器必須衍生自Google 提供的基本容器。如果使用的容器不是衍生自基礎容器，相容性問題的風險就會增加，我們也無法支援您使用 Vertex AI Workbench 執行個體。
Vertex AI Workbench 執行個體不支援使用多個容器。
從使用者自管筆記本和代管筆記本取得的自訂容器支援的中繼資料，與 Vertex AI Workbench 執行個體搭配使用時，可能會有不同行為。
代管自訂容器的 VM 是以 Container-Optimized OS 執行，因此您與主機互動的方式會受到限制。舉例來說，Container-Optimized OS 不含套件管理員。也就是說，對主機執行的套件必須在含有掛接點的容器上執行。這會影響從受管理 Notebook 執行個體和使用者管理的 Notebook 執行個體遷移的啟動後指令碼，因為主機包含的工具遠多於 Container-Optimized OS。
Vertex AI Workbench 執行個體會使用 nerdctl (containerd CLI) 執行自訂容器。這是與影像串流服務相容的必要條件。使用中繼資料值新增的任何容器參數，都必須符合 nerdctl 支援的項目。
Vertex AI Workbench 執行個體已設定為從 Artifact Registry 或公開容器存放區提取資料。如要將執行個體設定為從私人存放區提取內容，您必須手動設定 containerd 使用的憑證。

基本容器

標準基礎容器

標準基礎容器支援所有 Vertex AI Workbench 功能，並包含下列項目：

預先安裝的資料科學套件。
類似於深度學習容器的 Cuda 程式庫。
Google Cloud JupyterLab 整合，例如 Dataproc 和 BigQuery 整合。
常見的系統套件，例如 curl 或 git。
以中繼資料為準的 JupyterLab 設定。
以 Micromamba 為基礎的核心管理。

規格

標準基礎容器的規格如下：

基礎映像檔：nvidia/cuda:12.6.1-cudnn-devel-ubuntu24.04
映像檔大小：約 22 GB
URI：us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container:latest

纖薄底座容器

精簡的基本容器提供一組最低設定，可允許與執行個體建立 Proxy 連線。不含標準 Vertex AI Workbench 功能和套件，但下列項目除外：

JupyterLab
以中繼資料為準的 JupyterLab 設定
以 Micromamba 為基礎的核心管理

其他套件或 JupyterLab 擴充功能必須個別安裝及管理。

規格

精簡版基礎容器的規格如下：

基礎映像檔：marketplace.gcr.io/google/ubuntu24.04
映像檔大小：約 2 GB
URI：us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container-slim:latest

事前準備

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Notebooks API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Notebooks API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

必要的角色

如要取得建立自訂容器 Vertex AI Workbench 執行個體所需的權限，請要求管理員授予下列 IAM 角色：

筆記本執行者 (roles/notebooks.runner) 使用者帳戶
如要從 Artifact Registry 存放區提取映像檔，請在服務帳戶上啟用 Artifact Registry 讀取者 (roles/artifactregistry.reader)。

如要進一步瞭解如何授予角色，請參閱「管理專案、資料夾和機構的存取權」。

您或許還可透過自訂角色或其他預先定義的角色取得必要權限。

建立自訂容器

如要建立自訂容器，以便搭配 Vertex AI Workbench 執行個體使用，請按照下列步驟操作：

建立衍生容器，該容器衍生自 Google 提供的基本容器映像檔。
建構容器並推送至 Artifact Registry。建立 Vertex AI Workbench 執行個體時，您會使用容器的 URI。舉例來說，URI 可能如下所示： gcr.io/PROJECT_ID/IMAGE_NAME。

建立執行個體

您可以透過 Google Cloud 控制台或 Google Cloud CLI，根據自訂容器建立 Vertex AI Workbench 執行個體。

主控台

如要根據自訂容器建立 Vertex AI Workbench 執行個體，請按照下列步驟操作：

前往 Google Cloud 控制台的「Instances」(執行個體) 頁面。

前往「Instances」(執行個體) 頁面
按一下「建立新標籤」。
在「New instance」對話方塊中，按一下「Advanced options」。
在「建立執行個體」對話方塊的「環境」部分，選取「使用自訂容器」。
按一下「Docker container image」(Docker 容器映像檔) 的「Select」(選取)。
在「Select container image」(選取容器映像檔) 對話方塊中，瀏覽至要使用的容器映像檔，然後按一下「Select」(選取)。
(選用步驟) 如要使用開機後指令碼，請輸入要使用的開機後指令碼路徑。
(選用步驟) 為執行個體新增中繼資料。詳情請參閱「自訂容器中繼資料」。
(選用步驟) 在「網路」部分，自訂網路設定。詳情請參閱「網路設定選項」。
完成執行個體建立對話方塊的其餘部分，然後按一下「建立」。

Vertex AI Workbench 會建立執行個體並自動啟動。執行個體可供使用時，Vertex AI Workbench 會啟用「Open JupyterLab」(開啟 JupyterLab) 連結。

gcloud

使用下列任何指令資料之前，請先替換以下項目：

INSTANCE_NAME：Vertex AI Workbench 執行個體的名稱；開頭須為英文字母，後面最多可接 62 個小寫英文字母、數字或連字號 (-)，但結尾不得為連字號
PROJECT_ID：您的專案 ID
LOCATION：您希望執行個體所在的區域
CUSTOM_CONTAINER_PATH：容器映像檔存放區的路徑，例如：gcr.io/PROJECT_ID/IMAGE_NAME
METADATA：要套用至這個執行個體的自訂中繼資料；例如，如要指定開機後指令碼，可以使用 post-startup-script 中繼資料標記，格式如下："--metadata=post-startup-script=gs://BUCKET_NAME/hello.sh"

執行下列指令：

Linux、macOS 或 Cloud Shell

gcloud workbench instances create INSTANCE_NAME \
    --project=PROJECT_ID \
    --location=LOCATION \
    --container-repository=CUSTOM_CONTAINER_URL \
    --container-tag=latest \
    --metadata=METADATA

Windows (PowerShell)

gcloud workbench instances create INSTANCE_NAME `
    --project=PROJECT_ID `
    --location=LOCATION `
    --container-repository=CUSTOM_CONTAINER_URL `
    --container-tag=latest `
    --metadata=METADATA

Windows (cmd.exe)

gcloud workbench instances create INSTANCE_NAME ^
    --project=PROJECT_ID ^
    --location=LOCATION ^
    --container-repository=CUSTOM_CONTAINER_URL ^
    --container-tag=latest ^
    --metadata=METADATA

如要進一步瞭解如何透過指令列建立執行個體的指令，請參閱 gcloud CLI 說明文件。

Vertex AI Workbench 會建立執行個體並自動啟動。執行個體可供使用時，Vertex AI Workbench 會在 Google Cloud 控制台中啟用「Open JupyterLab」(開啟 JupyterLab) 連結。

網路設定選項

除了一般網路選項之外，使用自訂容器的 Vertex AI Workbench 執行個體也必須能存取 Artifact Registry 服務。

如果您已關閉虛擬私有雲的公開 IP 存取權，請確認已啟用私人 Google 存取權。

啟用映像檔串流

系統會佈建自訂容器主機，與 Google Kubernetes Engine (GKE) 中的映像檔串流互動，以便更快提取容器，並在大型容器快取至 GKE 遠端檔案系統後，縮短初始化時間。

如要查看啟用圖片串流的相關規定，請參閱「需求條件」。通常，只要啟用 Container File System API，就能搭配 Vertex AI Workbench 執行個體使用映像檔串流。

啟用 Container File System API

主機 VM 如何執行自訂容器

主機 VM 會使用 Kubernetes 命名空間下的 nerdctl 載入及執行容器，而不是使用 Docker 執行自訂容器。這樣一來，Vertex AI Workbench 就能對自訂容器使用映像檔串流。

# Runs the custom container.
sudo /var/lib/google/nerdctl/nerdctl --snapshotter=gcfs -n k8s.io run --name payload-container

安裝範例：自訂容器，搭配自訂預設核心

以下範例說明如何建立預先安裝 pip 套件的新核心。

建立新的自訂容器：

FROM us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container:latest

ENV MAMBA_ROOT_PREFIX=/opt/micromamba

RUN micromamba create -n ENVIRONMENT_NAME -c conda-forge python=PYTHON_VERSION -y

SHELL ["micromamba", "run", "-n", "ENVIRONMENT_NAME", "/bin/bash", "-c"]

RUN micromamba install -c conda-forge pip -y
RUN pip install PACKAGE
RUN pip install ipykernel
RUN python -m ipykernel install --prefix /opt/micromamba/envs/ENVIRONMENT_NAME --name ENVIRONMENT_NAME --display-name KERNEL_NAME
# Creation of a micromamba kernel automatically creates a python3 kernel
# that must be removed if it's in conflict with the new kernel.
RUN rm -rf "/opt/micromamba/envs/ENVIRONMENT_NAME/share/jupyter/kernels/python3"

將新容器新增至 Artifact Registry：

gcloud auth configure-docker REGION-docker.pkg.dev
docker build -t REGION-docker.pkg.dev/PROJECT_ID/REPOSITORY_NAME/IMAGE_NAME .
docker push REGION-docker.pkg.dev/PROJECT_ID/REPOSITORY_NAME/IMAGE_NAME:latest

建立執行個體：

gcloud workbench instances create INSTANCE_NAME  \
    --project=PROJECT_ID \
    --location=ZONE \
    --container-repository=REGION-docker.pkg.dev/PROJECT_ID/REPOSITORY_NAME/IMAGE_NAME \
    --container-tag=latest

自訂容器的持續性核心

Vertex AI Workbench 自訂容器只會將資料磁碟掛接到每個容器內的 /home/USER 目錄，其中 jupyter 是預設使用者。也就是說，/home/USER 以外的任何變更都是暫時性的，重新啟動後不會保留。如要讓已安裝的套件保留在特定核心中，可以在 /home/USER 目錄中建立核心。

如要在 /home/USER 目錄中建立核心：

建立 micromamba 環境：

micromamba create -p /home/USER/ENVIRONMENT_NAME -c conda-forge python=3.11 -y
micromamba activate /home/USER/ENVIRONMENT_NAME
pip install ipykernel
pip install -r ~/requirement.txt
python -m ipykernel install --prefix "/home/USER/ENVIRONMENT_NAME" --display-name "Example Kernel"

更改下列內容：

USER：使用者目錄名稱，預設為 jupyter
ENVIRONMENT_NAME：環境名稱
PYTHON_VERSION：Python 版本，例如 3.11

等待 30 秒到 1 分鐘，讓核心重新整理。

更新基本容器的啟動作業

Vertex AI Workbench 執行個體的基礎容器 (us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container:latest) 會執行 /run_jupyter.sh，啟動 JupyterLab。

如果您在衍生容器中修改容器的啟動程序，必須附加 /run_jupyter.sh 才能執行 JupyterLab 的預設設定。

以下是 Dockerfile 的修改範例：

# DockerFile
FROM us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container:latest

CP startup_file.sh /
# Ensure that you have the correct permissions and startup is executable.
RUN chmod 755 /startup_file.sh && \
    chown jupyter:jupyter /startup_file.sh

# Override the existing CMD directive from the base container.
CMD ["/startup_file.sh"]

# /startup_file.sh

echo "Running startup scripts"
...

/run_jupyter.sh

更新基本容器中的 JupyterLab 設定

如要修改基礎容器的 JupyterLab 設定，請執行下列操作：

確認 JupyterLab 已設定為通訊埠 8080。我們的 Proxy 代理程式已設定為將任何要求轉送至通訊埠 8080，如果 Jupyter 伺服器未監聽正確的通訊埠，執行個體就會發生佈建問題。
在 jupyterlab micromamba 環境下修改 JupyterLab 套件。我們提供獨立的套件環境來執行 JupyterLab 和外掛程式，確保與核心環境不會發生任何依附元件衝突。如要安裝其他 JupyterLab 擴充功能，請務必在 jupyterlab 環境中安裝。例如：
```
# DockerFile
FROM us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container:latest
RUN micromamba activate jupyterlab && \
  jupyter nbextension install nbdime
```

自訂容器中繼資料

除了可套用至 Vertex AI Workbench 執行個體的標準中繼資料清單，使用自訂容器的執行個體還包含下列中繼資料，可管理有效負載容器的例項化：

功能	說明	中繼資料鍵	接受的值和預設值
在容器映像檔上啟用 Cloud Storage FUSE	將 `/dev/fuse` 掛接到容器，並啟用 `gcsfuse` 以供容器使用。	`container-allow-fuse`	`true`：啟用 Cloud Storage FUSE。 `false` (預設)：不啟用 Cloud Storage FUSE。
其他容器執行參數	在 `nerdctl run` 中附加額外容器參數，其中 `nerdctl` 是 Containerd CLI。	`container-custom-params`	容器執行參數的字串。示例： `--v /mnt/disk1:/mnt/disk1`。
其他容器環境旗標	將環境變數儲存至 `/mnt/stateful_partition/workbench/container_env` 下的旗標，並附加至 `nerdctl run`。	`container-env-file`	容器環境變數字串。示例： `CONTAINER_NAME=derivative-container`。

升級自訂容器

執行個體首次啟動時，會從儲存在 custom-container-payload 中繼資料中的 URI 提取容器映像檔。如果您使用 :latest 標記，容器會在每次重新啟動時更新。custom-container-payload 中繼資料值無法直接修改，因為這是受保護的中繼資料鍵。

如要更新執行個體的自訂容器映像檔，可以使用 Google Cloud CLI、Terraform 或 Notebooks API 支援的下列方法。

gcloud

您可以使用下列指令，更新 Vertex AI Workbench 執行個體上的自訂容器映像檔中繼資料：

gcloud workbench instances update INSTANCE_NAME \
    --container-repository=CONTAINER_URI \
    --container-tag=CONTAINER_TAG

Terraform

您可以變更 terraform 設定中的 container_image 欄位，更新容器有效負載。

如要瞭解如何套用或移除 Terraform 設定，請參閱「基本 Terraform 指令」。

resource "google_workbench_instance" "default" {
  name     = "workbench-instance-example"
  location = "us-central1-a"

  gce_setup {
    machine_type = "n1-standard-1"
    container_image {
      repository = "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container"
      family  = "latest"
    }
  }
}

Notebooks API

使用 instances.patch 方法，並在 updateMask 中變更 gce_setup.container_image.repository 和 gce_setup.container_image.tag。

執行診斷工具

診斷工具會檢查及驗證各種 Vertex AI Workbench 服務的狀態。詳情請參閱「診斷工具執行的工作」。

使用自訂容器建立 Vertex AI Workbench 執行個體時，診斷工具無法做為使用者可執行的指令碼，在主機環境中使用。而是編譯成二進位檔，並載入 Google 執行階段容器，該容器專為在 Container-Optimized OS 環境中執行診斷服務而建構。請參閱 Container-Optimized OS 總覽。

如要執行診斷工具，請完成下列步驟：

使用 SSH 連線至 Vertex AI Workbench 執行個體。

在 SSH 終端機中執行下列指令：

sudo docker exec diagnostic-service ./diagnostic_tool

如要查看其他指令選項，請執行下列指令：

sudo docker exec diagnostic-service ./diagnostic_tool --help

如要進一步瞭解診斷工具的選項，請參閱監控健康狀態說明文件。

如要使用 REST API 執行診斷工具，請參閱 REST API 說明文件。

存取執行個體

您可以透過 Proxy 網址存取執行個體。

建立執行個體並啟用後，您可以使用 gcloud CLI 取得 Proxy 網址。

使用下列任何指令資料之前，請先替換以下項目：

INSTANCE_NAME：Vertex AI Workbench 執行個體的名稱
PROJECT_ID：您的專案 ID
LOCATION：執行個體所在的區域

執行下列指令：

Linux、macOS 或 Cloud Shell

gcloud workbench instances describe INSTANCE_NAME \
--project=PROJECT_ID \
--location=LOCATION | grep proxy-url

Windows (PowerShell)

gcloud workbench instances describe INSTANCE_NAME `
--project=PROJECT_ID `
--location=LOCATION | grep proxy-url

Windows (cmd.exe)

gcloud workbench instances describe INSTANCE_NAME ^
--project=PROJECT_ID ^
--location=LOCATION | grep proxy-url

proxy-url: 7109d1b0d5f850f-dot-datalab-vm-staging.googleusercontent.com

describe 指令會傳回 Proxy 網址。如要存取執行個體，請在網路瀏覽器中開啟 Proxy 網址。

如要進一步瞭解如何透過指令列描述執行個體，請參閱 gcloud CLI 說明文件。