将 TPU 连接到共享 VPC 网络
配置 VPC 宿主项目
您需要向服务项目中的 TPU 服务账号授予管理宿主项目中资源的权限。您可以使用“TPU Shared VPC Agent”角色 (roles/tpu.xpnAgent) 来执行此操作。运行以下 gcloud 命令以授予此角色绑定。
gcloud projects add-iam-policy-binding host-project-id \ --member=serviceAccount:service-your-service-project-number@gcp-sa-tpu.iam.gserviceaccount.com \ --role=roles/tpu.xpnAgent
创建连接到共享 VPC 网络的 TPU 虚拟机
首先,确定可用区中可用的加速器类型和版本
gcloud compute tpus accelerator-types list --zone zone
gcloud compute tpus versions list --zone zone
您可以在创建 TPU 时将 TPU 虚拟机连接到共享 VPC 网络。使用 --network 标记指定共享 VPC:
gcloud compute tpus tpu-vm create tpu-name \ --zone zone \ --accelerator-type accelerator-type \ --network projects/host-project-id/global/networks/host-network \ --version tpu-image-version \ --project your-service-project-id
您可以使用 gcloud describe 命令验证 TPU 虚拟机是否已连接到共享 VPC:
$ gcloud compute tpus tpu-vm describe tpu-name --zone zone
响应包含 TPU 虚拟机所连接到的网络:
acceleratorType: v3-8
apiVersion: V2
cidrBlock: 10.128.0.0/20
createTime: '2022-06-17T21:32:13.859274143Z'
health: HEALTHY
id: '0000000000000000000'
name: projects/my-project/locations/us-central1-b/nodes/my-tpu
networkConfig:
enableExternalIps: true
network: projects/my-project/global/networks/default
subnetwork: projects/my-project/regions/us-central1/subnetworks/default
networkEndpoints:
- accessConfig:
externalIp: 000.000.000.000
ipAddress: 10.128.0.104
port: 8470
runtimeVersion: tpu-vm-tf-2.8.0
schedulingConfig: {}
serviceAccount:
email: 00000000000-compute@developer.gserviceaccount.com
scope:
- https://www.googleapis.com/auth/devstorage.read_write
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/service.management
- https://www.googleapis.com/auth/servicecontrol
- https://www.googleapis.com/auth/cloud-platform
- https://www.googleapis.com/auth/pubsub
shieldedInstanceConfig: {}
state: READY
删除 TPU 虚拟机
完成 TPU 虚拟机的操作后,请务必将其删除。
gcloud compute tpus tpu-vm delete tpu-name --zone zone