此旧版 AI Platform Prediction 已弃用，2025 年 1 月 31 日之后将不再在 Google Cloud 上提供。2025 年 1 月 31 日之后，所有模型、关联的元数据和部署都将被删除。将资源迁移至 Vertex AI 即可获取 AI Platform 没有的新机器学习功能。

此页面由 Cloud Translation API 翻译。

使用入门：使用自定义容器提供 PyTorch 预测服务

本教程介绍如何使用自定义容器部署提供在线预测服务的 PyTorch 机器学习 (ML) 模型。

在本教程中，您将部署运行 PyTorch 的 TorchServe 工具的容器，以从 TorchServe 提供的数字识别模型提供预测服务，而该模型已通过 MNIST 数据集预先训练。然后，您可以使用 AI Platform Prediction 对数字图片进行分类。

准备工作

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

Enable the AI Platform Training & Prediction and Artifact Registry API APIs.

Enable the APIs

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

Enable the AI Platform Training & Prediction and Artifact Registry API APIs.

Enable the APIs

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

在本教程中，我们建议您使用 Cloud Shell 与 Google Cloud进行互动。如果您想使用其他 Bash shell 取代 Cloud Shell，请执行以下额外的配置：

Install the Google Cloud CLI.
To initialize the gcloud CLI, run the following command:
```
gcloud init
```
按照 Artifact Registry 文档安装 Docker。

构建和推送容器映像

如需使用自定义容器，您必须指定符合自定义容器要求的 Docker 容器映像。本部分将介绍如何创建容器映像并将其推送到 Artifact Registry。

下载模型工件

模型工件是机器学习训练创建的文件，可用于提供预测服务。它们至少包含经过训练的机器学习模型的结构和权重。模型工件的格式取决于您用于训练的机器学习框架。

在本教程中，您将下载 TorchServe 提供的示例模型工件，而不是从头开始训练。

如需克隆 TorchServe 代码库，并使用模型工件导航到该目录，请在 shell 中运行以下命令：

git clone https://github.com/pytorch/serve.git \
  --branch=v0.3.0 \
  --depth=1

cd serve/examples/image_classifier/mnist

此目录包含要构建到容器映像中的三个重要文件：

mnist.py：定义经过训练的神经网络的结构
mnist_cnn.pt：包含 state_dict，其中含有特征权重和来自训练的其他输出
mnist_handler.py：扩展 TorchServe 处理预测请求的方式

创建 Artifact Registry 代码库

创建 Artifact Registry 代码库以存储您将在下一部分中创建的容器映像。在 shell 中运行以下命令：

gcloud beta artifacts repositories create getting-started-pytorch \
 --repository-format=docker \
 --location=REGION

将 REGION 替换为您希望 Artifact Registry 在其中存储容器映像的地区。稍后，您必须在与此地区匹配的地区端点创建 AI Platform Prediction 模型资源，因此请选择 AI Platform Prediction 具有地区端点的地区；例如 us-central1。

完成此操作后，此命令将输出以下输入内容：

Created repository [getting-started-pytorch].

构建容器映像

TorchServe 提供了 Dockerfile，用于构建运行 TorchServe 的容器映像。但是，您可以通过从 TorchServe 团队推送给 Docker Hub 的其中一个 TorchServe 映像获取您的容器映像来加快构建流程，而不是使用此 Dockerfile 安装所有 TorchServe 依赖项。

在包含模型工件的目录中，通过在您的 shell 中运行以下命令来创建新的 Dockerfile：
```
cat > Dockerfile <<END
FROM pytorch/torchserve:0.3.0-cpu

COPY mnist.py mnist_cnn.pt mnist_handler.py /home/model-server/

USER root
RUN printf "\nservice_envelope=json" >> /home/model-server/config.properties
USER model-server

RUN torch-model-archiver \
  --model-name=mnist \
  --version=1.0 \
  --model-file=/home/model-server/mnist.py \
  --serialized-file=/home/model-server/mnist_cnn.pt \
  --handler=/home/model-server/mnist_handler.py \
  --export-path=/home/model-server/model-store

CMD ["torchserve", \
     "--start", \
     "--ts-config=/home/model-server/config.properties", \
     "--models", \
     "mnist=mnist.mar"]
END
```
这些 Docker 指令可执行以下操作：
- FROM 指令会根据现有 TorchServe 映像获取当前容器映像。
- COPY 指令会将模型工件和预测处理程序从本地目录复制到容器映像的 /home/model-server/ 目录中。
- 第一个 RUN 指令修改父映像中的配置文件，以支持 AI Platform Prediction 的首选预测输入格式。
  
  具体来说，此指令将 TorchServe 配置为预期接受预测请求的 JSON 服务信封。
  
  修改此配置文件需要具有 model-server 用户（在父级映像中创建）的权限。相关说明会指示 Docker 以 root 用户身份运行以修改配置文件，然后继续使用 model-server 用户获取以下说明。
- 第二个 RUN 指令使用已安装在容器映像中的 Torch 模型归档程序，根据您复制到映像中的文件创建模型归档。它会将此模型归档保存在 /home/model-server/model-store/ 中并命名为 mnist.mar。
  
  如果要更改容器映像（例如，在请求处理程序中执行自定义预处理或后处理），则可以使用其他 RUN 指令来安装依赖项。
- CMD 指令用于启动 TorchServe HTTP 服务器。它引用了父级映像中的配置文件，并支持针对一个名为 mnist 的模型提供服务。此模型会加载通过 RUN 指令创建的 mnist.mar 文件。
  
  此指令会替换父级映像的 CMD 指令。请务必替换 CMD 指令而不是 ENTRYPOINT 指令，因为父级映像的 ENTRYPOINT 脚本会运行 CMD 中传递的命令，同时还添加额外的逻辑以防止 Docker 退出。
如需根据新的 Dockerfile 构建容器映像，并使用与您的 Artifact Registry 代码库兼容的名称来标记该容器映像，请在 shell 中运行以下命令：
```
docker build \
  --tag=REGION-docker.pkg.dev/PROJECT_ID/getting-started-pytorch/serve-mnist \
  .
```
替换以下内容：
- REGION：您的 Artifact Registry 代码库的地区，如上一部分中所指定
- PROJECT_ID：您的 Google Cloud项目的 ID
该命令可能会运行几分钟。

在本地运行容器（可选）

将容器映像推送到 Artifact Registry 以用于 AI Platform Prediction 之前，您可以将它作为本地环境中的容器来运行，以验证服务器是否按预期运行：

如需在本地将容器映像作为容器来运行，请在 shell 中运行以下命令：
```
docker run -d -p 8080:8080 --name=local_mnist \
  REGION-docker.pkg.dev/PROJECT_ID/getting-started-pytorch/serve-mnist
```
如上一节所述，替换以下内容：
- REGION：您的 Artifact Registry 代码库的地区，如上一部分中所指定
- PROJECT_ID：您的 Google Cloud项目的 ID
此命令会以分离模式运行容器，将容器的端口 8080 映射到本地环境的端口 8080。（您从中获取容器映像的父级映像会将 TorchServe 配置为使用端口 8080。）
如需向容器的服务器发送运行状况检查，请在 shell 中运行以下命令：
```
curl localhost:8080/ping
```
如果成功，则服务器会返回以下响应：
```
{
  "status": "Healthy"
}
```

如需向容器的服务器发送预测请求，请在 shell 中运行以下命令：

cat > instances.json <<END
{
  "instances": [
    {
      "data": {
        "b64": "$(base64 --wrap=0 test_data/3.png)"
      }
    }
  ]
}
END

curl -X POST \
  -H "Content-Type: application/json; charset=utf-8" \
  -d @instances.json \
  localhost:8080/predictions/mnist

此请求使用 TorchServe 示例随附的其中一个测试映像。

如果成功，则服务器会返回以下预测：

{"predictions": [3]}

如需停止容器，请在 shell 中运行以下命令：
```
docker stop local_mnist
```

将容器映像推送到 Artifact Registry

配置 Docker 以访问 Artifact Registry。然后，将您的容器映像推送到 Artifact Registry 代码库。

如需向本地 Docker 安装授予推送到所选地区中 Artifact Registry 的权限，请在 shell 中运行以下命令：
```
gcloud auth configure-docker REGION-docker.pkg.dev
```
将 REGION 替换为您在上一部分中创建了代码库的地区。
如需将您刚刚构建的容器映像推送到 Artifact Registry，请在 shell 中运行以下命令：
```
docker push REGION-docker.pkg.dev/PROJECT_ID/getting-started-pytorch/serve-mnist
```
如上一节所述，替换以下内容：
- REGION：您的 Artifact Registry 代码库的地区，如上一部分中所指定
- PROJECT_ID：您的 Google Cloud项目的 ID

部署容器

本部分逐步介绍如何在 AI Platform Prediction 上创建模型和模型版本以提供预测服务。模型版本将容器映像作为容器运行以提供预测服务。

本教程提供了在创建模型和模型版本时使用的特定配置选项。如果您想要了解不同的配置选项，请阅读部署模型。

创建模型

如需创建模型资源，请在 shell 中运行以下命令：

gcloud beta ai-platform models create getting_started_pytorch \
  --region=REGION \
  --enable-logging \
  --enable-console-logging

将 REGION 替换为您在上一部分中创建了 Artifact Registry 代码库的地区。

创建模型版本

如需创建模型版本资源，请在 shell 中运行以下命令：

gcloud beta ai-platform versions create v1 \
  --region=REGION \
  --model=getting_started_pytorch \
  --machine-type=n1-standard-4 \
  --image=REGION-docker.pkg.dev/PROJECT_ID/getting-started-pytorch/serve-mnist \
  --ports=8080 \
  --health-route=/ping \
  --predict-route=/predictions/mnist

替换以下内容：

REGION：您在先前部分中创建 Artifact Registry 代码库和 AI Platform Prediction 模型的地区
PROJECT_ID：您的 Google Cloud项目的 ID

此命令中与容器相关的标志用于执行以下操作：

--image：容器映像的 URI。
--ports：您的容器的 HTTP 服务器侦听请求的端口。您从中获取容器映像的父级映像会将 TorchServe 配置为使用端口 8080。
--health-route：容器的 HTTP 服务器用于侦听健康检查的路径。TorchServe 始终监听 /ping 路径上的健康检查。
--predict-route：容器的 HTTP 服务器用于侦听预测请求的路径。TorchServe 始终监听 /predictions/MODEL 路径上的预测请求。

MODEL 是您在启动 TorchServe 时指定的模型的名称。在此示例中，名称是 mnist（在上一部分中，您在此 Docker 指令中设置了该名称）：
```
CMD ["torchserve", \
  "--start", \
  "--ts-config=/home/model-server/config.properties", \
  "--models", \
  "mnist=mnist.mar"]
```

获取预测

您在上一部分下载的 TorchServe 示例文件包括测试图片。容器的 TorchServe 配置预期收到 JSON 格式的预测请求，其中图片是每个实例的 data.b64 字段中的 base64 编码字符串。

例如，如需对 test_data/3.png 进行分类，请在 shell 中运行以下命令：

cat > instances.json <<END
{
 "instances": [
   {
     "data": {
       "b64": "$(base64 --wrap=0 test_data/3.png)"
     }
   }
 ]
}
END

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json; charset=utf-8" \
  -d @instances.json \
  https://REGION-ml.googleapis.com/v1/projects/PROJECT_ID/models/getting_started_pytorch/versions/v1:predict

请替换以下内容：

REGION：在先前部分中您创建了 AI Platform Prediction 模型的地区
PROJECT_ID：您的 Google Cloud项目的 ID

如果成功，模型版本会返回以下预测：

{"predictions": [3]}

清理

为避免产生额外的 AI Platform Prediction 费用和 Artifact Registry 费用，请删除您在本教程中创建的 Google Cloud 资源：

如需删除模型版本，请在 shell 中运行以下命令：
```
gcloud ai-platform versions delete v1 \
  --region=REGION \
  --model=getting_started_pytorch \
  --quiet
```
将 REGION 替换为您在上一部分中创建了模型的地区。
如需删除模型，请在 shell 中运行以下命令：
```
gcloud ai-platform models delete getting_started_pytorch \
  --region=REGION \
  --quiet
```
将 REGION 替换为您在上一部分中创建了模型的地区。
如需删除 Artifact Registry 代码库和容器映像，请在 shell 中运行以下命令：
```
gcloud beta artifacts repositories delete getting-started-pytorch \
  --location=REGION \
  --quiet
```
将 REGION 替换为您在上一部分中创建了 Artifact Registry 代码库的地区。

后续步骤

如果您要设计自己的容器映像（无论是从头开始创建，还是通过现有的第三方容器映像获取），请参阅自定义容器要求。
详细了解如何使用自定义容器进行预测，包括与其他 AI Platform Prediction 功能以及您可在部署期间为容器指定的配置选项的兼容性。