This page describes how to use Vertex AI to export your AutoML tabular model to Cloud Storage, download the model to an on-premises server or a server hosted by another cloud provider, and then use Docker to make the model available for predictions.
For information about exporting image and video Edge models, see Export AutoML Edge models.
After exporting your tabular model, if you want to import it back into Vertex AI, see Import models to Vertex AI.
Limitations
Exporting AutoML tabular models has the following limitations:
You can export AutoML tabular classification and regression models only. Exporting AutoML tabular forecasting models is not supported.
Vertex Explainable AI is not available using exported tabular models. If you need to use Vertex Explainable AI, you must serve predictions from a model hosted by Vertex AI.
The exported tabular model can run only on x86 architecture CPUs that support Advanced Vector Extensions (AVX) instruction sets.
Export process
The steps for exporting your model are:
Before you begin
Before you can complete this task, you must have completed the following tasks:
- Set up your project as described in Setting up the cloud environment.
- Train the model that you want to download.
- Install and initialize the Google Cloud CLI on the server you will use to run the exported model.
- Install Docker on your server.
Export the model
Console
In the Google Cloud console, in the Vertex AI section, go to the Models page.
Click the tabular model you want to export to open its details page.
Click Export in the button bar to export your model.
Select or create a Cloud Storage folder in the desired location.
The bucket must meet the bucket requirements.
You cannot export a model to a top-level bucket. You must use at least one level of folder.
For best results, create a new, empty folder. You will copy the entire contents of the folder in a later step.
Click Export.
You will download the exported model to your server in the next section.
REST
You use the models.export method to export a model to Cloud Storage.Before using any of the request data, make the following replacements:
- LOCATION: Your region.
- PROJECT: Your project ID.
- MODEL_ID: the ID of the model you want to export.
-
GCS_DESTINATION : your destination folder in
Cloud Storage. For example,
gs://export-bucket/exports
.You cannot export a model to a top-level bucket. You must use at least one level of folder.
The folder must conform to the bucket requirements.
For best results, create a new folder. You will copy the entire contents of the folder in a later step.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/models/MODEL_ID:export
Request JSON body:
{ "outputConfig": { "exportFormatId": "tf-saved-model", "artifactDestination": { "outputUriPrefix": "GCS_DESTINATION" } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/models/MODEL_ID:export"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/models/MODEL_ID:export" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/models/MODEL_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.ExportModelOperationMetadata", "genericMetadata": { "createTime": "2020-10-12T20:53:40.130785Z", "updateTime": "2020-10-12T20:53:40.130785Z" }, "outputInfo": { "artifactOutputUri": "gs://OUTPUT_BUCKET/model-MODEL_ID/EXPORT_FORMAT/YYYY-MM-DDThh:mm:ss.sssZ" } } }
Java
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Get the status of an export operation
Some requests start long-running operations that require time to complete. These requests return an operation name, which you can use to view the operation's status or cancel the operation. Vertex AI provides helper methods to make calls against long-running operations. For more information, see Working with long-running operations.
Pull and run the model server
In this task, you will download your exported model from Cloud Storage and start the Docker container, so your model is ready to receive prediction requests.
To pull and run the model server:
On the machine where you will run the model, change to the directory where you want to save the exported model.
Download the exported model:
gcloud storage cp <var>gcs-destination</var> . --recursive
Where gcs-destination is the path to the location of the exported model in Cloud Storage.
The model is copied to your current directory, under the following path:
./model-<model-id>/tf-saved-model/<export-timestamp>
The path may contain either
tf-saved-model
orcustom-trained
.Rename the directory so the timestamp is removed.
mv model-<model-id>/tf-saved-model/<export-timestamp> model-<model-id>/tf-saved-model/<new-dir-name>
The timestamp makes the directory invalid for Docker.
Pull the model server Docker image.
sudo docker pull MODEL_SERVER_IMAGE
The model server image to pull is located in the
environment.json
file in exported model directory. It should have the following path:./model-<model-id>/tf-saved-model/<new-dir-name>/environment.json
If no environment.json file is present, use:
MULTI_REGION-docker.pkg.dev/vertex-ai/automl-tabular/prediction-server-v1
Replace
MULTI_REGION
withus
,europe
, orasia
to select which Docker repository you want to pull the Docker image from. Each repository provides the same Docker image, but choosing the Artifact Registry multi-region closest to the machine where you are running Docker might reduce latency.Start the Docker container, using the directory name you just created:
docker run -v `pwd`/model-<model-id>/tf-saved-model/<new-dir-name>:/models/default -p 8080:8080 -it MODEL_SERVER_IMAGE
You can stop the model server at any time by using Ctrl-C
.
Update the model server docker container
Because you download the model server Docker container when you export the model, you must explicitly update the model server to get updates and bug fixes. You should update the model server periodically, using the following command:
docker pull MODEL_SERVER_IMAGE
Make sure the Docker image URI matches the URI of the Docker image that you pulled previously.
Get predictions from the exported model
The model server in the Vertex AI image container handles prediction requests and returns prediction results.
Batch prediction is not available for exported models.
Prediction data format
You provide the data (payload
field) for your prediction request the following
JSON format:
{ "instances": [ { "column_name_1": value, "column_name_2": value, … } , … ] }
The following example shows a request with three columns: a categorical column, a numeric array, and a struct. The request includes two rows.
{ "instances": [ { "categorical_col": "mouse", "num_array_col": [ 1, 2, 3 ], "struct_col": { "foo": "piano", "bar": "2019-05-17T23:56:09.05Z" } }, { "categorical_col": "dog", "num_array_col": [ 5, 6, 7 ], "struct_col": { "foo": "guitar", "bar": "2019-06-17T23:56:09.05Z" } } ] }
Make the prediction request
Put your request data into a text file, for example,
tmp/request.json
.The number of rows of data in the prediction request, called the mini-batch size, affects the prediction latency and throughput. The larger the mini- batch size, the higher the latency and throughput. For reduced latency, use a smaller mini-batch size. For increased throughput, increase the mini-batch size. The most commonly used mini-batch sizes are 1, 32, 64, 128, 256, 512, and 1024.
Request the prediction:
curl -X POST --data @/tmp/request.json http://localhost:8080/predict
Prediction results format
The format of your results depends on your model objective.
Classification model results
Prediction results for classification models (binary and multi-class) return a probability score for each potential value of the target column. You must determine how you want to use the scores. For example, to get a binary classification from the provided scores, you would identify a threshold value. If there are two classes, "A" and "B", you should classify the example as "A" if the score for "A" is greater than the chosen threshold, and "B" otherwise. For imbalanced datasets, the threshold might approach 100% or 0%.
The results payload for a classification model look similar to this example:
{ "predictions": [ { "scores": [ 0.539999994635582, 0.2599999845027924, 0.2000000208627896 ], "classes": [ "apple", "orange", "grape" ] }, { "scores": [ 0.23999999463558197, 0.35999998450279236, 0.40000002086278963 ], "classes": [ "apple", "orange", "grape" ] } ] }
Regression model results
A predicted value is returned for each valid row of the prediction request. Prediction intervals are not returned for exported models.
The results payload for a regression model look similar to this example:
{ "predictions": [ { "value": -304.3663330078125, "lower_bound": -56.32196807861328, "upper_bound": 126.51904296875 }, { "value": -112.3663330078125, "lower_bound": 16.32196807861328, "upper_bound": 255.51904296875 } ] }
What's next
- Learn how to import your exported tabular model back into Vertex AI.