Stay organized with collections
Save and categorize content based on your preferences.
Google Distributed Cloud (GDC) air-gapped offers
prebuilt containers
to serve online predictions from models trained using the following
machine learning (ML) frameworks:
TensorFlow
PyTorch
To use one of these prebuilt containers, you must save your model as one or
more model artifacts that comply with the requirements of the prebuilt
container. These requirements apply whether or not your model artifacts are
created on Distributed Cloud.
Before you begin
Before exporting model artifacts, perform the following steps:
The IO creates the cluster for you, associates it with your project, and
assigns the appropriate node pools within the cluster, considering the
resources you need for online predictions.
Create the Vertex AI Default Serving
(vai-default-serving-sa) service account within your project. For
information about service accounts, see
Set up service accounts.
Grant the Project Bucket Object Viewer (project-bucket-object-viewer) role
to the Vertex AI Default Serving (vai-default-serving-sa)
service account for the storage bucket you created. For information
about granting bucket access to service accounts, see
Grant bucket access.
To get the permissions that you need to access Online Prediction,
ask your Project IAM Admin to grant you the Vertex AI
Prediction User (vertex-ai-prediction-user) role. For information about
this role, see Prepare IAM permissions.
Framework-specific requirements for exporting to prebuilt containers
Depending on
the ML framework you plan to use for prediction,
you must export model artifacts in different formats. The following sections
describe the acceptable model formats for each ML framework.
There are several ways to export SavedModels from TensorFlow training
code. The following list describes a few ways that work for various
TensorFlow APIs:
To serve predictions using these artifacts, create a Model with the
prebuilt container for prediction
matching the version of TensorFlow that you used for training.
PyTorch
If you use PyTorch to train a model,
you must package the model artifacts including either a
default or
custom
handler by creating an archive file using
Torch model archiver.
The prebuilt PyTorch images expect the archive to be named model.mar, so make
sure you set the model name to model.
For information about optimizing the memory usage, latency, or throughput of a
PyTorch model served with TorchServe, see the
PyTorch performance guide.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eOnline Prediction is a Preview feature not recommended for production use, with no SLAs or technical support commitments from Google.\u003c/p\u003e\n"],["\u003cp\u003eGoogle Distributed Cloud (GDC) air-gapped provides prebuilt containers for serving online predictions from models trained using TensorFlow or PyTorch.\u003c/p\u003e\n"],["\u003cp\u003eTo use prebuilt containers, models must be saved as compliant model artifacts, and the process requires setting up a project, creating a prediction cluster, and establishing a storage bucket.\u003c/p\u003e\n"],["\u003cp\u003eDepending on whether you use TensorFlow or PyTorch, model artifacts must be exported in different formats, such as a TensorFlow SavedModel directory or a PyTorch archive file named \u003ccode\u003emodel.mar\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eModels must be uploaded to a designated storage bucket with a specific structure: \u003ccode\u003es3://<BUCKET_NAME>/<MODEL_ID>/<MODEL_VERSION_ID>\u003c/code\u003e.\u003c/p\u003e\n"]]],[],null,["# Export model artifacts for prediction\n\n| **Preview:** Online Prediction is a Preview feature that is available as-is and is not recommended for production environments. Google provides no service-level agreements (SLA) or technical support commitments for Preview features. For more information, see GDC's [feature stages](/distributed-cloud/hosted/docs/latest/gdch/resources/feature-stages).\n\nGoogle Distributed Cloud (GDC) air-gapped offers\n[prebuilt containers](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-online-predictions#available-container-images)\nto serve online predictions from models trained using the following\nmachine learning (ML) frameworks:\n\n- TensorFlow\n- PyTorch\n\nTo use one of these prebuilt containers, you must save your model as one or\nmore *model artifacts* that comply with the requirements of the prebuilt\ncontainer. These requirements apply whether or not your model artifacts are\ncreated on Distributed Cloud.\n\nBefore you begin\n----------------\n\nBefore exporting model artifacts, perform the following steps:\n\n1. Create and train a prediction model targeting one of the [supported containers](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-online-predictions#available-container-images).\n2. If you don't have a project, [set up a project for Vertex AI](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-set-up-project).\n3. Work with your Infrastructure Operator (IO) to\n [create the prediction cluster](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/prediction-user-cluster).\n\n The IO creates the cluster for you, associates it with your project, and\n assigns the appropriate node pools within the cluster, considering the\n resources you need for online predictions.\n4. [Create a storage bucket for your project](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/create-storage-buckets).\n\n5. Create the Vertex AI Default Serving\n (`vai-default-serving-sa`) service account within your project. For\n information about service accounts, see\n [Set up service accounts](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-set-up-project#set-up-service).\n\n6. Grant the Project Bucket Object Viewer (`project-bucket-object-viewer`) role\n to the Vertex AI Default Serving (`vai-default-serving-sa`)\n service account for the storage bucket you created. For information\n about granting bucket access to service accounts, see\n [Grant bucket access](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/grant-obtain-storage-access#grant_bucket_access).\n\n7. To get the permissions that you need to access Online Prediction,\n ask your Project IAM Admin to grant you the Vertex AI\n Prediction User (`vertex-ai-prediction-user`) role. For information about\n this role, see [Prepare IAM permissions](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-ao-permissions).\n\nFramework-specific requirements for exporting to prebuilt containers\n--------------------------------------------------------------------\n\nDepending on\n[the ML framework you plan to use for prediction](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-online-predictions#available-container-images),\nyou must export model artifacts in different formats. The following sections\ndescribe the acceptable model formats for each ML framework.\n| **Important:** To access the URLs listed on this page, you must connect to the internet. The URLs are provided to access outside of your air-gapped environment.\n\n### TensorFlow\n\nIf you [use TensorFlow to train a model](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-online-predictions#tf),\nexport your model as a [TensorFlow SavedModel directory](https://www.tensorflow.org/guide/saved_model).\n\nThere are several ways to export `SavedModels` from TensorFlow training\ncode. The following list describes a few ways that work for various\nTensorFlow APIs:\n\n- If you use Keras for training,\n [use `tf.keras.Model.save` to export a SavedModel](https://www.tensorflow.org/guide/keras/save_and_serialize#whole-model_saving_loading).\n\n- If you use an Estimator for training,\n [use `tf.estimator.Estimator.export_saved_model` to export a SavedModel](https://www.tensorflow.org/guide/estimator#savedmodels_from_estimators).\n\n- Otherwise,\n [use `tf.saved_model.save`](https://www.tensorflow.org/guide/saved_model#saving_a_custom_model)\n or\n [use `tf.compat.v1.saved_model.SavedModelBuilder`](https://www.tensorflow.org/api_docs/python/tf/compat/v1/saved_model/builder).\n\nIf you are not using Keras or an Estimator, then make sure to\n[use the `serve` tag and `serving_default` signature when you export your SavedModel](https://www.tensorflow.org/tfx/serving/serving_basic#train_and_export_tensorflow_model)\nto ensure Vertex AI can use your model artifacts to serve\npredictions. Keras and Estimator handle this task automatically.\nLearn more about\n[specifying signatures during export](https://www.tensorflow.org/guide/saved_model#specifying_signatures_during_export).\n\nTo serve predictions using these artifacts, create a `Model` with the\n[prebuilt container for prediction](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-online-predictions#tf)\nmatching the version of TensorFlow that you used for training.\n\n### PyTorch\n\nIf you [use PyTorch to train a model](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-online-predictions#pt),\nyou must package the model artifacts including either a\n[default](https://pytorch.org/serve/#default-handlers) or\n[custom](https://pytorch.org/serve/custom_service.html)\nhandler by creating an archive file using\n[Torch model archiver](https://github.com/pytorch/serve/tree/master/model-archiver).\nThe prebuilt PyTorch images expect the archive to be named `model.mar`, so make\nsure you set the model name to *model*.\n\nFor information about optimizing the memory usage, latency, or throughput of a\nPyTorch model served with TorchServe, see the\n[PyTorch performance guide](https://github.com/pytorch/serve/blob/master/docs/performance_guide.md).\n\nUpload your model\n-----------------\n\nYou must upload your model to [the storage bucket you created](#storage-bucket).\nFor more information about uploading objects to storage buckets, see\n[Upload and download storage objects in projects](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/upload-download-storage-objects).\n\nThe path to the storage bucket of your model must have the following structure: \n\n s3://\u003cvar translate=\"no\"\u003eBUCKET_NAME\u003c/var\u003e/\u003cvar translate=\"no\"\u003eMODEL_ID\u003c/var\u003e/\u003cvar translate=\"no\"\u003eMODEL_VERSION_ID\u003c/var\u003e\n\nFor export details, see the\n[framework-specific requirements for exporting to prebuilt containers](#framework-specific-requirements)."]]