Stay organized with collections
Save and categorize content based on your preferences.
Vertex AI offers online predictions on
Google Distributed Cloud (GDC) air-gapped through the Online Prediction API. A
prediction is the output of a trained machine-learning model. Specifically,
online predictions are synchronous requests made to your model endpoint.
Online Prediction lets you upload, deploy, serve, and make requests
using your own prediction models on
a set of supported containers.
Use Online Prediction when making requests in response to application
input or in situations requiring timely inference.
You can use the Online Prediction API by applying Kubernetes custom
resources to the dedicated prediction cluster
that your Infrastructure Operator (IO) creates for you.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-04-09 UTC."],[[["Online Prediction is a Preview feature within Vertex AI on Google Distributed Cloud (GDC) air-gapped, designed for synchronous requests to trained machine-learning model endpoints, but not recommended for production environments."],["You can utilize Online Prediction to upload, deploy, serve, and request predictions from your own models using a set of specified containers."],["The Online Prediction API, accessed through Kubernetes custom resources, requires a dedicated prediction cluster set up by your Infrastructure Operator."],["Before using Online Prediction, you must export model artifacts and deploy the model to an endpoint, associating it with compute resources for low-latency predictions."],["Supported containers for Online Prediction in Distributed Cloud include TensorFlow (version 2.14 for both CPU and GPU) and PyTorch (version 2.1 for both CPU and GPU)."]]],[]]