Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
A Vertex AI oferece previsões on-line no Google Distributed Cloud (GDC) isolado por ar usando a API Online Prediction. Uma previsão é a saída de um modelo de machine learning treinado. Especificamente, as previsões on-line são solicitações síncronas feitas no endpoint do modelo.
Com a previsão on-line, é possível fazer upload, implantar, veicular e fazer solicitações usando seus próprios modelos de previsão em um conjunto de contêineres compatíveis.
Use a previsão on-line ao fazer solicitações em resposta à entrada do aplicativo ou em situações que exigem inferência em tempo hábil.
É possível usar a API Online Prediction aplicando recursos personalizados do Kubernetes ao cluster de previsão dedicado que o operador de infraestrutura (IO) cria para você.
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-09-04 UTC."],[[["\u003cp\u003eOnline Prediction is a Preview feature within Vertex AI on Google Distributed Cloud (GDC) air-gapped, designed for synchronous requests to trained machine-learning model endpoints, but not recommended for production environments.\u003c/p\u003e\n"],["\u003cp\u003eYou can utilize Online Prediction to upload, deploy, serve, and request predictions from your own models using a set of specified containers.\u003c/p\u003e\n"],["\u003cp\u003eThe Online Prediction API, accessed through Kubernetes custom resources, requires a dedicated prediction cluster set up by your Infrastructure Operator.\u003c/p\u003e\n"],["\u003cp\u003eBefore using Online Prediction, you must export model artifacts and deploy the model to an endpoint, associating it with compute resources for low-latency predictions.\u003c/p\u003e\n"],["\u003cp\u003eSupported containers for Online Prediction in Distributed Cloud include TensorFlow (version 2.14 for both CPU and GPU) and PyTorch (version 2.1 for both CPU and GPU).\u003c/p\u003e\n"]]],[],null,["# Learn about online predictions\n\n| **Preview:** Online Prediction is a Preview feature that is available as-is and is not recommended for production environments. Google provides no service-level agreements (SLA) or technical support commitments for Preview features. For more information, see GDC's [feature stages](/distributed-cloud/hosted/docs/latest/gdch/resources/feature-stages).\n\nVertex AI offers online predictions on\nGoogle Distributed Cloud (GDC) air-gapped through the Online Prediction API. A\nprediction is the output of a trained machine-learning model. Specifically,\nonline predictions are synchronous requests made to your model endpoint.\n\nOnline Prediction lets you upload, deploy, serve, and make requests\nusing your own prediction models on\n[a set of supported containers](#available-container-images).\nUse Online Prediction when making requests in response to application\ninput or in situations requiring timely inference.\n\nYou can use the Online Prediction API by applying Kubernetes custom\nresources to the dedicated [prediction cluster](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/prediction-user-cluster)\nthat your Infrastructure Operator (IO) creates for you.\n\nBefore getting online predictions, you must\n[export model artifacts](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-export-model-artifacts)\nand [deploy the model to an endpoint](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-deploy-model).\nThis action associates compute resources with the model to serve online\npredictions with low latency.\n\nThen, you can get online predictions from a custom-trained model by\n[formatting](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-format-online-prediction)\nand [sending](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/quickstart-op)\na request.\n\nAvailable container images\n--------------------------\n\nThe following table contains the list of supported containers for\nOnline Prediction in Distributed Cloud:"]]