Deploys a model to a new endpoint.
Endpoint
posthttps://{service-endpoint}/v1beta1/{destination}:deploy
Where {service-endpoint}
is one of the supported service endpoints.
Path parameters
destination
string
Required. The resource name of the location to deploy the model in. Format: projects/{project}/locations/{location}
Request body
The request body contains data with the following structure:
Optional. The model config to use for the deployment. If not specified, the default model config will be used.
Optional. The endpoint config to use for the deployment. If not specified, the default endpoint config will be used.
Optional. The deploy config to use for the deployment. If not specified, the default deploy config will be used.
artifacts
Union type
artifacts
can be only one of the following:publisherModelName
string
The Model Garden model to deploy. Format: publishers/{publisher}/models/{publisherModel}@{versionId}
, or publishers/hf-{hugging-face-author}/models/{hugging-face-model-name}@001
.
huggingFaceModelId
string
The Hugging Face model to deploy. Format: Hugging Face model id like google/gemma-2-2b-it
.
Response body
If successful, the response body contains an instance of Operation
.
ModelConfig
The model config to use for the deployment.
acceptEula
boolean
Optional. Whether the user accepts the End user License Agreement (EULA) for the model.
huggingFaceAccessToken
string
Optional. The Hugging Face read access token used to access the model artifacts of gated models.
huggingFaceCacheEnabled
boolean
Optional. If true, the model will deploy with a cached version instead of directly downloading the model artifacts from Hugging Face. This is suitable for VPC-SC users with limited internet access.
modelDisplayName
string
Optional. The user-specified display name of the uploaded model. If not set, a default name will be used.
Optional. The specification of the container that is to be used when deploying. If not set, the default container spec will be used.
JSON representation |
---|
{
"acceptEula": boolean,
"huggingFaceAccessToken": string,
"huggingFaceCacheEnabled": boolean,
"modelDisplayName": string,
"containerSpec": {
object ( |
EndpointConfig
The endpoint config to use for the deployment.
endpointDisplayName
string
Optional. The user-specified display name of the endpoint. If not set, a default name will be used.
dedicatedEndpointEnabled
boolean
Optional. If true, the endpoint will be exposed through a dedicated DNS [Endpoint.dedicated_endpoint_dns]. Your request to the dedicated DNS will be isolated from other users' traffic and will have better performance and reliability. Note: Once you enabled dedicated endpoint, you won't be able to send request to the shared DNS {region}-aiplatform.googleapis.com. The limitations will be removed soon.
JSON representation |
---|
{ "endpointDisplayName": string, "dedicatedEndpointEnabled": boolean } |
DeployConfig
The deploy config to use for the deployment.
Optional. The dedicated resources to use for the endpoint. If not set, the default resources will be used.
fastTryoutEnabled
boolean
Optional. If true, enable the QMT fast tryout feature for this model if possible.
JSON representation |
---|
{
"dedicatedResources": {
object ( |