Method: projects.locations.deploy

Deploys a model to a new endpoint.

Endpoint

post https://{service-endpoint}/v1beta1/{destination}:deploy

Where {service-endpoint} is one of the supported service endpoints.

Path parameters

destination string

Required. The resource name of the location to deploy the model in. Format: projects/{project}/locations/{location}

Request body

The request body contains data with the following structure:

Fields
modelConfig object (ModelConfig)

Optional. The model config to use for the deployment. If not specified, the default model config will be used.

endpointConfig object (EndpointConfig)

Optional. The endpoint config to use for the deployment. If not specified, the default endpoint config will be used.

deployConfig object (DeployConfig)

Optional. The deploy config to use for the deployment. If not specified, the default deploy config will be used.

artifacts Union type
The artifacts to deploy. artifacts can be only one of the following:
publisherModelName string

The Model Garden model to deploy. Format: publishers/{publisher}/models/{publisherModel}@{versionId}, or publishers/hf-{hugging-face-author}/models/{hugging-face-model-name}@001.

huggingFaceModelId string

The Hugging Face model to deploy. Format: Hugging Face model id like google/gemma-2-2b-it.

Response body

If successful, the response body contains an instance of Operation.

ModelConfig

The model config to use for the deployment.

Fields
acceptEula boolean

Optional. Whether the user accepts the End user License Agreement (EULA) for the model.

huggingFaceAccessToken string

Optional. The Hugging Face read access token used to access the model artifacts of gated models.

huggingFaceCacheEnabled boolean

Optional. If true, the model will deploy with a cached version instead of directly downloading the model artifacts from Hugging Face. This is suitable for VPC-SC users with limited internet access.

modelDisplayName string

Optional. The user-specified display name of the uploaded model. If not set, a default name will be used.

containerSpec object (ModelContainerSpec)

Optional. The specification of the container that is to be used when deploying. If not set, the default container spec will be used.

JSON representation
{
  "acceptEula": boolean,
  "huggingFaceAccessToken": string,
  "huggingFaceCacheEnabled": boolean,
  "modelDisplayName": string,
  "containerSpec": {
    object (ModelContainerSpec)
  }
}

EndpointConfig

The endpoint config to use for the deployment.

Fields
endpointDisplayName string

Optional. The user-specified display name of the endpoint. If not set, a default name will be used.

dedicatedEndpointEnabled boolean

Optional. If true, the endpoint will be exposed through a dedicated DNS [Endpoint.dedicated_endpoint_dns]. Your request to the dedicated DNS will be isolated from other users' traffic and will have better performance and reliability. Note: Once you enabled dedicated endpoint, you won't be able to send request to the shared DNS {region}-aiplatform.googleapis.com. The limitations will be removed soon.

JSON representation
{
  "endpointDisplayName": string,
  "dedicatedEndpointEnabled": boolean
}

DeployConfig

The deploy config to use for the deployment.

Fields
dedicatedResources object (DedicatedResources)

Optional. The dedicated resources to use for the endpoint. If not set, the default resources will be used.

fastTryoutEnabled boolean

Optional. If true, enable the QMT fast tryout feature for this model if possible.

JSON representation
{
  "dedicatedResources": {
    object (DedicatedResources)
  },
  "fastTryoutEnabled": boolean
}