Full name: projects.locations.deploy
Deploys a model to a new endpoint.
Endpoint
posthttps://{service-endpoint}/v1/{destination}:deploy
Where {service-endpoint}
is one of the supported service endpoints.
Path parameters
destination
string
Required. The resource name of the Location to deploy the model in. Format: projects/{project}/locations/{location}
Request body
The request body contains data with the following structure:
Optional. The model config to use for the deployment. If not specified, the default model config will be used.
Optional. The endpoint config to use for the deployment. If not specified, the default endpoint config will be used.
Optional. The deploy config to use for the deployment. If not specified, the default deploy config will be used.
artifacts
Union type
artifacts
can be only one of the following:publisherModelName
string
The Model Garden model to deploy. Format: publishers/{publisher}/models/{publisherModel}@{versionId}
, or publishers/hf-{hugging-face-author}/models/{hugging-face-model-name}@001
.
huggingFaceModelId
string
The Hugging Face model to deploy. Format: Hugging Face model id like google/gemma-2-2b-it
.
Response body
If successful, the response body contains an instance of Operation
.
ModelConfig
The model config to use for the deployment.
acceptEula
boolean
Optional. Whether the user accepts the End user License Agreement (EULA) for the model.
huggingFaceAccessToken
string
Optional. The Hugging Face read access token used to access the model artifacts of gated models.
huggingFaceCacheEnabled
boolean
Optional. If true, the model will deploy with a cached version instead of directly downloading the model artifacts from Hugging Face. This is suitable for VPC-SC users with limited internet access.
modelDisplayName
string
Optional. The user-specified display name of the uploaded model. If not set, a default name will be used.
Optional. The specification of the container that is to be used when deploying. If not set, the default container spec will be used.
modelUserId
string
Optional. The id to use for the uploaded Model, which will become the final component of the model resource name. When not provided, Vertex AI will generate a value for this id. When Model Registry model is provided, this field will be ignored.
This value may be up to 63 characters, and valid characters are [a-z0-9_-]
. The first character cannot be a number or hyphen.
JSON representation |
---|
{
"acceptEula": boolean,
"huggingFaceAccessToken": string,
"huggingFaceCacheEnabled": boolean,
"modelDisplayName": string,
"containerSpec": {
object ( |
EndpointConfig
The endpoint config to use for the deployment.
endpointDisplayName
string
Optional. The user-specified display name of the endpoint. If not set, a default name will be used.
dedicatedEndpointEnabled
(deprecated)
boolean
Optional. Deprecated. Use dedicatedEndpointDisabled instead. If true, the endpoint will be exposed through a dedicated DNS [Endpoint.dedicated_endpoint_dns]. Your request to the dedicated DNS will be isolated from other users' traffic and will have better performance and reliability. Note: Once you enabled dedicated endpoint, you won't be able to send request to the shared DNS {region}-aiplatform.googleapis.com. The limitations will be removed soon.
dedicatedEndpointDisabled
boolean
Optional. By default, if dedicated endpoint is enabled, the endpoint will be exposed through a dedicated DNS [Endpoint.dedicated_endpoint_dns]. Your request to the dedicated DNS will be isolated from other users' traffic and will have better performance and reliability. Note: Once you enabled dedicated endpoint, you won't be able to send request to the shared DNS {region}-aiplatform.googleapis.com. The limitations will be removed soon.
If this field is set to true, the dedicated endpoint will be disabled and the deployed model will be exposed through the shared DNS {region}-aiplatform.googleapis.com.
JSON representation |
---|
{ "endpointDisplayName": string, "dedicatedEndpointEnabled": boolean, "dedicatedEndpointDisabled": boolean } |
DeployConfig
The deploy config to use for the deployment.
Optional. The dedicated resources to use for the endpoint. If not set, the default resources will be used.
fastTryoutEnabled
boolean
Optional. If true, enable the QMT fast tryout feature for this model if possible.
systemLabels
map (key: string, value: string)
Optional. System labels for Model Garden deployments. These labels are managed by Google and for tracking purposes only.
JSON representation |
---|
{
"dedicatedResources": {
object ( |