Deploying an index to an endpoint includes the following three tasks:
- Create an
IndexEndpoint
if needed, or reuse an existingIndexEndpoint
. - Get the
IndexEndpoint
ID. - Deploy the index to the
IndexEndpoint
.
Create an IndexEndpoint
within your VPC network
If you are deploying an Index
to an existing IndexEndpoint
, you can skip this step.
Before you use an index to serve online vector matching queries, you
must deploy the Index
to an IndexEndpoint
within your
VPC Network Peering network. The
first step is to create an IndexEndpoint
. You can deploy more than one index
to an IndexEndpoint
that shares the same VPC network.
gcloud
The following example uses the gcloud ai index-endpoints create
command.
Before using any of the command data below, make the following replacements:
- INDEX_ENDPOINT_NAME: Display name of the index endpoint.
- VPC_NETWORK_NAME: The Google Compute Engine network name to which the index endpoint should be peered.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud ai index-endpoints create \ --display-name=INDEX_ENDPOINT_NAME \ --network=VPC_NETWORK_NAME \ --region=LOCATION \ --project=PROJECT_ID
Windows (PowerShell)
gcloud ai index-endpoints create ` --display-name=INDEX_ENDPOINT_NAME ` --network=VPC_NETWORK_NAME ` --region=LOCATION ` --project=PROJECT_ID
Windows (cmd.exe)
gcloud ai index-endpoints create ^ --display-name=INDEX_ENDPOINT_NAME ^ --network=VPC_NETWORK_NAME ^ --region=LOCATION ^ --project=PROJECT_ID
You should receive a response similar to the following:
The Google Cloud CLI tool might take a few minutes to create the IndexEndpoint
.
REST
Before using any of the request data, make the following replacements:
- INDEX_ENDPOINT_NAME: Display name of the index endpoint.
- VPC_NETWORK_NAME: The Google Compute Engine network name to which the index endpoint should be peered.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
- PROJECT_NUMBER: Your project's automatically generated project number.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints
Request JSON body:
{ "display_name": "INDEX_ENDPOINT_NAME", "network": "VPC_NETWORK_NAME" }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateIndexEndpointOperationMetadata", "genericMetadata": { "createTime": "2022-01-13T04:09:56.641107Z", "updateTime": "2022-01-13T04:09:56.641107Z" } } }
You can poll for the status of the operation until the response includes "done": true
.
Terraform
The following sample uses the vertex_ai_index_endpoint
Terraform resource to create an index endpoint.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Console
Use these instructions to create an index endpoint.
- In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search
- A list of your active indexes is displayed.
- On the top of the page, select the Index endpoints tab. Your index endpoints are displayed.
- Click Create new index endpoint. The Create a new index endpoint panel opens.
- Enter a display name for the index endpoint.
- In the Region field, select a region from the drop-down.
- In the Access field, select Private.
- Enter your peered VPC network details. Enter the full name of the
Compute Engine network to which the job should be peered. The format should be
projects/{project_num}/global/networks/{network_id}
- Click Create.
Deploy an index
gcloud
This example uses the gcloud ai index-endpoints deploy-index
command.
Before using any of the command data below, make the following replacements:
- INDEX_ENDPOINT_ID: The ID of the index endpoint.
- DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
- DEPLOYED_INDEX_ENDPOINT_NAME: Display name of the deployed index endpoint.
- INDEX_ID: The ID of the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID \ --deployed-index-id=DEPLOYED_INDEX_ID \ --display-name=DEPLOYED_INDEX_ENDPOINT_NAME \ --index=INDEX_ID \ --region=LOCATION \ --project=PROJECT_ID
Windows (PowerShell)
gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID ` --deployed-index-id=DEPLOYED_INDEX_ID ` --display-name=DEPLOYED_INDEX_ENDPOINT_NAME ` --index=INDEX_ID ` --region=LOCATION ` --project=PROJECT_ID
Windows (cmd.exe)
gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID ^ --deployed-index-id=DEPLOYED_INDEX_ID ^ --display-name=DEPLOYED_INDEX_ENDPOINT_NAME ^ --index=INDEX_ID ^ --region=LOCATION ^ --project=PROJECT_ID
You should receive a response similar to the following:
The Google Cloud CLI tool might take a few minutes to create the IndexEndpoint
.
REST
Before using any of the request data, make the following replacements:
- INDEX_ENDPOINT_ID: The ID of the index endpoint.
- DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
- DEPLOYED_INDEX_ENDPOINT_NAME: Display name of the deployed index endpoint.
- INDEX_ID: The ID of the index.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
- PROJECT_NUMBER: Your project's automatically generated project number.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex
Request JSON body:
{ "deployedIndex": { "id": "DEPLOYED_INDEX_ID", "index": "projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID", "displayName": "DEPLOYED_INDEX_ENDPOINT_NAME" } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata", "genericMetadata": { "createTime": "2022-10-19T17:53:16.502088Z", "updateTime": "2022-10-19T17:53:16.502088Z" }, "deployedIndexId": "DEPLOYED_INDEX_ID" } }
Terraform
The following sample uses the vertex_ai_index_endpoint_deployed_index
Terraform resource to create a deployed index endpoint.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Console
Use these instructions to deploy your index to an endpoint.
- In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search
- A list of your active indexes is displayed.
- Select the name of the index you want to deploy. The index details page opens.
- From the index details page, click Deploy to endpoint. The index deployment panel opens.
- Enter a display name - this name acts as an ID and can't be updated.
- From the Endpoint drop-down, select the endpoint you want to deploy this index to. Note: The endpoint is unavailable if the index is already deployed to it.
- Optional: In the Machine type field, select either standard or high-memory.
- Optional. Select Enable autoscaling to automatically resize the number of nodes based on the demands of your workloads. The default number of replicas is 2 if autoscaling is disabled.
- Click Deploy to deploy your index to the endpoint. Note: It takes around 30 minutes to be deployed.
Enable autoscaling
Vector Search supports autoscaling, which can automatically resize the number of nodes based on the demands of your workloads. When demand is high, nodes are added to the node pool, which won't exceed the maximum size you designate. When demand is low, the node pool scales back down to a minimum size that you designate. You can check the actual nodes in use and the changes by monitoring the current replicas.
To enable autoscaling, specify the maxReplicaCount
and
minReplicaCount
when you deploy your index:
gcloud
The following example uses the gcloud ai index-endpoints deploy-index
command.
Before using any of the command data below, make the following replacements:
- INDEX_ENDPOINT_ID: The ID of the index endpoint.
- DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
- DEPLOYED_INDEX_NAME: Display name of the deployed index.
- INDEX_ID: The ID of the index.
- MIN_REPLICA_COUNT: Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
- MAX_REPLICA_COUNT: Maximum number of machine replicas the deployed index could be deployed on.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID \ --deployed-index-id=DEPLOYED_INDEX_ID \ --display-name=DEPLOYED_INDEX_NAME \ --index=INDEX_ID \ --min-replica-count=MIN_REPLICA_COUNT \ --max-replica-count=MAX_REPLICA_COUNT \ --region=LOCATION \ --project=PROJECT_ID
Windows (PowerShell)
gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID ` --deployed-index-id=DEPLOYED_INDEX_ID ` --display-name=DEPLOYED_INDEX_NAME ` --index=INDEX_ID ` --min-replica-count=MIN_REPLICA_COUNT ` --max-replica-count=MAX_REPLICA_COUNT ` --region=LOCATION ` --project=PROJECT_ID
Windows (cmd.exe)
gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID ^ --deployed-index-id=DEPLOYED_INDEX_ID ^ --display-name=DEPLOYED_INDEX_NAME ^ --index=INDEX_ID ^ --min-replica-count=MIN_REPLICA_COUNT ^ --max-replica-count=MAX_REPLICA_COUNT ^ --region=LOCATION ^ --project=PROJECT_ID
REST
Before using any of the request data, make the following replacements:
- INDEX_ENDPOINT_ID: The ID of the index endpoint.
- DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
- DEPLOYED_INDEX_NAME: Display name of the deployed index.
- INDEX_ID: The ID of the index.
- MIN_REPLICA_COUNT: Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
- MAX_REPLICA_COUNT: Maximum number of machine replicas the deployed index could be deployed on.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
- PROJECT_NUMBER: Your project's automatically generated project number.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex
Request JSON body:
{ "deployedIndex": { "id": "DEPLOYED_INDEX_ID", "index": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID", "displayName": "DEPLOYED_INDEX_NAME", "automaticResources": { "minReplicaCount": MIN_REPLICA_COUNT, "maxReplicaCount": MAX_REPLICA_COUNT } } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata", "genericMetadata": { "createTime": "2023-10-19T17:53:16.502088Z", "updateTime": "2023-10-19T17:53:16.502088Z" }, "deployedIndexId": "DEPLOYED_INDEX_ID" } }
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Console
You can only enable autoscaling from the console during index deployment.
- In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search
- A list of your active indexes is displayed.
- Select the name of the index you want to deploy. The index details page opens.
- From the index details page, click Deploy to endpoint. The index deployment panel opens.
- Enter a display name - this name acts as an ID and can't be updated.
- From the Endpoint drop-down, select the endpoint you want to deploy this index to. Note: The endpoint is unavailable if the index is already deployed to it.
- Optional: In the Machine type field, select either standard or high-memory.
- Optional. Select Enable autoscaling to automatically resize the number of nodes based on the demands of your workloads. The default number of replicas is 2 if autoscaling is disabled.
- If both
minReplicaCount
andmaxReplicaCount
are not set, they are set to 2 by default. - If only
maxReplicaCount
is set,minReplicaCount
is set to 2 by default. - If only
minReplicaCount
is set,maxReplicaCount
is set to equalminReplicaCount
.
Mutate a DeployedIndex
You can use MutateDeployedIndex
API to update the deployment resources (for example, minReplicaCount
and maxReplicaCount
) of an already deployed index.
- Users are not allowed to change the
machineType
after the index is deployed. - If
maxReplicaCount
is not specified in the request, theDeployedIndex
will keep using the existingmaxReplicaCount
.
gcloud
The following example uses the gcloud ai index-endpoints mutate-deployed-index
command.
Before using any of the command data below, make the following replacements:
- INDEX_ENDPOINT_ID: The ID of the index endpoint.
- DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
- MIN_REPLICA_COUNT: Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
- MAX_REPLICA_COUNT: Maximum number of machine replicas the deployed index could be deployed on.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud ai index-endpoints mutate-deployed-index INDEX_ENDPOINT_ID \ --deployed-index-id=DEPLOYED_INDEX_ID \ --min-replica-count=MIN_REPLICA_COUNT \ --max-replica-count=MAX_REPLICA_COUNT \ --region=LOCATION \ --project=PROJECT_ID
Windows (PowerShell)
gcloud ai index-endpoints mutate-deployed-index INDEX_ENDPOINT_ID ` --deployed-index-id=DEPLOYED_INDEX_ID ` --min-replica-count=MIN_REPLICA_COUNT ` --max-replica-count=MAX_REPLICA_COUNT ` --region=LOCATION ` --project=PROJECT_ID
Windows (cmd.exe)
gcloud ai index-endpoints mutate-deployed-index INDEX_ENDPOINT_ID ^ --deployed-index-id=DEPLOYED_INDEX_ID ^ --min-replica-count=MIN_REPLICA_COUNT ^ --max-replica-count=MAX_REPLICA_COUNT ^ --region=LOCATION ^ --project=PROJECT_ID
REST
Before using any of the request data, make the following replacements:
- INDEX_ENDPOINT_ID: The ID of the index endpoint.
- DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
- MIN_REPLICA_COUNT: Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
- MAX_REPLICA_COUNT: Maximum number of machine replicas the deployed index could be deployed on.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
- PROJECT_NUMBER: Your project's automatically generated project number.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:mutateDeployedIndex
Request JSON body:
{ "deployedIndex": { "id": "DEPLOYED_INDEX_ID", "index": "projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID", "displayName": "DEPLOYED_INDEX_NAME", "min_replica_count": "MIN_REPLICA_COUNT", "max_replica_count": "MAX_REPLICA_COUNT" } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata", "genericMetadata": { "createTime": "2020-10-19T17:53:16.502088Z", "updateTime": "2020-10-19T17:53:16.502088Z" }, "deployedIndexId": "DEPLOYED_INDEX_ID" } }
Terraform
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands. For more information, see the Terraform provider reference documentation.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Deployment settings that impact performance
The following deployment settings can affect latency, availability, and cost when using Vector Search. This guidance applies to most cases. However, always experiment with your configurations to make sure that they work for your use case.
Setting | Performance impact |
---|---|
Machine type |
The hardware selection has a direct interaction with the shard size selected. Depending on shard choices you specified at index creation time, each machine type offers a tradeoff between performance and cost. Reference the pricing page to determine the hardware available and pricing. In general, performance increases in the following order:
|
Minimum replica count |
If you have workloads that drop to low levels and then quickly increase
to higher levels, consider setting |
Maximum replica count |
maxReplicaCount primarily lets you control usage cost. You
can choose to prevent increasing costs beyond a certain threshold, with
the tradeoff of allowing increased latency and reducing availability.
|
List IndexEndpoints
To list your IndexEndpoint
resources and view the information of
any associated DeployedIndex
instances, run the following
code:
gcloud
The following example uses the gcloud ai index-endpoints list
command.
Before using any of the command data below, make the following replacements:
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud ai index-endpoints list \ --region=LOCATION \ --project=PROJECT_ID
Windows (PowerShell)
gcloud ai index-endpoints list ` --region=LOCATION ` --project=PROJECT_ID
Windows (cmd.exe)
gcloud ai index-endpoints list ^ --region=LOCATION ^ --project=PROJECT_ID
REST
Before using any of the request data, make the following replacements:
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
- PROJECT_NUMBER: Your project's automatically generated project number.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "indexEndpoints": [ { "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID", "displayName": "INDEX_ENDPOINT_DISPLAY_NAME", "deployedIndexes": [ { "id": "DEPLOYED_INDEX_ID", "index": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID", "displayName": "DEPLOYED_INDEX_DISPLAY_NAME", "createTime": "2021-06-04T02:23:40.178286Z", "privateEndpoints": { "matchGrpcAddress": "GRPC_ADDRESS" }, "indexSyncTime": "2022-01-13T04:22:00.151916Z", "automaticResources": { "minReplicaCount": 2, "maxReplicaCount": 10 } } ], "etag": "AMEw9yP367UitPkLo-khZ1OQvqIK8Q0vLAzZVF7QjdZ5O3l7Zow-mzBo2l6xmiuuMljV", "createTime": "2021-03-17T04:47:28.460373Z", "updateTime": "2021-06-04T02:23:40.930513Z", "network": "VPC_NETWORK_NAME" } ] }
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Console
Use these instructions to view a list of your index endpoints.
- In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search
- On the top of the page, select the Index endpoint tab.
- All of the existing index endpoints are displayed.
For more information, see the reference documentation for
IndexEndpoint
.
Undeploy an index
To undeploy an index, run the following code:
gcloud
The following example uses the gcloud ai index-endpoints undeploy-index
command.
Before using any of the command data below, make the following replacements:
- INDEX_ENDPOINT_ID: The ID of the index endpoint.
- DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud ai index-endpoints undeploy-index INDEX_ENDPOINT_ID \ --deployed-index-id=DEPLOYED_INDEX_ID \ --region=LOCATION \ --project=PROJECT_ID
Windows (PowerShell)
gcloud ai index-endpoints undeploy-index INDEX_ENDPOINT_ID ` --deployed-index-id=DEPLOYED_INDEX_ID ` --region=LOCATION ` --project=PROJECT_ID
Windows (cmd.exe)
gcloud ai index-endpoints undeploy-index INDEX_ENDPOINT_ID ^ --deployed-index-id=DEPLOYED_INDEX_ID ^ --region=LOCATION ^ --project=PROJECT_ID
REST
Before using any of the request data, make the following replacements:
- INDEX_ENDPOINT_ID: The ID of the index endpoint.
- DEPLOYED_INDEX_ID: A user specified string to uniquely identify the deployed index. It must start with a letter and contain only letters, numbers or underscores. See DeployedIndex.id for format guidelines.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
- PROJECT_NUMBER: Your project's automatically generated project number.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:undeployIndex
Request JSON body:
{ "deployed_index_id": "DEPLOYED_INDEX_ID" }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.UndeployIndexOperationMetadata", "genericMetadata": { "createTime": "2022-01-13T04:09:56.641107Z", "updateTime": "2022-01-13T04:09:56.641107Z" } } }
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Console
Use these instructions to undeploy an index.
- In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search
- A list of your active indexes is displayed.
- Select the index you want to undeploy. The index details page opens.
- Under the Deployed indexes section, identify the index endpoint you want to undeploy.
- Click the options menu that is in the same row as the index endpoint and select Undeploy.
- A confirmation screen opens. Click Undeploy. Note: It can take up to 30 minutes to be undeployed.
Delete an IndexEndpoint
Before you delete an IndexEndpoint
, you must undeploy all
indexes deploy to the endpoint.
gcloud
The following example uses the gcloud ai index-endpoints delete
command.
Before using any of the command data below, make the following replacements:
- INDEX_ENDPOINT_ID: The ID of the index endpoint.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud ai index-endpoints delete INDEX_ENDPOINT_ID \ --region=LOCATION \ --project=PROJECT_ID
Windows (PowerShell)
gcloud ai index-endpoints delete INDEX_ENDPOINT_ID ` --region=LOCATION ` --project=PROJECT_ID
Windows (cmd.exe)
gcloud ai index-endpoints delete INDEX_ENDPOINT_ID ^ --region=LOCATION ^ --project=PROJECT_ID
REST
Before using any of the request data, make the following replacements:
- INDEX_ENDPOINT_ID: The ID of the index endpoint.
- LOCATION: The region where you are using Vertex AI.
- PROJECT_ID: Your Google Cloud project ID.
- PROJECT_NUMBER: Your project's automatically generated project number.
HTTP method and URL:
DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeleteOperationMetadata", "genericMetadata": { "createTime": "2022-01-13T04:36:19.142203Z", "updateTime": "2022-01-13T04:36:19.142203Z" } }, "done": true, "response": { "@type": "type.googleapis.com/google.protobuf.Empty" } }
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Console
Use these instructions to delete an index endpoint.
- In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search
- On the top of the page, select the Index endpoints tab.
- All of the existing index endpoints are displayed.
- Click the options menu that is in the same row as the index endpoint you want to delete and select Delete.
- A confirmation screen opens. Click Delete. Your index endpoint is now deleted.