After you've created and deployed the index, you can run queries to get the nearest neighbors.
Here are some examples for a match query to find the top nearest neighbors using the k-nearest neighbors algorithm (k-NN).
Example queries for public endpoint
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Command-line
The publicEndpointDomainName
listed below can be found at
Deploy and is formatted as
<number>.<region>-<number>.vdb.vertexai.goog
.
$ curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://1957880287.us-central1-181224308459.vdb.vertexai.goog/v1/projects/181224308459/locations/us-central1/indexEndpoints/3370566089086861312:findNeighbors -d '{deployed_index_id: "test_index_public1", queries: [{datapoint: {datapoint_id: "0", feature_vector: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}, neighbor_count: 5}]}'
This curl example demonstrates how to call from http(s)
clients,
although public endpoint supports dual protocol for restful and
grpc_cli
.
$ curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://1957880287.us-central1-181224308459.vdb.vertexai.goog/v1/projects/${PROJECT_ID}/locations/us-central1/indexEndpoints/${INDEX_ENDPOINT_ID}:readIndexDatapoints -d '{deployed_index_id:"test_index_public1", ids: ["606431", "896688"]}'
This curl example demonstrates how to query with token and numeric restricts.
$ curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://${PUBLIC_ENDPOINT_DOMAIN}/v1/projects/${PROJECT_ID}/locations/${LOCATION}/indexEndpoints/${INDEX_ENDPOINT_ID}:findNeighbors -d '{deployed_index_id:"${DEPLOYED_INDEX_ID}", queries: [{datapoint: {datapoint_id:"x", feature_vector: [1, 1], "sparse_embedding": {"values": [111.0,111.1,111.2], "dimensions": [10,20,30]}, numeric_restricts: [{namespace: "int-ns", value_int: -2, op: "GREATER"}, {namespace: "int-ns", value_int: 4, op: "LESS_EQUAL"}, {namespace: "int-ns", value_int: 0, op: "NOT_EQUAL"}], restricts: [{namespace: "color", allow_list: ["red"]}]}}]}'
Console
Use these instructions to query an index deployed to a public endpoint from the console.
- In the Vertex AI section of the Google Cloud console, go to the Deploy and Use section. Select Vector Search.
- Select the index you want to query. The Index info page opens.
- Scroll down to the Deployed indexes section and select the deployed index you want to query. The Deployed index info page opens.
- From the Query index section, select whether to query by a dense embedding value, a sparse embedding value, a hybrid embedding value (dense and sparse embeddings), or a specific data point.
- Enter the query parameters for the type of query you selected. For example, if you're querying by a dense embedding, enter the embedding vector to query by.
- Execute the query using the provided curl command, or by running with Cloud Shell.
- If using Cloud Shell, select Run in Cloud Shell.
- Run in Cloud Shell.
- The results return nearest neighbors.
Hybrid queries
Hybrid search uses both dense and sparse embeddings for searches based on combination of keyword search and semantic search.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Queries with filtering and crowding
Filtering vector matches lets you restrict your nearest neighbor results to specific categories. Filters can also designate categories to exclude from your results.
Per-crowding neighbor limits can increase result diversity by limiting the number of results returned from any single crowding tag in your index data.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Query-time settings that impact performance
The following query-time parameters can affect latency, availability, and cost when using Vector Search. This guidance applies to most cases. However, always experiment with your configurations to make sure that they work for your use case.
For parameter definitions, see Index configuration parameters.
Parameter | About | Performance impact |
---|---|---|
approximateNeighborsCount |
Tells the algorithm the number of approximate results to retrieve from each shard.
The value of |
Increasing the value of
Decreasing the value of
|
setNeighborCount |
Specifies the number of results that you want the query to return. |
Values less than or equal to 300 remain performant in most use cases. For larger values, test for your specific use case. |
fractionLeafNodesToSearch |
Controls the percentage of leaf nodes to visit when searching for nearest
neighbors. This is related to the leafNodeEmbeddingCount in
that the more embeddings per leaf node, the more data examined per leaf.
|
Increasing the value of
Decreasing the value of
|
What's next
To see an end-to-end example of how to create an index, how to deploy it to a public endpoint, and how to query, see the official notebook: Using Vector Search and Vertex AI Embeddings for Text for StackOverflow Questions.
- Learn how to Update and rebuild your index
- Learn how to Filter vector matches
- Learn how to Monitor an index