This guide shows you how to connect your Vertex AI RAG Engine to Vertex AI Vector Search. This page covers the following topics: Before you use Vertex AI RAG Engine with Vector Search, set up the Vertex AI SDK. For instructions, see Setup. To integrate Vector Search with Vertex AI RAG Engine, you must start with an empty Vector Search index. To be compatible with a RAG Corpus, your Vector Search index must meet the following criteria: To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python.
For more information, see the
Python API reference documentation.
Before you begin
Set up Vertex AI Vector Search
Create a Vector Search index
STREAM_UPDATE
. For more information, see Create stream index.DOT_PRODUCT_DISTANCE
or COSINE_DISTANCE
.Python
Create an index endpoint
Vertex AI RAG Engine supports Public endpoints.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Deploy the index
Before you can query the index, you must deploy it to an index endpoint.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
If this is the first time that you're deploying an index to an index endpoint, it takes approximately 30 minutes to build and initiate the backend. Subsequent deployments to the same endpoint take only a few seconds. To see the status of the index deployment, go to the Vector Search Console, select the Index endpoints tab, and choose your index endpoint.
You need the full resource names of your index and index endpoint for the following steps. The formats are as follows:
projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/indexes/${INDEX_ID}
projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/indexEndpoints/${INDEX_ENDPOINT_ID}
Use Vector Search with Vertex AI RAG Engine
After you set up your Vector Search index and endpoint, you can use them as the vector database for your RAG application.
Create a RAG corpus
When you create the RAG corpus, specify the full resource names for your index and index endpoint. Vertex AI RAG Engine automatically associates the RAG corpus with your Vector Search index. Vertex AI RAG Engine also validates that the index meets the required criteria. If the requirements aren't met, the request is rejected.
Python
Before trying this sample, follow the Python setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Python API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
REST
# TODO(developer): Update and un-comment the following lines:
# CORPUS_DISPLAY_NAME = "YOUR_CORPUS_DISPLAY_NAME"
# Full index/indexEndpoint resource name
# Index: projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/indexes/${INDEX_ID}
# IndexEndpoint: projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/indexEndpoints/${INDEX_ENDPOINT_ID}
# INDEX_RESOURCE_NAME = "YOUR_INDEX_ENDPOINT_RESOURCE_NAME"
# INDEX_NAME = "YOUR_INDEX_RESOURCE_NAME"
# Call CreateRagCorpus API to create a new RagCorpus
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://${LOCATION_ID}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/ragCorpora -d '{
"display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
"rag_vector_db_config" : {
"vertex_vector_search": {
"index":'\""${INDEX_NAME}"\"'
"index_endpoint":'\""${INDEX_ENDPOINT_NAME}"\"'
}
}
}'
# Call ListRagCorpora API to verify the RagCorpus is created successfully
curl -sS -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://${LOCATION_ID}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/${LOCATION_ID}/ragCorpora"
Import files
Use the ImportRagFiles
API to import files from Cloud Storage or Google Drive. The files are embedded and stored in your Vector Search index.
REST
# TODO(developer): Update and uncomment the following lines:
# RAG_CORPUS_ID = "your-rag-corpus-id"
#
# Google Cloud Storage bucket/file location.
# For example, "gs://rag-fos-test/"
# GCS_URIS= "your-gcs-uris"
# Call ImportRagFiles API to embed files and store in the BigQuery table
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
"import_rag_files_config": {
"gcs_source": {
"uris": '\""${GCS_URIS}"\"'
},
"rag_file_chunking_config": {
"chunk_size": 512
}
}
}'
# Call ListRagFiles API to verify the files are imported successfully
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}/ragFiles
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Retrieve contexts
After you import the files, retrieve relevant contexts from the Vector Search index by using the RetrieveContexts
API.
REST
# TODO(developer): Update and uncomment the following lines:
# RETRIEVAL_QUERY="your-retrieval-query"
#
# Full RAG corpus resource name
# Format:
# "projects/${PROJECT_NUMBER}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}"
# RAG_CORPUS_RESOURCE="your-rag-corpus-resource"
# Call RetrieveContexts API to retrieve relevant contexts
curl -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/us-central1:retrieveContexts \
-d '{
"vertex_rag_store": {
"rag_resources": {
"rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',
},
},
"query": {
"text": '\""${RETRIEVAL_QUERY}"\"',
"similarity_top_k": 10
}
}'
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Generate content
To generate content, call the Vertex AI GenerateContent
API with a Gemini model. When you specify the RAG_CORPUS_RESOURCE
in the request, the API automatically retrieves data from the Vector Search index to ground the model's response.
REST
# TODO(developer): Update and uncomment the following lines:
# MODEL_ID=gemini-2.0-flash
# GENERATE_CONTENT_PROMPT="your-generate-content-prompt"
# GenerateContent with contexts retrieved from the FeatureStoreOnline index
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_NUMBER}/locations/us-central1/publishers/google/models/${MODEL_ID}:generateContent \
-d '{
"contents": {
"role": "user",
"parts": {
"text": '\""${GENERATE_CONTENT_PROMPT}"\"'
}
},
"tools": {
"retrieval": {
"vertex_rag_store": {
"rag_resources": {
"rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',
},
"similarity_top_k": 8,
"vector_distance_threshold": 0.32
}
}
}
}'
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.