Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see Model versions and lifecycle.
Stay organized with collections
Save and categorize content based on your preferences.
The Vertex AI RAG Engine is a component of the
Vertex AI platform, which facilitates Retrieval-Augmented
Generation (RAG). RAG Engine enables Large Language Models (LLMs) to access
and incorporate data from external knowledge sources, such as documents and
databases. By using RAG, LLMs can generate more accurate and informative LLM
responses.
This table lists the parameters used to create a RAG corpus.
Body Request
Parameters
display_name
Required: string
The display name of the RAG corpus.
description
Optional: string
The description of the RAG corpus.
encryption_spec
Optional: Immutable: string
The CMEK key name is used to encrypt at-rest data that's related to the RAG corpus. The key name is only applicable to the RagManaged option for the vector database. When the corpus is created, this field can be set and can't be updated or deleted.
Format: projects/{project}/locations/{location}/collections/{collection}/engines/{engine}/servingConfigs/{serving_config}
or
projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/servingConfigs/{serving_config}
vectorDbConfig
Parameters
rag_managed_db
oneofvector_db: vectorDbConfig.RagManagedDb
If no vector database is specified, rag_managed_db is the default vector database.
pinecone
oneofvector_db: vectorDbConfig.Pinecone
Specifies your Pinecone instance.
pinecone.index_name
string
This is the name used to create the Pinecone index that's used with the RAG corpus.
This value can't be changed after it's set. You can leave it empty in
the CreateRagCorpus API call, and set it with a non-empty
value in a follow up UpdateRagCorpus API call.
vertex_vector_search
oneofvector_db: vectorDbConfig.VertexVectorSearch
Specifies your Vertex Vector Search instance.
vertex_vector_search.index
string
This is the resource name of the Vector Search index that's used with the RAG corpus.
This value can't be changed after it's set. You can leave it empty in
the CreateRagCorpus API call, and set it with a non-empty
value in a follow up UpdateRagCorpus API call.
vertex_vector_search.index_endpoint
string
This is the resource name of the Vector Search index endpoint that's used with the RAG corpus.
This value can't be changed after it's set. You can leave it empty in
the CreateRagCorpus API call, and set it with a non-empty
value in a follow up UpdateRagCorpus API call.
api_auth.api_key_config.api_key_secret_version
string
This the full resource name of the secret that is stored in Secret Manager,
which contains your Pinecone API key.
The embedding model to use for the RAG corpus. This value can't be
changed after it's set. If you leave it empty, we use text-embedding-005
as the embedding model.
Update a RAG corpus
This table lists the parameters used to update a RAG corpus.
Body Request
Parameters
display_name
Optional: string
The display name of the RAG corpus.
description
Optional: string
The description of the RAG corpus.
rag_vector_db.pinecone.index_name
string
This is the name used to create the Pinecone index that's used with the RAG corpus.
If your RagCorpus was created with a Pinecone
configuration, and this field has never been set before, then you can update
the Pinecone instance's index name.
rag_vector_db.vertex_vector_search.index
string
This is the resource name of the Vector Search index that's used with the RAG corpus.
This table lists the parameters used to list RAG corpora.
Parameters
page_size
Optional: int
The standard list page size.
page_token
Optional: string
The standard list page token. Typically obtained from [ListRagCorporaResponse.next_page_token][] of the previous [VertexRagDataService.ListRagCorpora][] call.
Get a RAG corpus
This table lists parameters used to get a RAG corpus.
Parameters
name
string
The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}
Delete a RAG corpus
This table lists parameters used to delete a RAG corpus.
Parameters
name
string
The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}
If this field isn't set, RAG uses the default parser.
max_embedding_requests_per_min
Optional: int32
The maximum number of queries per minute that this job is allowed to make to the embedding model specified on the corpus. This value is specific to this job and not shared across other import jobs. Consult the Quotas page on the project to set an appropriate value.
If unspecified, a default value of 1,000 QPM is used.
GoogleDriveSource
resource_ids.resource_id
Required: string
The ID of the Google Drive resource.
resource_ids.resource_type
Required: string
The type of the Google Drive resource.
SlackSource
channels.channels
Repeated: SlackSource.SlackChannels.SlackChannel
Slack channel information, include ID and time range to import.
channels.channels.channel_id
Required: string
The Slack channel ID.
channels.channels.start_time
Optional: google.protobuf.Timestamp
The starting timestamp for messages to import.
channels.channels.end_time
Optional: google.protobuf.Timestamp
The ending timestamp for messages to import.
channels.api_key_config.api_key_secret_version
Required: string
The full resource name of the secret that is stored in Secret Manager,
which contains a Slack channel access token that has access to the slack channel IDs.
See: https://api.slack.com/tutorials/tracks/getting-a-token.
The full resource name of the secret that is stored in Secret Manager,
which contains Jira API key that has access to the slack channel IDs.
See: https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/
The path of the SharePoint folder to download from.
share_point_sources.sharepoint_folder_id
oneof in folder_source: string
The ID of the SharePoint folder to download from.
share_point_sources.drive_name
oneof in drive_source: string
The name of the drive to download from.
share_point_sources.drive_id
oneof in drive_source: string
The ID of the drive to download from.
share_point_sources.client_id
string
The Application ID for the app registered in Microsoft Azure Portal.
The application must also be configured with MS Graph permissions
"Files.ReadAll", "Sites.ReadAll" and BrowserSiteLists.Read.All.
The maximum number of requests the job is allowed to make to the Document AI processor per minute.
Consult https://cloud.google.com/document-ai/quotas and the Quota page
for your project to set an appropriate value here. If unspecified, a default
value of 120 QPM is used.
The maximum number of requests the job is allowed to make to the LLM model per minute.
To set an appropriate value for your project, see model quota section and the Quota page
for your project to set an appropriate value here. If unspecified, a default
value of 5000 QPM is used.
Get a RAG file
This table lists parameters used to get a RAG file.
Parameters
name
string
The name of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}
Delete a RAG file
This table lists parameters used to delete a RAG file.
Parameters
name
string
The name of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}
Retrieval and prediction parameters
This section lists the retrieval and prediction parameters.
Retrieval parameters
This table lists parameters for retrieveContexts API.
Parameters
parent
Required: string
The resource name of the Location to retrieve RagContexts.
The users must have permission to make a call in the project.
Format: projects/{project}/locations/{location}
vertex_rag_store
VertexRagStore
The data source for Vertex RagStore.
query
Required: RagQuery
Single RAG retrieve query.
VertexRagStore
VertexRagStore
rag_resources
list: RagResource
The representation of the RAG source. It can be used to specify the corpus
only or RagFiles. Only support one corpus or multiple files
from one corpus.
fromvertexaiimportragimportvertexai# TODO(developer): Update and un-comment below lines# PROJECT_ID = "your-project-id"# display_name = "test_corpus"# description = "Corpus Description"# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location="us-central1")# Configure backend_configbackend_config=rag.RagVectorDbConfig(rag_embedding_model_config=rag.RagEmbeddingModelConfig(vertex_prediction_endpoint=rag.VertexPredictionEndpoint(publisher_model="publishers/google/models/text-embedding-005")))corpus=rag.create_corpus(display_name=display_name,description=description,backend_config=backend_config,)print(corpus)# Example response:# RagCorpus(name='projects/1234567890/locations/us-central1/ragCorpora/1234567890',# display_name='test_corpus', description='Corpus Description', embedding_model_config=...# ...
Update a RAG corpus example
You can update your RAG corpus with a new display name, description, and vector
database configuration. However, you can't change the following
parameters in your RAG corpus:
The vector database type. For example, you can't change the vector database
from Weaviate to Vertex AI Feature Store.
If you're using the managed database option, you can't update the vector
database configuration.
These examples demonstrate how to update a RAG corpus.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
CORPUS_ID: The corpus ID of your RAG corpus.
CORPUS_DISPLAY_NAME: The display name of the RAG corpus.
CORPUS_DESCRIPTION: The description of the RAG corpus.
INDEX_NAME: The resource name of the
Vector Search Index. Format:
projects/{project}/locations/{location}/indexes/{index}.
INDEX_ENDPOINT_NAME: The resource name of the
Vector Search index endpoint. Format:
projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}.
You should receive a successful status code (2xx).
List RAG corpora example
These code samples demonstrate how to list all of the RAG corpora.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
PAGE_SIZE: The standard list page size. You might adjust
the number of RAG corpora to return per page by updating the page_size
parameter.
PAGE_TOKEN: The standard list page token. Obtained
typically using ListRagCorporaResponse.next_page_token of the previous
VertexRagDataService.ListRagCorpora call.
fromvertexaiimportragimportvertexai# TODO(developer): Update and un-comment below lines# PROJECT_ID = "your-project-id"# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location="us-central1")corpora=rag.list_corpora()print(corpora)# Example response:# ListRagCorporaPager<rag_corpora {# name: "projects/[PROJECT_ID]/locations/us-central1/ragCorpora/2305843009213693952"# display_name: "test_corpus"# create_time {# ...
Get a RAG corpus example
These code samples demonstrate how to get a RAG corpus.
REST
Before using any of the request data, make the following replacements:
A successful response returns the RagCorpus resource.
The get and list commands are used in an example to demonstrate how
RagCorpus uses the rag_embedding_model_config field with in the
vector_db_config, which points to the embedding model you have chosen.
fromvertexaiimportragimportvertexai# TODO(developer): Update and un-comment below lines# PROJECT_ID = "your-project-id"# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location="us-central1")corpus=rag.get_corpus(name=corpus_name)print(corpus)# Example response:# RagCorpus(name='projects/[PROJECT_ID]/locations/us-central1/ragCorpora/1234567890',# display_name='test_corpus', description='Corpus Description',# ...
Delete a RAG corpus example
These code samples demonstrate how to delete a RAG corpus.
REST
Before using any of the request data, make the following replacements:
fromvertexaiimportragimportvertexai# TODO(developer): Update and un-comment below lines# PROJECT_ID = "your-project-id"# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location="us-central1")rag.delete_corpus(name=corpus_name)print(f"Corpus {corpus_name} deleted.")# Example response:# Successfully deleted the RagCorpus.# Corpus projects/[PROJECT_ID]/locations/us-central1/ragCorpora/123456789012345 deleted.
File management examples
This section provides examples of how to use the API to manage RAG files.
Upload a RAG file example
These code samples demonstrate how to upload a RAG file.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The corpus ID of your RAG corpus.
LOCAL_FILE_PATH: The local path to the file to be
uploaded.
fromvertexaiimportragimportvertexai# TODO(developer): Update and un-comment below lines# PROJECT_ID = "your-project-id"# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"# path = "path/to/local/file.txt"# display_name = "file_display_name"# description = "file description"# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location="us-central1")rag_file=rag.upload_file(corpus_name=corpus_name,path=path,display_name=display_name,description=description,)print(rag_file)# RagFile(name='projects/[PROJECT_ID]/locations/us-central1/ragCorpora/1234567890/ragFiles/09876543',# display_name='file_display_name', description='file description')
Import RAG files example
Files and folders can be imported from Drive or
Cloud Storage. You can use response.metadata to view partial
failures, request time, and response time in the SDK's response object.
The response.skipped_rag_files_count refers to the number of files that
were skipped during import. A file is skipped when the following conditions are
met:
The file has already been imported.
The file hasn't changed.
The chunking configuration for the file hasn't changed.
Python
fromvertexaiimportragimportvertexai# TODO(developer): Update and un-comment below lines# PROJECT_ID = "your-project-id"# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"# paths = ["https://drive.google.com/file/123", "gs://my_bucket/my_files_dir"] # Supports Cloud Storage and Google Drive Links# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location="us-central1")response=rag.import_files(corpus_name=corpus_name,paths=paths,transformation_config=rag.TransformationConfig(rag.ChunkingConfig(chunk_size=1024,chunk_overlap=256)),import_result_sink="gs://sample-existing-folder/sample_import_result_unique.ndjson",# Optional: This must be an existing Cloud Storage bucket folder, and the filename must be unique (non-existent).llm_parser=rag.LlmParserConfig(model_name="gemini-2.5-pro-preview-05-06",max_parsing_requests_per_min=100,),# Optionalmax_embedding_requests_per_min=900,# Optional)print(f"Imported {response.imported_rag_files_count} files.")
REST
Before using any of the request data, make the following replacements:
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The corpus ID of your RAG corpus.
FOLDER_RESOURCE_ID: The resource ID of your
Drive folder.
GCS_URIS: A list of Cloud Storage locations.
Example: gs://my-bucket1.
CHUNK_SIZE: Number of tokens each chunk should have.
CHUNK_OVERLAP: Number of tokens overlap between chunks.
EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAG's
access to your embedding model. Example: 1,000.
A successful response returns the ImportRagFilesOperationMetadata resource.
The following sample demonstrates how to import a file from
Cloud Storage. Use the max_embedding_requests_per_min control field
to limit the rate at which RAG Engine calls the embedding model during the
ImportRagFiles indexing process. The field has a default value of 1000 calls
per minute.
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The corpus ID of your RAG corpus.
GCS_URIS: A list of Cloud Storage locations.
Example: gs://my-bucket1.
CHUNK_SIZE: Number of tokens each chunk should have.
CHUNK_OVERLAP: Number of tokens overlap between chunks.
EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAGs
access to your embedding model. Example: 1,000.
The following sample demonstrates how to import a file from
Drive. Use the max_embedding_requests_per_min control field to
limit the rate at which RAG Engine calls the embedding model during the
ImportRagFiles indexing process. The field has a default value of 1000 calls
per minute.
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The corpus ID of your RAG corpus.
FOLDER_RESOURCE_ID: The resource ID of your
Drive folder.
CHUNK_SIZE: Number of tokens each chunk should have.
CHUNK_OVERLAP: Number of tokens overlap between chunks.
EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAG's
access to your embedding model. Example: 1,000.
fromvertexaiimportragimportvertexai# TODO(developer): Update and un-comment below lines# PROJECT_ID = "your-project-id"# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location="us-central1")files=rag.list_files(corpus_name=corpus_name)forfileinfiles:print(file.display_name)print(file.name)# Example response:# g-drive_file.txt# projects/1234567890/locations/us-central1/ragCorpora/111111111111/ragFiles/222222222222# g_cloud_file.txt# projects/1234567890/locations/us-central1/ragCorpora/111111111111/ragFiles/333333333333
Get a RAG file example
These code samples demonstrate how to get a RAG file.
REST
Before using any of the request data, make the following replacements:
fromvertexaiimportragimportvertexai# TODO(developer): Update and un-comment below lines# PROJECT_ID = "your-project-id"# file_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location="us-central1")rag_file=rag.get_file(name=file_name)print(rag_file)# Example response:# RagFile(name='projects/1234567890/locations/us-central1/ragCorpora/11111111111/ragFiles/22222222222',# display_name='file_display_name', description='file description')
Delete a RAG file example
These code samples demonstrate how to delete a RAG file.
REST
Before using any of the request data, make the following replacements:
PROJECT_ID>: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The ID of the RagCorpus resource.
RAG_FILE_ID: The ID of the RagFile resource. Format:
projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}.
fromvertexaiimportragimportvertexai# TODO(developer): Update and un-comment below lines# PROJECT_ID = "your-project-id"# file_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location="us-central1")rag.delete_file(name=file_name)print(f"File {file_name} deleted.")# Example response:# Successfully deleted the RagFile.# File projects/1234567890/locations/us-central1/ragCorpora/1111111111/ragFiles/2222222222 deleted.
Retrieval query example
When a user asks a question or provides a prompt, the retrieval component in RAG
searches through its knowledge base to find information that is relevant to the
query.
fromvertexaiimportragfromvertexai.generative_modelsimportGenerativeModel,Toolimportvertexai# TODO(developer): Update and un-comment below lines# PROJECT_ID = "your-project-id"# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location="us-central1")rag_retrieval_tool=Tool.from_retrieval(retrieval=rag.Retrieval(source=rag.VertexRagStore(rag_resources=[rag.RagResource(rag_corpus=corpus_name,# Optional: supply IDs from `rag.list_files()`.# rag_file_ids=["rag-file-1", "rag-file-2", ...],)],rag_retrieval_config=rag.RagRetrievalConfig(top_k=10,filter=rag.utils.resources.Filter(vector_distance_threshold=0.5),),),))rag_model=GenerativeModel(model_name="gemini-2.0-flash-001",tools=[rag_retrieval_tool])response=rag_model.generate_content("Why is the sky blue?")print(response.text)# Example response:# The sky appears blue due to a phenomenon called Rayleigh scattering.# Sunlight, which contains all colors of the rainbow, is scattered# by the tiny particles in the Earth's atmosphere....# ...
Project management examples
Tier is a project-level setting available under the RagEngineConfig
resource and impacts RAG corpora using RagManagedDb. To get the tier
configuration, use GetRagEngineConfig. To update the tier configuration,
use UpdateRagEngineConfig.
For more information on managing your tier configuration, see Manage tiers.
Get project configuration
The following code samples demonstrate how to read your RagEngineConfig:
Console
In the Google Cloud console, go to the RAG Engine page.
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine. The Configure RAG Engine pane appears. You can see
the tier that's selected for your RAG Engine.
Click Cancel.
Python
fromvertexaiimportragimportvertexaiPROJECT_ID=YOUR_PROJECT_IDLOCATION=YOUR_RAG_ENGINE_LOCATION# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location=LOCATION)rag_engine_config=rag.rag_data.get_rag_engine_config(name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig")print(rag_engine_config)
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine. The Configure RAG Engine pane appears.
Select the tier that you want to run your RAG Engine.
Click Save.
Python
fromvertexaiimportragimportvertexaiPROJECT_ID=YOUR_PROJECT_IDLOCATION=YOUR_RAG_ENGINE_LOCATION# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location=LOCATION)rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"new_rag_engine_config=rag.RagEngineConfig(name=rag_engine_config_name,rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Scaled()),)updated_rag_engine_config=rag.rag_data.update_rag_engine_config(rag_engine_config=new_rag_engine_config)print(updated_rag_engine_config)
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine. The Configure RAG Engine pane appears.
Select the tier that you want to run your RAG Engine.
Click Save.
Python
fromvertexaiimportragimportvertexaiPROJECT_ID=YOUR_PROJECT_IDLOCATION=YOUR_RAG_ENGINE_LOCATION# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location=LOCATION)rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"new_rag_engine_config=rag.RagEngineConfig(name=rag_engine_config_name,rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Basic()),)updated_rag_engine_config=rag.rag_data.update_rag_engine_config(rag_engine_config=new_rag_engine_config)print(updated_rag_engine_config)
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine. The Configure RAG Engine pane appears.
Click Delete RAG Engine. A confirmation dialog appears.
Verify that you're about to delete your data in RAG Engine by typing delete, then
click Confirm.
Click Save.
Python
fromvertexaiimportragimportvertexaiPROJECT_ID=YOUR_PROJECT_IDLOCATION=YOUR_RAG_ENGINE_LOCATION# Initialize Vertex AI API once per sessionvertexai.init(project=PROJECT_ID,location=LOCATION)rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"new_rag_engine_config=rag.RagEngineConfig(name=rag_engine_config_name,rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Unprovisioned()),)updated_rag_engine_config=rag.rag_data.update_rag_engine_config(rag_engine_config=new_rag_engine_config)print(updated_rag_engine_config)
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-27 UTC."],[],[],null,[]]