Use a Weaviate database with LlamaIndex on Vertex AI for RAG

This page shows you how to connect your LlamaIndex on Vertex AI for RAG corpus to your Weaviate database.

You can use your Weaviate database instance, which is an open source database, with LlamaIndex on Vertex AI for RAG to index and conduct a vector-based similarity search. A similarity search is a way to find pieces of text that are similar to the text that you're looking for, which requires the use of an embedding model. The embedding model produces vector data for each piece of text being compared. The similarity search is used to retrieve semantic contexts for grounding to return the most accurate content from your LLM.

With LlamaIndex on Vertex AI for RAG, you can continue to use your fully-managed vector database instance, which you are responsible for provisioning. LlamaIndex on Vertex AI for RAG uses the vector database for storage, index management, and search.


Consider the following steps before using the Weaviate database:

  1. You must create, configure, and deploy your Weaviate database instance and collection. Follow the instructions in Create your Weaviate collection to set up a collection based on your schema.
  2. You must provide a Weaviate API key, which allows LlamaIndex on Vertex AI for RAG to interact with the Weaviate database. LlamaIndex on Vertex AI for RAG supports the API key-based AuthN and AuthZ, which connects to your Weaviate database and supports an HTTPS connection.
  3. LlamaIndex on Vertex AI for RAG doesn't store and manage your Weaviate API key. Instead, you must do the following:
    1. Store your key in the Google Cloud Secret Manager.
    2. Grant your project's service account permissions to access your secret.
    3. Provide LlamaIndex on Vertex AI for RAG access to your secret's resource name.
    4. When you interact with your Weaviate database, LlamaIndex on Vertex AI for RAG accesses your secret resource using your service account.
  4. LlamaIndex on Vertex AI for RAG corpus and the Weaviate collection have a one-to-one mapping. RAG files are stored in a Weaviate database collection. When a call is made to the CreateRagCorpus API or the UpdateRagCorpus API, the RAG corpus is associated to the database collection.
  5. In addition to dense embeddings-based semantic searches, the hybrid search is also supported with LlamaIndex on Vertex AI for RAG through a Weaviate database. You can also adjust the weight between dense and sparse vector similarity in a hybrid search.

Provision the Weaviate database

Before using the Weaviate database with LlamaIndex on Vertex AI for RAG, you must do the following:

  1. Configure and deploy your Weaviate database instance.
  2. Prepare the HTTPS endpoint.
  3. Create your Weaviate collection.
  4. Use your API key to provision Weaviate using AuthN and AuthZ.
  5. Provision your LlamaIndex on Vertex AI for RAG service account.

Configure and deploy your Weaviate database instance

You must follow the Weaviate official guide quickstart. However, you can use the Google Cloud Marketplace guide, which is optional.

You can set up your Weaviate instance anywhere as long as the Weaviate endpoint is accessible to configure and deploy in your project. You can then fully manage your Weaviate database instance.

Because LlamaIndex on Vertex AI for RAG isn't involved in any stage of your Weaviate database instance lifecycle, it is your responsibility to grant permissions to LlamaIndex on Vertex AI for RAG so it can store and search for data in your Weaviate database. It is also your responsibility to ensure that the data in your database can be used by LlamaIndex on Vertex AI for RAG. For example, if you change your data, LlamaIndex on Vertex AI for RAG isn't responsible for any unexpected behaviors because of those changes.

Prepare the HTTPS endpoint

During Weaviate provisioning, ensure that you create an HTTPS endpoint. Although HTTP connections are supported, we prefer that LlamaIndex on Vertex AI for RAG and Weaviate database traffic use an HTTPS connection.

Create your Weaviate collection

Because the LlamaIndex on Vertex AI for RAG corpus and the Weaviate collection have a one-to-one mapping, you must create a collection in your Weaviate database before associating your collection with the LlamaIndex on Vertex AI for RAG corpus. This one-time association is made when you call the CreateRagCorpus API or the UpdateRagCorpus API.

When creating a collection in Weaviate, you must use the following schema:

Property name Data type
fileId text
corpusId text
chunkId text
chunkDataType text
chunkData text
fileOriginalUri text

Use your API key to provision Weaviate using AuthN and AuthZ

Provisioning the Weaviate API key involves the following steps:

  1. Create the Weaviate API key.
  2. Configure Weaviate using your Weaviate API key.
  3. Store your Weaviate API key in Secret Manager.

Create the API key

LlamaIndex on Vertex AI for RAG can only connect to your Weaviate database instances by using your API key for authentication and authorization. You must follow the Weaviate official guide to authentication to configure the API key-based authentication in your Weaviate database instance.

If creating the Weaviate API key requires identity information to associate with that comes from LlamaIndex on Vertex AI for RAG, you must create your first corpus, and use your LlamaIndex on Vertex AI for RAG service account as an identity.

Store your API key in Secret Manager

An API key holds Sensitive Personally Identifiable Information (SPII), which is subject to legal requirements. If the SPII data is compromised or misused, an individual might experience a significant risk or harm. To minimize risks to an individual while using LlamaIndex on Vertex AI for RAG, don't store and manage your API key, and avoid sharing the unencrypted API key.

To protect SPII, do the following:

  1. Store your API key in Secret Manager.
  2. Grant your LlamaIndex on Vertex AI for RAG service account the permissions to your secret(s), and manage the access control at the secret resource level.
    1. Navigate to your project's permissions.
    2. Enable the option Include Google-provided role grants.
    3. Find the service account, which has the format

      service-{project number}@gcp-sa-{env-}

    4. Edit the service account's principals.
    5. Add the Secret Manager Secret Accessor role to the service account.
  3. During the creation or update of the RAG corpus, pass the secret resource name to LlamaIndex on Vertex AI for RAG, and store the secret resource name.

When you make API requests to your Weaviate database instance(s), LlamaIndex on Vertex AI for RAG uses each service account to read the API key that corresponds to your secret resources in Secret Manager from your project(s).

Provision your LlamaIndex on Vertex AI for RAG service account

When you create the first resource in your project, LlamaIndex on Vertex AI for RAG creates a dedicated service account. You can find your service account from your project's IAM page. The service account follows this format:

service-{project number}@gcp-sa-{env-}

For example,

When integrating with the Weaviate database, your service account is used in the following scenarios:

  • You can use your service account to generate your Weaviate API key for authentication. In some cases, generating the API key doesn't require any user information, which means that a service account isn't required when generating the API key.
  • You can bind your service account with the API key in your Weaviate database to configure the authentication (AuthN) and authorization (AuthZ). However, your service account isn't required.
  • You can store the API key Secret Manager in your project, and you can grant your service account permissions to these secret resources.
  • LlamaIndex on Vertex AI for RAG uses service accounts to access the API key from the Secret Manager in your projects.

Set up your Google Cloud console environment

Click to learn how to set up your environment

Learn how to set up your environment by selecting one of the following tabs:


  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

  8. Install or update the Vertex AI SDK for Python by running the following command:

    pip3 install --upgrade "google-cloud-aiplatform>=1.38"


  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

  8. Install or update the Vertex AI SDK for Node.js by running the following command:

    npm install @google-cloud/vertexai


  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

  8. To add google-cloud-vertexai as a dependency, add the appropriate code for your environment:

    Maven with BOM

    Add the following HTML to your pom.xml:


    Maven without BOM

    Add the following HTML to your pom.xml:


    Gradle without BOM

    Add the following to your build.gradle

    implementation ''


  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

  8. Review the available Vertex AI API Go packages to determine which package best meets your project's needs:

    • Package (recommended)

      vertexai is a human authored package that provides access to common capabilities and features.

      This package is recommended as the starting point for most developers building with the Vertex AI API. To access capabilities and features not yet covered by this package, use the auto-generated aiplatform instead.

    • Package

      aiplatform is an auto-generated package.

      This package is intended for projects that require access to Vertex AI API capabilities and features not yet provided by the human authored vertexai package.

  9. Install the desired Go package based on your project's needs by running one of the following commands:

    # Human authored package. Recommended for most developers.
    go get
    # Auto-generated package. go get


  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.


  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Vertex AI API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Vertex AI API.

    Enable the API

  6. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  7. Configure environment variables by entering the following. Replace PROJECT_ID with the ID of your Google Cloud project.
  8. Provision the endpoint:
    gcloud beta services identity create --project=${PROJECT_ID}
  9. Optional: If you are using Cloud Shell and you are asked to authorize Cloud Shell, click Authorize.

Prepare your Vertex AI RAG corpus

To access data from your Weaviate database, LlamaIndex on Vertex AI for RAG must have access to a RAG corpus. This section provides the steps for creating a single RAG corpus and additional RAG corpora.

Use CreateRagCorpus and UpdateRagCorpus APIs

You must specify the following fields when calling the CreateRagCorpus and UpdateRagCorpus APIs:

  • vector_db.weaviate: The vector database configuration is chosen after you call the CreateRagCorpus API. The vector database configuration contains all configuration fields. If this field isn't set, then vector_db.rag_managed_db is set by default.
  • weaviate.http_endpoint: The HTTPS or HTTP Weaviate endpoint is created during provisioning of the Weaviate database instance.
  • weaviate.collection_name: The name of the collection that is created during the Weaviate instance provisioning. The name must start with a capital letter.
  • api_auth.api_key_config: The configuration specifies to use an API key to authorize your access to the vector database.
  • api_key_config.api_key_secret_version: The resource name of the secret that is stored in Secret Manager, which contains your Weaviate API key.

You can create and associate your RAG corpus to the Weaviate collection in your database instance. However, you might need the service account to generate your API key and to configure your Weaviate database instance. When you create your first RAG corpus, the service account is generated. After you create your first RAG corpus, the association between the Weaviate database and the API key might not be ready for use in the creation of another RAG corpus.

Just in case your database and key aren't ready to be associated to your RAG corpus, do the following to your RAG corpus:

  1. Set the weaviate field in rag_vector_db_config.

    • You can't change the associated vector database.
    • Leave both the http_endpoint and the collection_name fields empty. Both fields can be updated at a later time.
  2. If you don't have your API key stored in Secret Manager, then you can leave the api_auth field empty. When you call the UpdateRagCorpus API, you can update the api_auth field. Weaviate requires that the following be done:

    1. Set the api_key_config in the api_auth field.
    2. Set the api_key_secret_version of your Weaviate API key in Secret Manager. The api_key_secret_version field uses the following format:


  3. If you specify fields that can only be set one time, like http_endpoint or collection_name, you can't change them unless you delete your RAG corpus, and create your RAG corpus again. Other fields like the API key field, api_key_secret_version, can be updated.

  4. When you call UpdateRagCorpus, you can set the vector_db field. The vector_db should be set to weaviate by your CreateRagCorpus API call. Otherwise, the system chooses the RAG Managed Database option, which is the default. This option can't be changed when you call the UpdateRagCorpus API. When you call UpdateRagCorpus and the vector_db field is partially set, you can update the fields that are marked as Changeable (also referred to as mutable).

This table lists the WeaviateConfig mutable and immutable fields that are used in your code.

Field name Mutable or Immutable
http_endpoint Immutable once set
collection_name Immutable once set
api_key_authentication Mutable

Create the first RAG corpus

When the LlamaIndex on Vertex AI for RAG service account doesn't exist, do the following:

  1. Create a RAG corpus in LlamaIndex on Vertex AI for RAG with an empty Weaviate configuration, which initiates LlamaIndex on Vertex AI for RAG provisioning to create a service account.
  2. Choose a name for your LlamaIndex on Vertex AI for RAG service account that follows this format:

    service-{project number}@gcp-sa-{env-}

    For example,

  3. Using your service account, access your secret that is stored in your project's Secret Manager, which contains your Weaviate API key.
  4. Get the following information after Weaviate provisioning completes:
    • Your Weaviate HTTPS or HTTP endpoint.
    • The name of your Weaviate collection.
  5. Call the CreateRagCorpus API to create a RAG corpus with an empty Weaviate configuration, and call the UpdateRagCorpus API to update the RAG corpus with the following information:
    • Your Weaviate HTTPS or HTTP endpoint.
    • The name of your Weaviate collection.
    • The API key resource name.

Create another RAG corpus

When the LlamaIndex on Vertex AI for RAG service account exists, do the following:

  1. Get your LlamaIndex on Vertex AI for RAG service account from your project's permissions.
  2. Enable the option "Include Google-provided role grants"
  3. Choose a name for your LlamaIndex on Vertex AI for RAG service account that follows this format:

    service-{project number}@gcp-sa-{env-}

  4. Using your service account, access your secret that is stored in your project's Secret Manager, which contains your Weaviate API key.
  5. During Weaviate provisioning, get the following information:
    • The Weaviate HTTPS or HTTP endpoint.
    • The name of your Weaviate collection.
  6. Create a RAG corpus in LlamaIndex on Vertex AI for RAG, and connect with your Weaviate collection by doing one of the following:
    1. Make a CreateRagCorpus API call to create a RAG corpus with a populated Weaviate configuration, which is the preferred option.
    2. Make a CreateRagCorpus API call to create a RAG corpus with an empty Weaviate configuration, and make an UpdateRagCorpus API call to update the RAG corpus with the following information:
      • Weaviate database HTTP endpoint
      • Weaviate Collection name
      • API key


This section presents sample code that demonstrates how to set up your Weaviate database, Secret Manager, the RAG corpus, and the RAG file. Sample code is also provided to demonstrate how to import files, to retrieve context, to get content, and to delete the RAG corpus and RAG files.

Set up your Weaviate database

This code sample demonstrates how to set up your Weaviate data and the Secret Manager.

# TODO(developer): Update and un-comment below lines.
# The HTTPS/HTTP Weaviate endpoint you created during provisioning.

# Your Weaviate API Key.

# Select your Weaviate collection name, which roughly corresponds to a Vertex AI Knowledge Engine Corpus.
# For example, "MyFakeCollectionName"
# Note that the first letter needs to be capitalized.
# Otherwise, Weavaite will capitalize it for you.

# Create a collection in Weaviate which includes the required schema fields shown below.
echo '{
  "properties": [
    { "name": "fileId", "dataType": [ "string" ] },
    { "name": "corpusId", "dataType": [ "string" ] },
    { "name": "chunkId", "dataType": [ "string" ] },
    { "name": "chunkDataType", "dataType": [ "string" ] },
    { "name": "chunkData", "dataType": [ "string" ] },
    { "name": "fileOriginalUri", "dataType": [ "string" ] }
}' | curl \
    -X POST \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer "${WEAVIATE_API_KEY} \
    -d @- \

Set up your Secret Manager

To set up your Secret Manager, you must enable Secret Manager, and set permissions.

Enable your Secret Manager

To enable your Secret Manager, do the following:

  1. Go to the Secret Manager page.

    Go to Secret Manager

  2. Click + Create Secret.
  3. Enter the Name of your secret. Secret names can only contain English letters (A-Z), numbers (0-9), dashes (-), and underscores (_).
  4. Specifying the following fields is optional:
    1. To upload the file with your secret, click Browse.
    2. Read the Replication policy.
    3. If you want to manually manage the locations for your secret, then check Manually manage locations for this secret. At least one region must be selected.
    4. Select your encryption option.
    5. If you want to manually set your rotation period, then check Set rotation period.
    6. If you want to specify Publish or subscribe topic(s) to receive event notifications, click Add topics.
    7. By default, the secret never expires. If you want to set an expiration date, then check Set expiration date.
    8. By default, secret versions are destroyed upon request. To delay the destruction of secret versions, check Set duration for delayed destruction.
    9. If you want to use labels to organize and categorize your secrets, then click + Add label.
    10. If you want to use annotations to attach non-identifying metadata to your secrets, then click + Add annotation.
  5. Click Create secret.

Set permissions

  1. You must grant Secret Manager permissions to your service account.

  2. In the IAM & Admin section of your Google Cloud console, find your service account account, and click the pencil icon to edit.

  3. In the Role field, select Secret Manager Secret Accessor.

This code sample demonstrates how to set up your Secret Manager.

# TODO(developer): Update and un-comment below lines.
# Select a resource name for your Secret, which will contain your API Key.

# Create a secret in SecretManager.
curl "${PROJECT_ID}/secrets?secretId=${SECRET_NAME}" \
    --request "POST" \
    --header "authorization: Bearer $(gcloud auth print-access-token)" \
    --header "content-type: application/json" \
    --data "{\"replication\": {\"automatic\": {}}}"

# Your Weaviate API Key.
# Encode your WEAVIATE_API_KEY using base 64.
SECRET_DATA=$(echo ${WEAVIATE_API_KEY} | base64)

# Create a new version of your secret which uses SECRET_DATA as payload
"${PROJECT_ID}/secrets/${SECRET_NAME}:addVersion" \
    --request "POST" \
    --header "authorization: Bearer $(gcloud auth print-access-token)" \
    --header "content-type: application/json" \
    --data "{\"payload\": {\"data\": \"${SECRET_DATA}\"}}"

Use the RAG corpus

This code sample demonstrates how to create a RAG corpus.

# TODO(developer): Update and un-comment below lines.
PROJECT_ID = "your-project-id"
# The HTTPS/HTTP Weaviate endpoint you created during provisioning.

# Your Weaviate collection name, which roughly corresponds to a Vertex AI Knowledge Engine Corpus.
# For example, "MyFakeCollectionName"
# Note that the first letter needs to be capitalized.
# Otherwise, Weavaite will capitalize it for you.

# The resource name of your Weaviate API Key your Secret.
# The Secret Manager resource name containing the API Key for your Weaviate endpoint.
# For example, projects/{project}/secrets/{secret}/versions/latest

# Select a Corpus display name.

# Call CreateRagCorpus API and set all Vector DB Config parameters for Weaviate to create a new corpus associated to your selected Weaviate collection.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \${PROJECT_ID}/locations/us-central1/ragCorpora \
-d '{
      "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
      "rag_vector_db_config" : {
              "weaviate": {
                    "http_endpoint": '\""${HTTP_ENDPOINT_NAME}"\"',
                    "collection_name": '\""${WEAVIATE_COLLECTION_NAME}"\"'
        "api_auth" : {
                "api_key_config": {
                      "api_key_secret_version": '\""${APIKEY_SECRET_VERSION}"\"'

# TODO(developer): Update and un-comment below lines
# Get operation_id returned in CreateRagCorpus.

# Poll Operation status until done = true in the response.
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \${PROJECT_ID}/locations/us-central1/operations/${OPERATION_ID}

# Call ListRagCorpora API to verify the RAG corpus is created successfully.
curl -sS -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \

Use the RAG file

The RAG API handles the file upload, import, listing, and deletion.

Retrieve context

The RetrieveContexts RAG API can retrieve the context information related to a query from your uploaded files.

Get the content

Use the Prediction RAG APIs to generate content from a selected model related to a text query.

Hybrid search is supported with Weaviate database, which combines both semantic and keyword searches to improve the relevance of search results. During the retrieval of search results, a combination of similarity scores from semantic (a dense vector) and keyword matching (a sparse vector) produces the final ranked results.

Hybrid search using the LlamaIndex on Vertex AI for RAG retrieval API

This is an example of how to enable a hybrid search using the LlamaIndex on Vertex AI for RAG retrieval API.

  • PROJECT_ID: Your Google Cloud project ID.
  • RAG_CORPUS_RESOURCE: The full resource name for your RAG corpus in the format of projects/*/locations/us-central1/ragCorpora/*.
  • DISTANCE_THRESHOLD: A threshold set for a vector search distance in the range of [0, 1.0]. The default value is set to 0.3.
  • ALPHA: The alpha value controls the weight between semantic and keyword search results. The range is [0, 1] where 0 is a sparse vector search and 1 is a dense vector search. The default value is 0.5, which balances sparse and dense vector searches.
  • RETRIEVAL_QUERY: Your retrieval query.
  • TOP_K: The number of top k results to be retrieved.

This example demonstrates how to call the HTTP method in a URL.


This code sample demonstrates how to use the request JSON body.

      "vertex_rag_store": {
        "rag_resources": {
            "rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',

        "vector_distance_threshold": ${DISTANCE_THRESHOLD}
      "query": {
        "text": '\""${RETRIEVAL_QUERY}"\"',
        "similarity_top_k": ${TOP_K},
        "ranking": { "alpha" : ${ALPHA}}

Use hybrid search and LlamaIndex on Vertex AI for RAG for grounded generation

This is an example of how to use hybrid search and LlamaIndex on Vertex AI for RAG for grounded generation.

  • PROJECT_ID: Your Google Cloud project ID.
  • RAG_CORPUS_RESOURCE: Your RAG corpus full resource name in the format of projects/*/locations/us-central1/ragCorpora/*.
  • DISTANCE_THRESHOLD: A threshold set for a vector search distance in the range of [0, 1.0]. The default value is set to 0.3.
  • ALPHA: The alpha value controls the weight between semantic and keyword search results. The range is [0, 1] where 0 is a sparse vector search and 1 is a dense vector search. The default value is 0.5, which balances sparse and dense vector searches.
  • INPUT_PROMPT: Your input prompt.
  • TOP_K: The number of top k results to be retrieved.

This example demonstrates how to call the HTTP method in a URL.


This code sample demonstrates how to use the request JSON body.

      "contents": {
        "role": "user",
        "parts": {
          "text": '\""${INPUT_PROMPT}"\"'
      "tools": {
        "retrieval": {
          "vertex_rag_store": {
            "rag_resources": {
                "rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',
            "similarity_top_k": ${TOP_K},
            "vector_distance_threshold": ${DISTANCE_THRESHOLD},
            "ranking": { "alpha" : ${ALPHA}}

What's next