Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see Model versions and lifecycle.

Grounding with Vertex AI Search

This guide shows you how to ground Gemini model responses to your own data by using Vertex AI Search. When you ground a model, you connect it to your own data sources, such as website content or documents. This technique, also known as retrieval-augmented generation (RAG), helps the model provide more accurate and relevant responses by reducing hallucinations.

This guide covers the following topics:

Prerequisites: Set up your environment by enabling APIs and configuring permissions.
Create a data store: Learn how to create a data store from either website content or documents.
Generate grounded responses: Use the API or Vertex AI Studio to get model responses grounded to your data.
Understand your response: Interpret the output, including grounding metadata and confidence scores.

You can ground a model with up to 10 Vertex AI Search data sources. This can also be combined with Grounding with Google Search.

The following diagram summarizes the overall workflow:

Supported models

This section lists the models that support grounding with your data.

Gemini 2.5 Flash-Lite
Gemini 2.5 Flash with Live API native audio (Preview)
Gemini 2.0 Flash with Live API (Preview)
Gemini 2.5 Pro
Gemini 2.5 Flash
Gemini 2.0 Flash

Prerequisites

Before you ground model responses with your data, complete the following setup steps:

Configure permissions: In the Google Cloud console, go to the IAM page and make sure that your project has the discoveryengine.servingConfigs.search permission. This permission is required for the grounding service to work.

Go to IAM
Enable the API: Enable AI Applications and activate the API.
Create a data store: Create a AI Applications data store to hold your grounding data.

For more information, see the Introduction to Vertex AI Search.

Enable AI Applications

In the Google Cloud console, go to the AI Applications page.

AI Applications
Read and agree to the terms of service, then click Continue and activate the API.

Important: You must accept the discovery solutions data use terms for every project that you want to use AI Applications with.

AI Applications is available in the global location, or the eu and us multi-region. To learn more, see AI Applications locations.

Create a data store

To create a data store in AI Applications, you can use either website data or documents from Cloud Storage.

Option	Description	Use Case
Website data	Creates a data store by indexing content directly from specified website URLs.	When your knowledge base is publicly available on one or more websites and you want to keep it synchronized with the live content.
Documents in Cloud Storage	Creates a data store from unstructured documents (like PDF, HTML, TXT) stored in a Cloud Storage bucket.	When your knowledge base consists of internal or proprietary documents that are not publicly hosted on a website.

Website

In the Google Cloud console, go to the Create Data Store page.
On the Website Content card, click Select.
Select the Advanced website indexing checkbox.
In the Sites to include field, add the URLs to index. You can optionally add URLs to the Sites to exclude field.
Click Continue.
On the Configure your data store page, complete the following:
1. Select a Location for your data store.
2. Enter a Data store name. The data store ID is automatically generated from this name. You will use this ID later.
Click Create.

Documents

In the Google Cloud console, go to the Create Data Store page.
On the Cloud Storage card, click Select.
In the Import data from Cloud Storage pane, select Unstructured documents.
Select a Synchronization frequency.
Enter the Cloud Storage path to the folder or file you want to import.
Click Continue.
On the Configure your data store page, complete the following:
1. Select a Location for your data store.
2. Enter a Data store name. The data store ID is automatically generated.
3. Optional: To configure parsing and chunking, expand the Document Processing Options section. For more information, see Parse documents.
Click Create.

Generate grounded responses with your data store

You can ground a model with your data by using up to 10 data stores. The following instructions show you how to generate a grounded response.

If you don't know your data store ID, follow these steps:

In the Google Cloud console, go to the AI Applications page and in the navigation menu, click Data stores.

Go to the Data stores page
Click the name of your data store.
On the Data page for your data store, get the data store ID.

Console

To ground your model output to AI Applications by using Vertex AI Studio in the Google Cloud console, follow these steps:

In the Google Cloud console, go to the Vertex AI Studio Freeform page.
Go to Freeform
To turn on grounding, click the Grounding: your data toggle.
Click Customize.
1. Select Vertex AI Search as your source.
2. Using this path format, replace your data store's Project ID and the ID of the data store:
  
  projects/project_id/locations/global/collections/default_collection/dataStores/data_store_id.
Click Save.
Enter your prompt in the text box, and click Submit.

Your prompt responses are grounded to AI Applications.

Python

Before trying this sample, follow the Python setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Python API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

from google import genai
from google.genai.types import (
    GenerateContentConfig,
    HttpOptions,
    Retrieval,
    Tool,
    VertexAISearch,
)

client = genai.Client(http_options=HttpOptions(api_version="v1"))

# Load Data Store ID from Vertex AI Search
# datastore = "projects/111111111111/locations/global/collections/default_collection/dataStores/data-store-id"

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="How do I make an appointment to renew my driver's license?",
    config=GenerateContentConfig(
        tools=[
            # Use Vertex AI Search Tool
            Tool(
                retrieval=Retrieval(
                    vertex_ai_search=VertexAISearch(
                        datastore=datastore,
                    )
                )
            )
        ],
    ),
)

print(response.text)
# Example response:
# 'The process for making an appointment to renew your driver's license varies depending on your location. To provide you with the most accurate instructions...'

REST

To test a text prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

LOCATION: The region to process the request.
PROJECT_ID: Your project ID.
MODEL_ID: The model ID of the multimodal model.
TEXT: The text instructions to include in the prompt.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent

Request JSON body:

{
  "contents": [{
    "role": "user",
    "parts": [{
      "text": "TEXT"
    }]
  }],
  "tools": [{
    "retrieval": {
      "vertexAiSearch": {
        "datastore": projects/PROJECT_ID/locations/global/collections/default_collection/dataStores/DATA_STORE_ID
      }
    }
  }],
  "model": "projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID"
}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent"

PowerShell (Windows)

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "text": "You can make an appointment on the website https://dmv.gov/"
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        "..."
      ],
      "groundingMetadata": {
        "retrievalQueries": [
          "How to make appointment to renew driving license?"
        ],
        "groundingChunks": [
          {
            "retrievedContext": {
              "uri": "https://vertexaisearch.cloud.google.com/grounding-api-redirect/AXiHM.....QTN92V5ePQ==",
              "title": "dmv"
            }
          }
        ],
        "groundingSupport": [
          {
            "segment": {
              "startIndex": 25,
              "endIndex": 147
            },
            "segment_text": "ipsum lorem ...",
            "supportChunkIndices": [1, 2],
            "confidenceScore": [0.9541752, 0.97726375]
          },
          {
            "segment": {
              "startIndex": 294,
              "endIndex": 439
            },
            "segment_text": "ipsum lorem ...",
            "supportChunkIndices": [1],
            "confidenceScore": [0.9541752, 0.9325467]
          }
        ]
      }
    }
  ],
  "usageMetadata": {
    "..."
  }
}

Understand your response

The API response includes the LLM-generated text and, if grounding is successful, grounding metadata. Grounding metadata identifies which parts of the response were derived from your data source.

Grounding metadata might not be provided if the source has low relevance or if the information in the model's response is incomplete.

The response contains the following fields:

Role: Indicates the sender of the answer. For a grounded response, the role is always model.
Text: The grounded answer generated by the LLM.
Grounding metadata: Contains information about the grounding source. This object includes the following:
- Grounding chunks: A list of results from your data store that support the answer.
- Grounding supports: Information about a specific claim within the answer that can be used for citations. This includes:
  - Segment: The part of the model's answer that is substantiated by a grounding chunk.
  - Grounding chunk index: The index of the grounding chunk in the grounding_chunks list that corresponds to this claim.
  - Confidence scores: A value from 0 to 1 that indicates how grounded the claim is in the provided grounding chunks. This is not available for Gemini 2.5 and later.

What's next

To learn how to send chat prompt requests, see Multiturn chat.
To learn about responsible AI best practices and Vertex AI's safety filters, see Safety best practices.

Grounding with Vertex AI Search Stay organized with collections Save and categorize content based on your preferences.

Supported models

Prerequisites

Enable AI Applications

Create a data store

Website

Documents

Generate grounded responses with your data store

Console

Python

REST

curl (Linux, macOS, or Cloud Shell)

PowerShell (Windows)

Understand your response

What's next

Grounding with Vertex AI Search