This page explains how you can ground responses by using your data from
Vertex AI Search. If you want to do retrieval-augmented generation (RAG), connect your model to
your website data or your sets of documents, then use Grounding with
Vertex AI Search. Grounding to your data supports a maximum of 10 Vertex AI Search
data sources and can be combined with Grounding with
Google Search. This section lists the models that support grounding with your data. Before you can ground model output to your data, do the following: In the Google Cloud console, go to the IAM page, and search for the
Enable AI Applications and activate the API. Create a AI Applications data source and
application. See the Introduction to Vertex AI Search for more. In the Google Cloud console, go to the AI Applications page. Read and agree to the terms of service, then click Continue and activate
the API. AI Applications is available in the To create a data store in AI Applications, you can choose to
ground with website data or documents. Open the Create Data
Store page from the Google Cloud console. In Website Content box, click Select. If Advanced website indexing isn't checked, then select the Advanced
website indexing checkbox to turn it on. In the Specify URL patterns to index section, do the following: Click Continue. In the Configure your data store pane, Open the Create Data
Store page from the Google Cloud console. In Cloud Storage box, click Select. In the Unstructured documents (PDF, HTML, TXT and more) section, select
Unstructured documents (PDF, HTML, TXT and more). Select a Synchronization frequency option. Select a Select a folder or a file you want to import option, and
enter the path in the field. Click Continue. In the Configure your data store pane, Click Create. Use the following instructions to ground a model with your data. A maximum
of 10 data stores is supported. If you don't know your data store ID, follow these steps: In the Google Cloud console, go to the AI Applications page and
in the navigation menu, click Data stores. Click the name of your data store. On the Data page for your data store, get the data store ID. To ground your model output to AI Applications by using Vertex AI Studio in the
Google Cloud console, follow these steps: projects/project_id/locations/global/collections/default_collection/dataStores/data_store_id. Your prompt responses are grounded to AI Applications.
Before trying this sample, follow the Python setup instructions in the
Vertex AI quickstart using
client libraries.
For more information, see the
Vertex AI Python API
reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
To test a text prompt by using the Vertex AI API, send a POST request to the
publisher model endpoint.
Before using any of the request data,
make the following replacements:
HTTP method and URL:
Request JSON body:
To send your request, expand one of these options: You should receive a JSON response similar to the following:Grounding Gemini to your data
Supported models
Prerequisites
discoveryengine.servingConfigs.search
permission, which is required for the
grounding service to work.Enable AI Applications
global
location, or the eu
and us
multi-region. To
learn more, see AI Applications locationsCreate a data store in AI Applications
Website
Specify the
websites for your data store pane displays.
Configure your data store
pane displays.
Documents
Import data from
Cloud Storage pane displays.
Configure your data store pane displays.
Generate grounded responses with your data store
Console
Python
REST
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent
{
"contents": [{
"role": "user",
"parts": [{
"text": "TEXT"
}]
}],
"tools": [{
"retrieval": {
"vertexAiSearch": {
"datastore": projects/PROJECT_ID/locations/global/collections/default_collection/dataStores/DATA_STORE_ID
}
}
}],
"model": "projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID"
}
{
"candidates": [
{
"content": {
"role": "model",
"parts": [
{
"text": "You can make an appointment on the website https://dmv.gov/"
}
]
},
"finishReason": "STOP",
"safetyRatings": [
"..."
],
"groundingMetadata": {
"retrievalQueries": [
"How to make appointment to renew driving license?"
],
"groundingChunks": [
{
"retrievedContext": {
"uri": "https://vertexaisearch.cloud.google.com/grounding-api-redirect/AXiHM.....QTN92V5ePQ==",
"title": "dmv"
}
}
],
"groundingSupport": [
{
"segment": {
"startIndex": 25,
"endIndex": 147
},
"segment_text": "ipsum lorem ...",
"supportChunkIndices": [1, 2],
"confidenceScore": [0.9541752, 0.97726375]
},
{
"segment": {
"startIndex": 294,
"endIndex": 439
},
"segment_text": "ipsum lorem ...",
"supportChunkIndices": [1],
"confidenceScore": [0.9541752, 0.9325467]
}
]
}
}
],
"usageMetadata": {
"..."
}
}
Understand your response
The response from both APIs include the LLM-generated text, which is called a candidate. If your model prompt successfully grounds to your data source, then the responses include grounding metadata, which identifies the parts of the response that were derived from your data. However, there are several reasons this metadata may not be provided, and the prompt response won't be grounded. These reasons include low-source relevance or incomplete information within the model's response.
The following is a breakdown of the output data:
- Role: Indicates the sender of the grounded answer. Because the response
always contains grounded text, the role is always
model
. - Text: The grounded answer generated by the LLM.
- Grounding metadata: Information about the grounding source, which contains
the following elements:
- Grounding chunks: A list of results from your index that support the answer.
- Grounding supports: Information about a specific claim within the answer that can be used to show citations:
- Segment: The part of the model's answer that is substantiated by a grounding chunk.
- Grounding chunk index: The index of the grounding chunks in the grounding chunks list that corresponds to this claim.
- Confidence scores: A number from 0 to 1 that indicates how grounded the claim is in the provided set of grounding chunks. Not available for Gemini 2.5 and later.
What's next
- To learn how to send chat prompt requests, see Multiturn chat.
- To learn about responsible AI best practices and Vertex AI's safety filters, see Safety best practices.