This page describes the Vertex AI RAG Engine pricing and billing based on the Vertex AI RAG Engine components you use, such as models, reranking, and vector storage.
For more information, see the Vertex AI RAG Engine overview page.
Pricing and billing
Vertex AI RAG Engine is free to use. However, if you configure Vertex AI RAG Engine components, the billing might be affected.
This table explains how billing works when you use the RAG components.
Component | How billing works with Vertex AI RAG Engine |
---|---|
Data ingestion | Vertex AI RAG Engine supports ingesting data from different data sources. For example, uploading local files, Cloud Storage, and Google Drive. Accessing files in these data sources from Vertex AI RAG Engine is free, but these data sources might charge for data transfer. For example, data egress costs. |
Data transformation (file parsing) |
|
Data transformation (file chunking) | Supports fixed-size chunking, which is free. |
Embedding generation | Vertex AI RAG Engine orchestrates the embedding generation using the embedding model that you specified, and your project is billed for the costs associated with that model. For more pricing information, see Cost of building and deploying AI models in Vertex AI. |
Data indexing and retrieval |
RAG Engine supports two categories of vector databases for vector search:
A RAG-managed database has two purposes:
A RAG-managed database uses a Spanner instance as the backend. For each of your projects, Vertex AI RAG Engine provisions a customer-specific Google Cloud project and manages RAG-managed resources that are stored in Vertex AI RAG Engine, so that your data is physically isolated. If you choose the
If any RAG corpus in your project chooses to use a RAG-managed database for the vector search, you will be charged for the RAG-managed Spanner instance. Vertex AI RAG Engine surfaces Spanner costs from your corresponding RAG-managed project to your Google Cloud project, so that you can see and pay Spanner instance costs. For more pricing details on Spanner, see Spanner pricing. |
Reranking for Vertex AI RAG Engine | The following ranking tools are supported post retrieval:
|
What's next
- To learn how to use the Vertex AI SDK to run Vertex AI RAG Engine tasks, see RAG quickstart for Python.
- To learn about grounding, see Grounding overview.
- To learn more about the responses from RAG, see Retrieval and Generation Output of Vertex AI RAG Engine.
- To learn about the RAG architecture: