This page introduces you to RagManagedDb
, its underlying technology, and how
RagManagedDb
is used in Vertex AI RAG Engine. In addition, this page
describes the different tiers that are available to tune performance, which
might impact your costs, and provides instructions for deleting your
Vertex AI RAG Engine data, which stops billing.
Overview
Vertex AI RAG Engine uses RagManagedDb
, which is an enterprise-ready,
fully-managed Google Spanner instance that's used for resource storage
by Vertex AI RAG Engine and is optionally available to be used as
the vector database of
choice for your RAG corpora.
Through Spanner, Vertex AI RAG Engine offers a consistent, highly available, and highly scalable database to support your application. To learn more about Google Spanner, see Spanner.
Vertex AI RAG Engine stores your RAG corpus and RAG file resource
metadata in RagManagedDb
, regardless of your choice of vector database. Vector
databases are only used for storage and retrieval of embeddings. In addition to
resource storage, RagManagedDb
can also be used to store and manage vector
representations of your documents. The vector database is then used to retrieve
relevant documents based on the document's semantic similarity to a given query.
Manage tiers
Vertex AI RAG Engine lets you scale your RagManagedDb
instance based
on your usage and performance requirements using a choice of two tiers, and
optionally, lets you delete your Vertex AI RAG Engine data using
a third tier.
The tier is a project-level setting that's available in the RagEngineConfig
resource that impacts RAG corpora using RagManagedDb
. The following tiers
are available in RagEngineConfig
:
Scaled tier: This tier offers production-scale performance along with autoscaling functionality. It's suitable for customers with large amounts of data or performance-sensitive workloads. Internally, this tier sets the Spanner instance to autoscaling configuration with a minimum of 1 node (1,000 processing units) and a maximum of 10 nodes (10,000 processing units).
Basic tier (default): This tier offers a cost-effective and low-compute tier, which might be suitable for some of the following cases:
- Experimenting with
RagManagedDb
. - Small data size.
- Latency-insensitive workload.
- Use Vertex AI RAG Engine with only other vector databases.
To offer the Basic tier,
RagManagedDb
sets the underlying Spanner instance to a fixed configuration of 100 processing units, which is equivalent to 0.1 nodes.- Experimenting with
Unprovisioned tier: This tier deletes the
RagManagedDb
and its underlying Spanner instance. The Unprovisioned tier disables the Vertex AI RAG Engine service and deletes your data held within this service regardless of the vector database used for yourRagCorpora
. This stops the billing of the service. For more information on billing, see Vertex AI RAG Engine billing.After the data is deleted, the data can't be recovered. To start usingVertex AI RAG Engine again, you must update the tier by calling the
UpdateRagEngineConfig
API.
Get the project configuration
The following code samples demonstrate how to use the GetRagEngineConfig
API
for each type of tier:
Version 1 (v1) API code samples.
v1beta1 API code samples.
Update the project configuration
The following code samples demonstrate how to use the UpdateRagEngineConfig
API for each type of tier:
Version 1 (v1) API code samples.
v1beta1 API code samples.
What's next
- To learn how to use the RAG API v1, the default, see RAG API v1.
- To learn how to use the RAG API v1beta1, see RAG API v1beta1.
- To learn more about
RagManagedDb
and how to manage your tier configuration as well as the RAG corpus-level retrieval strategy, see Use RagManagedDb with Vertex AI RAG Engine.