Before you work with Vertex AI Agent Engine Memory Bank, you must set up your environment. Note that although Memory Bank is part of Agent Engine, you don't need to deploy your code to Agent Engine Runtime to use Memory Bank.
Set up your Google Cloud project
Every project can be identified in two ways: the project number or the project
ID. The PROJECT_NUMBER
is automatically created when you
create the project, whereas the PROJECT_ID
is created by you,
or whoever created the project. To set up a project:
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Verify that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI API.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Verify that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI API.
Get the required roles
To get the permissions that
you need to use Vertex AI Agent Engine,
ask your administrator to grant you the
Vertex AI User (roles/aiplatform.user
)
IAM role on your project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
If you're making requests to Memory Bank from an agent deployed on Google Kubernetes Engine or Cloud Run, make sure that your service account has the necessary permissions. The Reasoning Engine Service Agent already has the necessary permissions to read and write memories, so outbound requests from Agent Engine Runtime should already have permission to access Memory Bank.
Set up your environment
This section assumes that you have set up a Python development environment, or are using a runtime with a Python development environment (such as Colab).
Install libraries
Install the Vertex AI SDK:
pip install google-cloud-aiplatform>=1.104.0
Authentication
Authentication instructions depend on whether you're using Vertex AI in express mode:
If you're not using Vertex AI in express mode, follow the instructions at Authenticate to Vertex AI.
If you're using Vertex AI in express mode, set up authentication by setting the API key in the environment:
os.environ["GOOGLE_API_KEY"] = "API_KEY"
Set up a Vertex AI SDK client
Run the following code to set up a Vertex AI SDK client:
import vertexai
client = vertexai.Client(
project="PROJECT_ID",
location="LOCATION",
)
where
PROJECT_ID
is your project ID.LOCATION
is one of the supported regions for Memory Bank.
Configure your Agent Engine instance
To get started with Memory Bank, you first need an Agent Engine instance.
You can do one of the following:
Create or update an Agent Engine instance: If you create or update an instance, you can override Agent Engine's defaults to make the following modifications to the instance:
Set the configuration for how Memory Bank generates and manages memories.
Deploy your agent to Agent Engine Runtime.
Use an existing instance
If you don't need to modify an existing Agent Engine instance, run the following to configure the instance for Memory Bank:
agent_engine = client.agent_engines.get(name="AGENT_ENGINE_NAME")
Replace the following:
- AGENT_ENGINE_NAME: The name of the Agent Engine. It should be in the format
projects/.../locations/.../reasoningEngines/...
. See the supported regions for Memory Bank.
Create or update an instance
When you create or update an Agent Engine instance, you can override Agent Engine's defaults to make the following modifications:
Set the configuration for how Memory Bank generates and manages memories.
Deploy your agent to Agent Engine Runtime
Create
Memory Bank is enabled by default when you create an Agent Engine instance. You can use the instance in any environment, including Google Kubernetes Engine and Cloud Run. You need the Agent Engine name that identifies the Memory Bank and sufficient permission to call Memory Bank.
agent_engine = client.agent_engines.create()
New instances are empty unless you first create or generate memories.
Update
You can update an existing Agent Engine instance if you want to update the Agent Engine while still persisting the memories that were stored in the instance. You can make updates like changing the Memory Bank configuration or deploying your agent to Agent Engine Runtime.
agent_engine = client.agent_engines.update(
# If you have an existing AgentEngine, you can access the name using `agent_engine.api_resource.name`.
name="AGENT_ENGINE_NAME",
# Optional.
agent_engine=...,
# Optional.
config=...
)
Replace the following:
- AGENT_ENGINE_NAME: The name of the Agent Engine. It should be in the format
projects/.../locations/.../reasoningEngines/...
. See the supported regions for Memory Bank.
Set your Memory Bank configuration
You can configure your Memory Bank to customize how memories are generated and managed. If the configuration is not provided, then Memory Bank uses the default settings.
# Optional.
similarity_search_config = {
"similarity_search_config": {
"embedding_model": "EMBEDDING_MODEL",
}
}
# Optional
generation_config = {
"generation_config": {
"model": "LLM_MODEL",
}
}
context_spec = {
"context_spec": {
"memory_bank_config": {
**similarity_search_config,
**generation_config
}
}
}
# Create an Agent Engine with a Memory Bank Config.
agent_engine = client.agent_engines.create(
config={
# Optional.
**context_spec
}
)
Replace the following:
EMBEDDING_MODEL: The Google text embedding model to use for similarity search, in the format
projects/{project}/locations/{location}/publishers/google/models/{model}
. Memory Bank usestext-embedding-005
as the default model. If you expect user conversations to be in non-English languages, use a model that supports multiple languages, such asgemini-embedding-001
ortext-multilingual-embedding-002
, to improve retrieval quality.LLM_MODEL: The Google LLM model to use for extracting and consolidating memories, in the format
projects/{project}/locations/{location}/publishers/google/models/{model}
. Memory Bank usesgemini-2.0-flash-001
as the default model.
Deploy your agent with memory to Agent Engine
Although Memory Bank can be used in any runtime, you can also use Memory Bank with Agent Engine Runtime to read and write memories from your deployed agent.
To deploy an agent with Memory Bank on Vertex AI Agent Engine Runtime, first set up your environment for Agent Engine runtime. Then, prepare your agent to be deployed on Agent Engine Runtime with memory integration. Your deployed agent should make calls to read and write memories as needed.
AdkApp
If you're using the Agent Engine Agent Development Kit template, the agent uses the VertexAiMemoryBankService
by default when deployed to Agent Engine runtime. This means that your agent automatically uses Memory Bank to manage memories.
from google.adk.agents import Agent
from vertexai.preview.reasoning_engines import AdkApp
# Develop an agent using the ADK template.
agent = Agent(...)
adk_app = AdkApp(
agent=adk_agent,
...
)
# Deploy the agent to Agent Engine runtime.
agent_engine = client.agent_engines.create(
agent_engine=adk_app,
config={
"staging_bucket":
"requirements": ["google-cloud-aiplatform[agent_engines,adk]"],
# Optional.
**context_spec
}
)
When run locally, the ADK template uses InMemoryMemoryService
by default. If you want to use different Agent Engine instances for memories and deployment, or use an AdkApp
with Memory Bank locally, you can override the instance Memory Bank uses:
def memory_bank_service_builder():
return VertexAiMemoryBankService(
project="PROJECT_ID",
location="LOCATION",
agent_engine_id="AGENT_ENGINE_ID"
)
adk_app = AdkApp(
agent=adk_agent,
# Override the default memory service.
memory_service_builder=memory_bank_service_builder
)
agent_engine = client.agent_engines.create(
agent_engine=adk_app,
config={
"staging_bucket":
"requirements": ["google-cloud-aiplatform[agent_engines,adk]"],
# Optional.
**context_spec
}
)
Replace the following:
- PROJECT_ID: Your project ID.
- LOCATION: Your region. See the supported regions for Memory Bank.
- AGENT_ENGINE_ID: The Agent Engine ID to use for Memory Bank. For example,
456
inprojects/my-project/locations/us-central1/reasoningEngines/456
.
For more information about using Memory Bank with ADK, refer to the Quickstart with Agent Development Kit.
Custom agent
You can use Memory Bank with your custom agent deployed on Agent Engine Runtime.
If you want to use the same Agent Engine instance for both Memory Bank and the Agent Engine Runtime, you can read the environment variables GOOGLE_CLOUD_PROJECT
, GOOGLE_CLOUD_LOCATION
,GOOGLE_CLOUD_AGENT_ENGINE_ID
to infer the Agent Engine name from the environment:
project = os.environ.get("GOOGLE_CLOUD_PROJECT")
location = os.environ.get("GOOGLE_CLOUD_LOCATION")
agent_engine_id = os.environ.get("GOOGLE_CLOUD_AGENT_ENGINE_ID")
agent_engine_name = f"projects/{project}/locations/{location}/reasoningEngines/{agent_engine_id}"