Memory Bank lets you construct long-term memories from conversations between the user and your agent. Memory Bank then consolidates and self-curates memories for a specific user by adding, updating, and removing memories over time. This page describes how to trigger memory generation and get the memories.
When you trigger memory generation, Memory Bank automatically performs the following operations:
Extraction: Extracts information about the user from their conversations with the agent.
Consolidation: Identifies if existing memories with the same scope should be deleted or updated. Memory Bank checks that new memories are not duplicative or contradictory before merging them with existing memories.
Not all user-agent interactions result in memories being created or updated. Memory Bank only persists information that is judged to be valuable for future interactions, which can include the following types of information:
- Information about the user, like preferences, names, relationships, hobbies, and important dates. For example, "I work at Google", "I prefer the aisle seat", or "My wedding anniversary is on December 31".
- Key conversation events and task outcomes. For example, "I booked plane tickets for a round trip between JFK and SFO. I leave on June 1, 2025 and return on June 7, 2025."
- Information that the user explicitly asks the agent to remember. For example, if the user says "Remember that I primarily use Python," Memory Bank generates a memory such as "I primarily use Python."
Memories can only be extracted from text, inline files, and file data in the source content. All other content, including function calls and responses, are ignored when generating memories.
Memories can be extracted from images, video, and audio provided by the user. If the context provided by the multimodal input is judged by Memory Bank to be meaningful for future interactions, then a textual memory may be created including information extracted from the input. For example, if the user provides an image of a golden retriever with the text "This is my dog" then Memory Bank generates a memory such as "My dog is a golden retriever."
Set up your environment
Before you get started with generating memories, you need to setup your environment and Agent Engine instance. This section presumes that you have set up a Python development environment, or are using a runtime with a Python development environment (such as Colab).
Import libraries
Install the Vertex AI SDK:
pip install google-cloud-aiplatform>=1.100.0
Set up a Vertex AI SDK client
import vertexai
client = vertexai.Client(
project="PROJECT_ID",
location="LOCATION",
)
Replace the following:
- PROJECT_ID: Your project ID.
- LOCATION: Your region. Only
us-central1
is supported for Vertex AI Agent Engine Memory Bank.
Configure your Agent Engine instance
To fetch memories from Memory Bank, you first need an instance of an Agent Engine. You can either create a new instance or get an existing instance. New instances are empty unless you first create or generate memories.
Create
agent_engine = client.agent_engines.create()
Use existing
agent_engine = client.agent_engines.get(name="AGENT_ENGINE_NAME")
Replace the following:
- AGENT_ENGINE_NAME: The name of the Agent Engine. It should be in the format
projects/.../locations/.../reasoningEngines/...
. You can only use Agent Engine instances inus-central1
for Vertex AI Agent Engine Memory Bank.
Generate memories
You can trigger memory generation using GenerateMemories
at the end of a session or at regular intervals within a session. Memory generation extracts key context from source conversations and combines it with existing memories for the same scope. For example, you can create session-level memories by using a scope such as {"user_id": "123", "session_id": "456"}
. Memories with the same scope can be consolidated and retrieved together.
When calling GenerateMemories
, you must provide the source conversation through Agent Engine Sessions or directly through JSON format:
Agent Engine Sessions
With Agent Engine Sessions, Memory Bank uses session events as the source conversation for memory generation.
To scope the generated memories, Memory Bank extracts and uses the user ID from the session by default. For example, the memories' scope is stored as {"user_id": "123"}
if the session's user_id
is "123". You can also provide a scope
directly, which overrides using the session's user_id
as the scope.
client.agent_engines.generate_memories(
name=agent_engine.api_resource.name,
vertex_session_source={
"session": "SESSION_NAME"
},
# Optional when using Agent Engine Sessions. Defaults to {"user_id": session.user_id}.
scope=SCOPE,
config={
"wait_for_completion": True
}
)
Replace the following:
SESSION_NAME: The session name.
(Optional) SCOPE: A dictionary, representing the scope of the generated memories. For example,
{"session_id": "MY_SESSION"}
. Only memories with the same scope are considered for consolidation. If not provided,{"user_id": session.user_id}
is used.
JSON format
Provide the source conversation directly in JSON format if you're using a different session storage from Agent Engine Sessions:
client.agent_engines.generate_memories(
name=agent_engine.api_resource.name,
direct_contents_source={
"events": EVENTS
},
scope=SCOPE,
config={
"wait_for_completion": True
}
)
Replace the following:
- EVENTS: List of Content dictionaries. For example:
[
{
"content": {
"role": "user",
"parts": [
{"text": "I'm work with LLM agents!"}
]
}
}
]
- SCOPE: A dictionary, representing the scope of the generated memories. For example,
{"session_id": "MY_SESSION"}
. Only memories with the same scope are considered for consolidation.
GenerateMemories
returns a AgentEngineGenerateMemoriesOperation
containing a list of generated memories:
AgentEngineGenerateMemoriesOperation(
name="projects/.../locations/.../reasoningEngines/.../operations/...",
done=True,
response=GenerateMemoriesResponse(
generatedMemories=[
GenerateMemoriesResponseGeneratedMemory(
memory=Memory(
"name": "projects/.../locations/.../reasoningEngines/.../memories/..."
),
action=<GenerateMemoriesResponseGeneratedMemoryAction.CREATED: "CREATED">,
),
GenerateMemoriesResponseGeneratedMemory(
memory=Memory(
"name": "projects/.../locations/.../reasoningEngines/.../memories/..."
),
action=<GenerateMemoriesResponseGeneratedMemoryAction.UPDATED: "UPDATED">,
),
GenerateMemoriesResponseGeneratedMemory(
memory=Memory(
"name": "projects/.../locations/.../reasoningEngines/.../memories/..."
),
action=<GenerateMemoriesResponseGeneratedMemoryAction.DELETED: "DELETED">,
),
]
)
)
Each generated memory includes the action
that was performed on that memory:
CREATED
: Indicates that a new memory was added, representing a novel concept that wasn't captured by existing memories.UPDATED
: Indicates that an existing memory was updated, which happens if the memory covered similar concepts as the newly extracted information. The memory's fact may be updated with new information or remain the same.DELETED
: Indicates that the existing memory was deleted, because its information was contradictory to new information extracted from the conversation.
For CREATED
or UPDATED
memories, you can use GetMemories
to retrieve the full content of the memory. Retrieving DELETED
memories results in a 404 error.