Vertex AI Agent Engine Memory Bank overview

Vertex AI Agent Engine Memory Bank lets you dynamically generate long-term memories based on users' conversations with your agent. Long-term memories are personalized information that can be accessed across multiple sessions for a particular user. The agent can use the memories to personalize responses to the user and create cross-session continuity.

Features of Memory Bank include the following:

  • Persistent storage of memories that can be accessed from multiple environments. You can use Vertex AI Agent Engine Sessions and Memory Bank with your deployed agent on Vertex AI Agent Engine, from your local environment, or with other deployment options.

  • Large language model (LLM)-based extraction of memories from sessions.

  • Memories are remotely generated asynchronously, so the agent doesn't need to wait for memories to be generated.

  • Similarity search-based retrieval of memories scoped to a user.

  • If you use Vertex AI Agent Engine Memory Bank with Agent Development Kit, your agent automatically reads and writes long-term memories for you.

Vertex AI Agent Engine Memory Bank conceptual overview

Memory Bank integrates with Vertex AI Agent Engine Sessions to generate memories from stored sessions using the following process:

  1. (Sessions) CreateSession: At the start of each conversation, create a new session. The conversation history used by the agent is scoped to this session. A session contains the chronological sequence of messages and actions (SessionEvents) for an interaction between a user and your agent. All sessions must have a user ID; the extracted memories (see GenerateMemories) for this session are mapped to this user.

  2. (Sessions) AppendEvent: As the user interacts with the agent, events (such as user messages, agent responses, tool actions) are uploaded to Sessions. The events persist conversation history and create a record of the conversation that can be used to generate memories.

  3. (Sessions) ListEvents: As the user interacts with the agent, the agent retrieves the conversation history.

  4. (Memory Bank) Generate or create memories:

    • GenerateMemories: At a specified interval (such as the end of every session or the end of every turn), the agent can trigger memories to be generated using conversation history. Facts about the user are automatically extracted from the conversation history so that they're available for current or future sessions.

    • CreateMemory: Your agent can write memories directly to Memory Bank. For example, the agent can decide when a memory should be written and what information should be saved (memory-as-a-tool). Use CreateMemory when you want your agent to have more control over what facts are extracted.

  5. (Memory Bank) RetrieveMemories: As the user interacts with your agent, the agent can retrieve memories saved about that user. You can either retrieve all memories (simple retrieval) or only the most relevant memories to the current conversation (similarity search retrieval). Then you can insert the retrieved memories into your prompt.

Quickstarts

Get started with Memory Bank using the following quickstarts:

Security considerations

In addition to the security responsibilities outlined in Vertex AI shared responsibility, consider the risk of prompt injection and memory poisoning that can affect your agent when using long-term memories. Memory poisoning occurs when false information is stored in Memory Bank. The agent may then operate on this false or malicious information in future sessions.

To mitigate the risk of memory poisoning, you can do the following:

  • Model Armor: Use Model Armor to inspect prompts being sent to Memory Bank or from your agent.

  • Adversarial testing: Proactively test your LLM application for prompt injection vulnerabilities by simulating attacks. This is typically known as "red teaming."

  • Sandbox execution: If the agent has the ability to execute or interact with external or critical systems, these actions should be performed in a sandboxed environment with strict access control and human review.

For more information, see Google's Approach for Secure AI Agents.