Set up the environment for Memory Bank

Before you work with Vertex AI Agent Engine Memory Bank, you must set up your environment. Note that although Memory Bank is part of Agent Engine, you don't need to deploy your code to Agent Engine Runtime to use Memory Bank.

Set up your Google Cloud project

Every project can be identified in two ways: the project number or the project ID. The PROJECT_NUMBER is automatically created when you create the project, whereas the PROJECT_ID is created by you, or whoever created the project. To set up a project:

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. Enable the Vertex AI API.

    Enable the API

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Verify that billing is enabled for your Google Cloud project.

  7. Enable the Vertex AI API.

    Enable the API

Get the required roles

To get the permissions that you need to use Vertex AI Agent Engine, ask your administrator to grant you the Vertex AI User (roles/aiplatform.user) IAM role on your project. For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

If you're making requests to Memory Bank from an agent deployed on Google Kubernetes Engine or Cloud Run, make sure that your service account has the necessary permissions. The Reasoning Engine Service Agent already has the necessary permissions to read and write memories, so outbound requests from Agent Engine Runtime should already have permission to access Memory Bank.

Set up your environment

This section assumes that you have set up a Python development environment, or are using a runtime with a Python development environment (such as Colab).

Install libraries

Install the Vertex AI SDK:

  pip install google-cloud-aiplatform>=1.104.0

Authentication

Authentication instructions depend on whether you're using Vertex AI in express mode:

  • If you're not using Vertex AI in express mode, follow the instructions at Authenticate to Vertex AI.

  • If you're using Vertex AI in express mode, set up authentication by setting the API key in the environment:

      os.environ["GOOGLE_API_KEY"] = "API_KEY"
    

Set up a Vertex AI SDK client

Run the following code to set up a Vertex AI SDK client:

import vertexai
client = vertexai.Client(
    project="PROJECT_ID",
    location="LOCATION",
)

where

  • PROJECT_ID is your project ID.
  • LOCATION is one of the supported regions for Memory Bank.

Configure your Agent Engine instance

To get started with Memory Bank, you first need an Agent Engine instance.

You can do one of the following:

Use an existing instance

If you don't need to modify an existing Agent Engine instance, run the following to configure the instance for Memory Bank:

agent_engine = client.agent_engines.get(name="AGENT_ENGINE_NAME")

Replace the following:

  • AGENT_ENGINE_NAME: The name of the Agent Engine. It should be in the format projects/.../locations/.../reasoningEngines/.... See the supported regions for Memory Bank.

Create or update an instance

When you create or update an Agent Engine instance, you can override Agent Engine's defaults to make the following modifications:

Create

Memory Bank is enabled by default when you create an Agent Engine instance. You can use the instance in any environment, including Google Kubernetes Engine and Cloud Run. You need the Agent Engine name that identifies the Memory Bank and sufficient permission to call Memory Bank.

  agent_engine = client.agent_engines.create()

New instances are empty unless you first create or generate memories.

Update

You can update an existing Agent Engine instance if you want to update the Agent Engine while still persisting the memories that were stored in the instance. You can make updates like changing the Memory Bank configuration or deploying your agent to Agent Engine Runtime.

  agent_engine = client.agent_engines.update(
        # If you have an existing AgentEngine, you can access the name using `agent_engine.api_resource.name`.
        name="AGENT_ENGINE_NAME",
        # Optional.
        agent_engine=...,
        # Optional.
        config=...
  )

Replace the following:

  • AGENT_ENGINE_NAME: The name of the Agent Engine. It should be in the format projects/.../locations/.../reasoningEngines/.... See the supported regions for Memory Bank.

Set your Memory Bank configuration

You can configure your Memory Bank to customize how memories are generated and managed. If the configuration is not provided, then Memory Bank uses the default settings.

# Optional.
similarity_search_config = {
        "similarity_search_config": {
                "embedding_model": "EMBEDDING_MODEL",
        }
}

# Optional
generation_config = {
      "generation_config": {
            "model": "LLM_MODEL",
      }
}

context_spec = {
      "context_spec": {
            "memory_bank_config": {
                  **similarity_search_config,
                  **generation_config
            }
      }
}

# Create an Agent Engine with a Memory Bank Config.
agent_engine = client.agent_engines.create(
      config={
            # Optional.
            **context_spec
      }
)

Replace the following:

  • EMBEDDING_MODEL: The Google text embedding model to use for similarity search, in the format projects/{project}/locations/{location}/publishers/google/models/{model}. Memory Bank uses text-embedding-005 as the default model. If you expect user conversations to be in non-English languages, use a model that supports multiple languages, such as gemini-embedding-001 or text-multilingual-embedding-002, to improve retrieval quality.

  • LLM_MODEL: The Google LLM model to use for extracting and consolidating memories, in the format projects/{project}/locations/{location}/publishers/google/models/{model}. Memory Bank uses gemini-2.0-flash-001 as the default model.

Deploy your agent with memory to Agent Engine

Although Memory Bank can be used in any runtime, you can also use Memory Bank with Agent Engine Runtime to read and write memories from your deployed agent.

To deploy an agent with Memory Bank on Vertex AI Agent Engine Runtime, first set up your environment for Agent Engine runtime. Then, prepare your agent to be deployed on Agent Engine Runtime with memory integration. Your deployed agent should make calls to read and write memories as needed.

AdkApp

If you're using the Agent Engine Agent Development Kit template, the agent uses the VertexAiMemoryBankService by default when deployed to Agent Engine runtime. This means that your agent automatically uses Memory Bank to manage memories.

from google.adk.agents import Agent
from vertexai.preview.reasoning_engines import AdkApp

# Develop an agent using the ADK template.
agent = Agent(...)

adk_app = AdkApp(
      agent=adk_agent,
      ...
)

# Deploy the agent to Agent Engine runtime.
agent_engine = client.agent_engines.create(
      agent_engine=adk_app,
      config={
            "staging_bucket":
            "requirements": ["google-cloud-aiplatform[agent_engines,adk]"],
            # Optional.
            **context_spec
      }
)

When run locally, the ADK template uses InMemoryMemoryService by default. If you want to use different Agent Engine instances for memories and deployment, or use an AdkApp with Memory Bank locally, you can override the instance Memory Bank uses:

def memory_bank_service_builder():
    return VertexAiMemoryBankService(
        project="PROJECT_ID",
        location="LOCATION",
        agent_engine_id="AGENT_ENGINE_ID"
    )

adk_app = AdkApp(
      agent=adk_agent,
      # Override the default memory service.
      memory_service_builder=memory_bank_service_builder
)

agent_engine = client.agent_engines.create(
      agent_engine=adk_app,
      config={
            "staging_bucket":
            "requirements": ["google-cloud-aiplatform[agent_engines,adk]"],
            # Optional.
            **context_spec
      }
)

Replace the following:

  • PROJECT_ID: Your project ID.
  • LOCATION: Your region. See the supported regions for Memory Bank.
  • AGENT_ENGINE_ID: The Agent Engine ID to use for Memory Bank. For example, 456 in projects/my-project/locations/us-central1/reasoningEngines/456.

For more information about using Memory Bank with ADK, refer to the Quickstart with Agent Development Kit.

Custom agent

You can use Memory Bank with your custom agent deployed on Agent Engine Runtime.

If you want to use the same Agent Engine instance for both Memory Bank and the Agent Engine Runtime, you can read the environment variables GOOGLE_CLOUD_PROJECT, GOOGLE_CLOUD_LOCATION,GOOGLE_CLOUD_AGENT_ENGINE_ID to infer the Agent Engine name from the environment:

project = os.environ.get("GOOGLE_CLOUD_PROJECT")
location = os.environ.get("GOOGLE_CLOUD_LOCATION")
agent_engine_id = os.environ.get("GOOGLE_CLOUD_AGENT_ENGINE_ID")

agent_engine_name = f"projects/{project}/locations/{location}/reasoningEngines/{agent_engine_id}"

What's next