This tutorial demonstrates how to make REST API calls directly to Vertex AI Agent Engine Sessions and Memory Bank to create and use sessions and long-term memories. Use the REST API if you don't want an agent framework to orchestrate calls for you, or you want to integrate Sessions and Memory Bank with agent frameworks other than Agent Development Kit (ADK).
For the quickstart using ADK, see Quickstart with Agent Development Kit.
This tutorial uses the following steps:
- Create your Vertex AI Agent Engine instance to access Vertex AI Agent Engine Sessions and Memory Bank.
- Create memories using the following options:
- Generate memories using Vertex AI Agent Engine Memory Bank: Write sessions and events to Vertex AI Agent Engine Sessions as sources for Vertex AI Agent Engine Memory Bank to generate memories.
- Upload memories directly: Write your own memories or have your agent build memories if you want full control over what information is persisted.
- Retrieve memories.
- Clean up.
Before you begin
To complete the steps demonstrated in this tutorial, you must first set up your project and environment.
Set up your project
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI API.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI API.
- If you selected a project, make sure you have the
Vertex AI user (
roles/aiplatform.user
) IAM role on the project.
Authenticate to Vertex AI
To use the Python samples on this page in a local development environment, install and initialize the gcloud CLI, and then set up Application Default Credentials with your user credentials.
-
Install the Google Cloud CLI.
-
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
If you're using a local shell, then create local authentication credentials for your user account:
gcloud auth application-default login
You don't need to do this if you're using Cloud Shell.
If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.
For more information, see Set up ADC for a local development environment in the Google Cloud authentication documentation.
Import libraries
Install Vertex AI SDK:
pip install google-cloud-aiplatform>=1.100.0
Create your Vertex AI Agent Engine instance
To access Vertex AI Agent Engine Sessions and Vertex AI Agent Engine Memory Bank, you first need to create an Vertex AI Agent Engine instance. You don't need to deploy an agent to start using Sessions and Memory Bank. Without agent deployment, creating an Vertex AI Agent Engine instance should take a few seconds.
import vertexai
client = vertexai.Client(
project="PROJECT_ID",
location="LOCATION",
)
agent_engine = client.agent_engines.create()
Replace the following:
- PROJECT_ID: Your project ID.
- LOCATION: Your region. Only
us-central1
is supported for Vertex AI Agent Engine Memory Bank.
Generate memories from Vertex AI Agent Engine Sessions
After setting up Vertex AI Agent Engine Sessions and Memory Bank, you can create sessions and append events to them. Memories are generated as facts from the user's conversation with the agent so that they're available for future user interactions. For more information, see Generate and retrieve memories.
Create a session with an opaque user ID. Any memories generated from this session are automatically keyed by the scope
{"user_id": "USER_ID"}
unless you explicitly provide a scope when generating memories.from google.cloud import aiplatform_v1beta1 sessions_client = aiplatform_v1beta1.SessionServiceClient( client_options={ "api_endpoint": "https://LOCATION-aiplatform.googleapis.com" }, transport="rest" ) session_lro = sessions_client.create_session( parent=AGENT_ENGINE_NAME, session={"user_id": "USER_ID"} ) session_name = "/".join(session_lro.operation.name.split("/")[0:-2])
Replace the following:
LOCATION: Your region. Only
us-central1
is supported for Vertex AI Agent Engine Memory Bank.AGENT_ENGINE_NAME: The name of the Vertex AI Agent Engine instance that you created or an existing Vertex AI Agent Engine instance. The name should be in the following format:
projects/{your project}/locations/{your location}/reasoningEngine/{your reasoning engine}
.USER_ID: An identifier for your user. Any memories generated from this session are automatically keyed by the scope
{"user_id": "USER_ID"}
unless you explicitly provide a scope when generating memories.
Iteratively upload events to your session. Events can include any interactions between your user, agent, and tools. The ordered list of events represents your session's conversation history. This conversation history is used as the source material for generating memories for that particular user.
event = aiplatform_v1beta1.SessionEvent( author="user", # Required by Sessions. invocation_id="1", # Required by Sessions. timestamp=datetime.now().strftime('%Y-%m-%dT%H:%M:%SZ'), # Required by Sessions. content = aiplatform_v1beta1.Content( role="user", parts=[aiplatform_v1beta1.Part(text="Hello")] ) ) sessions_client.append_event(name=session_name, event=event)
To generate memories from your conversation history, trigger a memory generation request for the session:
client.agent_engines.generate_memories( name=agent_engine.api_resource.name, vertex_session_source={ "session": session_name }, # Optional when using Agent Engine Sessions. Defaults to {"user_id": session.user_id}. scope=SCOPE )
Replace the following:
- (Optional) SCOPE: A dictionary representing the scope of the generated memories, with a maximum of 5 key value pairs and no
*
characters. For example,{"session_id": "MY_SESSION"}
. Only memories with the same scope are considered for consolidation. If not provided,{"user_id": session.user_id}
is used.
Upload memories
As an alternative to generating memories using raw dialogue, you can upload memories or have your agents add them directly using CreateMemory
. Rather than Memory Bank extracting information from your content, you provide the facts that should be stored about your user directly. We recommended that you write facts about users in first person (for example, I am a software engineer
).
memory = client.agent_engines.create_memory(
name=agent_engine.api_resource.name,
fact="This is a fact.",
scope= {"user_id": "123"}
)
"""
Returns an AgentEngineMemoryOperation containing the created Memory like:
AgentEngineMemoryOperation(
done=True,
metadata={
"@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.CreateMemoryOperationMetadata",
"genericMetadata": {
"createTime": '2025-06-26T01:15:29.027360Z',
"updateTime": '2025-06-26T01:15:29.027360Z'
}
},
name="projects/.../locations/us-central1/reasoningEngines/.../memories/.../operations/...",
response=Memory(
create_time=datetime.datetime(2025, 6, 26, 1, 15, 29, 27360, tzinfo=TzInfo(UTC)),
fact="This is a fact.",
name="projects/.../locations/us-central1/reasoningEngines/.../memories/...",
scope={
"user_id": "123"
},
update_time=datetime.datetime(2025, 6, 26, 1, 15, 29, 27360, tzinfo=TzInfo(UTC))
)
)
"""
Retrieve and use memories
You can retrieve memories for your user and include them in your system instructions to give the LLM access to your personalized context.
For more information about retrieving memories using a scope-based method, see Fetch memories.
# Retrieve all memories for User ID 123.
retrieved_memories = list(
client.agent_engines.retrieve_memories(
name=agent_engine.api_resource.name,
scope={"user_id": "123"}
)
)
You can use jinja
to convert your structured memories into a prompt:
from jinja2 import Template
template = Template("""
<MEMORIES>
Here is some information about the user:
{% for retrieved_memory in data %}* {{ retrieved_memory.memory.fact }}
{% endfor %}</MEMORIES>
""")
prompt = template.render(data=retrieved_memories)
"""
Output:
<MEMORIES>
Here is some information about the user:
* This is a fact
</MEMORIES>
"""
Clean up
To clean up all resources used in this project, you can delete the Google Cloud project you used for the quickstart.
Otherwise, you can delete the individual resources you created in this tutorial, as follows:
Use the following code sample to delete the Vertex AI Agent Engine instance, which also deletes any sessions or memories associated with the Vertex AI Agent Engine instance.
agent_engine.delete(force=True)
Delete any locally created files.