Quickstart with REST API

This tutorial demonstrates how to make REST API calls directly to Vertex AI Agent Engine Sessions and Memory Bank to create and use sessions and long-term memories. Use the REST API if you don't want an agent framework to orchestrate calls for you, or you want to integrate Sessions and Memory Bank with agent frameworks other than Agent Development Kit (ADK).

For the quickstart using ADK, see Quickstart with Agent Development Kit.

This tutorial uses the following steps:

  1. Create your Vertex AI Agent Engine instance to access Vertex AI Agent Engine Sessions and Memory Bank.
  2. Create memories using the following options:
  3. Retrieve memories.
  4. Clean up.

Before you begin

To complete the steps demonstrated in this tutorial, you must first set up your project and environment.

Set up your project

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Google Cloud project.

  4. Enable the Vertex AI API.

    Enable the API

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Make sure that billing is enabled for your Google Cloud project.

  7. Enable the Vertex AI API.

    Enable the API

  8. If you selected a project, make sure you have the Vertex AI user (roles/aiplatform.user) IAM role on the project.

Authenticate to Vertex AI

To use the Python samples on this page in a local development environment, install and initialize the gcloud CLI, and then set up Application Default Credentials with your user credentials.

  1. Install the Google Cloud CLI.

  2. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  3. To initialize the gcloud CLI, run the following command:

    gcloud init
  4. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

    If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

For more information, see Set up ADC for a local development environment in the Google Cloud authentication documentation.

Import libraries

Install Vertex AI SDK:

pip install google-cloud-aiplatform>=1.100.0

Create your Vertex AI Agent Engine instance

To access Vertex AI Agent Engine Sessions and Vertex AI Agent Engine Memory Bank, you first need to create an Vertex AI Agent Engine instance. You don't need to deploy an agent to start using Sessions and Memory Bank. Without agent deployment, creating an Vertex AI Agent Engine instance should take a few seconds.

import vertexai

client = vertexai.Client(
    project="PROJECT_ID",
    location="LOCATION",
)

agent_engine = client.agent_engines.create()

Replace the following:

  • PROJECT_ID: Your project ID.
  • LOCATION: Your region. Only us-central1 is supported for Vertex AI Agent Engine Memory Bank.

Generate memories from Vertex AI Agent Engine Sessions

After setting up Vertex AI Agent Engine Sessions and Memory Bank, you can create sessions and append events to them. Memories are generated as facts from the user's conversation with the agent so that they're available for future user interactions. For more information, see Generate and retrieve memories.

  1. Create a session with an opaque user ID. Any memories generated from this session are automatically keyed by the scope {"user_id": "USER_ID"} unless you explicitly provide a scope when generating memories.

    from google.cloud import aiplatform_v1beta1
    
    sessions_client = aiplatform_v1beta1.SessionServiceClient(
      client_options={
        "api_endpoint": "https://LOCATION-aiplatform.googleapis.com"
      },
      transport="rest"
    )
    
    session_lro = sessions_client.create_session(
      parent=AGENT_ENGINE_NAME,
      session={"user_id": "USER_ID"}
    )
    session_name = "/".join(session_lro.operation.name.split("/")[0:-2])
    

    Replace the following:

    • LOCATION: Your region. Only us-central1 is supported for Vertex AI Agent Engine Memory Bank.

    • AGENT_ENGINE_NAME: The name of the Vertex AI Agent Engine instance that you created or an existing Vertex AI Agent Engine instance. The name should be in the following format: projects/{your project}/locations/{your location}/reasoningEngine/{your reasoning engine}.

    • USER_ID: An identifier for your user. Any memories generated from this session are automatically keyed by the scope {"user_id": "USER_ID"} unless you explicitly provide a scope when generating memories.

  2. Iteratively upload events to your session. Events can include any interactions between your user, agent, and tools. The ordered list of events represents your session's conversation history. This conversation history is used as the source material for generating memories for that particular user.

    event = aiplatform_v1beta1.SessionEvent(
        author="user",  # Required by Sessions.
        invocation_id="1",  # Required by Sessions.
        timestamp=datetime.now().strftime('%Y-%m-%dT%H:%M:%SZ'),  # Required by Sessions.
        content = aiplatform_v1beta1.Content(
            role="user",
            parts=[aiplatform_v1beta1.Part(text="Hello")]
        )
    )
    
    sessions_client.append_event(name=session_name, event=event)
    
  3. To generate memories from your conversation history, trigger a memory generation request for the session:

    client.agent_engines.generate_memories(
      name=agent_engine.api_resource.name,
      vertex_session_source={
        "session": session_name
      },
      # Optional when using Agent Engine Sessions. Defaults to {"user_id": session.user_id}.
      scope=SCOPE
    )
    

Replace the following:

  • (Optional) SCOPE: A dictionary representing the scope of the generated memories, with a maximum of 5 key value pairs and no * characters. For example, {"session_id": "MY_SESSION"}. Only memories with the same scope are considered for consolidation. If not provided, {"user_id": session.user_id} is used.

Upload memories

As an alternative to generating memories using raw dialogue, you can upload memories or have your agents add them directly using CreateMemory. Rather than Memory Bank extracting information from your content, you provide the facts that should be stored about your user directly. We recommended that you write facts about users in first person (for example, I am a software engineer).

memory = client.agent_engines.create_memory(
    name=agent_engine.api_resource.name,
    fact="This is a fact.",
    scope= {"user_id": "123"}
)

"""
Returns an AgentEngineMemoryOperation containing the created Memory like:

AgentEngineMemoryOperation(
  done=True,
  metadata={
    "@type': 'type.googleapis.com/google.cloud.aiplatform.v1beta1.CreateMemoryOperationMetadata",
    "genericMetadata": {
      "createTime": '2025-06-26T01:15:29.027360Z',
      "updateTime": '2025-06-26T01:15:29.027360Z'
    }
  },
  name="projects/.../locations/us-central1/reasoningEngines/.../memories/.../operations/...",
  response=Memory(
    create_time=datetime.datetime(2025, 6, 26, 1, 15, 29, 27360, tzinfo=TzInfo(UTC)),
    fact="This is a fact.",
    name="projects/.../locations/us-central1/reasoningEngines/.../memories/...",
    scope={
      "user_id": "123"
    },
    update_time=datetime.datetime(2025, 6, 26, 1, 15, 29, 27360, tzinfo=TzInfo(UTC))
  )
)
"""

Retrieve and use memories

You can retrieve memories for your user and include them in your system instructions to give the LLM access to your personalized context.

For more information about retrieving memories using a scope-based method, see Fetch memories.

# Retrieve all memories for User ID 123.
retrieved_memories = list(
    client.agent_engines.retrieve_memories(
        name=agent_engine.api_resource.name,
        scope={"user_id": "123"}
    )
)

You can use jinja to convert your structured memories into a prompt:


from jinja2 import Template

template = Template("""
<MEMORIES>
Here is some information about the user:
{% for retrieved_memory in data %}* {{ retrieved_memory.memory.fact }}
{% endfor %}</MEMORIES>
""")

prompt = template.render(data=retrieved_memories)

"""
Output:

<MEMORIES>
Here is some information about the user:
* This is a fact
</MEMORIES>
"""

Clean up

To clean up all resources used in this project, you can delete the Google Cloud project you used for the quickstart.

Otherwise, you can delete the individual resources you created in this tutorial, as follows:

  1. Use the following code sample to delete the Vertex AI Agent Engine instance, which also deletes any sessions or memories associated with the Vertex AI Agent Engine instance.

    agent_engine.delete(force=True)
    
  2. Delete any locally created files.

What's next