Instrument ADK applications with OpenTelemetry

This document describes how to instrument an AI agent that was built with the Agent Development Kit (ADK) framework. This instrumentation, which leverages OpenTelemetry, lets you collect user prompts, agent responses, and agent choices.

The ADK framework is itself instrumented with OpenTelemetry which captures telemetry from key steps in your agent's execution. This provides valuable application observability out-of-the-box. However, this observability might not be sufficient for your application's use case. You can add additional instrumentation libraries by using OpenTelemetry to capture telemetry from other pieces of your app, or your own custom instrumentation to capture application-specific data to get more fine-grained observability.

For example, in your application you could write instrumentation code to:

  • Track resource consumption of agent-invoked tools.
  • Track application-specific validation failures, business rule violations, or custom error recovery mechanisms.
  • Track quality scores for agent responses based on your domain-specific criteria.

Instrument your generative AI application to collect telemetry

To instrument your AI agent to collect log, metric, and trace data, do the following:

  1. Install OpenTelemetry packages
  2. Configure OpenTelemetry to collect and send telemetry
  3. Write a custom entry-point to inject configured OpenTelemetry

The remainder of this section describes the previous steps.

Install OpenTelemetry packages

Add the following OpenTelemetry instrumentations and exporter packages:

pip install 'opentelemetry-instrumentation-google-genai' \
  'opentelemetry-instrumentation-sqlite3' \
  'opentelemetry-exporter-gcp-logging' \
  'opentelemetry-exporter-gcp-monitoring' \
  'opentelemetry-exporter-otlp-proto-grpc' \
  'opentelemetry-instrumentation-vertexai>=2.0b0'

Log and metric data is sent to your Google Cloud project by using the Cloud Logging API or the Cloud Monitoring API. The opentelemetry-exporter-gcp-logging and opentelemetry-exporter-gcp-monitoring libraries invoke endpoints in those APIs.

Trace data is sent to Google Cloud by using the Telemetry (OTLP) API, which supports the OTLP format. Data received through this endpoint is also stored in the OTLP format. The opentelemetry-exporter-otlp-proto-grpc library invokes the Telemetry (OTLP) API endpoint.

Configure OpenTelemetry to collect and send telemetry

In your ADK agent's initialization code, add code to configure OpenTelemetry to capture and send telemetry to your Google Cloud project:

To view the full sample, click More, and then select View on GitHub.

def setup_opentelemetry() -> None:
    credentials, project_id = google.auth.default()
    resource = Resource.create(
        attributes={
            SERVICE_NAME: "adk-sql-agent",
            # The project to send spans to
            "gcp.project_id": project_id,
        }
    )

    # Set up OTLP auth
    request = google.auth.transport.requests.Request()
    auth_metadata_plugin = AuthMetadataPlugin(credentials=credentials, request=request)
    channel_creds = grpc.composite_channel_credentials(
        grpc.ssl_channel_credentials(),
        grpc.metadata_call_credentials(auth_metadata_plugin),
    )

    # Set up OpenTelemetry Python SDK
    tracer_provider = TracerProvider(resource=resource)
    tracer_provider.add_span_processor(
        BatchSpanProcessor(
            OTLPSpanExporter(
                credentials=channel_creds,
                endpoint="https://telemetry.googleapis.com:443/v1/traces",
            )
        )
    )
    trace.set_tracer_provider(tracer_provider)

    logger_provider = LoggerProvider(resource=resource)
    logger_provider.add_log_record_processor(
        BatchLogRecordProcessor(CloudLoggingExporter())
    )
    logs.set_logger_provider(logger_provider)

    event_logger_provider = EventLoggerProvider(logger_provider)
    events.set_event_logger_provider(event_logger_provider)

    reader = PeriodicExportingMetricReader(CloudMonitoringMetricsExporter())
    meter_provider = MeterProvider(metric_readers=[reader], resource=resource)
    metrics.set_meter_provider(meter_provider)

    # Load instrumentors
    SQLite3Instrumentor().instrument()
    # ADK uses Vertex AI and Google Gen AI SDKs.
    VertexAIInstrumentor().instrument()
    GoogleGenAiSdkInstrumentor().instrument()

Write a custom entry point to use configured OpenTelemetry

To use OpenTelemetry for instrumentation, create a custom entry point for your ADK application. The custom entry point must configure OpenTelemetry before it launches the ADK agent.

In the sample application, the main method acts as a custom entry point that initializes OpenTelemetry and then launches the FastAPI server which lets you interact with the agent.

To view the full sample, click More, and then select View on GitHub.

def main() -> None:
    # Make sure to set:
    # OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=true
    # OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true
    # in order to full prompts and responses and logs messages.
    # For this sample, these can be set by loading the `opentelemetry.env` file.
    setup_opentelemetry()

    # Call the function to get the FastAPI app instance.
    # Ensure that the agent director name is the name of directory containing agent subdirectories,
    # where each subdirectory represents a single agent with __init__.py and agent.py files.
    # For this example this would be the current directory containing main.py.
    # Note: Calling this method attempts to set the global tracer provider, which has already been
    # set by the setup_opentelemetry() function.
    app = get_fast_api_app(
        agents_dir=AGENT_DIR,
        session_service_uri=SESSION_DB_URL,
        allow_origins=ALLOWED_ORIGINS,
        web=SERVE_WEB_INTERFACE,
    )

    # Lauch the web interface on port 8080.
    uvicorn.run(app, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))

Download and run the sample application

This sample code implements a generative AI agent that is built using ADK. The agent is instrumented with OpenTelemetry, configured to send metrics, traces and logs to your Google Cloud project. The telemetry sent to your project includes generative AI prompts and responses.

ADK agent persona

The generative AI agent is defined as a SQL expert that has full access to an ephemeral SQLite database. The agent is built with the Agent Development Kit and accesses a database using the SQLDatabaseToolkit. The database is initially empty.

Before you begin

  1. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  2. Enable the Vertex AI, Telemetry, Cloud Logging, Cloud Monitoring, and Cloud Trace APIs:

    gcloud services enable aiplatform.googleapis.com telemetry.googleapis.com logging.googleapis.com monitoring.googleapis.com cloudtrace.googleapis.com
  3. To get the permissions that you need to for the sample applications to write log, metric, and trace data, ask your administrator to grant you the following IAM roles on your project:

Launch the application

To launch the sample application, do the following:

  1. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  2. Clone the repository:

    git clone https://github.com/GoogleCloudPlatform/opentelemetry-operations-python.git
    
  3. Go to the sample directory:

    cd opentelemetry-operations-python/samples/adk-sql-agent
    
  4. Create a virtual environment and run the sample:

    python -m venv venv/
    source venv/bin/activate
    pip install -r requirements.txt
    env $(cat opentelemetry.env | xargs) python main.py
    

    The application displays a message similar to the following:

    Appplication startup complete
    Uvicorn running on http://0.0.0.0:8080
    
  5. To interact with the agent, open a browser to the address listed in the previous step.

  6. Expand Select an agent and select sql_agent from the list of agents.

Engage with the agent

To engage with the agent, ask it a question or give it a command. For example, you might ask the question:

What can you do for me ?

Similarly, since the sql_agent has the persona of a SQL expert, you might ask it to create tables for your applications and write queries to operate on the created tables. The agent can only create ephemeral database that is backed by a .db file that is created on the machine running the application.

The following illustrates sample interaction between the sql_agent and the user:

Create a table for me to store weather data and also insert sample data in
the table. At the end show all data in the table you created.

Display of interaction with the sql_agent.

The actions performed by generative AI agents aren't deterministic, so you might see a different response for the same prompt.

Exit the application

To exit the application, enter Ctrl-C on the shell used to launch the application.

View the traces, metrics, and logs

This section describes how can view generative AI events.

Before you begin

To get the permissions that you need to view your log, metric, and trace data, ask your administrator to grant you the following IAM roles on your project:

For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

View telemetry

To view the generative AI events created by the application, use the Trace Explorer page:

  1. In the Google Cloud console, go to the Trace explorer page:

    Go to Trace explorer

    You can also find this page by using the search bar.

  2. In the toolbar, select Add filter, select Span name, and then select call_llm.

    The following illustrates the Trace Explorer page after filtering the data:

    Display of trace spans.

    If you've never used Cloud Trace before, then Google Cloud Observability needs to create a database to store your trace data. The creation of the database can take a few minutes and during that period, no trace data is available to view.

  3. To explore your span and log data, in the Spans table, select a span.

    The Details page opens. This page displays the associated trace and its spans. The table on the page displays detailed information for the span you selected. This information includes the following:

    • The GenAI tab displays events for generative AI agents. To learn more about these events, see View generative AI events.

      The following screenshot illustrates a trace, where one span has the name call_llm. That span invokes the LLM (Large Language Model) powering this agent. For this sample, it is Gemini. The Gemini span includes generative AI events:

      Display of generative AI events.

    • The Logs & Events tab lists log entries and events that are associated with the span. If you want to view the log data in the Logs Explorer, then in the toolbar of this tab, select View logs.

      The log data includes the response of the sql_agent. For example, for the sample run, the JSON payload includes the following content:

      {
        "logName": "projects/my-project/logs/otel_python_inprocess_log_name_temp",
        "jsonPayload": {
          "content": {
            "role": "model",
            "parts": [
              {
                "executable_code": null,
                "inline_data": null,
                "thought": null,
                "video_metadata": null,
                "code_execution_result": null,
                "function_response": null,
                "thought_signature": null,
                "text": "Okay, I will create a table named `weather` with columns `id`, `city`, `temperature`, and `date`. Then I will insert some sample rows into the table and display all the data in the table.\n",
                "file_data": null,
                "function_call": null
              }
            ]
          }
        },
        ...
      }
      

The sample is instrumented to send metric data to your Google Cloud project, but it doesn't generate any metrics.