Log requests and responses

Vertex AI can log samples of requests and responses for Gemini and supported partner models. The logs are saved to a BigQuery table for viewing and analysis. This page describes how to configure request-response logs for base foundation models and fine-tuned models.

Supported API methods for logging

Request-response logs are supported for all Gemini models that use generateContent or streamGenerateContent.

The following partner models that use rawPredict or streamrawPredict are also supported:

Anthropic Claude

Request-response logs for base foundation models

You can configure request-response logs for base foundation models by using the REST API or Python SDK. Logging configurations can take a few minutes to take effect.

Enable request-response logging

Select one the of the following tabs for instructions on enabling request-response logs for a base foundation model.

For Anthropic models, only REST is supported for logging configuration. Enable logging configuration through the REST API by setting publisher to anthropic and setting the model name to one of the supported Claude models.

Python SDK

This method can be used to create or update a PublisherModelConfig.

# Specify the preview version of GenerativeModel
import vertexai
from vertexai.preview.generative_models import(
    GenerativeModel,
    )

PROJECT_ID = "PROJECT_ID"
LOCATION = "LOCATION"

vertexai.init(project=PROJECT_ID, location=LOCATION)

publisher_model = GenerativeModel("gemini-2.5-flash")

# Set logging configuration
publisher_model.set_request_response_logging_config(
    enabled=True,
    sampling_rate=1.0,
    bigquery_destination="bq://PROJECT_ID.DATASET_NAME.TABLE_NAME",
    enable_otel_logging=True
    )

REST API

Create or update a PublisherModelConfig using setPublisherModelConfig:

Before using any of the request data, make the following replacements:

ENDPOINT_PREFIX: The region of the model resource followed by -. For example, us-central1-. If using the global endpoint, leave blank. Request-response logging is supported for all regions supported by the model.
PROJECT_ID: Your project ID.
LOCATION: The region of the model resource. If using the global endpoint, enter global.
PUBLISHER: The publisher name. For example, google.
MODEL: The foundation model name. For example, gemini-2.0-flash-001.
SAMPLING_RATE: To reduce storage costs, you can set a number between 0 or 1 to define the fraction of requests to log. For example, a value of 1 logs all requests, and a value of 0.1 logs 10% of requests.
BQ_URI: the BigQuery table to be used for logging. If you only specify a project name, a new dataset is created with the name logging_ENDPOINT_DISPLAY_NAME\_ENDPOINT_ID, where ENDPOINT_DISPLAY_NAME follows the BigQuery naming rules. If you don't specify a table name, a new table is created with the name request_response_logging.

HTTP method and URL:

POST https://ENDPOINT_PREFIXaiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig

Request JSON body:

{
  "publisherModelConfig": {
     "loggingConfig": {
       "enabled": true,
       "samplingRate": SAMPLING_RATE,
       "bigqueryDestination": {
         "outputUri": "BQ_URI"
       },
       "enableOtelLogging": true
     }
   }
 }

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://ENDPOINT_PREFIXaiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig"

PowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://ENDPOINT_PREFIXaiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Response

{
  "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1beta1.SetPublisherModelConfigOperationMetadata",
    "genericMetadata": {
      "createTime": "2025-03-11T22:42:54.283184Z",
      "updateTime": "2025-03-11T22:42:54.283184Z"
    }
  }
}

Get logging configuration

Get the request-response logging configuration for the foundation model by using the REST API.

REST API

Get the request-response logging configuration using fetchPublisherModelConfig:

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The location of the model resource.
PUBLISHER: The publisher name. For example, google.
MODEL: The foundation model name. For example, gemini-2.0-flash-001.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:fetchPublisherModelConfig

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:fetchPublisherModelConfig"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:fetchPublisherModelConfig" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Response

{
  "loggingConfig": {
    "enabled": true,
    "samplingRate": 1,
    "bigqueryDestination": {
      "outputUri": "bq://output-uri"
    },
    "enableOtelLogging": true
  }
}

Disable logging

Disable request-response logging for the foundation model by using the REST API or Python SDK.

Python SDK

# Specify the preview version of GenerativeModel
import vertexai
from vertexai.preview.generative_models import(
    GenerativeModel,
    )

PROJECT_ID = "PROJECT_ID"
LOCATION = "LOCATION"

vertexai.init(project=PROJECT_ID, location=LOCATION)

publisher_model = GenerativeModel("gemini-2.5-flash")

# Disable logging
publisher_model.set_request_response_logging_config(
    enabled=False,
    )

REST API

Use setPublisherModelConfig to disable logging:

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The location of the model resource.
PUBLISHER: The publisher name. For example, google.
MODEL: The foundation model name. For example, gemini-2.0-flash-001.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig

Request JSON body:

{
  "publisherModelConfig": {
     "loggingConfig": {
       "enabled": false
     }
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Response

{
  "name": "projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1beta1.SetPublisherModelConfigOperationMetadata",
    "genericMetadata": {
      "createTime": "2025-03-11T22:42:54.283184Z",
      "updateTime": "2025-03-11T22:42:54.283184Z"
    }
  }
}

Request-response logs for fine-tuned models

You can configure request-response logs for fine-tuned models by using the REST API or Python SDK.

Enable request-response logs

Select one the of the following tabs for instructions on enabling request-response logs for a fine-tuned model.

Python SDK

This method can be used to update the request-response logging configuration for an endpoint.

# Specify the preview version of GenerativeModel
import vertexai
from vertexai.preview.generative_models import(
    GenerativeModel,
    )

PROJECT_ID = "PROJECT_ID"
LOCATION = "LOCATION"

vertexai.init(project=PROJECT_ID, location=LOCATION)

tuned_model = GenerativeModel(f"projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/ENDPOINT_ID")

# Set logging configuration
publisher_model.set_request_response_logging_config(
    enabled=True,
    sampling_rate=1.0,
    bigquery_destination=f"bq://{PROJECT_ID}.DATASET_NAME.TABLE_NAME",
    enable_otel_logging=True,
    )

REST API

You can only enable request-response logging when you create an endpoint using projects.locations.endpoints.create or patch an existing endpoint using projects.locations.endpoints.patch.

Requests and responses are logged at the endpoint level, so requests sent to any deployed models under the same endpoint are logged.

When you create or patch an endpoint, populate the predictRequestResponseLoggingConfig field of the Endpoint resource with the following entries:

enabled: set to True to enable request-response logging.
samplingRate: To reduce storage costs, you can set a number between 0 or 1 to define the fraction of requests to log. For example, a value of 1 logs all requests, and a value of 0.1 logs 10% of requests.
BigQueryDestination: the BigQuery table to be used for logging. If you only specify a project name, a new dataset is created with the name logging_ENDPOINT_DISPLAY_NAME_ENDPOINT_ID, where ENDPOINT_DISPLAY_NAME follows the BigQuery naming rules. If you don't specify a table name, a new table is created with the name request_response_logging.
enableOtelLogging: set to true to enable OpenTelemetry (OTEL) logging in addition to the default request-response logging.

To view the BigQuery table schema, see Logging table schema.

The following is an example configuration:

{
  "predictRequestResponseLoggingConfig": {
    "enabled": true,
    "samplingRate": 0.5,
    "bigqueryDestination": {
      "outputUri": "bq://PROJECT_ID.DATASET_NAME.TABLE_NAME"
    },
    "enableOtelLogging": true
  }
}

Get logging configuration

Get the request-response logging configuration for the fine-tuned model by using the REST API.

REST API

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The location of the endpoint resource.
MODEL: The foundation model name. For example, gemini-2.0-flash-001.
ENDPOINT_ID: The ID of the endpoint.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Response

{
  "loggingConfig": {
    "enabled": true,
    "samplingRate": 1,
    "bigqueryDestination": {
      "outputUri": "bq://output-uri"
    },
    "enableOtelLogging": true
  }
}

Disable logging configuration

Disable the request-response logging configuration for the endpoint.

Python SDK

# Specify the preview version of GenerativeModel
import vertexai
from vertexai.preview.generative_models import(
    GenerativeModel,
    )

PROJECT_ID = "PROJECT_ID"
LOCATION = "LOCATION"

vertexai.init(project=PROJECT_ID, location=LOCATION)

tuned_model = GenerativeModel(f"projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/ENDPOINT_ID")

# Disable logging
tuned_model.set_request_response_logging_config(
    enabled=False,
    )

REST API

{
"predictRequestResponseLoggingConfig": {
  "enabled": false
}
}

Logging table schema

In BigQuery, the logs are recorded using the following schema:

Field name	Type	Notes
endpoint	STRING	Resource name of the endpoint to which the tuned model is deployed.
deployed_model_id	STRING	Deployed model ID for a tuned model deployed to an endpoint.
logging_time	TIMESTAMP	The time that logging is performed. This is roughly the time that the response is returned.
request_id	NUMERIC	The auto-generated integer request ID based on the API request.
request_payload	STRING	Included for partner model logging and backward compatibility with the Vertex AI endpoint request-response log.
response_payload	STRING	Included for partner model logging and backward compatibility with the Vertex AI endpoint request-response log.
model	STRING	Model resource name.
model_version	STRING	The model version. This is often "default" for Gemini models.
api_method	STRING	generateContent, streamGenerateContent, rawPredict, streamRawPredict
full_request	JSON	The full `GenerateContentRequest`.
full_response	JSON	The full `GenerateContentResponse`.
metadata	JSON	Any metadata of the call; contains the request latency.
otel_log	JSON	Logs in OpenTelemetry schema format. Only available if `otel_logging` is enabled in the logging configuration.

Note that request-response pairs larger than the BigQuery write API 10MB row limit are not recorded.

What's next

Estimate pricing for online prediction logging.
Deploy a model using the Google Cloud console or using the Vertex AI API.
Learn how to create a BigQuery table.