Analyze log volume with Log Analytics

This document describes how you can use Log Analytics to estimate the billable volume of your log entries. You can write queries that report and aggregate your billable volume by different dimensions, like resource type or application name, and then chart and view the query results.

How to query for billable volume

The billable volume of a log entry, which is the size that is reported to Cloud Billing, is available through the storage_bytes field. In your queries, you can use the storage_bytes field in the same way that you use any schema field whose data type is INTEGER. For example, you can include it in SELECT clauses, in CASE statements, and in common table expressions. For more information about querying your logs, see the following documents:

Because Cloud Billing uses the billable volume when determining your costs, you can write queries that help you understand the sources of your costs. For example, you can write queries that help you determine which applications are writing the most log entries. To learn how to relate billable volume to cost, see Cloud Logging pricing summary and Cloud Logging pricing.

The billable volume of a log entry isn't the size of the LogEntry object that was sent to the Cloud Logging API. The billable volume includes bytes that are required for serialization and metadata.

Before you begin

  1. To get the permissions that you need to use Log Analytics to run queries and view logs, ask your administrator to grant you the following IAM roles on your project:

    • To query the _Required and _Default log buckets: Logs Viewer (roles/logging.viewer)
    • To query custom log buckets: Logs View Accessor (roles/logging.viewAccessor)

    For more information about granting roles, see Manage access.

    You might also be able to get the required permissions through custom roles or other predefined roles.

  2. For the log views that you want to query, go to the Logs Storage page and verify that the log buckets that store those log views are upgraded to use Log Analytics. If necessary, upgrade the log bucket.

    In the Google Cloud console, go to the Logs Storage page:

    Go to Logs Storage

    If you use the search bar to find this page, then select the result whose subheading is Logging.

  3. Optional: If you want to query your log data by using a BigQuery dataset, then create a linked BigQuery dataset.

Sample queries

This section provides example queries that analyze data from a single log view. If you store data in multiple log views and if you want to compute aggregate values for data stored in those views, then you need to use the UNION statement.

You can query your log entry by using the Log Analytics page or anywhere you can query BigQuery datasets, which includes the BigQuery Studio and Looker Studio pages, and the bq command-line tool.

To use the sample queries, do the following:

Query for log volume by app

To compute the total bytes per day, per app, for your log entries that were written against a Google Kubernetes Engine resource and that have a JSON payload, use the following query:

SELECT
  timestamp_trunc(timestamp,DAY) as day,
  JSON_VALUE(labels["k8s-pod/app"]) as app_id,
  SUM(storage_bytes) as total_bytes
FROM
  `TABLE`
WHERE
  json_payload IS NOT NULL
  AND resource.type="k8s_container"
GROUP BY ALL

To visualize the data, you can create a chart. In the following example, the data is displayed as a stacked bar chart. Each bar on the chart displays the total number of bytes stored, organized by app. In this example, the frontend app is generating the most log data:

Example chart showing results of querying for log volume by app.

Query for log volume by log name

To list the number of stored bytes and the log name for each log entry that has a JSON payload and that was written against a Google Kubernetes Engine resource, use the following query;

SELECT
  log_id AS log_name,
  storage_bytes
FROM
  `TABLE`
WHERE
  json_payload IS NOT NULL
  AND resource.type="k8s_container"

The previous query doesn't aggregate the results, instead there is one row for each log entry, and that row contains a log name and the number of stored bytes. If you chart this data, then you can visualize the portion of your log data as written to different logs:

Example chart showing results of querying for log volume by log name.

The previous chart shows that most log data is written to the log named stdout.

Use the bq command-line tool to query for log volume by log name

You can include the storage_bytes field in queries that you run through the BigQuery Studio page or by using the bq command-line tool.

The following query reports the log name and the number of stored bytes for each log entry:

bq query --use_legacy_sql=false  'SELECT log_id as log_name,
  storage_bytes FROM `TABLE`'

The result of this query is similar to the following:

+----------+---------------+
| log_name | storage_bytes |
+----------+---------------+
| stdout   |           716 |
| stdout   |           699 |
| stdout   |           917 |
| stdout   |           704 |

Each row corresponds to one log entry. The value of the storage_bytes column is the billable volume for that log entry.

Limitations

The storage_bytes field is available only when the following are true:

  • The log bucket is upgraded to use Log Analytics.
  • Your query is executed on the Log Analytics page or anywhere you can query BigQuery datasets, which includes the BigQuery Studio and Looker Studio pages, and the bq command-line tool.

  • The log entry was written on or after January 1, 2024.