Query a linked BigQuery dataset

This document describes how to create a linked dataset for a log bucket, and how to query and view logs in that dataset. For example, you might create a linked dataset so that you can run queries on Log Analytics by using your reserved BigQuery slots. Similarly, you might want a linked dataset so that you can write queries in BigQuery or in Looker Studio that join your log data with other business datata.

If you aren't familiar with Log Analytics, then see Log Analytics overview.

When you need a linked dataset

You don't need a linked BigQuery dataset to issue queries to a log view on the Log Analytics page when you use the default Cloud Logging service. You can save and share your queries, and you can save the results to a custom dashboard.

You do need a linked BigQuery dataset when you want to do any of the following:

A linked BigQuery dataset for a log bucket lets BigQuery read the data in the log bucket.

If you run your queries on reserved BigQuery slots, then your queries are subject to capacity compute pricing. Also, if you query your data by using a service other than Log Analytics, then your queries might be subject to other charges based on that service. See the pricing page for the service that you are using.

How log buckets and log views map to BigQuery objects

When you create a linked BigQuery dataset for a log bucket, a new dataset becomes available to BigQuery. You specify the name of the linked dataset during the linking process. Any log views within the log bucket are automatically mapped to virtual views in the dataset. The name of a virtual view is the same as the corresponding log view.

For example, suppose the project my_project has a log bucket named example, and that this log bucket has a log view named _AllLogs. If you create a linked dataset for that log bucket with the name my_dataset, then the Explorer page of BigQuery displays a my_project entry. A child of the my_project entry is a dataset with the name my_dataset, and that dataset lists a virtual view named _AllLogs.

Access control implications when using linked datasets

When a principal queries a view on a linked BigQuery dataset, that query runs through the BigQuery permission layer, not the Cloud Logging permission layer. Therefore, the BigQuery roles and permissions granted to the principal on the linked dataset determine whether they can query the dataset.

You can't restrict a principal to a specific virtual view within a linked dataset. When a principal has been granted access to the linked dataset, they can query every view in that dataset.

Before you begin

This section describes steps that you must complete before you can use Log Analytics.

Configure log buckets

Ensure that your log buckets have been upgraded to use Log Analytics:

  1. In the Google Cloud console, go to the Logs Storage page:

    Go to Logs Storage

    If you use the search bar to find this page, then select the result whose subheading is Logging.

  2. For each log bucket that has a log view that you want to query, ensure that the Log Analytics available column displays Open. If Upgrade is shown, then click Upgrade and complete the dialog.

Configure IAM roles and permissions

This section describes the IAM roles or permissions that are required to use Log Analytics:

  • To get the permissions that you need to use Log Analytics and query log views, ask your administrator to grant you the following IAM roles on your project:

    • To query the _Required and _Default log buckets: Logs Viewer (roles/logging.viewer)
    • To query all log views in a project: Logs View Accessor (roles/logging.viewAccessor)

    You can restrict a principal to a specific log view either by adding an IAM condition to the Logs View Accessor role grant made at the project level, or by adding an IAM binding to the policy file of the log view. For more information, see Control access to a log view.

    These are the same permissions that you need to view log entries on the Logs Explorer page. For information about additional roles that you need to query views on user-defined buckets or to query the _AllLogs view of the _Default log bucket, see Cloud Logging roles.

  • To get the permissions that you need to create and query linked datasets, ask your administrator to grant you the following IAM roles on the project that stores the log bucket:

    • To create and view linked datasets: Logs Configuration Writer (roles/logging.configWriter)
    • To run queries on linked datasets using reserved BigQuery slots:
    • To view linked datasets in BigQuery Studio, grant all roles mentioned in this step, and the following role. You can grant this role to a dataset or to your project: BigQuery Data Viewer (roles/bigquery.dataViewer)

Verify your BigQuery edition

If you want to query your log data from the Log Analytics page by using reserved BigQuery slots and if you are using projects with Virtual Private Cloud (VPC) Service Controls, then ensure you are using the Enterprise Edition of BigQuery. For more information, see Understand BigQuery editions.

When you want to use the capabilities of BigQuery to analyze your log data, upgrade a log bucket to use Log Analytics, and then create a linked dataset. With this configuration, Logging stores your log data but BigQuery can read the log data.

Google Cloud console

To create a link to a BigQuery dataset for an existing log bucket, do the following:

  1. In the Google Cloud console, go to the Logs Storage page:

    Go to Logs Storage

    If you use the search bar to find this page, then select the result whose subheading is Logging.

  2. Locate the log bucket and verify that the Log Analytics available column displays Open.

    If this column displays Upgrade, then the log bucket hasn't been upgraded to use Log Analytics. Configure Log Analytics:

    1. Click Upgrade.
    2. Click Confirm in the dialog.

    After the upgrade completes, proceed to the next step.

  3. On the log bucket, click More, and then click Edit bucket.

    The Edit log bucket dialog opens.

  4. Select Create a new BigQuery dataset that links to this bucket and enter the name for the new dataset.

    The dataset name must be unique for each Google Cloud project. If you enter the name of an existing dataset, then you receive the following error: Dataset name must be unique in the selected region.

  5. Click Done and then click Update bucket.

    After Logging displays the linked dataset name on the Logs Storage page, it might take several minutes before BigQuery recognizes the dataset.

gcloud

To create a linked dataset for a log bucket that is upgraded to use Log Analytics, run the gcloud logging links create command:

gcloud logging links create LINK_ID --bucket=BUCKET_ID --location=LOCATION

The LINK_ID that you provide is used as the name of the BigQuery dataset, and the value of this field must be unique for your Google Cloud project.

The links create command is asynchronous. The return of an asynchronous method is an Operation object, and it contains information about the progress of the method. When the method completes, the Operation object contains the status. For more information, see Asynchronous API methods.

The links create command takes several minutes to complete.

For example, the following command creates a linked dataset named mylink for the log bucket named my-bucket:

gcloud logging links create mylink --bucket=my-bucket --location=global

The dataset name must be unique for each Google Cloud project. If you attempt to create a dataset with the same name as an existing dataset, then you receive the following error:

BigQuery dataset with name "LINK_ID" already exists.

If you attempt to create a linked dataset for a log bucket that isn't upgraded to use Log Analytics, then the following error is reported:

A link can only be created for an analytics-enabled bucket.

REST

To create a linked a BigQuery dataset for an existing log bucket that is upgraded use Log Analytics, call the asynchronous projects.locations.buckets.links.create method of the Cloud Logging API.

Prepare the arguments to the method as follows:

  1. Construct the request body for the create command. The request body is formatted as a Link object.
  2. For the query parameter of the command, use linkId=LINK_ID. The LINK_ID that you provide is used as the name of the BigQuery dataset, and the value of this field must be unique for your Google Cloud project..

The response to the asynchronous methods is an Operation object. This object contains information about the progress of the method. When the method completes, the Operation object contains the status. For more information, see Asynchronous API methods.

The links.create method takes several minutes to complete.

The dataset name must be unique for each Google Cloud project. If you attempt to create a dataset with the same name as an existing dataset, then you receive the following error:

BigQuery dataset with name "LINK_ID" already exists.

If you attempt to create a linked dataset for a log bucket that isn't upgraded to use Log Analytics, then the following error is reported:

A link can only be created for an analytics-enabled bucket.

Run queries from BigQuery

When you have a log bucket that uses Log Analytics and linked datasets, you can view and query your linked datasets by using the BigQuery Studio page. With this configuration, you can analyze your datasets using commands, workflows, and datasets available only in BigQuery Studio.

To open BigQuery from the Log Analytics and then query linked dataset, do the following:

  1. Ensure that a linked dataset exists for the log bucket which hosts the log view that you want to query:

    1. In the Google Cloud console, go to the Logs Storage page:

      Go to Logs Storage

      If you use the search bar to find this page, then select the result whose subheading is Logging.

    2. Locate the log bucket and verify that the BigQuery linked dataset column displays a URL. If the entry is empty, then you must create a linked dataset. For information on these steps, see Create a linked BigQuery dataset

  2. In the Google Cloud console, go to the Log Analytics page:

    Go to Log Analytics

    If you use the search bar to find this page, then select the result whose subheading is Logging.

  3. In the Log views list, find the log view, and then select Query. The Query pane is populated with a default query. You can also enter a query in the Query pane, or edit a displayed query.

  4. In the toolbar, expand the button labeled Run query or Run on BigQuery, and select Open in BigQuery.

    The BigQuery Studio page opens. The FROM statement of the query is modified to specify the path to the log view on the linked dataset by using the BigQuery Table path syntax.

    You can also edit the displayed query.

  5. On the BigQuery Studio page, click Run query.

Run Log Analytics queries on your reserved BigQuery slots

When you want to minimize the execution time of your Log Analytics queries, run them on your reserved BigQuery slots. The Log Analytics page is preconfigured to use the default Log Analytics service, which means that your queries must compete for limited slot space with other queries. When there aren't available slots, your the execution of your query is delayed. You can eliminate these delays by running your queries on reserved BigQuery slots.

To issue a SQL query to a log view, do the following:

  1. Ensure that a linked dataset exists for the log bucket which hosts the log view that you want to query:

    1. In the Google Cloud console, go to the Logs Storage page:

      Go to Logs Storage

      If you use the search bar to find this page, then select the result whose subheading is Logging.

    2. Locate the log bucket and verify that the BigQuery linked dataset column displays a URL. If the entry is empty, then you must create a linked dataset. For information on these steps, see Create a linked BigQuery dataset

  2. Ensure that you have configured your reserved BigQuery slots:

    1. Create a reservation with dedicated slots
    2. Create reservation assignments
  3. Configure Log Analytics to run your queries on your reserved BigQuery slots:

    1. In the Google Cloud console, go to the Log Analytics page:

      Go to Log Analytics

      If you use the search bar to find this page, then select the result whose subheading is Logging.

    2. In the Log views list, find the view, and then select Query or enter a query.

      If the query pane displays an error message that references the FROM statement, then the table can't be resolved to a specific log view. For information about how to resolve this failure, see Error FROM clause must contain exactly one log view.

    3. Go to the toolbar and ensure that a button labeled Run on BigQuery is displayed.

      If the toolbar displays Run Query, then click Settings and select BigQuery.

      If the Run on BigQuery button is disabled, then you need to create a linked dataset.

    4. Run your query.

      You can use the toolbar options to format your query, clear the query, and open the BigQuery SQL reference documentation.

What's next