Use the BigQuery advanced runtime

This document describes how to enable and disable the BigQuery advanced runtime, and how to evaluate the advanced runtime's effects on your query performance.

Roles and permissions

To get the permissions that you need to specify a configuration setting, ask your administrator to grant you the BigQuery Admin (roles/bigquery.admin) IAM role on your project or organization. For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Features

Enabling the advanced runtime on a BigQuery project turns on features in the query processor that reduce query latency and slot consumption at no additional cost.

Enhanced vectorization

Vectorized execution is a query processing model that operates on columns of data in blocks that align with CPU cache size and uses single instruction, multiple data (SIMD) instructions. Enhanced vectorization extends the vectorized query execution in BigQuery to the following aspects of query processing:

  • By leveraging specialized data encodings within the Capacitor storage format, filter evaluation operations can be executed on the encoded data.
  • Specialized encodings are propagated through the query plan, which allows more data to be processed while it's still encoded.
  • By implementing expression folding to evaluate deterministic functions and constant expressions, BigQuery can simplify complex predicates into constant values.

Short query optimizations

BigQuery typically executes queries in a distributed environment using a shuffle intermediate layer. Short query optimizations dynamically identify queries that can be run as a single stage, reducing latency and slot consumption. Specialized encodings can be used more effectively when a query is run in a single stage. These optimizations are most effective when used with optional job creation mode, which minimizes job startup, maintenance, and result retrieval latency.

Eligibility for short query optimization is dynamic and influenced by the following factors:

  • The predicted size of the data scan.
  • The amount of data movement required.
  • The selectivity of query filters.
  • The type and physical layout of the data in storage.
  • The overall query structure.
  • The historical statistics of past query executions.

Enable the advanced runtime

To enable the advanced runtime for your project or organization, use the ALTER PROJECT or ALTER ORGANIZATION statement to change the default configuration. In the statement, set the query_runtime argument to 'advanced'. For example:

ALTER PROJECT PROJECT_NAME
SET OPTIONS (
  `region-LOCATION.query_runtime` = 'advanced'
);

Replace the following:

  • PROJECT_NAME: the name of the project
  • LOCATION: the location of the project

It can take several minutes for the change to take effect.

Once you've enabled the advanced runtime, qualifying queries in the project or organization use the advanced runtime regardless of which user created the query job.

Disable the advanced runtime

To disable the advanced runtime for your project or organization, use the ALTER PROJECT or ALTER ORGANIZATION statement to change the default configuration. In the statement, set the query_runtime argument to NULL. For example:

ALTER PROJECT PROJECT_NAME
SET OPTIONS (
  `region-LOCATION.query_runtime` = NULL
);

Replace the following:

  • PROJECT_NAME: the name of the project
  • LOCATION: the location of the project

It can take several minutes for the change to take effect.

Evaluate query performance

You can use the administrative job explorer and INFORMATION_SCHEMA views to evaluate the effect of the advanced runtime on query execution time and slot usage.

Follow these steps to evaluate query performance with and without the advanced runtime enabled:

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. Open a new tab in the query editor.

  3. Disable the use of cached query results for that query tab.

  4. Type or copy your test queries into the query tab.

  5. Run your test queries a few times to establish baseline performance. After each run, determine the query performance metrics as follows:

    1. View the query execution details in the administrative job explorer.
    2. Retrieve job performance data from the INFORMATION_SCHEMA.JOBS_BY_USER view by running the following query in a new query tab:

      SELECT
          job_id,
          end_time - start_time AS duration,
          total_slot_ms,
          query
      FROM
        `region-us`.INFORMATION_SCHEMA.JOBS_BY_USER
      WHERE
          creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
          AND job_type='QUERY'
          AND total_slot_ms IS NOT NULL
      ORDER BY
          creation_time DESC,
          query ASC
      LIMIT 1000;
      
  6. Enable the advanced runtime.

  7. Repeat Step 5.

  8. Compare the query latency and slot usage metrics for the test queries from before and after you enabled the advanced runtime.