Use the BigQuery advanced runtime
This document describes how to enable and disable the BigQuery advanced runtime, and how to evaluate the advanced runtime's effects on your query performance.
Roles and permissions
To get the permissions that
you need to specify a configuration setting,
ask your administrator to grant you the
BigQuery Admin (roles/bigquery.admin
)
IAM role on your project or organization.
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Features
Enabling the advanced runtime on a BigQuery project turns on features in the query processor that reduce query latency and slot consumption at no additional cost.
Enhanced vectorization
Vectorized execution is a query processing model that operates on columns of data in blocks that align with CPU cache size and uses single instruction, multiple data (SIMD) instructions. Enhanced vectorization extends the vectorized query execution in BigQuery to the following aspects of query processing:
- By leveraging specialized data encodings within the Capacitor storage format, filter evaluation operations can be executed on the encoded data.
- Specialized encodings are propagated through the query plan, which allows more data to be processed while it's still encoded.
- By implementing expression folding to evaluate deterministic functions and constant expressions, BigQuery can simplify complex predicates into constant values.
Short query optimizations
BigQuery typically executes queries in a distributed environment using a shuffle intermediate layer. Short query optimizations dynamically identify queries that can be run as a single stage, reducing latency and slot consumption. Specialized encodings can be used more effectively when a query is run in a single stage. These optimizations are most effective when used with optional job creation mode, which minimizes job startup, maintenance, and result retrieval latency.
Eligibility for short query optimization is dynamic and influenced by the following factors:
- The predicted size of the data scan.
- The amount of data movement required.
- The selectivity of query filters.
- The type and physical layout of the data in storage.
- The overall query structure.
- The historical statistics of past query executions.
Enable the advanced runtime
To enable the advanced runtime for your project or organization, use the
ALTER PROJECT
or
ALTER ORGANIZATION
statement to change the
default configuration. In the
statement, set the query_runtime
argument to 'advanced'
. For example:
ALTER PROJECTPROJECT_NAME
SET OPTIONS ( `region-LOCATION
.query_runtime` = 'advanced' );
Replace the following:
PROJECT_NAME
: the name of the projectLOCATION
: the location of the project
It can take several minutes for the change to take effect.
Once you've enabled the advanced runtime, qualifying queries in the project or organization use the advanced runtime regardless of which user created the query job.
Disable the advanced runtime
To disable the advanced runtime for your project or organization, use the
ALTER PROJECT
or ALTER ORGANIZATION
statement to change the
default configuration. In the
statement, set the query_runtime
argument to NULL
. For example:
ALTER PROJECTPROJECT_NAME
SET OPTIONS ( `region-LOCATION
.query_runtime` = NULL );
Replace the following:
PROJECT_NAME
: the name of the projectLOCATION
: the location of the project
It can take several minutes for the change to take effect.
Evaluate query performance
You can use the
administrative job explorer and
INFORMATION_SCHEMA
views to
evaluate the effect of the advanced runtime on query execution time and slot
usage.
Follow these steps to evaluate query performance with and without the advanced runtime enabled:
In the Google Cloud console, go to the BigQuery page.
Open a new tab in the query editor.
Disable the use of cached query results for that query tab.
Type or copy your test queries into the query tab.
Run your test queries a few times to establish baseline performance. After each run, determine the query performance metrics as follows:
- View the query execution details in the administrative job explorer.
Retrieve job performance data from the
INFORMATION_SCHEMA.JOBS_BY_USER
view by running the following query in a new query tab:SELECT job_id, end_time - start_time AS duration, total_slot_ms, query FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_USER WHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY) AND job_type='QUERY' AND total_slot_ms IS NOT NULL ORDER BY creation_time DESC, query ASC LIMIT 1000;
Repeat Step 5.
Compare the query latency and slot usage metrics for the test queries from before and after you enabled the advanced runtime.