JOBS_TIMELINE view
The INFORMATION_SCHEMA.JOBS_TIMELINE
view contains near real-time
BigQuery metadata by timeslice for all jobs submitted in the
current project. This view contains currently running and completed jobs.
Required permissions
To query the INFORMATION_SCHEMA.JOBS_TIMELINE
view, you need the
bigquery.jobs.listAll
Identity and Access Management (IAM) permission for the project.
Each of the following predefined IAM roles includes the required
permission:
- Project Owner
- BigQuery Admin
For more information about BigQuery permissions, see Access control with IAM.
Schema
When you query the INFORMATION_SCHEMA.JOBS_TIMELINE_BY_*
views, the query
results contain one row for every second of execution of every
BigQuery job. Each period starts on a whole-second interval and
lasts exactly one second.
The INFORMATION_SCHEMA.JOBS_TIMELINE_BY_*
view has the following schema:
Column name | Data type | Value |
---|---|---|
period_start |
TIMESTAMP |
Start time of this period. |
period_slot_ms |
INTEGER |
Slot milliseconds consumed in this period. |
project_id |
STRING |
(Clustering column) ID of the project. |
project_number |
INTEGER |
Number of the project. |
user_email |
STRING |
(Clustering column) Email address or service account of the user who ran the job. |
job_id |
STRING |
ID of the job. For example, bquxjob_1234 . |
job_type |
STRING |
The type of the job. Can be QUERY , LOAD ,
EXTRACT , COPY , or null . Job
type null indicates an internal job, such as script job
statement evaluation or materialized view refresh. |
statement_type |
STRING |
The type of query statement, if valid. For example,
SELECT , INSERT , UPDATE , or
DELETE . |
priority |
STRING |
The priority of this job. Valid values include INTERACTIVE and
BATCH . |
parent_job_id |
STRING |
ID of the parent job, if any. |
job_creation_time |
TIMESTAMP |
(Partitioning column) Creation time of this job. Partitioning is based on the UTC time of this timestamp. |
job_start_time |
TIMESTAMP |
Start time of this job. |
job_end_time |
TIMESTAMP |
End time of this job. |
state |
STRING |
Running state of the job at the end of this period. Valid states
include PENDING , RUNNING , and
DONE . |
reservation_id |
STRING |
Name of the primary reservation assigned to this job at the end of this period, if applicable. |
edition |
STRING |
The edition associated with the reservation assigned to this job. For more information about editions, see Introduction to BigQuery editions. |
total_bytes_billed |
INTEGER |
If the project is configured to use on-demand pricing, then this field contains the total bytes billed for the job. If the project is configured to use flat-rate pricing, then you are not billed for bytes and this field is informational only. |
total_bytes_processed |
INTEGER |
Total bytes processed by the job. |
error_result |
RECORD |
Details of error (if any) as an
ErrorProto.
|
cache_hit |
BOOLEAN |
Whether the query results of this job were from a cache. |
period_shuffle_ram_usage_ratio |
FLOAT |
Shuffle usage ratio in the selected time period. |
period_estimated_runnable_units |
INTEGER |
Units of work that can be scheduled immediately in this period. Additional slots for these units of work accelerate your query, provided no other query in the reservation needs additional slots. |
transaction_id |
STRING |
ID of the transaction in which this job ran, if any. (Preview) |
Data retention
This view contains currently running jobs and the job history of the past 180 days.
Scope and syntax
Queries against this view must include a region qualifier. If you do not specify a regional qualifier, metadata is retrieved from all regions. The following table explains the region scope for this view:
View name | Resource scope | Region scope |
---|---|---|
[PROJECT_ID.]`region-REGION`.INFORMATION_SCHEMA.JOBS_TIMELINE[_BY_PROJECT] |
Project level | REGION |
Optional: PROJECT_ID
: the ID of your
Google Cloud project. If not specified, the default project is used.
REGION
: any dataset region name.
For example, `region-us`
.
Examples
To run the query against a project other than your default project, add the project ID in the following format:
`PROJECT_ID`.`region-REGION_NAME`.INFORMATION_SCHEMA.VIEW
`myproject`.`region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE
.
The following example calculates the slot utilization for every second in the last day:
SELECT period_start, SUM(period_slot_ms) AS total_slot_ms, FROM `reservation-admin-project.region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE WHERE period_start BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY) AND CURRENT_TIMESTAMP() GROUP BY period_start ORDER BY period_start DESC;
+---------------------+---------------+ | period_start | total_slot_ms | +---------------------+---------------+ | 2020-07-29 03:52:14 | 122415176 | | 2020-07-29 03:52:15 | 141107048 | | 2020-07-29 03:52:16 | 173335142 | | 2020-07-28 03:52:17 | 131107048 | +---------------------+---------------+
You can check usage for a particular reservation with
WHERE reservation_id = "…"
. For script jobs, the parent job also reports the
total slot usage from its children jobs. To avoid double counting, use
WHERE statement_type != "SCRIPT"
to exclude the parent job.
Example: Number of RUNNING
and PENDING
jobs over time
To run the query against a project other than your default project, add the project ID in the following format:
`PROJECT_ID`.`region-REGION_NAME`.INFORMATION_SCHEMA.VIEW
`myproject`.`region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE
.
The following example computes the number of RUNNING
and PENDING
jobs at every
second in the last day:
SELECT period_start, SUM(IF(state = "PENDING", 1, 0)) as PENDING, SUM(IF(state = "RUNNING", 1, 0)) as RUNNING FROM `reservation-admin-project.region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE WHERE period_start BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY) AND CURRENT_TIMESTAMP() GROUP BY period_start;
The result is similar to the following:
+---------------------+---------+---------+ | period_start | PENDING | RUNNING | +---------------------+---------+---------+ | 2020-07-29 03:52:14 | 7 | 27 | | 2020-07-29 03:52:15 | 1 | 21 | | 2020-07-29 03:52:16 | 5 | 21 | | 2020-07-29 03:52:17 | 4 | 22 | +---------------------+---------+---------+
Example: Resource usage by jobs at a specific point in time
To run the query against a project other than your default project, add the project ID in the following format:
`PROJECT_ID`.`region-REGION_NAME`.INFORMATION_SCHEMA.VIEW
`myproject`.`region-us`.INFORMATION_SCHEMA.JOBS
.
The following example returns the job_id
of all jobs running at a specific
point in time together with their resource usage during that one-second period:
SELECT job_id, period_slot_ms FROM `reservation-admin-project.region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE_BY_PROJECT WHERE period_start = '2020-07-29 03:52:14' AND statement_type != 'SCRIPT';
The result is similar to the following:
+------------------+ | job_id | slot_ms | +------------------+ | job_1 | 2415176 | | job_2 | 4417245 | | job_3 | 427416 | | job_4 | 1458122 | +------------------+
Example: Match slot usage behavior from administrative resource charts
You can use
administrative resource charts to
monitor your organization's health, slot usage, and BigQuery jobs
performance over time. The following example queries the
INFORMATION_SCHEMA.JOBS_TIMELINE
view for a slot usage timeline at one-hour
intervals, similar to the information that is available in administrative
resource charts.
WITH snapshot_data AS ( SELECT UNIX_MILLIS(period_start) AS period_start, IFNULL(SUM(period_slot_ms), 0) AS period_slot_ms, DIV(UNIX_MILLIS(period_start), 3600000 * 1) * 3600000 * 1 AS time_ms FROM ( SELECT * FROM `user_proj.region-US`.INFORMATION_SCHEMA.JOBS_TIMELINE_BY_ORGANIZATION WHERE ((job_creation_time >= TIMESTAMP_SUB(@start_time, INTERVAL 1200 MINUTE) AND job_creation_time < TIMESTAMP(@end_time)) AND period_start >= TIMESTAMP(@start_time) AND period_start < TIMESTAMP(@end_time)) AND (statement_type != "SCRIPT" OR statement_type IS NULL) AND REGEXP_CONTAINS(reservation_id, "^user_proj:") ) GROUP BY period_start, time_ms ), data_by_time AS ( SELECT time_ms, SUM(period_slot_ms) / (3600000 * 1) AS submetric_value FROM snapshot_data GROUP BY time_ms ) SELECT time_ms, IFNULL(submetric_value, 0) AS submetric_value, "Slot Usage" AS resource_id, IFNULL(SUM(submetric_value) OVER () / (TIMESTAMP_DIFF(@end_time, @start_time, HOUR) / 1), 0) AS overall_average_slot_count FROM ( SELECT time_ms * 3600000 * 1 AS time_ms FROM UNNEST(GENERATE_ARRAY(DIV(UNIX_MILLIS(@start_time), 3600000 * 1), DIV(UNIX_MILLIS(@end_time), 3600000 * 1) - 1, 1)) AS time_ms ) LEFT JOIN data_by_time USING(time_ms) ORDER BY time_ms DESC;