JOBS_TIMELINE view

The INFORMATION_SCHEMA.JOBS_TIMELINE view contains near real-time BigQuery metadata by timeslice for all jobs submitted in the current project. This view contains currently running and completed jobs.

Required permissions

To query the INFORMATION_SCHEMA.JOBS_TIMELINE view, you need the bigquery.jobs.listAll Identity and Access Management (IAM) permission for the project. Each of the following predefined IAM roles includes the required permission:

  • Project Owner
  • BigQuery Admin

For more information about BigQuery permissions, see Access control with IAM.

Schema

When you query the INFORMATION_SCHEMA.JOBS_TIMELINE_BY_* views, the query results contain one row for every second of execution of every BigQuery job. Each period starts on a whole-second interval and lasts exactly one second.

The INFORMATION_SCHEMA.JOBS_TIMELINE_BY_* view has the following schema:

Column name Data type Value
period_start TIMESTAMP Start time of this period.
period_slot_ms INTEGER Slot milliseconds consumed in this period.
project_id STRING (Clustering column) ID of the project.
project_number INTEGER Number of the project.
user_email STRING (Clustering column) Email address or service account of the user who ran the job.
job_id STRING ID of the job. For example, bquxjob_1234.
job_type STRING The type of the job. Can be QUERY, LOAD, EXTRACT, COPY, or null. Job type null indicates an internal job, such as script job statement evaluation or materialized view refresh.
statement_type STRING The type of query statement, if valid. For example, SELECT, INSERT, UPDATE, or DELETE.
priority STRING The priority of this job. Valid values include INTERACTIVE and BATCH.
parent_job_id STRING ID of the parent job, if any.
job_creation_time TIMESTAMP (Partitioning column) Creation time of this job. Partitioning is based on the UTC time of this timestamp.
job_start_time TIMESTAMP Start time of this job.
job_end_time TIMESTAMP End time of this job.
state STRING Running state of the job at the end of this period. Valid states include PENDING, RUNNING, and DONE.
reservation_id STRING Name of the primary reservation assigned to this job at the end of this period, if applicable.
edition STRING The edition associated with the reservation assigned to this job. For more information about editions, see Introduction to BigQuery editions.
total_bytes_billed INTEGER If the project is configured to use on-demand pricing, then this field contains the total bytes billed for the job. If the project is configured to use flat-rate pricing, then you are not billed for bytes and this field is informational only.
total_bytes_processed INTEGER Total bytes processed by the job.
error_result RECORD Details of error (if any) as an ErrorProto.
cache_hit BOOLEAN Whether the query results of this job were from a cache.
period_shuffle_ram_usage_ratio FLOAT Shuffle usage ratio in the selected time period.
period_estimated_runnable_units INTEGER Units of work that can be scheduled immediately in this period. Additional slots for these units of work accelerate your query, provided no other query in the reservation needs additional slots.
transaction_id STRING ID of the transaction in which this job ran, if any. (Preview)

Data retention

This view contains currently running jobs and the job history of the past 180 days.

Scope and syntax

Queries against this view must include a region qualifier. If you do not specify a regional qualifier, metadata is retrieved from all regions. The following table explains the region scope for this view:

View name Resource scope Region scope
[PROJECT_ID.]`region-REGION`.INFORMATION_SCHEMA.JOBS_TIMELINE[_BY_PROJECT] Project level REGION
Replace the following:

  • Optional: PROJECT_ID: the ID of your Google Cloud project. If not specified, the default project is used.

  • REGION: any dataset region name. For example, `region-us`.

  • Examples

    To run the query against a project other than your default project, add the project ID in the following format:

    `PROJECT_ID`.`region-REGION_NAME`.INFORMATION_SCHEMA.VIEW
    ; for example, `myproject`.`region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE.

    The following example calculates the slot utilization for every second in the last day:

    SELECT
      period_start,
      SUM(period_slot_ms) AS total_slot_ms,
    FROM
      `reservation-admin-project.region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE
    WHERE
      period_start BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY) AND CURRENT_TIMESTAMP()
    GROUP BY
      period_start
    ORDER BY
      period_start DESC;
    Note: `INFORMATION_SCHEMA` view names are case-sensitive. The result is similar to the following:
    +---------------------+---------------+
    |    period_start     | total_slot_ms |
    +---------------------+---------------+
    | 2020-07-29 03:52:14 |     122415176 |
    | 2020-07-29 03:52:15 |     141107048 |
    | 2020-07-29 03:52:16 |     173335142 |
    | 2020-07-28 03:52:17 |     131107048 |
    +---------------------+---------------+
    

    You can check usage for a particular reservation with WHERE reservation_id = "…". For script jobs, the parent job also reports the total slot usage from its children jobs. To avoid double counting, use WHERE statement_type != "SCRIPT" to exclude the parent job.

    Example: Number of RUNNING and PENDING jobs over time

    To run the query against a project other than your default project, add the project ID in the following format:

    `PROJECT_ID`.`region-REGION_NAME`.INFORMATION_SCHEMA.VIEW
    ; for example, `myproject`.`region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE.

    The following example computes the number of RUNNING and PENDING jobs at every second in the last day:

    SELECT
      period_start,
      SUM(IF(state = "PENDING", 1, 0)) as PENDING,
      SUM(IF(state = "RUNNING", 1, 0)) as RUNNING
    FROM
      `reservation-admin-project.region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE
    WHERE
      period_start BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY) AND CURRENT_TIMESTAMP()
    GROUP BY
      period_start;

    The result is similar to the following:

    +---------------------+---------+---------+
    |    period_start     | PENDING | RUNNING |
    +---------------------+---------+---------+
    | 2020-07-29 03:52:14 |       7 |      27 |
    | 2020-07-29 03:52:15 |       1 |      21 |
    | 2020-07-29 03:52:16 |       5 |      21 |
    | 2020-07-29 03:52:17 |       4 |      22 |
    +---------------------+---------+---------+
    

    Example: Resource usage by jobs at a specific point in time

    To run the query against a project other than your default project, add the project ID in the following format:

    `PROJECT_ID`.`region-REGION_NAME`.INFORMATION_SCHEMA.VIEW
    ; for example, `myproject`.`region-us`.INFORMATION_SCHEMA.JOBS.

    The following example returns the job_id of all jobs running at a specific point in time together with their resource usage during that one-second period:

    SELECT
      job_id,
      period_slot_ms
    FROM
      `reservation-admin-project.region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE_BY_PROJECT
    WHERE
      period_start = '2020-07-29 03:52:14'
      AND statement_type != 'SCRIPT';

    The result is similar to the following:

    +------------------+
    | job_id | slot_ms |
    +------------------+
    | job_1  | 2415176 |
    | job_2  | 4417245 |
    | job_3  |  427416 |
    | job_4  | 1458122 |
    +------------------+
    

    Example: Match slot usage behavior from administrative resource charts

    You can use administrative resource charts to monitor your organization's health, slot usage, and BigQuery jobs performance over time. The following example queries the INFORMATION_SCHEMA.JOBS_TIMELINE view for a slot usage timeline at one-hour intervals, similar to the information that is available in administrative resource charts.

    WITH
      snapshot_data AS (
        SELECT
          UNIX_MILLIS(period_start) AS period_start,
          IFNULL(SUM(period_slot_ms), 0) AS period_slot_ms,
          DIV(UNIX_MILLIS(period_start), 3600000 * 1) * 3600000 * 1 AS time_ms
        FROM
          (
            SELECT
              *
            FROM
              `user_proj.region-US`.INFORMATION_SCHEMA.JOBS_TIMELINE_BY_ORGANIZATION
            WHERE
              ((job_creation_time >= TIMESTAMP_SUB(@start_time, INTERVAL 1200 MINUTE) AND job_creation_time < TIMESTAMP(@end_time)) AND
              period_start >= TIMESTAMP(@start_time) AND period_start < TIMESTAMP(@end_time)) AND (statement_type !=
              "SCRIPT" OR statement_type IS NULL) AND REGEXP_CONTAINS(reservation_id, "^user_proj:")
          )
        GROUP BY period_start, time_ms
      ),
      data_by_time AS (
        SELECT
          time_ms,
          SUM(period_slot_ms) / (3600000 * 1) AS submetric_value
        FROM
          snapshot_data
        GROUP BY time_ms
      )
    SELECT
      time_ms,
      IFNULL(submetric_value, 0) AS submetric_value,
      "Slot Usage" AS resource_id,
      IFNULL(SUM(submetric_value) OVER () / (TIMESTAMP_DIFF(@end_time, @start_time, HOUR) / 1), 0) AS overall_average_slot_count
    FROM
      (
        SELECT
          time_ms * 3600000 * 1 AS time_ms
        FROM
          UNNEST(GENERATE_ARRAY(DIV(UNIX_MILLIS(@start_time), 3600000 * 1), DIV(UNIX_MILLIS(@end_time), 3600000 *
              1) - 1, 1)) AS time_ms
      )
      LEFT JOIN
      data_by_time
      USING(time_ms)
    ORDER BY time_ms DESC;