Class QueryJob (3.33.0)

QueryJob(job_id, query, client, job_config=None)

Asynchronous job: query tables.

Parameters

Name Description
job_id str

the job's ID, within the project belonging to client.

query str

SQL query string.

client google.cloud.bigquery.client.Client

A client which holds credentials and project configuration for the dataset (which requires a project).

job_config Optional[google.cloud.bigquery.job.QueryJobConfig]

Extra configuration options for the query job.

Properties

allow_large_results

billing_tier

Returns
Type Description
Optional[int] Billing tier used by the job, or None if job is not yet complete.

cache_hit

Return whether or not query results were served from cache.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.cache_hit

Returns
Type Description
Optional[bool] whether the query results were returned from cache, or None if job is not yet complete.

clustering_fields

configuration

The configuration for this query job.

connection_properties

See connection_properties.

.. versionadded:: 2.29.0

create_disposition

create_session

See create_session.

.. versionadded:: 2.29.0

ddl_operation_performed

ddl_target_routine

Optional[google.cloud.bigquery.routine.RoutineReference]: Return the DDL target routine, present for CREATE/DROP FUNCTION/PROCEDURE queries.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.ddl_target_routine

ddl_target_table

default_dataset

destination

destination_encryption_configuration

google.cloud.bigquery.encryption_configuration.EncryptionConfiguration: Custom encryption configuration for the destination table.

Custom encryption configuration (e.g., Cloud KMS keys) or :data:None if using default encryption.

See destination_encryption_configuration.

dry_run

See dry_run.

estimated_bytes_processed

Returns
Type Description
Optional[int] number of DML rows affected by the job, or None if job is not yet complete.

flatten_results

maximum_billing_tier

maximum_bytes_billed

num_dml_affected_rows

Returns
Type Description
Optional[int] number of DML rows affected by the job, or None if job is not yet complete.

priority

See priority.

query

query_id

[Preview] ID of a completed query.

This ID is auto-generated and not guaranteed to be populated.

query_parameters

query_plan

Returns
Type Description
List[google.cloud.bigquery.job.QueryPlanEntry] mappings describing the query plan, or an empty list if the query has not yet completed.

range_partitioning

referenced_tables

Returns
Type Description
List[Dict] mappings describing the query plan, or an empty list if the query has not yet completed.

schema

The schema of the results.

Present only for successful dry run of non-legacy SQL queries.

schema_update_options

search_stats

Returns a SearchStats object.

slot_millis

Union[int, None]: Slot-milliseconds used by this query job.

statement_type

Returns
Type Description
Optional[str] type of statement used by the job, or None if job is not yet complete.

table_definitions

time_partitioning

timeline

List(TimelineEntry): Return the query execution timeline from job statistics.

total_bytes_billed

Returns
Type Description
Optional[int] Total bytes processed by the job, or None if job is not yet complete.

total_bytes_processed

Returns
Type Description
Optional[int] Total bytes processed by the job, or None if job is not yet complete.

udf_resources

undeclared_query_parameters

Return undeclared query parameters from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.undeclared_query_parameters

Returns
Type Description
List[Union[ google.cloud.bigquery.query.ArrayQueryParameter, google.cloud.bigquery.query.ScalarQueryParameter, google.cloud.bigquery.query.StructQueryParameter ]] Undeclared parameters, or an empty list if the query has not yet completed.

use_legacy_sql

use_query_cache

write_disposition

Methods

from_api_repr

from_api_repr(resource: dict, client: Client) -> QueryJob

Factory: construct a job given its API representation

Parameters
Name Description
resource Dict

dataset job representation returned from the API

client google.cloud.bigquery.client.Client

Client which holds credentials and project configuration for the dataset.

Returns
Type Description
google.cloud.bigquery.job.QueryJob Job parsed from resource.

result

result(page_size: typing.Optional[int] = None, max_results: typing.Optional[int] = None, retry: typing.Optional[google.api_core.retry.retry_unary.Retry] = <google.api_core.retry.retry_unary.Retry object>, timeout: typing.Optional[typing.Union[float, object]] = <object object>, start_index: typing.Optional[int] = None, job_retry: typing.Optional[google.api_core.retry.retry_unary.Retry] = <google.api_core.retry.retry_unary.Retry object>) -> typing.Union[RowIterator, google.cloud.bigquery.table._EmptyRowIterator]

Start the job and wait for it to complete and get the result.

Parameters
Name Description
page_size Optional[int]

The maximum number of rows in each page of results from this request. Non-positive values are ignored.

max_results Optional[int]

The maximum total number of rows from this request.

retry Optional[google.api_core.retry.Retry]

How to retry the call that retrieves rows. This only applies to making RPC calls. It isn't used to retry failed jobs. This has a reasonable default that should only be overridden with care. If the job state is DONE, retrying is aborted early even if the results are not available, as this will not change anymore.

timeout Optional[Union[float, google.api_core.future.polling.PollingFuture._DEFAULT_VALUE, ]]

The number of seconds to wait for the underlying HTTP transport before using retry. If None, wait indefinitely unless an error is returned. If unset, only the underlying API calls have their default timeouts, but we still wait indefinitely for the job to finish.

start_index Optional[int]

The zero-based index of the starting row to read.

job_retry Optional[google.api_core.retry.Retry]

How to retry failed jobs. The default retries rate-limit-exceeded errors. Passing None disables job retry. Not all jobs can be retried. If job_id was provided to the query that created this job, then the job returned by the query will not be retryable, and an exception will be raised if non-None non-default job_retry is also provided.

Exceptions
Type Description
google.cloud.exceptions.GoogleAPICallError If the job failed and retries aren't successful.
concurrent.futures.TimeoutError If the job did not complete in the given timeout.
TypeError If Non-None and non-default job_retry is provided and the job is not retryable.
Returns
Type Description
google.cloud.bigquery.table.RowIterator Iterator of row data Row-s. During each page, the iterator will have the total_rows attribute set, which counts the total number of rows **in the result set** (this is distinct from the total number of rows in the current page: iterator.page.num_items). If the query is a special query that produces no results, e.g. a DDL query, an _EmptyRowIterator instance is returned.

to_api_repr

to_api_repr()

Generate a resource for _begin.

to_arrow

to_arrow(
    progress_bar_type: typing.Optional[str] = None,
    bqstorage_client: typing.Optional[bigquery_storage.BigQueryReadClient] = None,
    create_bqstorage_client: bool = True,
    max_results: typing.Optional[int] = None,
) -> pyarrow.Table

[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.

Parameters
Name Description
progress_bar_type Optional[str]

If set, use the tqdm https://tqdm.github.io/_ library to display a progress bar while the data downloads. Install the tqdm package to use this feature. Possible values of progress_bar_type include: None No progress bar. 'tqdm' Use the tqdm.tqdm function to print a progress bar to :data:sys.stdout. 'tqdm_notebook' Use the tqdm.notebook.tqdm function to display a progress bar as a Jupyter notebook widget. 'tqdm_gui' Use the tqdm.tqdm_gui function to display a progress bar as a graphical dialog box.

bqstorage_client Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]

A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires google-cloud-bigquery-storage library. Reading from a specific partition or snapshot is not currently supported by this method.

create_bqstorage_client Optional[bool]

If True (default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the bqstorage_client parameter for more information. This argument does nothing if bqstorage_client is supplied. .. versionadded:: 1.24.0

max_results Optional[int]

Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0

Exceptions
Type Description
ValueError If the pyarrow library cannot be imported. .. versionadded:: 1.17.0

to_dataframe

to_dataframe(
    bqstorage_client: typing.Optional[bigquery_storage.BigQueryReadClient] = None,
    dtypes: typing.Optional[typing.Dict[str, typing.Any]] = None,
    progress_bar_type: typing.Optional[str] = None,
    create_bqstorage_client: bool = True,
    max_results: typing.Optional[int] = None,
    geography_as_object: bool = False,
    bool_dtype: typing.Optional[typing.Any] = DefaultPandasDTypes.BOOL_DTYPE,
    int_dtype: typing.Optional[typing.Any] = DefaultPandasDTypes.INT_DTYPE,
    float_dtype: typing.Optional[typing.Any] = None,
    string_dtype: typing.Optional[typing.Any] = None,
    date_dtype: typing.Optional[typing.Any] = DefaultPandasDTypes.DATE_DTYPE,
    datetime_dtype: typing.Optional[typing.Any] = None,
    time_dtype: typing.Optional[typing.Any] = DefaultPandasDTypes.TIME_DTYPE,
    timestamp_dtype: typing.Optional[typing.Any] = None,
    range_date_dtype: typing.Optional[
        typing.Any
    ] = DefaultPandasDTypes.RANGE_DATE_DTYPE,
    range_datetime_dtype: typing.Optional[
        typing.Any
    ] = DefaultPandasDTypes.RANGE_DATETIME_DTYPE,
    range_timestamp_dtype: typing.Optional[
        typing.Any
    ] = DefaultPandasDTypes.RANGE_TIMESTAMP_DTYPE,
) -> pandas.DataFrame

Return a pandas DataFrame from a QueryJob

Parameters
Name Description
bqstorage_client Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]

A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires the fastavro and google-cloud-bigquery-storage libraries. Reading from a specific partition or snapshot is not currently supported by this method.

dtypes Optional[Map[str, Union[str, pandas.Series.dtype]]]

A dictionary of column names pandas dtypes. The provided dtype is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.

progress_bar_type Optional[str]

If set, use the tqdm https://tqdm.github.io/_ library to display a progress bar while the data downloads. Install the tqdm package to use this feature. See to_dataframe for details. .. versionadded:: 1.11.0

create_bqstorage_client Optional[bool]

If True (default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the bqstorage_client parameter for more information. This argument does nothing if bqstorage_client is supplied. .. versionadded:: 1.24.0

max_results Optional[int]

Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0

geography_as_object Optional[bool]

If True, convert GEOGRAPHY data to shapely geometry objects. If False (default), don't cast geography data to shapely geometry objects. .. versionadded:: 2.24.0

bool_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.BooleanDtype()) to convert BigQuery Boolean type, instead of relying on the default pandas.BooleanDtype(). If you explicitly set the value to None, then the data type will be numpy.dtype("bool"). BigQuery Boolean type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#boolean_type .. versionadded:: 3.8.0

int_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.Int64Dtype()) to convert BigQuery Integer types, instead of relying on the default pandas.Int64Dtype(). If you explicitly set the value to None, then the data type will be numpy.dtype("int64"). A list of BigQuery Integer types can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#integer_types .. versionadded:: 3.8.0

float_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.Float32Dtype()) to convert BigQuery Float type, instead of relying on the default numpy.dtype("float64"). If you explicitly set the value to None, then the data type will be numpy.dtype("float64"). BigQuery Float type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#floating_point_types .. versionadded:: 3.8.0

string_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.StringDtype()) to convert BigQuery String type, instead of relying on the default numpy.dtype("object"). If you explicitly set the value to None, then the data type will be numpy.dtype("object"). BigQuery String type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#string_type .. versionadded:: 3.8.0

date_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.ArrowDtype(pyarrow.date32())) to convert BigQuery Date type, instead of relying on the default db_dtypes.DateDtype(). If you explicitly set the value to None, then the data type will be numpy.dtype("datetime64[ns]") or object if out of bound. BigQuery Date type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#date_type .. versionadded:: 3.10.0

datetime_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.ArrowDtype(pyarrow.timestamp("us"))) to convert BigQuery Datetime type, instead of relying on the default numpy.dtype("datetime64[ns]. If you explicitly set the value to None, then the data type will be numpy.dtype("datetime64[ns]") or object if out of bound. BigQuery Datetime type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#datetime_type .. versionadded:: 3.10.0

time_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.ArrowDtype(pyarrow.time64("us"))) to convert BigQuery Time type, instead of relying on the default db_dtypes.TimeDtype(). If you explicitly set the value to None, then the data type will be numpy.dtype("object"). BigQuery Time type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#time_type .. versionadded:: 3.10.0

timestamp_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.ArrowDtype(pyarrow.timestamp("us", tz="UTC"))) to convert BigQuery Timestamp type, instead of relying on the default numpy.dtype("datetime64[ns, UTC]"). If you explicitly set the value to None, then the data type will be numpy.dtype("datetime64[ns, UTC]") or object if out of bound. BigQuery Datetime type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#timestamp_type .. versionadded:: 3.10.0

range_date_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype, such as: .. code-block:: python pandas.ArrowDtype(pyarrow.struct( [("start", pyarrow.date32()), ("end", pyarrow.date32())] )) to convert BigQuery RANGE

range_datetime_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype, such as: .. code-block:: python pandas.ArrowDtype(pyarrow.struct( [ ("start", pyarrow.timestamp("us")), ("end", pyarrow.timestamp("us")), ] )) to convert BigQuery RANGE

range_timestamp_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype, such as: .. code-block:: python pandas.ArrowDtype(pyarrow.struct( [ ("start", pyarrow.timestamp("us", tz="UTC")), ("end", pyarrow.timestamp("us", tz="UTC")), ] )) to convert BigQuery RANGE

Exceptions
Type Description
ValueError If the pandas library cannot be imported, or the bigquery_storage_v1 module is required but cannot be imported. Also if geography_as_object is True, but the shapely library cannot be imported.
Returns
Type Description
pandas.DataFrame A pandas.DataFrame populated with row data and column headers from the query results. The column headers are derived from the destination table's schema.

to_geodataframe

to_geodataframe(
    bqstorage_client: typing.Optional[bigquery_storage.BigQueryReadClient] = None,
    dtypes: typing.Optional[typing.Dict[str, typing.Any]] = None,
    progress_bar_type: typing.Optional[str] = None,
    create_bqstorage_client: bool = True,
    max_results: typing.Optional[int] = None,
    geography_column: typing.Optional[str] = None,
    bool_dtype: typing.Optional[typing.Any] = DefaultPandasDTypes.BOOL_DTYPE,
    int_dtype: typing.Optional[typing.Any] = DefaultPandasDTypes.INT_DTYPE,
    float_dtype: typing.Optional[typing.Any] = None,
    string_dtype: typing.Optional[typing.Any] = None,
) -> geopandas.GeoDataFrame

Return a GeoPandas GeoDataFrame from a QueryJob

Parameters
Name Description
bqstorage_client Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]

A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires the fastavro and google-cloud-bigquery-storage libraries. Reading from a specific partition or snapshot is not currently supported by this method.

dtypes Optional[Map[str, Union[str, pandas.Series.dtype]]]

A dictionary of column names pandas dtypes. The provided dtype is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.

progress_bar_type Optional[str]

If set, use the tqdm https://tqdm.github.io/_ library to display a progress bar while the data downloads. Install the tqdm package to use this feature. See to_dataframe for details. .. versionadded:: 1.11.0

create_bqstorage_client Optional[bool]

If True (default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the bqstorage_client parameter for more information. This argument does nothing if bqstorage_client is supplied. .. versionadded:: 1.24.0

max_results Optional[int]

Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0

geography_column Optional[str]

If there are more than one GEOGRAPHY column, identifies which one to use to construct a GeoPandas GeoDataFrame. This option can be ommitted if there's only one GEOGRAPHY column.

bool_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.BooleanDtype()) to convert BigQuery Boolean type, instead of relying on the default pandas.BooleanDtype(). If you explicitly set the value to None, then the data type will be numpy.dtype("bool"). BigQuery Boolean type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#boolean_type

int_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.Int64Dtype()) to convert BigQuery Integer types, instead of relying on the default pandas.Int64Dtype(). If you explicitly set the value to None, then the data type will be numpy.dtype("int64"). A list of BigQuery Integer types can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#integer_types

float_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.Float32Dtype()) to convert BigQuery Float type, instead of relying on the default numpy.dtype("float64"). If you explicitly set the value to None, then the data type will be numpy.dtype("float64"). BigQuery Float type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#floating_point_types

string_dtype Optional[pandas.Series.dtype, None]

If set, indicate a pandas ExtensionDtype (e.g. pandas.StringDtype()) to convert BigQuery String type, instead of relying on the default numpy.dtype("object"). If you explicitly set the value to None, then the data type will be numpy.dtype("object"). BigQuery String type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#string_type

Exceptions
Type Description
ValueError If the geopandas library cannot be imported, or the bigquery_storage_v1 module is required but cannot be imported. .. versionadded:: 2.24.0
Returns
Type Description
geopandas.GeoDataFrame A geopandas.GeoDataFrame populated with row data and column headers from the query results. The column headers are derived from the destination table's schema.