- 3.25.0 (latest)
- 3.24.0
- 3.23.1
- 3.22.0
- 3.21.0
- 3.20.1
- 3.19.0
- 3.18.0
- 3.17.2
- 3.16.0
- 3.15.0
- 3.14.1
- 3.13.0
- 3.12.0
- 3.11.4
- 3.4.0
- 3.3.6
- 3.2.0
- 3.1.0
- 3.0.1
- 2.34.4
- 2.33.0
- 2.32.0
- 2.31.0
- 2.30.1
- 2.29.0
- 2.28.1
- 2.27.1
- 2.26.0
- 2.25.2
- 2.24.1
- 2.23.3
- 2.22.1
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.1
- 2.15.0
- 2.14.0
- 2.13.1
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.2
- 2.5.0
- 2.4.0
- 2.3.1
- 2.2.0
- 2.1.0
- 2.0.0
- 1.28.2
- 1.27.2
- 1.26.1
- 1.25.0
- 1.24.0
- 1.23.1
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
RowIterator(
client,
api_request,
path,
schema,
page_token=None,
max_results=None,
page_size=None,
extra_params=None,
table=None,
selected_fields=None,
total_rows=None,
first_page_response=None,
)
A class for iterating through HTTP/JSON API row list responses.
Parameters
Name | Description |
client |
Optional[google.cloud.bigquery.Client]
The API client instance. This should always be non- |
api_request |
Callable[google.cloud._http.JSONConnection.api_request]
The function to use to make API requests. |
path |
str
The method path to query for the list of items. |
schema |
Sequence[Union[ SchemaField, Mapping[str, Any] ]]
The table's schema. If any item is a mapping, its content must be compatible with from_api_repr. |
page_token |
str
A token identifying a page in a result set to start fetching results from. |
max_results |
Optional[int]
The maximum number of results to fetch. |
page_size |
Optional[int]
The maximum number of rows in each page of results from this request. Non-positive values are ignored. Defaults to a sensible value set by the API. |
extra_params |
Optional[Dict[str, object]]
Extra query string parameters for the API call. |
table |
Optional[Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, ]]
The table which these rows belong to, or a reference to it. Used to call the BigQuery Storage API to fetch rows. |
selected_fields |
Optional[Sequence[google.cloud.bigquery.schema.SchemaField]]
A subset of columns to select from this table. |
total_rows |
Optional[int]
Total number of rows in the table. |
first_page_response |
Optional[dict]
API response for the first page of results. These are returned when the first page is requested. |
Inheritance
builtins.object > google.api_core.page_iterator.Iterator > google.api_core.page_iterator.HTTPIterator > RowIteratorProperties
pages
Iterator of pages in the response.
Type | Description |
ValueError | If the iterator has already been started. |
Type | Description |
types.GeneratorType[google.api_core.page_iterator.Page] | A generator of page instances. |
schema
List[google.cloud.bigquery.schema.SchemaField]: The subset of columns to be read from the table.
total_rows
int: The total number of rows in the table.
Methods
__iter__
__iter__()
Iterator for each item returned.
Type | Description |
ValueError | If the iterator has already been started. |
Type | Description |
types.GeneratorType[Any] | A generator of items from the API. |
to_arrow
to_arrow(
progress_bar_type: str = None,
bqstorage_client: Optional[bigquery_storage.BigQueryReadClient] = None,
create_bqstorage_client: bool = True,
)
[Beta] Create a class:pyarrow.Table
by loading all pages of a
table or query.
Name | Description |
progress_bar_type |
Optional[str]
If set, use the |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires |
create_bqstorage_client |
Optional[bool]
If |
to_arrow_iterable
to_arrow_iterable(bqstorage_client: bigquery_storage.BigQueryReadClient = None, max_queue_size: int = <object object>)
[Beta] Create an iterable of class:pyarrow.RecordBatch
, to process the table as a stream.
Name | Description |
max_queue_size |
Optional[int]
The maximum number of result pages to hold in the internal queue when streaming query results over the BigQuery Storage API. Ignored if Storage API is not used. By default, the max queue size is set to the number of BQ Storage streams created by the server. If |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This method requires the |
Type | Description |
pyarrow.RecordBatch .. versionadded:: 2.31.0 | A generator of `pyarrow.RecordBatch`. |
to_dataframe
to_dataframe(
bqstorage_client: Optional[bigquery_storage.BigQueryReadClient] = None,
dtypes: Dict[str, Any] = None,
progress_bar_type: str = None,
create_bqstorage_client: bool = True,
geography_as_object: bool = False,
)
Create a pandas DataFrame by loading all pages of a query.
Name | Description |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This method requires |
dtypes |
Optional[Map[str, Union[str, pandas.Series.dtype]]]
A dictionary of column names pandas |
progress_bar_type |
Optional[str]
If set, use the |
create_bqstorage_client |
Optional[bool]
If |
geography_as_object |
Optional[bool]
If |
Type | Description |
ValueError | If the `pandas` library cannot be imported, or the bigquery_storage_v1 module is required but cannot be imported. Also if `geography_as_object` is `True`, but the `shapely` library cannot be imported. |
Type | Description |
pandas.DataFrame | A `pandas.DataFrame` populated with row data and column headers from the query results. The column headers are derived from the destination table's schema. |
to_dataframe_iterable
to_dataframe_iterable(bqstorage_client: Optional[bigquery_storage.BigQueryReadClient] = None, dtypes: Dict[str, Any] = None, max_queue_size: int = <object object>)
Create an iterable of pandas DataFrames, to process the table as a stream.
Name | Description |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This method requires |
dtypes |
Optional[Map[str, Union[str, pandas.Series.dtype]]]
A dictionary of column names pandas |
max_queue_size |
Optional[int]
The maximum number of result pages to hold in the internal queue when streaming query results over the BigQuery Storage API. Ignored if Storage API is not used. By default, the max queue size is set to the number of BQ Storage streams created by the server. If |
Type | Description |
ValueError | If the `pandas` library cannot be imported. |
Type | Description |
pandas.DataFrame | A generator of `pandas.DataFrame`. |
to_geodataframe
to_geodataframe(
bqstorage_client: bigquery_storage.BigQueryReadClient = None,
dtypes: Dict[str, Any] = None,
progress_bar_type: str = None,
create_bqstorage_client: bool = True,
geography_column: Optional[str] = None,
)
Create a GeoPandas GeoDataFrame by loading all pages of a query.
Name | Description |
dtypes |
Optional[Map[str, Union[str, pandas.Series.dtype]]]
A dictionary of column names pandas |
progress_bar_type |
Optional[str]
If set, use the |
create_bqstorage_client |
Optional[bool]
If |
geography_column |
Optional[str]
If there are more than one GEOGRAPHY column, identifies which one to use to construct a geopandas GeoDataFrame. This option can be ommitted if there's only one GEOGRAPHY column. |
bqstorage_client |
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This method requires the |
Type | Description |
ValueError | If the `geopandas` library cannot be imported, or the bigquery_storage_v1 module is required but cannot be imported. .. versionadded:: 2.24.0 |
Type | Description |
geopandas.GeoDataFrame | A `geopandas.GeoDataFrame` populated with row data and column headers from the query results. The column headers are derived from the destination table's schema. |
__init__
__init__(
client,
api_request,
path,
schema,
page_token=None,
max_results=None,
page_size=None,
extra_params=None,
table=None,
selected_fields=None,
total_rows=None,
first_page_response=None,
)
Initialize self. See help(type(self)) for accurate signature.
RowIterator
RowIterator(
client,
api_request,
path,
schema,
page_token=None,
max_results=None,
page_size=None,
extra_params=None,
table=None,
selected_fields=None,
total_rows=None,
first_page_response=None,
)
A class for iterating through HTTP/JSON API row list responses.
Name | Description |
client |
Optional[google.cloud.bigquery.Client]
The API client instance. This should always be non- |
api_request |
Callable[google.cloud._http.JSONConnection.api_request]
The function to use to make API requests. |
path |
str
The method path to query for the list of items. |
schema |
Sequence[Union[ SchemaField, Mapping[str, Any] ]]
The table's schema. If any item is a mapping, its content must be compatible with from_api_repr. |
page_token |
str
A token identifying a page in a result set to start fetching results from. |
max_results |
Optional[int]
The maximum number of results to fetch. |
page_size |
Optional[int]
The maximum number of rows in each page of results from this request. Non-positive values are ignored. Defaults to a sensible value set by the API. |
extra_params |
Optional[Dict[str, object]]
Extra query string parameters for the API call. |
table |
Optional[Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, ]]
The table which these rows belong to, or a reference to it. Used to call the BigQuery Storage API to fetch rows. |
selected_fields |
Optional[Sequence[google.cloud.bigquery.schema.SchemaField]]
A subset of columns to select from this table. |
total_rows |
Optional[int]
Total number of rows in the table. |
first_page_response |
Optional[dict]
API response for the first page of results. These are returned when the first page is requested. |