google.cloud.bigquery.table.RowIterator#

Methods

to_arrow([progress_bar_type, bqstorage_client])

[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.

to_dataframe([bqstorage_client, dtypes, …])

Create a pandas DataFrame by loading all pages of a query.

Attributes

pages

Iterator of pages in the response.

schema

The subset of columns to be read from the table.

total_rows

The total number of rows in the table.


class google.cloud.bigquery.table.RowIterator(client, api_request, path, schema, page_token=None, max_results=None, page_size=None, extra_params=None, table=None, selected_fields=None)[source]#

Bases: google.api_core.page_iterator.HTTPIterator

A class for iterating through HTTP/JSON API row list responses.

Parameters
  • client (google.cloud.bigquery.Client) – The API client.

  • api_request (Callable[google.cloud._http.JSONConnection.api_request]) – The function to use to make API requests.

  • path (str) – The method path to query for the list of items.

  • page_token (str) – A token identifying a page in a result set to start fetching results from.

  • max_results (int, optional) – The maximum number of results to fetch.

  • page_size (int, optional) – The maximum number of rows in each page of results from this request. Non-positive values are ignored. Defaults to a sensible value set by the API.

  • extra_params (Dict[str, object]) – Extra query string parameters for the API call.

  • table (Union[ Table, TableReference, ]) – Optional. The table which these rows belong to, or a reference to it. Used to call the BigQuery Storage API to fetch rows.

  • selected_fields (Sequence[ google.cloud.bigquery.schema.SchemaField, ]) – Optional. A subset of columns to select from this table.

property pages#

Iterator of pages in the response.

Returns

A

generator of page instances.

Return type

types.GeneratorType[google.api_core.page_iterator.Page]

Raises

ValueError – If the iterator has already been started.

property schema#

The subset of columns to be read from the table.

Type

List[google.cloud.bigquery.schema.SchemaField]

to_arrow(progress_bar_type=None, bqstorage_client=None)[source]#

[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.

Parameters
  • progress_bar_type (Optional[str]) –

    If set, use the tqdm library to display a progress bar while the data downloads. Install the tqdm package to use this feature.

    Possible values of progress_bar_type include:

    None

    No progress bar.

    'tqdm'

    Use the tqdm.tqdm() function to print a progress bar to sys.stderr.

    'tqdm_notebook'

    Use the tqdm.tqdm_notebook() function to display a progress bar as a Jupyter notebook widget.

    'tqdm_gui'

    Use the tqdm.tqdm_gui() function to display a progress bar as a graphical dialog box.

  • bqstorage_client (google.cloud.bigquery_storage_v1beta1.BigQueryStorageClient) –

    Beta Feature Optional. A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.

    This method requires the pyarrow and google-cloud-bigquery-storage libraries.

    Reading from a specific partition or snapshot is not currently supported by this method.

Returns

pyarrow.Table

A pyarrow.Table populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.

Raises

ValueError – If the pyarrow library cannot be imported.

..versionadded:: 1.17.0

to_dataframe(bqstorage_client=None, dtypes=None, progress_bar_type=None)[source]#

Create a pandas DataFrame by loading all pages of a query.

Parameters
  • bqstorage_client (google.cloud.bigquery_storage_v1beta1.BigQueryStorageClient) –

    Beta Feature Optional. A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.

    This method requires the pyarrow and google-cloud-bigquery-storage libraries.

    Reading from a specific partition or snapshot is not currently supported by this method.

    Caution: There is a known issue reading small anonymous query result tables with the BQ Storage API. When a problem is encountered reading a table, the tabledata.list method from the BigQuery API is used, instead.

  • dtypes (Map[str, Union[str, pandas.Series.dtype]]) – Optional. A dictionary of column names pandas dtype``s. The provided ``dtype is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.

  • progress_bar_type (Optional[str]) –

    If set, use the tqdm library to display a progress bar while the data downloads. Install the tqdm package to use this feature.

    Possible values of progress_bar_type include:

    None

    No progress bar.

    'tqdm'

    Use the tqdm.tqdm() function to print a progress bar to sys.stderr.

    'tqdm_notebook'

    Use the tqdm.tqdm_notebook() function to display a progress bar as a Jupyter notebook widget.

    'tqdm_gui'

    Use the tqdm.tqdm_gui() function to display a progress bar as a graphical dialog box.

    ..versionadded:: 1.11.0

Returns

A DataFrame populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.

Return type

pandas.DataFrame

Raises

ValueError – If the pandas library cannot be imported, or the google.cloud.bigquery_storage_v1beta1 module is required but cannot be imported.

property total_rows#

The total number of rows in the table.

Type

int