google.cloud.bigquery.job.QueryJob#

Methods

add_done_callback(fn) Add a callback to be executed when the operation is complete.
cancel([client]) API call: cancel job via a POST request
cancelled() Check if the job has been cancelled.
done([retry]) Refresh the job and checks if it is complete.
exception([timeout]) Get the exception from the operation, blocking if necessary.
exists([client, retry]) API call: test for the existence of the job via a GET request
from_api_repr(resource, client) Factory: construct a job given its API representation
reload([client, retry]) API call: refresh job properties via a GET request.
result([timeout, page_size, retry]) Start the job and wait for it to complete and get the result.
running() True if the operation is currently running.
set_exception(exception) Set the Future’s exception.
set_result(result) Set the Future’s result.
to_api_repr() Generate a resource for _begin().
to_arrow([progress_bar_type, bqstorage_client]) [Beta] Create a class:pyarrow.Table by loading all pages of a table or query.
to_dataframe([bqstorage_client, dtypes, …]) Return a pandas DataFrame from a QueryJob

Attributes

allow_large_results See google.cloud.bigquery.job.QueryJobConfig.allow_large_results.
billing_tier Return billing tier from job statistics, if present.
cache_hit Return whether or not query results were served from cache.
clustering_fields See google.cloud.bigquery.job.QueryJobConfig.clustering_fields.
create_disposition See google.cloud.bigquery.job.QueryJobConfig.create_disposition.
created Datetime at which the job was created.
ddl_operation_performed Return the DDL operation performed.
ddl_target_routine Return the DDL target routine, present
ddl_target_table Return the DDL target table, present
default_dataset See google.cloud.bigquery.job.QueryJobConfig.default_dataset.
destination See google.cloud.bigquery.job.QueryJobConfig.destination.
destination_encryption_configuration Custom encryption configuration for the destination table.
dry_run See google.cloud.bigquery.job.QueryJobConfig.dry_run.
ended Datetime at which the job finished.
error_result Error information about the job as a whole.
errors Information about individual errors generated by the job.
estimated_bytes_processed Return the estimated number of bytes processed by the query.
etag ETag for the job resource.
flatten_results See google.cloud.bigquery.job.QueryJobConfig.flatten_results.
job_id ID of the job.
job_type Type of job
labels Labels for the job.
location Location where the job runs.
maximum_billing_tier See google.cloud.bigquery.job.QueryJobConfig.maximum_billing_tier.
maximum_bytes_billed See google.cloud.bigquery.job.QueryJobConfig.maximum_bytes_billed.
num_dml_affected_rows Return the number of DML rows affected by the job.
path URL path for the job’s APIs.
priority See google.cloud.bigquery.job.QueryJobConfig.priority.
project Project bound to the job.
query The query text used in this query job.
query_parameters See google.cloud.bigquery.job.QueryJobConfig.query_parameters.
query_plan Return query plan from job statistics, if present.
referenced_tables Return referenced tables from job statistics, if present.
schema_update_options See google.cloud.bigquery.job.QueryJobConfig.schema_update_options.
self_link URL for the job resource.
slot_millis Slot-milliseconds used by this query job.
started Datetime at which the job was started.
state Status of the job.
statement_type Return statement type from job statistics, if present.
table_definitions See google.cloud.bigquery.job.QueryJobConfig.table_definitions.
time_partitioning See google.cloud.bigquery.job.QueryJobConfig.time_partitioning.
timeline Return the query execution timeline from job statistics.
total_bytes_billed Return total bytes billed from job statistics, if present.
total_bytes_processed Return total bytes processed from job statistics, if present.
udf_resources See google.cloud.bigquery.job.QueryJobConfig.udf_resources.
undeclared_query_parameters Return undeclared query parameters from job statistics, if present.
use_legacy_sql See google.cloud.bigquery.job.QueryJobConfig.use_legacy_sql.
use_query_cache See google.cloud.bigquery.job.QueryJobConfig.use_query_cache.
user_email E-mail address of user who submitted the job.
write_disposition See google.cloud.bigquery.job.QueryJobConfig.write_disposition.


class google.cloud.bigquery.job.QueryJob(job_id, query, client, job_config=None)[source]#

Bases: google.cloud.bigquery.job._AsyncJob

Asynchronous job: query tables.

Parameters:
  • job_id (str) – the job’s ID, within the project belonging to client.
  • query (str) – SQL query string
  • client (google.cloud.bigquery.client.Client) – A client which holds credentials and project configuration for the dataset (which requires a project).
  • job_config (QueryJobConfig) – (Optional) Extra configuration options for the query job.
add_done_callback(fn)#

Add a callback to be executed when the operation is complete.

If the operation is not already complete, this will start a helper thread to poll for the status of the operation in the background.

Parameters:fn (Callable[Future]) – The callback to execute when the operation is complete.
allow_large_results#

See google.cloud.bigquery.job.QueryJobConfig.allow_large_results.

billing_tier#

Return billing tier from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.billingTier

Return type:int or None
Returns:billing tier used by the job, or None if job is not yet complete.
cache_hit#

Return whether or not query results were served from cache.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.cacheHit

Return type:bool or None
Returns:whether the query results were returned from cache, or None if job is not yet complete.
cancel(client=None)#

API call: cancel job via a POST request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel

Parameters:client (Client or NoneType) – the client to use. If not passed, falls back to the client stored on the current dataset.
Return type:bool
Returns:Boolean indicating that the cancel request was sent.
cancelled()#

Check if the job has been cancelled.

This always returns False. It’s not possible to check if a job was cancelled in the API. This method is here to satisfy the interface for google.api_core.future.Future.

Return type:bool
Returns:False
clustering_fields#

See google.cloud.bigquery.job.QueryJobConfig.clustering_fields.

create_disposition#

See google.cloud.bigquery.job.QueryJobConfig.create_disposition.

created#

Datetime at which the job was created.

Return type:datetime.datetime, or NoneType
Returns:the creation time (None until set from the server).
ddl_operation_performed#

Return the DDL operation performed.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.ddlOperationPerformed

Type:Optional[str]
ddl_target_routine#
Return the DDL target routine, present
for CREATE/DROP FUNCTION/PROCEDURE queries.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/JobStatistics

Type:Optional[google.cloud.bigquery.routine.RoutineReference]
ddl_target_table#
Return the DDL target table, present
for CREATE/DROP TABLE/VIEW queries.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.ddlTargetTable

Type:Optional[google.cloud.bigquery.table.TableReference]
default_dataset#

See google.cloud.bigquery.job.QueryJobConfig.default_dataset.

destination#

See google.cloud.bigquery.job.QueryJobConfig.destination.

destination_encryption_configuration#

Custom encryption configuration for the destination table.

Custom encryption configuration (e.g., Cloud KMS keys) or None if using default encryption.

See google.cloud.bigquery.job.QueryJobConfig.destination_encryption_configuration.

Type:google.cloud.bigquery.table.EncryptionConfiguration
done(retry=<google.api_core.retry.Retry object>)[source]#

Refresh the job and checks if it is complete.

Return type:bool
Returns:True if the job is complete, False otherwise.
dry_run#

See google.cloud.bigquery.job.QueryJobConfig.dry_run.

ended#

Datetime at which the job finished.

Return type:datetime.datetime, or NoneType
Returns:the end time (None until set from the server).
error_result#

Error information about the job as a whole.

Return type:mapping, or NoneType
Returns:the error information (None until set from the server).
errors#

Information about individual errors generated by the job.

Return type:list of mappings, or NoneType
Returns:the error information (None until set from the server).
estimated_bytes_processed#

Return the estimated number of bytes processed by the query.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.estimatedBytesProcessed

Return type:int or None
Returns:number of DML rows affected by the job, or None if job is not yet complete.
etag#

ETag for the job resource.

Return type:str, or NoneType
Returns:the ETag (None until set from the server).
exception(timeout=None)#

Get the exception from the operation, blocking if necessary.

Parameters:timeout (int) – How long to wait for the operation to complete. If None, wait indefinitely.
Returns:
The operation’s
error.
Return type:Optional[google.api_core.GoogleAPICallError]
exists(client=None, retry=<google.api_core.retry.Retry object>)#

API call: test for the existence of the job via a GET request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameters:
  • client (Client or NoneType) – the client to use. If not passed, falls back to the client stored on the current dataset.
  • retry (google.api_core.retry.Retry) – (Optional) How to retry the RPC.
Return type:

bool

Returns:

Boolean indicating existence of the job.

flatten_results#

See google.cloud.bigquery.job.QueryJobConfig.flatten_results.

classmethod from_api_repr(resource, client)[source]#

Factory: construct a job given its API representation

Parameters:
Return type:

google.cloud.bigquery.job.QueryJob

Returns:

Job parsed from resource.

job_id#

ID of the job.

Type:str
job_type#

Type of job

Return type:str
Returns:one of ‘load’, ‘copy’, ‘extract’, ‘query’
labels#

Labels for the job.

Type:Dict[str, str]
location#

Location where the job runs.

Type:str
maximum_billing_tier#

See google.cloud.bigquery.job.QueryJobConfig.maximum_billing_tier.

maximum_bytes_billed#

See google.cloud.bigquery.job.QueryJobConfig.maximum_bytes_billed.

num_dml_affected_rows#

Return the number of DML rows affected by the job.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.numDmlAffectedRows

Return type:int or None
Returns:number of DML rows affected by the job, or None if job is not yet complete.
path#

URL path for the job’s APIs.

Return type:str
Returns:the path based on project and job ID.
priority#

See google.cloud.bigquery.job.QueryJobConfig.priority.

project#

Project bound to the job.

Return type:str
Returns:the project (derived from the client).
query#

The query text used in this query job.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query.query

Type:str
query_parameters#

See google.cloud.bigquery.job.QueryJobConfig.query_parameters.

query_plan#

Return query plan from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.queryPlan

Return type:list of QueryPlanEntry
Returns:mappings describing the query plan, or an empty list if the query has not yet completed.
referenced_tables#

Return referenced tables from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.referencedTables

Return type:list of dict
Returns:mappings describing the query plan, or an empty list if the query has not yet completed.
reload(client=None, retry=<google.api_core.retry.Retry object>)#

API call: refresh job properties via a GET request.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameters:
  • client (Client or NoneType) – the client to use. If not passed, falls back to the client stored on the current dataset.
  • retry (google.api_core.retry.Retry) – (Optional) How to retry the RPC.
result(timeout=None, page_size=None, retry=<google.api_core.retry.Retry object>)[source]#

Start the job and wait for it to complete and get the result.

Parameters:
  • timeout (float) – How long (in seconds) to wait for job to complete before raising a concurrent.futures.TimeoutError.
  • page_size (int) – (Optional) The maximum number of rows in each page of results from this request. Non-positive values are ignored.
  • retry (google.api_core.retry.Retry) – (Optional) How to retry the call that retrieves rows.
Returns:

Iterator of row data Row-s. During each page, the iterator will have the total_rows attribute set, which counts the total number of rows in the result set (this is distinct from the total number of rows in the current page: iterator.page.num_items).

Return type:

google.cloud.bigquery.table.RowIterator

Raises:
running()#

True if the operation is currently running.

schema_update_options#

See google.cloud.bigquery.job.QueryJobConfig.schema_update_options.

URL for the job resource.

Return type:str, or NoneType
Returns:the URL (None until set from the server).
set_exception(exception)#

Set the Future’s exception.

set_result(result)#

Set the Future’s result.

slot_millis#

Slot-milliseconds used by this query job.

Type:Union[int, None]
started#

Datetime at which the job was started.

Return type:datetime.datetime, or NoneType
Returns:the start time (None until set from the server).
state#

Status of the job.

Return type:str, or NoneType
Returns:the state (None until set from the server).
statement_type#

Return statement type from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.statementType

Return type:str or None
Returns:type of statement used by the job, or None if job is not yet complete.
table_definitions#

See google.cloud.bigquery.job.QueryJobConfig.table_definitions.

time_partitioning#

See google.cloud.bigquery.job.QueryJobConfig.time_partitioning.

timeline#

Return the query execution timeline from job statistics.

Type:List(TimelineEntry)
to_api_repr()[source]#

Generate a resource for _begin().

to_arrow(progress_bar_type=None, bqstorage_client=None)[source]#

[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.

Parameters:
  • progress_bar_type (Optional[str]) –

    If set, use the tqdm library to display a progress bar while the data downloads. Install the tqdm package to use this feature.

    Possible values of progress_bar_type include:

    None
    No progress bar.
    'tqdm'
    Use the tqdm.tqdm() function to print a progress bar to sys.stderr.
    'tqdm_notebook'
    Use the tqdm.tqdm_notebook() function to display a progress bar as a Jupyter notebook widget.
    'tqdm_gui'
    Use the tqdm.tqdm_gui() function to display a progress bar as a graphical dialog box.
  • bqstorage_client (google.cloud.bigquery_storage_v1beta1.BigQueryStorageClient) –

    Beta Feature Optional. A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.

    This method requires the pyarrow and google-cloud-bigquery-storage libraries.

    Reading from a specific partition or snapshot is not currently supported by this method.

Returns:

pyarrow.Table

A pyarrow.Table populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.

Raises:

ValueError – If the pyarrow library cannot be imported.

..versionadded:: 1.17.0

to_dataframe(bqstorage_client=None, dtypes=None, progress_bar_type=None)[source]#

Return a pandas DataFrame from a QueryJob

Parameters:
  • bqstorage_client (google.cloud.bigquery_storage_v1beta1.BigQueryStorageClient) –

    Alpha Feature Optional. A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.

    This method requires the fastavro and google-cloud-bigquery-storage libraries.

    Reading from a specific partition or snapshot is not currently supported by this method.

    Caution: There is a known issue reading small anonymous query result tables with the BQ Storage API. Write your query results to a destination table to work around this issue.

  • dtypes (Map[str, Union[str, pandas.Series.dtype]]) – Optional. A dictionary of column names pandas dtype``s. The provided ``dtype is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.
  • progress_bar_type (Optional[str]) –

    If set, use the tqdm library to display a progress bar while the data downloads. Install the tqdm package to use this feature.

    See to_dataframe() for details.

    ..versionadded:: 1.11.0

Returns:

A DataFrame populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.

Raises:

ValueError – If the pandas library cannot be imported.

total_bytes_billed#

Return total bytes billed from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.totalBytesBilled

Return type:int or None
Returns:total bytes processed by the job, or None if job is not yet complete.
total_bytes_processed#

Return total bytes processed from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.totalBytesProcessed

Return type:int or None
Returns:total bytes processed by the job, or None if job is not yet complete.
udf_resources#

See google.cloud.bigquery.job.QueryJobConfig.udf_resources.

undeclared_query_parameters#

Return undeclared query parameters from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#statistics.query.undeclaredQueryParameters

Return type:list of ArrayQueryParameter, ScalarQueryParameter, or StructQueryParameter
Returns:undeclared parameters, or an empty list if the query has not yet completed.
use_legacy_sql#

See google.cloud.bigquery.job.QueryJobConfig.use_legacy_sql.

use_query_cache#

See google.cloud.bigquery.job.QueryJobConfig.use_query_cache.

user_email#

E-mail address of user who submitted the job.

Return type:str, or NoneType
Returns:the URL (None until set from the server).
write_disposition#

See google.cloud.bigquery.job.QueryJobConfig.write_disposition.