Deletes the batch workload resource. If the batch is not in a CANCELLED, SUCCEEDED or FAILED State, the delete operation fails and the response returns FAILED_PRECONDITION.

Method Details

analyze(name, body=None, x__xgafv=None)

Analyze a Batch for possible recommendations and insights.

Args:
name: string, Required. The fully qualified name of the batch to analyze in the format "projects/PROJECT_ID/locations/DATAPROC_REGION/batches/BATCH_ID" (required)
body: object, The request body.
The object takes the form of:

{ # A request to analyze a batch workload.
"requestId": "A String", # Optional. A unique ID used to identify the request. If the service receives two AnalyzeBatchRequest (http://cloud/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#google.cloud.dataproc.v1.AnalyzeBatchRequest)s with the same request_id, the second request is ignored and the Operation that corresponds to the first request created and stored in the backend is returned.Recommendation: Set this value to a UUID (https://en.wikipedia.org/wiki/Universally_unique_identifier).The value must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), and hyphens (-). The maximum length is 40 characters.
"requestorId": "A String", # Optional. The requestor ID is used to identify if the request comes from a GCA investigation or the old Ask Gemini Experience.
}

x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format

Returns:
An object of the form:

{ # This resource represents a long-running operation that is the result of a network API call.
"done": True or False, # If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available.
"error": { # The Status type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by gRPC (https://github.com/grpc). Each Status message contains three pieces of data: error code, error message, and error details.You can find out more about this error model and how to work with it in the API Design Guide (https://cloud.google.com/apis/design/errors). # The error result of the operation in case of failure or cancellation.
"code": 42, # The status code, which should be an enum value of google.rpc.Code.
"details": [ # A list of messages that carry the error details. There is a common set of message types for APIs to use.
{
"a_key": "", # Properties of the object. Contains field @type with type URL.
},
],
"message": "A String", # A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client.
},
"metadata": { # Service-specific metadata associated with the operation. It typically contains progress information and common metadata such as create time. Some services might not provide such metadata. Any method that returns a long-running operation should document the metadata type, if any.
"a_key": "", # Properties of the object. Contains field @type with type URL.
},
"name": "A String", # The server-assigned name, which is only unique within the same service that originally returns it. If you use the default HTTP mapping, the name should be a resource name ending with operations/{unique_id}.
"response": { # The normal, successful response of the operation. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty. If the original method is standard Get/Create/Update, the response should be the resource. For other methods, the response should have the type XxxResponse, where Xxx is the original method name. For example, if the original method name is TakeSnapshot(), the inferred response type is TakeSnapshotResponse.
"a_key": "", # Properties of the object. Contains field @type with type URL.
},
}

close()

Close httplib2 connections.

create(parent, batchId=None, body=None, requestId=None, x__xgafv=None)

Creates a batch workload that executes asynchronously.

Args:
parent: string, Required. The parent resource where this batch will be created. (required)
body: object, The request body.
The object takes the form of:

{ # A representation of a batch workload in the service.
"createTime": "A String", # Output only. The time when the batch was created.
"creator": "A String", # Output only. The email address of the user who created the batch.
"environmentConfig": { # Environment configuration for a workload. # Optional. Environment configuration for the batch execution.
"executionConfig": { # Execution configuration for a workload. # Optional. Execution configuration for a workload.
"authenticationConfig": { # Authentication configuration for a workload is used to set the default identity for the workload execution. The config specifies the type of identity (service account or user) that will be used by workloads to access resources on the project(s). # Optional. Authentication configuration used to set the default identity for the workload execution. The config specifies the type of identity (service account or user) that will be used by workloads to access resources on the project(s).
"userWorkloadAuthenticationType": "A String", # Optional. Authentication type for the user workload running in containers.
},
"idleTtl": "A String", # Optional. Applies to sessions only. The duration to keep the session alive while it's idling. Exceeding this threshold causes the session to terminate. This field cannot be set on a batch workload. Minimum value is 10 minutes; maximum value is 14 days (see JSON representation of Duration (https://developers.google.com/protocol-buffers/docs/proto3#json)). Defaults to 1 hour if not set. If both ttl and idle_ttl are specified for an interactive session, the conditions are treated as OR conditions: the workload will be terminated when it has been idle for idle_ttl or when ttl has been exceeded, whichever occurs first.
"kmsKey": "A String", # Optional. The Cloud KMS key to use for encryption.
"networkTags": [ # Optional. Tags used for network traffic control.
"A String",
],
"networkUri": "A String", # Optional. Network URI to connect workload to.
"serviceAccount": "A String", # Optional. Service account that used to execute workload.
"stagingBucket": "A String", # Optional. A Cloud Storage bucket used to stage workload dependencies, config files, and store workload output and other ephemeral data, such as Spark history files. If you do not specify a staging bucket, Cloud Dataproc will determine a Cloud Storage location according to the region where your workload is running, and then create and manage project-level, per-location staging and temporary buckets. This field requires a Cloud Storage bucket name, not a gs://... URI to a Cloud Storage bucket.
"subnetworkUri": "A String", # Optional. Subnetwork URI to connect workload to.
"ttl": "A String", # Optional. The duration after which the workload will be terminated, specified as the JSON representation for Duration (https://protobuf.dev/programming-guides/proto3/#json). When the workload exceeds this duration, it will be unconditionally terminated without waiting for ongoing work to finish. If ttl is not specified for a batch workload, the workload will be allowed to run until it exits naturally (or run forever without exiting). If ttl is not specified for an interactive session, it defaults to 24 hours. If ttl is not specified for a batch that uses 2.1+ runtime version, it defaults to 4 hours. Minimum value is 10 minutes; maximum value is 14 days. If both ttl and idle_ttl are specified (for an interactive session), the conditions are treated as OR conditions: the workload will be terminated when it has been idle for idle_ttl or when ttl has been exceeded, whichever occurs first.
},
"peripheralsConfig": { # Auxiliary services configuration for a workload. # Optional. Peripherals configuration that workload has access to.
"metastoreService": "A String", # Optional. Resource name of an existing Dataproc Metastore service.Example: projects/[project_id]/locations/[region]/services/[service_id]
"sparkHistoryServerConfig": { # Spark History Server configuration for the workload. # Optional. The Spark History Server configuration for the workload.
"dataprocCluster": "A String", # Optional. Resource name of an existing Dataproc Cluster to act as a Spark History Server for the workload.Example: projects/[project_id]/regions/[region]/clusters/[cluster_name]
},
},
},
"labels": { # Optional. The labels to associate with this batch. Label keys must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be associated with a batch.
"a_key": "A String",
},
"name": "A String", # Output only. The resource name of the batch.
"operation": "A String", # Output only. The resource name of the operation associated with this batch.
"pysparkBatch": { # A configuration for running an Apache PySpark (https://spark.apache.org/docs/latest/api/python/getting_started/quickstart.html) batch workload. # Optional. PySpark batch config.
"archiveUris": [ # Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
"A String",
],
"args": [ # Optional. The arguments to pass to the driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
"A String",
],
"fileUris": [ # Optional. HCFS URIs of files to be placed in the working directory of each executor.
"A String",
],
"jarFileUris": [ # Optional. HCFS URIs of jar files to add to the classpath of the Spark driver and tasks.
"A String",
],
"mainPythonFileUri": "A String", # Required. The HCFS URI of the main Python file to use as the Spark driver. Must be a .py file.
"pythonFileUris": [ # Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
"A String",
],
},
"runtimeConfig": { # Runtime configuration for a workload. # Optional. Runtime configuration for the batch execution.
"autotuningConfig": { # Autotuning configuration of the workload. # Optional. Autotuning configuration of the workload.
"scenarios": [ # Optional. Scenarios for which tunings are applied.
"A String",
],
},
"cohort": "A String", # Optional. Cohort identifier. Identifies families of the workloads having the same shape, e.g. daily ETL jobs.
"containerImage": "A String", # Optional. Optional custom container image for the job runtime environment. If not specified, a default container image will be used.
"properties": { # Optional. A mapping of property names to values, which are used to configure workload execution.
"a_key": "A String",
},
"repositoryConfig": { # Configuration for dependency repositories # Optional. Dependency repository configuration.
"pypiRepositoryConfig": { # Configuration for PyPi repository # Optional. Configuration for PyPi repository.
"pypiRepository": "A String", # Optional. PyPi repository address
},
},
"version": "A String", # Optional. Version of the batch runtime.
},
"runtimeInfo": { # Runtime information about workload execution. # Output only. Runtime information about batch execution.
"approximateUsage": { # Usage metrics represent approximate total resources consumed by a workload. # Output only. Approximate workload resource usage, calculated when the workload completes (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).Note: This metric calculation may change in the future, for example, to capture cumulative workload resource consumption during workload execution (see the Dataproc Serverless release notes (https://cloud.google.com/dataproc-serverless/docs/release-notes) for announcements, changes, fixes and other Dataproc developments).
"acceleratorType": "A String", # Optional. DEPRECATED Accelerator type being used, if any
"milliAcceleratorSeconds": "A String", # Optional. DEPRECATED Accelerator usage in (milliAccelerator x seconds) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
"milliDcuSeconds": "A String", # Optional. DCU (Dataproc Compute Units) usage in (milliDCU x seconds) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
"shuffleStorageGbSeconds": "A String", # Optional. Shuffle storage usage in (GB x seconds) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
"updateTime": "A String", # Optional. The timestamp of the usage metrics.
},
"currentUsage": { # The usage snapshot represents the resources consumed by a workload at a specified time. # Output only. Snapshot of current workload resource usage.
"acceleratorType": "A String", # Optional. Accelerator type being used, if any
"milliAccelerator": "A String", # Optional. Milli (one-thousandth) accelerator. (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing))
"milliDcu": "A String", # Optional. Milli (one-thousandth) Dataproc Compute Units (DCUs) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
"milliDcuPremium": "A String", # Optional. Milli (one-thousandth) Dataproc Compute Units (DCUs) charged at premium tier (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
"shuffleStorageGb": "A String", # Optional. Shuffle Storage in gigabytes (GB). (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing))
"shuffleStorageGbPremium": "A String", # Optional. Shuffle Storage in gigabytes (GB) charged at premium tier. (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing))
"snapshotTime": "A String", # Optional. The timestamp of the usage snapshot.
},
"diagnosticOutputUri": "A String", # Output only. A URI pointing to the location of the diagnostics tarball.
"endpoints": { # Output only. Map of remote access endpoints (such as web interfaces and APIs) to their URIs.
"a_key": "A String",
},
"outputUri": "A String", # Output only. A URI pointing to the location of the stdout and stderr of the workload.
"propertiesInfo": { # Properties of the workload organized by origin. # Optional. Properties of the workload organized by origin.
"autotuningProperties": { # Output only. Properties set by autotuning engine.
"a_key": { # Annotatated property value.
"annotation": "A String", # Annotation, comment or explanation why the property was set.
"overriddenValue": "A String", # Optional. Value which was replaced by the corresponding component.
"value": "A String", # Property value.
},
},
},
},
"sparkBatch": { # A configuration for running an Apache Spark (https://spark.apache.org/) batch workload. # Optional. Spark batch config.
"archiveUris": [ # Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
"A String",
],
"args": [ # Optional. The arguments to pass to the driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
"A String",
],
"fileUris": [ # Optional. HCFS URIs of files to be placed in the working directory of each executor.
"A String",
],
"jarFileUris": [ # Optional. HCFS URIs of jar files to add to the classpath of the Spark driver and tasks.
"A String",
],
"mainClass": "A String", # Optional. The name of the driver main class. The jar file that contains the class must be in the classpath or specified in jar_file_uris.
"mainJarFileUri": "A String", # Optional. The HCFS URI of the jar file that contains the main class.
},
"sparkRBatch": { # A configuration for running an Apache SparkR (https://spark.apache.org/docs/latest/sparkr.html) batch workload. # Optional. SparkR batch config.
"archiveUris": [ # Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
"A String",
],
"args": [ # Optional. The arguments to pass to the Spark driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
"A String",
],
"fileUris": [ # Optional. HCFS URIs of files to be placed in the working directory of each executor.
"A String",
],
"mainRFileUri": "A String", # Required. The HCFS URI of the main R file to use as the driver. Must be a .R or .r file.
},
"sparkSqlBatch": { # A configuration for running Apache Spark SQL (https://spark.apache.org/sql/) queries as a batch workload. # Optional. SparkSql batch config.
"jarFileUris": [ # Optional. HCFS URIs of jar files to be added to the Spark CLASSPATH.
"A String",
],
"queryFileUri": "A String", # Required. The HCFS URI of the script that contains Spark SQL queries to execute.
"queryVariables": { # Optional. Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";).
"a_key": "A String",
},
},
"state": "A String", # Output only. The state of the batch.
"stateHistory": [ # Output only. Historical state information for the batch.
{ # Historical state information.
"state": "A String", # Output only. The state of the batch at this point in history.
"stateMessage": "A String", # Output only. Details about the state at this point in history.
"stateStartTime": "A String", # Output only. The time when the batch entered the historical state.
},
],
"stateMessage": "A String", # Output only. Batch state details, such as a failure description if the state is FAILED.
"stateTime": "A String", # Output only. The time when the batch entered a current state.
"uuid": "A String", # Output only. A batch UUID (Unique Universal Identifier). The service generates this value when it creates the batch.
}

batchId: string, Optional. The ID to use for the batch, which will become the final component of the batch's resource name.This value must be 4-63 characters. Valid characters are /[a-z][0-9]-/.
requestId: string, Optional. A unique ID used to identify the request. If the service receives two CreateBatchRequest (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#google.cloud.dataproc.v1.CreateBatchRequest)s with the same request_id, the second request is ignored and the Operation that corresponds to the first Batch created and stored in the backend is returned.Recommendation: Set this value to a UUID (https://en.wikipedia.org/wiki/Universally_unique_identifier).The value must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), and hyphens (-). The maximum length is 40 characters.
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format

Returns:
An object of the form:

delete(name, x__xgafv=None)

Deletes the batch workload resource. If the batch is not in a CANCELLED, SUCCEEDED or FAILED State, the delete operation fails and the response returns FAILED_PRECONDITION.

Args:
  name: string, Required. The fully qualified name of the batch to retrieve in the format "projects/PROJECT_ID/locations/DATAPROC_REGION/batches/BATCH_ID" (required)
  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # A generic empty message that you can re-use to avoid defining duplicated empty messages in your APIs. A typical example is to use it as the request or the response type of an API method. For instance: service Foo { rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); }
}

get(name, x__xgafv=None)

Gets the batch workload resource representation.

Args:
  name: string, Required. The fully qualified name of the batch to retrieve in the format "projects/PROJECT_ID/locations/DATAPROC_REGION/batches/BATCH_ID" (required)
  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # A representation of a batch workload in the service.
  "createTime": "A String", # Output only. The time when the batch was created.
  "creator": "A String", # Output only. The email address of the user who created the batch.
  "environmentConfig": { # Environment configuration for a workload. # Optional. Environment configuration for the batch execution.
    "executionConfig": { # Execution configuration for a workload. # Optional. Execution configuration for a workload.
      "authenticationConfig": { # Authentication configuration for a workload is used to set the default identity for the workload execution. The config specifies the type of identity (service account or user) that will be used by workloads to access resources on the project(s). # Optional. Authentication configuration used to set the default identity for the workload execution. The config specifies the type of identity (service account or user) that will be used by workloads to access resources on the project(s).
        "userWorkloadAuthenticationType": "A String", # Optional. Authentication type for the user workload running in containers.
      },
      "idleTtl": "A String", # Optional. Applies to sessions only. The duration to keep the session alive while it's idling. Exceeding this threshold causes the session to terminate. This field cannot be set on a batch workload. Minimum value is 10 minutes; maximum value is 14 days (see JSON representation of Duration (https://developers.google.com/protocol-buffers/docs/proto3#json)). Defaults to 1 hour if not set. If both ttl and idle_ttl are specified for an interactive session, the conditions are treated as OR conditions: the workload will be terminated when it has been idle for idle_ttl or when ttl has been exceeded, whichever occurs first.
      "kmsKey": "A String", # Optional. The Cloud KMS key to use for encryption.
      "networkTags": [ # Optional. Tags used for network traffic control.
        "A String",
      ],
      "networkUri": "A String", # Optional. Network URI to connect workload to.
      "serviceAccount": "A String", # Optional. Service account that used to execute workload.
      "stagingBucket": "A String", # Optional. A Cloud Storage bucket used to stage workload dependencies, config files, and store workload output and other ephemeral data, such as Spark history files. If you do not specify a staging bucket, Cloud Dataproc will determine a Cloud Storage location according to the region where your workload is running, and then create and manage project-level, per-location staging and temporary buckets. This field requires a Cloud Storage bucket name, not a gs://... URI to a Cloud Storage bucket.
      "subnetworkUri": "A String", # Optional. Subnetwork URI to connect workload to.
      "ttl": "A String", # Optional. The duration after which the workload will be terminated, specified as the JSON representation for Duration (https://protobuf.dev/programming-guides/proto3/#json). When the workload exceeds this duration, it will be unconditionally terminated without waiting for ongoing work to finish. If ttl is not specified for a batch workload, the workload will be allowed to run until it exits naturally (or run forever without exiting). If ttl is not specified for an interactive session, it defaults to 24 hours. If ttl is not specified for a batch that uses 2.1+ runtime version, it defaults to 4 hours. Minimum value is 10 minutes; maximum value is 14 days. If both ttl and idle_ttl are specified (for an interactive session), the conditions are treated as OR conditions: the workload will be terminated when it has been idle for idle_ttl or when ttl has been exceeded, whichever occurs first.
    },
    "peripheralsConfig": { # Auxiliary services configuration for a workload. # Optional. Peripherals configuration that workload has access to.
      "metastoreService": "A String", # Optional. Resource name of an existing Dataproc Metastore service.Example: projects/[project_id]/locations/[region]/services/[service_id]
      "sparkHistoryServerConfig": { # Spark History Server configuration for the workload. # Optional. The Spark History Server configuration for the workload.
        "dataprocCluster": "A String", # Optional. Resource name of an existing Dataproc Cluster to act as a Spark History Server for the workload.Example: projects/[project_id]/regions/[region]/clusters/[cluster_name]
      },
    },
  },
  "labels": { # Optional. The labels to associate with this batch. Label keys must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be associated with a batch.
    "a_key": "A String",
  },
  "name": "A String", # Output only. The resource name of the batch.
  "operation": "A String", # Output only. The resource name of the operation associated with this batch.
  "pysparkBatch": { # A configuration for running an Apache PySpark (https://spark.apache.org/docs/latest/api/python/getting_started/quickstart.html) batch workload. # Optional. PySpark batch config.
    "archiveUris": [ # Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
      "A String",
    ],
    "args": [ # Optional. The arguments to pass to the driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
      "A String",
    ],
    "fileUris": [ # Optional. HCFS URIs of files to be placed in the working directory of each executor.
      "A String",
    ],
    "jarFileUris": [ # Optional. HCFS URIs of jar files to add to the classpath of the Spark driver and tasks.
      "A String",
    ],
    "mainPythonFileUri": "A String", # Required. The HCFS URI of the main Python file to use as the Spark driver. Must be a .py file.
    "pythonFileUris": [ # Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
      "A String",
    ],
  },
  "runtimeConfig": { # Runtime configuration for a workload. # Optional. Runtime configuration for the batch execution.
    "autotuningConfig": { # Autotuning configuration of the workload. # Optional. Autotuning configuration of the workload.
      "scenarios": [ # Optional. Scenarios for which tunings are applied.
        "A String",
      ],
    },
    "cohort": "A String", # Optional. Cohort identifier. Identifies families of the workloads having the same shape, e.g. daily ETL jobs.
    "containerImage": "A String", # Optional. Optional custom container image for the job runtime environment. If not specified, a default container image will be used.
    "properties": { # Optional. A mapping of property names to values, which are used to configure workload execution.
      "a_key": "A String",
    },
    "repositoryConfig": { # Configuration for dependency repositories # Optional. Dependency repository configuration.
      "pypiRepositoryConfig": { # Configuration for PyPi repository # Optional. Configuration for PyPi repository.
        "pypiRepository": "A String", # Optional. PyPi repository address
      },
    },
    "version": "A String", # Optional. Version of the batch runtime.
  },
  "runtimeInfo": { # Runtime information about workload execution. # Output only. Runtime information about batch execution.
    "approximateUsage": { # Usage metrics represent approximate total resources consumed by a workload. # Output only. Approximate workload resource usage, calculated when the workload completes (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).Note: This metric calculation may change in the future, for example, to capture cumulative workload resource consumption during workload execution (see the Dataproc Serverless release notes (https://cloud.google.com/dataproc-serverless/docs/release-notes) for announcements, changes, fixes and other Dataproc developments).
      "acceleratorType": "A String", # Optional. DEPRECATED Accelerator type being used, if any
      "milliAcceleratorSeconds": "A String", # Optional. DEPRECATED Accelerator usage in (milliAccelerator x seconds) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
      "milliDcuSeconds": "A String", # Optional. DCU (Dataproc Compute Units) usage in (milliDCU x seconds) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
      "shuffleStorageGbSeconds": "A String", # Optional. Shuffle storage usage in (GB x seconds) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
      "updateTime": "A String", # Optional. The timestamp of the usage metrics.
    },
    "currentUsage": { # The usage snapshot represents the resources consumed by a workload at a specified time. # Output only. Snapshot of current workload resource usage.
      "acceleratorType": "A String", # Optional. Accelerator type being used, if any
      "milliAccelerator": "A String", # Optional. Milli (one-thousandth) accelerator. (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing))
      "milliDcu": "A String", # Optional. Milli (one-thousandth) Dataproc Compute Units (DCUs) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
      "milliDcuPremium": "A String", # Optional. Milli (one-thousandth) Dataproc Compute Units (DCUs) charged at premium tier (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
      "shuffleStorageGb": "A String", # Optional. Shuffle Storage in gigabytes (GB). (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing))
      "shuffleStorageGbPremium": "A String", # Optional. Shuffle Storage in gigabytes (GB) charged at premium tier. (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing))
      "snapshotTime": "A String", # Optional. The timestamp of the usage snapshot.
    },
    "diagnosticOutputUri": "A String", # Output only. A URI pointing to the location of the diagnostics tarball.
    "endpoints": { # Output only. Map of remote access endpoints (such as web interfaces and APIs) to their URIs.
      "a_key": "A String",
    },
    "outputUri": "A String", # Output only. A URI pointing to the location of the stdout and stderr of the workload.
    "propertiesInfo": { # Properties of the workload organized by origin. # Optional. Properties of the workload organized by origin.
      "autotuningProperties": { # Output only. Properties set by autotuning engine.
        "a_key": { # Annotatated property value.
          "annotation": "A String", # Annotation, comment or explanation why the property was set.
          "overriddenValue": "A String", # Optional. Value which was replaced by the corresponding component.
          "value": "A String", # Property value.
        },
      },
    },
  },
  "sparkBatch": { # A configuration for running an Apache Spark (https://spark.apache.org/) batch workload. # Optional. Spark batch config.
    "archiveUris": [ # Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
      "A String",
    ],
    "args": [ # Optional. The arguments to pass to the driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
      "A String",
    ],
    "fileUris": [ # Optional. HCFS URIs of files to be placed in the working directory of each executor.
      "A String",
    ],
    "jarFileUris": [ # Optional. HCFS URIs of jar files to add to the classpath of the Spark driver and tasks.
      "A String",
    ],
    "mainClass": "A String", # Optional. The name of the driver main class. The jar file that contains the class must be in the classpath or specified in jar_file_uris.
    "mainJarFileUri": "A String", # Optional. The HCFS URI of the jar file that contains the main class.
  },
  "sparkRBatch": { # A configuration for running an Apache SparkR (https://spark.apache.org/docs/latest/sparkr.html) batch workload. # Optional. SparkR batch config.
    "archiveUris": [ # Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
      "A String",
    ],
    "args": [ # Optional. The arguments to pass to the Spark driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
      "A String",
    ],
    "fileUris": [ # Optional. HCFS URIs of files to be placed in the working directory of each executor.
      "A String",
    ],
    "mainRFileUri": "A String", # Required. The HCFS URI of the main R file to use as the driver. Must be a .R or .r file.
  },
  "sparkSqlBatch": { # A configuration for running Apache Spark SQL (https://spark.apache.org/sql/) queries as a batch workload. # Optional. SparkSql batch config.
    "jarFileUris": [ # Optional. HCFS URIs of jar files to be added to the Spark CLASSPATH.
      "A String",
    ],
    "queryFileUri": "A String", # Required. The HCFS URI of the script that contains Spark SQL queries to execute.
    "queryVariables": { # Optional. Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";).
      "a_key": "A String",
    },
  },
  "state": "A String", # Output only. The state of the batch.
  "stateHistory": [ # Output only. Historical state information for the batch.
    { # Historical state information.
      "state": "A String", # Output only. The state of the batch at this point in history.
      "stateMessage": "A String", # Output only. Details about the state at this point in history.
      "stateStartTime": "A String", # Output only. The time when the batch entered the historical state.
    },
  ],
  "stateMessage": "A String", # Output only. Batch state details, such as a failure description if the state is FAILED.
  "stateTime": "A String", # Output only. The time when the batch entered a current state.
  "uuid": "A String", # Output only. A batch UUID (Unique Universal Identifier). The service generates this value when it creates the batch.
}

list(parent, filter=None, orderBy=None, pageSize=None, pageToken=None, x__xgafv=None)

Lists batch workloads.

Args:
  parent: string, Required. The parent, which owns this collection of batches. (required)
  filter: string, Optional. A filter for the batches to return in the response.A filter is a logical expression constraining the values of various fields in each batch resource. Filters are case sensitive, and may contain multiple clauses combined with logical operators (AND/OR). Supported fields are batch_id, batch_uuid, state, create_time, and labels.e.g. state = RUNNING and create_time < "2023-01-01T00:00:00Z" filters for batches in state RUNNING that were created before 2023-01-01. state = RUNNING and labels.environment=production filters for batches in state in a RUNNING state that have a production environment label.See https://google.aip.dev/assets/misc/ebnf-filtering.txt for a detailed description of the filter syntax and a list of supported comparisons.
  orderBy: string, Optional. Field(s) on which to sort the list of batches.Currently the only supported sort orders are unspecified (empty) and create_time desc to sort by most recently created batches first.See https://google.aip.dev/132#ordering for more details.
  pageSize: integer, Optional. The maximum number of batches to return in each response. The service may return fewer than this value. The default page size is 20; the maximum page size is 1000.
  pageToken: string, Optional. A page token received from a previous ListBatches call. Provide this token to retrieve the subsequent page.
  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # A list of batch workloads.
  "batches": [ # Output only. The batches from the specified collection.
    { # A representation of a batch workload in the service.
      "createTime": "A String", # Output only. The time when the batch was created.
      "creator": "A String", # Output only. The email address of the user who created the batch.
      "environmentConfig": { # Environment configuration for a workload. # Optional. Environment configuration for the batch execution.
        "executionConfig": { # Execution configuration for a workload. # Optional. Execution configuration for a workload.
          "authenticationConfig": { # Authentication configuration for a workload is used to set the default identity for the workload execution. The config specifies the type of identity (service account or user) that will be used by workloads to access resources on the project(s). # Optional. Authentication configuration used to set the default identity for the workload execution. The config specifies the type of identity (service account or user) that will be used by workloads to access resources on the project(s).
            "userWorkloadAuthenticationType": "A String", # Optional. Authentication type for the user workload running in containers.
          },
          "idleTtl": "A String", # Optional. Applies to sessions only. The duration to keep the session alive while it's idling. Exceeding this threshold causes the session to terminate. This field cannot be set on a batch workload. Minimum value is 10 minutes; maximum value is 14 days (see JSON representation of Duration (https://developers.google.com/protocol-buffers/docs/proto3#json)). Defaults to 1 hour if not set. If both ttl and idle_ttl are specified for an interactive session, the conditions are treated as OR conditions: the workload will be terminated when it has been idle for idle_ttl or when ttl has been exceeded, whichever occurs first.
          "kmsKey": "A String", # Optional. The Cloud KMS key to use for encryption.
          "networkTags": [ # Optional. Tags used for network traffic control.
            "A String",
          ],
          "networkUri": "A String", # Optional. Network URI to connect workload to.
          "serviceAccount": "A String", # Optional. Service account that used to execute workload.
          "stagingBucket": "A String", # Optional. A Cloud Storage bucket used to stage workload dependencies, config files, and store workload output and other ephemeral data, such as Spark history files. If you do not specify a staging bucket, Cloud Dataproc will determine a Cloud Storage location according to the region where your workload is running, and then create and manage project-level, per-location staging and temporary buckets. This field requires a Cloud Storage bucket name, not a gs://... URI to a Cloud Storage bucket.
          "subnetworkUri": "A String", # Optional. Subnetwork URI to connect workload to.
          "ttl": "A String", # Optional. The duration after which the workload will be terminated, specified as the JSON representation for Duration (https://protobuf.dev/programming-guides/proto3/#json). When the workload exceeds this duration, it will be unconditionally terminated without waiting for ongoing work to finish. If ttl is not specified for a batch workload, the workload will be allowed to run until it exits naturally (or run forever without exiting). If ttl is not specified for an interactive session, it defaults to 24 hours. If ttl is not specified for a batch that uses 2.1+ runtime version, it defaults to 4 hours. Minimum value is 10 minutes; maximum value is 14 days. If both ttl and idle_ttl are specified (for an interactive session), the conditions are treated as OR conditions: the workload will be terminated when it has been idle for idle_ttl or when ttl has been exceeded, whichever occurs first.
        },
        "peripheralsConfig": { # Auxiliary services configuration for a workload. # Optional. Peripherals configuration that workload has access to.
          "metastoreService": "A String", # Optional. Resource name of an existing Dataproc Metastore service.Example: projects/[project_id]/locations/[region]/services/[service_id]
          "sparkHistoryServerConfig": { # Spark History Server configuration for the workload. # Optional. The Spark History Server configuration for the workload.
            "dataprocCluster": "A String", # Optional. Resource name of an existing Dataproc Cluster to act as a Spark History Server for the workload.Example: projects/[project_id]/regions/[region]/clusters/[cluster_name]
          },
        },
      },
      "labels": { # Optional. The labels to associate with this batch. Label keys must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be associated with a batch.
        "a_key": "A String",
      },
      "name": "A String", # Output only. The resource name of the batch.
      "operation": "A String", # Output only. The resource name of the operation associated with this batch.
      "pysparkBatch": { # A configuration for running an Apache PySpark (https://spark.apache.org/docs/latest/api/python/getting_started/quickstart.html) batch workload. # Optional. PySpark batch config.
        "archiveUris": [ # Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
          "A String",
        ],
        "args": [ # Optional. The arguments to pass to the driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
          "A String",
        ],
        "fileUris": [ # Optional. HCFS URIs of files to be placed in the working directory of each executor.
          "A String",
        ],
        "jarFileUris": [ # Optional. HCFS URIs of jar files to add to the classpath of the Spark driver and tasks.
          "A String",
        ],
        "mainPythonFileUri": "A String", # Required. The HCFS URI of the main Python file to use as the Spark driver. Must be a .py file.
        "pythonFileUris": [ # Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
          "A String",
        ],
      },
      "runtimeConfig": { # Runtime configuration for a workload. # Optional. Runtime configuration for the batch execution.
        "autotuningConfig": { # Autotuning configuration of the workload. # Optional. Autotuning configuration of the workload.
          "scenarios": [ # Optional. Scenarios for which tunings are applied.
            "A String",
          ],
        },
        "cohort": "A String", # Optional. Cohort identifier. Identifies families of the workloads having the same shape, e.g. daily ETL jobs.
        "containerImage": "A String", # Optional. Optional custom container image for the job runtime environment. If not specified, a default container image will be used.
        "properties": { # Optional. A mapping of property names to values, which are used to configure workload execution.
          "a_key": "A String",
        },
        "repositoryConfig": { # Configuration for dependency repositories # Optional. Dependency repository configuration.
          "pypiRepositoryConfig": { # Configuration for PyPi repository # Optional. Configuration for PyPi repository.
            "pypiRepository": "A String", # Optional. PyPi repository address
          },
        },
        "version": "A String", # Optional. Version of the batch runtime.
      },
      "runtimeInfo": { # Runtime information about workload execution. # Output only. Runtime information about batch execution.
        "approximateUsage": { # Usage metrics represent approximate total resources consumed by a workload. # Output only. Approximate workload resource usage, calculated when the workload completes (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).Note: This metric calculation may change in the future, for example, to capture cumulative workload resource consumption during workload execution (see the Dataproc Serverless release notes (https://cloud.google.com/dataproc-serverless/docs/release-notes) for announcements, changes, fixes and other Dataproc developments).
          "acceleratorType": "A String", # Optional. DEPRECATED Accelerator type being used, if any
          "milliAcceleratorSeconds": "A String", # Optional. DEPRECATED Accelerator usage in (milliAccelerator x seconds) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
          "milliDcuSeconds": "A String", # Optional. DCU (Dataproc Compute Units) usage in (milliDCU x seconds) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
          "shuffleStorageGbSeconds": "A String", # Optional. Shuffle storage usage in (GB x seconds) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
          "updateTime": "A String", # Optional. The timestamp of the usage metrics.
        },
        "currentUsage": { # The usage snapshot represents the resources consumed by a workload at a specified time. # Output only. Snapshot of current workload resource usage.
          "acceleratorType": "A String", # Optional. Accelerator type being used, if any
          "milliAccelerator": "A String", # Optional. Milli (one-thousandth) accelerator. (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing))
          "milliDcu": "A String", # Optional. Milli (one-thousandth) Dataproc Compute Units (DCUs) (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
          "milliDcuPremium": "A String", # Optional. Milli (one-thousandth) Dataproc Compute Units (DCUs) charged at premium tier (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing)).
          "shuffleStorageGb": "A String", # Optional. Shuffle Storage in gigabytes (GB). (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing))
          "shuffleStorageGbPremium": "A String", # Optional. Shuffle Storage in gigabytes (GB) charged at premium tier. (see Dataproc Serverless pricing (https://cloud.google.com/dataproc-serverless/pricing))
          "snapshotTime": "A String", # Optional. The timestamp of the usage snapshot.
        },
        "diagnosticOutputUri": "A String", # Output only. A URI pointing to the location of the diagnostics tarball.
        "endpoints": { # Output only. Map of remote access endpoints (such as web interfaces and APIs) to their URIs.
          "a_key": "A String",
        },
        "outputUri": "A String", # Output only. A URI pointing to the location of the stdout and stderr of the workload.
        "propertiesInfo": { # Properties of the workload organized by origin. # Optional. Properties of the workload organized by origin.
          "autotuningProperties": { # Output only. Properties set by autotuning engine.
            "a_key": { # Annotatated property value.
              "annotation": "A String", # Annotation, comment or explanation why the property was set.
              "overriddenValue": "A String", # Optional. Value which was replaced by the corresponding component.
              "value": "A String", # Property value.
            },
          },
        },
      },
      "sparkBatch": { # A configuration for running an Apache Spark (https://spark.apache.org/) batch workload. # Optional. Spark batch config.
        "archiveUris": [ # Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
          "A String",
        ],
        "args": [ # Optional. The arguments to pass to the driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
          "A String",
        ],
        "fileUris": [ # Optional. HCFS URIs of files to be placed in the working directory of each executor.
          "A String",
        ],
        "jarFileUris": [ # Optional. HCFS URIs of jar files to add to the classpath of the Spark driver and tasks.
          "A String",
        ],
        "mainClass": "A String", # Optional. The name of the driver main class. The jar file that contains the class must be in the classpath or specified in jar_file_uris.
        "mainJarFileUri": "A String", # Optional. The HCFS URI of the jar file that contains the main class.
      },
      "sparkRBatch": { # A configuration for running an Apache SparkR (https://spark.apache.org/docs/latest/sparkr.html) batch workload. # Optional. SparkR batch config.
        "archiveUris": [ # Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
          "A String",
        ],
        "args": [ # Optional. The arguments to pass to the Spark driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
          "A String",
        ],
        "fileUris": [ # Optional. HCFS URIs of files to be placed in the working directory of each executor.
          "A String",
        ],
        "mainRFileUri": "A String", # Required. The HCFS URI of the main R file to use as the driver. Must be a .R or .r file.
      },
      "sparkSqlBatch": { # A configuration for running Apache Spark SQL (https://spark.apache.org/sql/) queries as a batch workload. # Optional. SparkSql batch config.
        "jarFileUris": [ # Optional. HCFS URIs of jar files to be added to the Spark CLASSPATH.
          "A String",
        ],
        "queryFileUri": "A String", # Required. The HCFS URI of the script that contains Spark SQL queries to execute.
        "queryVariables": { # Optional. Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";).
          "a_key": "A String",
        },
      },
      "state": "A String", # Output only. The state of the batch.
      "stateHistory": [ # Output only. Historical state information for the batch.
        { # Historical state information.
          "state": "A String", # Output only. The state of the batch at this point in history.
          "stateMessage": "A String", # Output only. Details about the state at this point in history.
          "stateStartTime": "A String", # Output only. The time when the batch entered the historical state.
        },
      ],
      "stateMessage": "A String", # Output only. Batch state details, such as a failure description if the state is FAILED.
      "stateTime": "A String", # Output only. The time when the batch entered a current state.
      "uuid": "A String", # Output only. A batch UUID (Unique Universal Identifier). The service generates this value when it creates the batch.
    },
  ],
  "nextPageToken": "A String", # A token, which can be sent as page_token to retrieve the next page. If this field is omitted, there are no subsequent pages.
  "unreachable": [ # Output only. List of Batches that could not be included in the response. Attempting to get one of these resources may indicate why it was not included in the list response.
    "A String",
  ],
}

Cloud Dataproc API . projects . locations . batches

Instance Methods

Method Details