serverless-spark-create-pyspark-batch

A “serverless-spark-create-pyspark-batch” tool submits a Spark batch to run asynchronously.

About

A serverless-spark-create-pyspark-batch tool submits a Spark batch to a Google Cloud Serverless for Apache Spark source. The workload executes asynchronously and takes around a minute to begin executing; status can be polled using the get batch tool.

It’s compatible with the following sources:

serverless-spark-create-pyspark-batch accepts the following parameters:

  • mainFile: The path to the main Python file, as a gs://… URI.
  • args Optional. A list of arguments passed to the main file.
  • version Optional. The Serverless runtime version to execute with.

Custom Configuration

This tool supports custom runtimeConfig and environmentConfig settings, which can be specified in a tools.yaml file. These configurations are parsed as YAML and passed to the Dataproc API.

Note: If your project requires custom runtime or environment configuration, you must write a custom tools.yaml, you cannot use the serverless-spark prebuilt config.

Example tools.yaml

tools:
  - name: "serverless-spark-create-pyspark-batch"
    kind: "serverless-spark-create-pyspark-batch"
    source: "my-serverless-spark-source"
    runtimeConfig:
      properties:
        spark.driver.memory: "1024m"
    environmentConfig:
      executionConfig:
        networkUri: "my-network"

Response Format

The response is an operation metadata JSON object corresponding to batch operation metadata Example:

{
  "batch": "projects/myproject/locations/us-central1/batches/aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
  "batchUuid": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
  "createTime": "2025-11-19T16:36:47.607119Z",
  "description": "Batch",
  "labels": {
    "goog-dataproc-batch-uuid": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
    "goog-dataproc-location": "us-central1"
  },
  "operationType": "BATCH",
  "warnings": [
    "No runtime version specified. Using the default runtime version."
  ]
}

Reference

fieldtyperequireddescription
kindstringtrueMust be “serverless-spark-create-pyspark-batch”.
sourcestringtrueName of the source the tool should use.
descriptionstringfalseDescription of the tool that is passed to the LLM.
runtimeConfigmapfalseRuntime config for all batches created with this tool.
environmentConfigmapfalseEnvironment config for all batches created with this tool.
authRequiredstring[]falseList of auth services required to invoke this tool.
Last modified December 4, 2025: chore(main): release 0.22.0 (#1997) (cb4529c)