Dataproc Clusters

Google Cloud Dataproc Clusters lets you provision and manage Apache Spark and Hadoop clusters.

About

The Dataproc Clusters source allows Toolbox to interact with Dataproc Clusters hosted on Google Cloud.

Available Tools

Requirements

IAM Permissions

Dataproc uses Identity and Access Management (IAM) to control user and group access to Dataproc resources.

Toolbox will use your Application Default Credentials (ADC) to authorize and authenticate when interacting with Dataproc. When using this method, you need to ensure the IAM identity associated with your ADC has the correct permissions for the actions you intend to perform. Common roles include roles/dataproc.editor or roles/dataproc.viewer. Follow this guide to set up your ADC.

Example

kind: sources
name: my-dataproc-source
type: dataproc
project: my-project
region: us-central1

Reference

fieldtyperequireddescription
typestringtrueMust be “dataproc”.
projectstringtrueID of the GCP project with Dataproc resources.
regionstringtrueRegion containing Dataproc resources.