Dataproc Clusters
Google Cloud Dataproc Clusters lets you provision and manage Apache Spark and Hadoop clusters.
About
The Dataproc Clusters source allows Toolbox to interact with Dataproc Clusters hosted on Google Cloud.
Available Tools
dataproc-get-clusterGet a specific Dataproc cluster.dataproc-list-clustersList and filter Dataproc clusters.dataproc-get-jobGet a specific Dataproc job.dataproc-list-jobsList and filter Dataproc jobs.
Requirements
IAM Permissions
Dataproc uses Identity and Access Management (IAM) to control user and group access to Dataproc resources.
Toolbox will use your Application Default Credentials
(ADC) to authorize and
authenticate when interacting with Dataproc. When using this method, you need to
ensure the IAM identity associated with your ADC has the correct
permissions
for the actions you intend to perform. Common roles include
roles/dataproc.editor or roles/dataproc.viewer. Follow this
guide to
set up your ADC.
Example
kind: sources
name: my-dataproc-source
type: dataproc
project: my-project
region: us-central1
Reference
| field | type | required | description |
|---|---|---|---|
| type | string | true | Must be “dataproc”. |
| project | string | true | ID of the GCP project with Dataproc resources. |
| region | string | true | Region containing Dataproc resources. |