Mit Sammlungen den Überblick behalten
Sie können Inhalte basierend auf Ihren Einstellungen speichern und kategorisieren.
Dataproc-Cluster mit der gcloud CLI erstellen
Auf dieser Seite wird erläutert, wie Sie mit dem gcloud-Befehlszeilentool der Google Cloud CLI einen Dataproc-Cluster erstellen, einen Apache Spark-Job im Cluster ausführen und dann die Anzahl der Worker im Cluster ändern.
Sign in to your Google Cloud account. If you're new to
Google Cloud,
create an account to evaluate how our products perform in
real-world scenarios. New customers also get $300 in free credits to
run, test, and deploy workloads.
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Die Befehlsausgabe bestätigt die Clustererstellung:
Waiting for cluster creation operation...done.
Created [... example-cluster]
Informationen zum Auswählen einer Region finden Sie unter Verfügbare Regionen und Zonen.
Mit dem Befehl gcloud compute regions list können Sie eine Liste der verfügbaren Regionen aufrufen.
Weitere Informationen zu regionalen Endpunkten finden Sie unter Regionale Endpunkte.
Job senden
Um einen Spark-Beispieljob zu senden, der einen ungefähren Wert für pi berechnet, führen Sie den folgenden Befehl aus:
[[["Leicht verständlich","easyToUnderstand","thumb-up"],["Mein Problem wurde gelöst","solvedMyProblem","thumb-up"],["Sonstiges","otherUp","thumb-up"]],[["Schwer verständlich","hardToUnderstand","thumb-down"],["Informationen oder Beispielcode falsch","incorrectInformationOrSampleCode","thumb-down"],["Benötigte Informationen/Beispiele nicht gefunden","missingTheInformationSamplesINeed","thumb-down"],["Problem mit der Übersetzung","translationIssue","thumb-down"],["Sonstiges","otherDown","thumb-down"]],["Zuletzt aktualisiert: 2025-08-22 (UTC)."],[[["\u003cp\u003eThis guide demonstrates how to create a Dataproc cluster using the \u003ccode\u003egcloud\u003c/code\u003e command-line tool.\u003c/p\u003e\n"],["\u003cp\u003eYou can use the \u003ccode\u003egcloud\u003c/code\u003e command to submit an Apache Spark job to a cluster to execute code, such as a sample job that calculates the value of \u003ccode\u003epi\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eThe number of workers within an existing Dataproc cluster can be adjusted with the \u003ccode\u003egcloud\u003c/code\u003e update command.\u003c/p\u003e\n"],["\u003cp\u003eAfter you are finished with your Dataproc cluster, it can be deleted using the \u003ccode\u003egcloud\u003c/code\u003e command to prevent continued resource usage charges.\u003c/p\u003e\n"]]],[],null,["# Quickstart: Create a Dataproc cluster by using the gcloud CLI\n\nCreate a Dataproc cluster by using the gcloud CLI\n=================================================\n\nThis page shows you how to use the Google Cloud CLI\n[gcloud](/sdk/gcloud/reference/dataproc) command-line tool to create a\nDataproc cluster, run a [Apache Spark](http://spark.apache.org/) job\nin the cluster, then modify the number of workers in the cluster.\n| A convenient way to run the `gcloud` command-line tool is from [Cloud Shell](https://console.cloud.google.com/?cloudshell=true), which has the Google Cloud CLI pre-installed. Cloud Shell is free for Google Cloud customers. To use Cloud Shell, you need a Google Cloud project.\n\nYou can find out how to do the same or similar tasks with\n[Quickstarts Using the API Explorer](/dataproc/docs/quickstarts/create-cluster-template),\nthe Google Cloud console in\n[Create a Dataproc cluster by using the Google Cloud console](/dataproc/docs/quickstarts/create-cluster-console),\nand using the client libraries in\n[Create a Dataproc cluster by using client libraries](/dataproc/docs/quickstarts/create-cluster-client-libraries).\n\nBefore you begin\n----------------\n\n- Sign in to your Google Cloud account. If you're new to Google Cloud, [create an account](https://console.cloud.google.com/freetrial) to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Dataproc API.\n\n\n [Enable the API](https://console.cloud.google.com/flows/enableapi?apiid=dataproc&redirect=https://console.cloud.google.com)\n\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Dataproc API.\n\n\n [Enable the API](https://console.cloud.google.com/flows/enableapi?apiid=dataproc&redirect=https://console.cloud.google.com)\n\n\u003cbr /\u003e\n\nCreate a cluster\n----------------\n\nTo create a cluster called `example-cluster`, run the following command: \n\n```\ngcloud dataproc clusters create example-cluster --region=REGION\n```\n\nThe command output confirms cluster creation: \n\n```\nWaiting for cluster creation operation...done.\nCreated [... example-cluster]\n```\n\n\u003cbr /\u003e\n\nFor information on selecting a region, see\n[Available regions \\& zones](/compute/docs/regions-zones/regions-zones#available).\nTo see a list of available regions, you can run the\n`gcloud compute regions list` command.\nTo learn about regional endpoints, see\n[Regional endpoints](/dataproc/docs/concepts/regional-endpoints).\n\nSubmit a job\n------------\n\nTo submit a sample Spark job that calculates a rough value for `pi`, run the\nfollowing command: \n\n```\ngcloud dataproc jobs submit spark --cluster example-cluster \\\n --region=REGION \\\n --class org.apache.spark.examples.SparkPi \\\n --jars file:///usr/lib/spark/examples/jars/spark-examples.jar -- 1000\n```\n\nThis command specifies the following:\n\n- You want to run a [`spark`](/sdk/gcloud/reference/dataproc/jobs/submit/spark) job on the `example-cluster` cluster in the specified region\n- The `class` containing the main method for the job's pi-calculating application\n- The location of the jar file containing your job's code\n- Any parameters you want to pass to the job---in this case the number of tasks, which is `1000`\n\n| Parameters passed to the job must follow a double dash (`--`). For more information, see the [Google Cloud CLI documentation](/sdk/gcloud/reference/dataproc/jobs/submit/spark).\n\nThe job's running and final output is displayed in the terminal window: \n\n```\nWaiting for job output...\n...\nPi is roughly 3.14118528\n...\nJob finished successfully.\n```\n\nUpdate a cluster\n----------------\n\nTo change the number of workers in the cluster to five, run the\nfollowing command: \n\n```\ngcloud dataproc clusters update example-cluster \\\n --region=REGION \\\n --num-workers 5\n```\n\nThe command output displays your cluster's details. For example: \n\n```\nworkerConfig:\n...\n instanceNames:\n - example-cluster-w-0\n - example-cluster-w-1\n - example-cluster-w-2\n - example-cluster-w-3\n - example-cluster-w-4\n numInstances: 5\nstatusHistory:\n...\n- detail: Add 3 workers.\n```\n\nTo decrease the number of worker nodes to the original value, use the same\ncommand: \n\n```\ngcloud dataproc clusters update example-cluster \\\n --region=REGION \\\n --num-workers 2\n```\n\nClean up\n--------\n\n\nTo avoid incurring charges to your Google Cloud account for\nthe resources used on this page, follow these steps.\n\n1. To delete your `example-cluster`, run the\n [`clusters delete`](/sdk/gcloud/reference/dataproc/clusters/delete)\n command:\n\n ```\n gcloud dataproc clusters delete example-cluster \\\n --region=REGION\n ```\n\n \u003cbr /\u003e\n\n2. To confirm and complete the cluster deletion, press \u003ckbd\u003ey\u003c/kbd\u003e and then\n press \u003ckbd\u003eEnter\u003c/kbd\u003e when prompted.\n\nWhat's next\n-----------\n\n- Learn how to [write and run a Spark Scala job](/dataproc/docs/tutorials/spark-scala)."]]