Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Criar um cluster do Dataproc usando a CLI gcloud
Nesta página, mostramos como usar a ferramenta de linha de comando gcloud da Google Cloud CLI para criar um cluster do Dataproc, executar um job do Apache Spark no cluster e modificar o número de workers.
Sign in to your Google Cloud account. If you're new to
Google Cloud,
create an account to evaluate how our products perform in
real-world scenarios. New customers also get $300 in free credits to
run, test, and deploy workloads.
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
A resposta ao comando confirma a criação do cluster:
Waiting for cluster creation operation...done.
Created [... example-cluster]
Para informações sobre como selecionar uma região, consulte Regiões e zonas disponíveis.
Para ver uma lista de regiões disponíveis, execute o comando
gcloud compute regions list.
Para saber mais sobre endpoints regionais, consulte
Endpoints regionais.
Envie um job
Para enviar um job de exemplo do Spark que calcula um valor aproximado para pi, execute o seguinte comando:
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-08-22 UTC."],[[["\u003cp\u003eThis guide demonstrates how to create a Dataproc cluster using the \u003ccode\u003egcloud\u003c/code\u003e command-line tool.\u003c/p\u003e\n"],["\u003cp\u003eYou can use the \u003ccode\u003egcloud\u003c/code\u003e command to submit an Apache Spark job to a cluster to execute code, such as a sample job that calculates the value of \u003ccode\u003epi\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eThe number of workers within an existing Dataproc cluster can be adjusted with the \u003ccode\u003egcloud\u003c/code\u003e update command.\u003c/p\u003e\n"],["\u003cp\u003eAfter you are finished with your Dataproc cluster, it can be deleted using the \u003ccode\u003egcloud\u003c/code\u003e command to prevent continued resource usage charges.\u003c/p\u003e\n"]]],[],null,["# Quickstart: Create a Dataproc cluster by using the gcloud CLI\n\nCreate a Dataproc cluster by using the gcloud CLI\n=================================================\n\nThis page shows you how to use the Google Cloud CLI\n[gcloud](/sdk/gcloud/reference/dataproc) command-line tool to create a\nDataproc cluster, run a [Apache Spark](http://spark.apache.org/) job\nin the cluster, then modify the number of workers in the cluster.\n| A convenient way to run the `gcloud` command-line tool is from [Cloud Shell](https://console.cloud.google.com/?cloudshell=true), which has the Google Cloud CLI pre-installed. Cloud Shell is free for Google Cloud customers. To use Cloud Shell, you need a Google Cloud project.\n\nYou can find out how to do the same or similar tasks with\n[Quickstarts Using the API Explorer](/dataproc/docs/quickstarts/create-cluster-template),\nthe Google Cloud console in\n[Create a Dataproc cluster by using the Google Cloud console](/dataproc/docs/quickstarts/create-cluster-console),\nand using the client libraries in\n[Create a Dataproc cluster by using client libraries](/dataproc/docs/quickstarts/create-cluster-client-libraries).\n\nBefore you begin\n----------------\n\n- Sign in to your Google Cloud account. If you're new to Google Cloud, [create an account](https://console.cloud.google.com/freetrial) to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Dataproc API.\n\n\n [Enable the API](https://console.cloud.google.com/flows/enableapi?apiid=dataproc&redirect=https://console.cloud.google.com)\n\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Dataproc API.\n\n\n [Enable the API](https://console.cloud.google.com/flows/enableapi?apiid=dataproc&redirect=https://console.cloud.google.com)\n\n\u003cbr /\u003e\n\nCreate a cluster\n----------------\n\nTo create a cluster called `example-cluster`, run the following command: \n\n```\ngcloud dataproc clusters create example-cluster --region=REGION\n```\n\nThe command output confirms cluster creation: \n\n```\nWaiting for cluster creation operation...done.\nCreated [... example-cluster]\n```\n\n\u003cbr /\u003e\n\nFor information on selecting a region, see\n[Available regions \\& zones](/compute/docs/regions-zones/regions-zones#available).\nTo see a list of available regions, you can run the\n`gcloud compute regions list` command.\nTo learn about regional endpoints, see\n[Regional endpoints](/dataproc/docs/concepts/regional-endpoints).\n\nSubmit a job\n------------\n\nTo submit a sample Spark job that calculates a rough value for `pi`, run the\nfollowing command: \n\n```\ngcloud dataproc jobs submit spark --cluster example-cluster \\\n --region=REGION \\\n --class org.apache.spark.examples.SparkPi \\\n --jars file:///usr/lib/spark/examples/jars/spark-examples.jar -- 1000\n```\n\nThis command specifies the following:\n\n- You want to run a [`spark`](/sdk/gcloud/reference/dataproc/jobs/submit/spark) job on the `example-cluster` cluster in the specified region\n- The `class` containing the main method for the job's pi-calculating application\n- The location of the jar file containing your job's code\n- Any parameters you want to pass to the job---in this case the number of tasks, which is `1000`\n\n| Parameters passed to the job must follow a double dash (`--`). For more information, see the [Google Cloud CLI documentation](/sdk/gcloud/reference/dataproc/jobs/submit/spark).\n\nThe job's running and final output is displayed in the terminal window: \n\n```\nWaiting for job output...\n...\nPi is roughly 3.14118528\n...\nJob finished successfully.\n```\n\nUpdate a cluster\n----------------\n\nTo change the number of workers in the cluster to five, run the\nfollowing command: \n\n```\ngcloud dataproc clusters update example-cluster \\\n --region=REGION \\\n --num-workers 5\n```\n\nThe command output displays your cluster's details. For example: \n\n```\nworkerConfig:\n...\n instanceNames:\n - example-cluster-w-0\n - example-cluster-w-1\n - example-cluster-w-2\n - example-cluster-w-3\n - example-cluster-w-4\n numInstances: 5\nstatusHistory:\n...\n- detail: Add 3 workers.\n```\n\nTo decrease the number of worker nodes to the original value, use the same\ncommand: \n\n```\ngcloud dataproc clusters update example-cluster \\\n --region=REGION \\\n --num-workers 2\n```\n\nClean up\n--------\n\n\nTo avoid incurring charges to your Google Cloud account for\nthe resources used on this page, follow these steps.\n\n1. To delete your `example-cluster`, run the\n [`clusters delete`](/sdk/gcloud/reference/dataproc/clusters/delete)\n command:\n\n ```\n gcloud dataproc clusters delete example-cluster \\\n --region=REGION\n ```\n\n \u003cbr /\u003e\n\n2. To confirm and complete the cluster deletion, press \u003ckbd\u003ey\u003c/kbd\u003e and then\n press \u003ckbd\u003eEnter\u003c/kbd\u003e when prompted.\n\nWhat's next\n-----------\n\n- Learn how to [write and run a Spark Scala job](/dataproc/docs/tutorials/spark-scala)."]]