Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Criar um cluster do Dataproc usando o console Google Cloud
Nesta página, mostramos como usar o console Google Cloud para criar um cluster do Dataproc, executar um job básico do Apache Spark no cluster e modificar o número de workers.
Para seguir as instruções detalhadas desta tarefa diretamente no console do
Google Cloud , clique em Orientação:
Sign in to your Google Cloud account. If you're new to
Google Cloud,
create an account to evaluate how our products perform in
real-world scenarios. New customers also get $300 in free credits to
run, test, and deploy workloads.
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Na caixa de diálogo Criar cluster do Dataproc, clique em Criar na
linha Cluster no Compute Engine.
No campo Nome do cluster, insira example-cluster.
Nas listas Região e Zona, selecione uma região e uma zona.
Selecione uma região (por exemplo, us-east1 ou europe-west1)
para isolar recursos, como instâncias de máquina virtual (VM) e
locais de armazenamento de metadados e do Cloud Storage utilizados pelo
Dataproc na região. Para mais informações, consulte Regiões e zonas disponíveis e Endpoints regionais.
Para todas as outras opções, use as configurações padrão.
Para criar o cluster, clique em Criar.
O novo cluster aparece em uma lista na página Clusters. O status é
Em provisionamento até que o cluster esteja pronto para uso. Depois disso, o status
muda para Em execução. O provisionamento do cluster pode levar alguns minutos.
Enviar um job do Spark
Envie um job do Spark que estima um valor de Pi:
No menu de navegação do Dataproc, clique em Jobs.
Na página Jobs, clique em
add_boxEnviar job e faça o seguinte:
No campo ID do job, use a configuração padrão ou forneça um ID exclusivo para seu projeto Google Cloud .
No menu suspenso Cluster, selecione example-cluster.
Em Tipo de job, selecione Spark.
No campo Classe principal ou jar, insira
org.apache.spark.examples.SparkPi.
No campo Arquivos JAR, insira file:///usr/lib/spark/examples/jars/spark-examples.jar.
No campo Argumentos, insira 1000 para definir o número de tarefas.
Clique em Enviar.
O job vai aparecer na página Detalhes do job. O status do job é
Em execução ou Iniciando e, depois, muda para Concluído após
o envio.
Para evitar a rolagem na saída, clique em Quebra de linha: desativada. A saída
será semelhante a esta:
Pi is roughly 3.1416759514167594
Para ver os detalhes do job, clique na guia Configuração.
Atualize um cluster
Atualize o cluster mudando o número de instâncias de worker:
No menu de navegação do Dataproc, clique em Clusters.
Na lista de clusters, clique em example-cluster.
Na página Detalhes do cluster, clique na guia Configuração.
As configurações do cluster são exibidas.
Clique em mode_editEditar.
No campo Nós de trabalho, digite 5.
Clique em Salvar.
Seu cluster foi atualizado. Para diminuir o número de nós de trabalho ao valor original, siga o mesmo procedimento.
Limpar
Para evitar cobranças na sua conta do Google Cloud pelos
recursos usados nesta página, siga estas etapas.
Para excluir o cluster, na página Detalhes do cluster de example-cluster, clique em deleteExcluir.
Para confirmar que você quer excluir o cluster, clique em Excluir.
A seguir
Siga este guia de início rápido usando outras ferramentas:
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-08-22 UTC."],[[["\u003cp\u003eThis guide demonstrates how to create a Dataproc cluster using the Google Cloud console, with steps provided in a guided format.\u003c/p\u003e\n"],["\u003cp\u003eYou can submit an Apache Spark job to the cluster, specifically one that estimates Pi using the Monte Carlo method, by following the provided steps.\u003c/p\u003e\n"],["\u003cp\u003eThe guide shows how to modify the worker nodes of an existing cluster, allowing you to increase or decrease the resources allocated to your cluster.\u003c/p\u003e\n"],["\u003cp\u003eInstructions are included for cleaning up the cluster to avoid incurring unwanted charges.\u003c/p\u003e\n"],["\u003cp\u003eThe content also provides additional resources, links to quickstart guides for using other tools, and additional guidance on creating firewall rules and writing Spark Scala jobs.\u003c/p\u003e\n"]]],[],null,["# Quickstart: Create a Dataproc cluster by using the Google Cloud console\n\nCreate a Dataproc cluster by using the Google Cloud console\n===========================================================\n\nThis page shows you how to use the Google Cloud console to create a\nDataproc cluster, run a basic\n[Apache Spark](http://spark.apache.org/)\njob in the cluster, and then modify the number of workers in the cluster.\n\n*** ** * ** ***\n\nTo follow step-by-step guidance for this task directly in the\nGoogle Cloud console, click **Guide me**:\n\n[Guide me](https://console.cloud.google.com/freetrial?redirectPath=/?walkthrough_id=dataproc--quickstart-dataproc-console)\n\n*** ** * ** ***\n\nBefore you begin\n----------------\n\n- Sign in to your Google Cloud account. If you're new to Google Cloud, [create an account](https://console.cloud.google.com/freetrial) to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Dataproc API.\n\n\n [Enable the API](https://console.cloud.google.com/flows/enableapi?apiid=dataproc)\n\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Dataproc API.\n\n\n [Enable the API](https://console.cloud.google.com/flows/enableapi?apiid=dataproc)\n\n\u003cbr /\u003e\n\nCreate a cluster\n----------------\n\n1. In the Google Cloud console, go to the Dataproc\n **Clusters** page.\n\n [Go to Clusters](https://console.cloud.google.com/dataproc/clusters)\n2. Click **Create cluster**.\n\n3. In the **Create Dataproc cluster** dialog, click **Create** in\n the **Cluster on Compute Engine** row.\n\n4. In the **Cluster name** field, enter `example-cluster`.\n\n5. In the **Region** and **Zone** lists, select a region and zone.\n\n Select a region (for example, `us-east1` or `europe-west1`)\n to isolate resources, such as virtual machine (VM) instances and\n Cloud Storage and metadata storage locations that are utilized by\n Dataproc, in the region. For more\n information, see\n [Available regions and zones](/compute/docs/regions-zones/regions-zones#available)\n and\n [Regional endpoints](/dataproc/docs/concepts/regional-endpoints).\n6. For all the other options, use the default settings.\n\n7. To create the cluster, click **Create**.\n\n Your new cluster appears in a list on the **Clusters** page. The status is\n **Provisioning** until the cluster is ready to use, and then the status\n changes to **Running**. Provisioning the cluster might take a couple of\n minutes.\n\nSubmit a Spark job\n------------------\n\nSubmit a Spark job that estimates a value of Pi:\n\n1. In the Dataproc navigation menu, click **Jobs**.\n2. On the **Jobs** page, click\n add_box **Submit job**, and then do\n the following:\n\n 1. In the **Job ID** field, use the default setting, or provide an ID that is unique to your Google Cloud project.\n 2. In the **Cluster** drop-down, select **`example-cluster`**.\n 3. For **Job type** , select **Spark**.\n 4. In the **Main class or jar** field, enter `org.apache.spark.examples.SparkPi`.\n 5. In the **Jar files** field, enter `file:///usr/lib/spark/examples/jars/spark-examples.jar`.\n 6. In the **Arguments** field, enter `1000` to set the number of tasks.\n\n | **Note:** The Spark job estimates Pi by using the [Monte Carlo method](https://wikipedia.org/wiki/Monte_Carlo_method). It generates *x* and *y* points on a coordinate plane that models a circle enclosed by a unit square. The input argument (`1000`) determines the number of x-y pairs to generate; the more pairs generated, the greater the accuracy of the estimation. This estimation uses Dataproc worker nodes to parallelize the computation. For more information, see [Estimating Pi using the Monte Carlo Method](https://academo.org/demos/estimating-pi-monte-carlo/) and [JavaSparkPi.java on GitHub](https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/JavaSparkPi.java).\n 7. Click **Submit**.\n\n Your job is displayed on the **Job details** page. The job status is\n **Running** or **Starting** , and then it changes to **Succeeded** after\n it's submitted.\n\n To avoid scrolling in the output, click **Line wrap: off**. The output\n is similar to the following: \n\n ```\n Pi is roughly 3.1416759514167594\n ```\n\n To view job details, click the **Configuration** tab.\n\nUpdate a cluster\n----------------\n\nUpdate your cluster by changing the number of worker instances:\n\n1. In the Dataproc navigation menu, click **Clusters**.\n2. In the list of clusters, click **`example-cluster`**.\n3. On the **Cluster details** page, click the **Configuration** tab.\n\n Your cluster settings are displayed.\n4. Click mode_edit **Edit**.\n\n5. In the **Worker nodes** field, enter `5`.\n\n6. Click **Save**.\n\nYour cluster is now updated. To decrease the number of worker nodes to the\noriginal value, follow the same procedure.\n\nClean up\n--------\n\n\nTo avoid incurring charges to your Google Cloud account for\nthe resources used on this page, follow these steps.\n\n1. To delete the cluster, on the **Cluster details** page for **`example-cluster`** , click delete **Delete**.\n2. To confirm that you want to delete the cluster, click **Delete**.\n\nWhat's next\n-----------\n\n- Try this quickstart by using other tools:\n - [Use the API Explorer](/dataproc/docs/quickstarts/create-cluster-template).\n - [Use the Google Cloud CLI](/dataproc/docs/quickstarts/create-cluster-gcloud).\n- Learn how to [create robust firewall rules when you create a project](/dataproc/docs/concepts/configuring-clusters/network).\n- Learn how to [write and run a Spark Scala job](/dataproc/docs/tutorials/spark-scala)."]]