Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
O
Com o Serverless para Apache Spark, você executa cargas de trabalho do Spark sem precisar provisionar e gerenciar seu próprio cluster do Dataproc.
Há duas maneiras de executar cargas de trabalho do Serverless para Apache Spark:
Envie uma carga de trabalho em lote para o serviço sem servidor para Apache Spark usando o
consoleGoogle Cloud , Google Cloud CLI ou a API Dataproc. O serviço
executa a carga de trabalho em uma infraestrutura de computação gerenciada, com escalonamento automático de recursos
conforme necessário. As cobranças do Serverless para Apache Spark se aplicam
somente ao momento em que a carga de trabalho está em execução.
Escrever e executar código em notebooks Jupyter durante uma sessão interativa do Dataproc sem servidor para Apache Spark. É possível criar uma sessão de notebook das seguintes maneiras:
Use o plug-in do Dataproc JupyterLab
para criar várias sessões de notebook Jupyter com base em modelos que você cria
e gerencia. Ao instalar o plug-in em uma máquina local ou em uma VM do Compute Engine, diferentes cards correspondentes a diferentes configurações de kernel do Spark aparecem na página de inicialização do JupyterLab. Clique em um card para criar uma sessão de notebook do Serverless para Apache Spark
e comece a escrever e testar seu código no notebook.
O plug-in do JupyterLab para Dataproc também permite
usar a página de inicialização do JupyterLab para realizar as seguintes ações:
Crie clusters do Dataproc no Compute Engine.
Envie jobs para clusters do Dataproc no Compute Engine.
Ver Google Cloud e registros do Spark.
Sem servidor para Apache Spark em comparação com o Dataproc no Compute Engine
Se você quiser provisionar e gerenciar a infraestrutura e executar cargas de trabalho no Spark e em outros frameworks de processamento de código aberto, use o Dataproc no Compute Engine.
A tabela a seguir lista as principais diferenças entre o Dataproc no
Compute Engine e o Serverless para Apache Spark.
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-08-25 UTC."],[[["\u003cp\u003eDataproc Serverless allows the execution of Spark workloads without the need to provision and manage a Dataproc cluster, offering two methods: Spark Batch and Spark Interactive.\u003c/p\u003e\n"],["\u003cp\u003eDataproc Serverless for Spark Batch allows users to submit batch workloads via the Google Cloud console, CLI, or API, with the service managing resource scaling and only charging for active workload execution time.\u003c/p\u003e\n"],["\u003cp\u003eDataproc Serverless for Spark Interactive enables the writing and running of code within Jupyter notebooks, accessible through the Dataproc JupyterLab plugin, which also provides functionalities for creating and managing Dataproc on Compute Engine clusters.\u003c/p\u003e\n"],["\u003cp\u003eCompared to Dataproc on Compute Engine, Dataproc Serverless for Spark provides serverless capabilities, faster startup times, and interactive sessions, while Compute Engine offers greater infrastructure control and supports other open-source frameworks.\u003c/p\u003e\n"],["\u003cp\u003eDataproc Serverless adheres to data residency, CMEK, and VPC-SC security requirements and supports various Spark batch workload types including PySpark, Spark SQL, Spark R, and Spark (Java or Scala).\u003c/p\u003e\n"]]],[],null,["# Serverless for Apache Spark overview\n\n| **Dataproc Serverless** is now **Google Cloud Serverless for Apache Spark**. Until updated, some documents will refer to the previous name.\n\n\u003cbr /\u003e\n\nServerless for Apache Spark lets you run Spark workloads without requiring you\nto provision and manage your own Dataproc cluster.\nThere are two ways to run Serverless for Apache Spark workloads:\n\n- [Batch workloads](#spark-batch)\n- [Interactive sessions](#spark-interactive)\n\nBatch workloads\n---------------\n\nSubmit a batch workload to the Serverless for Apache Spark service using the\nGoogle Cloud console, Google Cloud CLI, or Dataproc API. The service\nruns the workload on a managed compute infrastructure, autoscaling resources\nas needed. [Serverless for Apache Spark charges](/dataproc-serverless/pricing) apply\nonly to the time when the workload is executing.\n\nTo get started, see\n[Run an Apache Spark batch workload](/dataproc-serverless/docs/quickstarts/spark-batch).\n| You can schedule a Spark batch workload as part of an [Airflow](https://airflow.apache.org/) or [Cloud Composer](/composer) workflow using an [Airflow batch operator](https://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/cloud/dataproc.html#create-a-batch). See [Run Serverless for Apache Spark workloads with Cloud Composer](/composer/docs/composer-2/run-dataproc-workloads) for more information.\n\nInteractive sessions\n--------------------\n\nWrite and run code in Jupyter notebooks during a Serverless for Apache Spark for\nSpark interactive session. You can create a notebook session in the following\nways:\n\n- [Run PySpark code in BigQuery Studio notebooks](/bigquery/docs/use-spark).\n Use the BigQuery Python notebook to create a\n [Spark-Connect-based](https://spark.apache.org/docs/latest/spark-connect-overview.html)\n Serverless for Apache Spark interactive session. Each BigQuery\n notebook can have only one active Serverless for Apache Spark session associated\n with it.\n\n- [Use the Dataproc JupyterLab plugin](/dataproc-serverless/docs/quickstarts/jupyterlab-sessions)\n to create multiple Jupyter notebook sessions from templates that you create\n and manage. When you install the plugin on a local machine or Compute Engine\n VM, different cards that correspond to different Spark kernel configurations\n appear on the JupyterLab launcher page. Click a card to create a Serverless for Apache Spark\n notebook session, then start writing and testing your code in the notebook.\n\n The Dataproc JupyterLab plugin also lets you\n use the JupyterLab launcher page to take the following actions:\n - Create Dataproc on Compute Engine clusters.\n - Submit jobs to Dataproc on Compute Engine clusters.\n - View Google Cloud and Spark logs.\n\nServerless for Apache Spark compared to Dataproc on Compute Engine\n------------------------------------------------------------------\n\nIf you want to provision and manage infrastructure, and then execute\nworkloads on Spark and other open source processing frameworks, use\n[Dataproc on Compute Engine](/dataproc/docs).\nThe following table lists key differences between the Dataproc on\nCompute Engine and Serverless for Apache Spark.\n\nSecurity compliance\n-------------------\n\nServerless for Apache Spark adheres to all [data residency](/terms/data-residency),\n[CMEK](/dataproc-serverless/docs/guides/cmek-serverless),\n[VPC-SC](/dataproc-serverless/docs/concepts/network#s8s-and-vpc-sc-networks),\nand other security requirements that Dataproc is compliant with.\n\nBatch workload capabilities\n---------------------------\n\nYou can run the following Serverless for Apache Spark batch workload types:\n\n- PySpark\n- Spark SQL\n- Spark R\n- Spark (Java or Scala)\n\nYou can specify [Spark properties](/dataproc-serverless/docs/concepts/properties)\nwhen you submit a Serverless for Apache Spark batch workload."]]