Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Nesta página, mostramos um exemplo de como usar o Spark SQL com um serviço do metastore do Dataproc. Neste exemplo, você inicia uma sessão do Spark SQL em um cluster do Dataproc
e executa alguns comandos de exemplo para criar um banco de dados e uma tabela.
Para começar a usar o Spark SQL, use SSH para se conectar ao cluster do Dataproc associado ao serviço metastore do Dataproc. Depois de se conectar ao cluster com SSH, você pode executar comandos do Spark para gerenciar seus metadados.
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-09-04 UTC."],[[["\u003cp\u003eThis guide demonstrates how to use Spark SQL with a Dataproc Metastore service by launching a Spark SQL session on a Dataproc cluster.\u003c/p\u003e\n"],["\u003cp\u003eBefore starting, you must create a Dataproc Metastore service and attach it to a Dataproc cluster.\u003c/p\u003e\n"],["\u003cp\u003eTo initiate Spark SQL, connect via SSH to your Dataproc cluster, and then use \u003ccode\u003espark-shell\u003c/code\u003e to manage metadata.\u003c/p\u003e\n"],["\u003cp\u003eThe process involves creating a database and a table using Spark SQL commands, such as \u003ccode\u003ecreate database\u003c/code\u003e, \u003ccode\u003euse\u003c/code\u003e, and \u003ccode\u003ecreate table\u003c/code\u003e, and then viewing the content.\u003c/p\u003e\n"],["\u003cp\u003eNext steps include importing, exporting, and using Hive with your metadata.\u003c/p\u003e\n"]]],[],null,["# Use Spark SQL with Dataproc Metastore\n\nThis page shows you an example of using Spark SQL with a Dataproc Metastore\nservice. In this example, you launch a Spark SQL session on a Dataproc cluster\nand run some sample commands to create a database and table.\n\nBefore you begin\n----------------\n\n- Create a [Dataproc Metastore service](/dataproc-metastore/docs/create-service).\n- Attach the [Dataproc Metastore service to a Dataproc cluster](/dataproc-metastore/docs/attach-dataproc).\n\nConnect to Spark SQL\n--------------------\n\nTo start using Spark SQL, use SSH to connect to the Dataproc cluster that's\nassociated with your Dataproc Metastore service. After you connect to\nthe cluster with SSH, you can run Spark commands to manage your metadata.\n\n**To connect to Spark SQL**\n\n1. In the Google Cloud console, go to the [VM\n Instances](https://console.cloud.google.com/compute/instances) page.\n2. In the list of virtual machine instances, click **SSH** in the row of the Dataproc VM instance that you want to connect to.\n\nA browser window opens in your home directory on the node with an output similar\nto the following: \n\n Connected, host fingerprint: ssh-rsa ...\n Linux cluster-1-m 3.16.0-0.bpo.4-amd64 ...\n ...\n example-cluster@cluster-1-m:~$\n\nTo start Hive and create a database and table, run the following commands in the SSH session:\n\n1. Start the Spark shell.\n\n spark-shell\n\n2. Create a database called `myDB`.\n\n spark.sql(\"create database myDB\");\n\n3. Use the database you created.\n\n spark.sql(\"use myDB\");\n\n4. Create a table called `myTable`.\n\n spark.sql(\"create table bar(id int,name string)\");\n\n5. List the tables under `myDatabase`.\n\n spark.sql(\"show tables\").show();\n\n6. Show the table rows in the table you created.\n\n desc myTable;\n\nRunning these commands shows an output similar to the following: \n\n $spark-shell\n\n scala\u003espark.sql(\"create database myDB\");\n\n scala\u003espark.sql(\"use myTable\");\n\n scala\u003espark.sql(\"create table myTable(id int,name string)\");\n\n scala\u003espark.sql(\"show tables\").show();\n\n +--------+---------+-----------+\n |database|tableName|isTemporary|\n +--------+---------+-----------+\n | myDB| myTable| false|\n +--------+---------+-----------+\n +--------+---------+-------+\n |col_name|data_type|comment|\n +--------+---------+-------+\n | id| int| null|\n | name| string| null|\n +--------+---------+-------+\n\nWhat's next\n-----------\n\n- [Import metadata](/dataproc-metastore/docs/import-metadata)\n- [Export metadata](/dataproc-metastore/docs/export-metadata)\n- [Use Apache Hive](/dataproc-metastore/docs/use-hive)"]]