[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-27。"],[[["\u003cp\u003eThis guide demonstrates how to use Spark SQL with a Dataproc Metastore service by launching a Spark SQL session on a Dataproc cluster.\u003c/p\u003e\n"],["\u003cp\u003eBefore starting, you must create a Dataproc Metastore service and attach it to a Dataproc cluster.\u003c/p\u003e\n"],["\u003cp\u003eTo initiate Spark SQL, connect via SSH to your Dataproc cluster, and then use \u003ccode\u003espark-shell\u003c/code\u003e to manage metadata.\u003c/p\u003e\n"],["\u003cp\u003eThe process involves creating a database and a table using Spark SQL commands, such as \u003ccode\u003ecreate database\u003c/code\u003e, \u003ccode\u003euse\u003c/code\u003e, and \u003ccode\u003ecreate table\u003c/code\u003e, and then viewing the content.\u003c/p\u003e\n"],["\u003cp\u003eNext steps include importing, exporting, and using Hive with your metadata.\u003c/p\u003e\n"]]],[],null,["# Use Spark SQL with Dataproc Metastore\n\nThis page shows you an example of using Spark SQL with a Dataproc Metastore\nservice. In this example, you launch a Spark SQL session on a Dataproc cluster\nand run some sample commands to create a database and table.\n\nBefore you begin\n----------------\n\n- Create a [Dataproc Metastore service](/dataproc-metastore/docs/create-service).\n- Attach the [Dataproc Metastore service to a Dataproc cluster](/dataproc-metastore/docs/attach-dataproc).\n\nConnect to Spark SQL\n--------------------\n\nTo start using Spark SQL, use SSH to connect to the Dataproc cluster that's\nassociated with your Dataproc Metastore service. After you connect to\nthe cluster with SSH, you can run Spark commands to manage your metadata.\n\n**To connect to Spark SQL**\n\n1. In the Google Cloud console, go to the [VM\n Instances](https://console.cloud.google.com/compute/instances) page.\n2. In the list of virtual machine instances, click **SSH** in the row of the Dataproc VM instance that you want to connect to.\n\nA browser window opens in your home directory on the node with an output similar\nto the following: \n\n Connected, host fingerprint: ssh-rsa ...\n Linux cluster-1-m 3.16.0-0.bpo.4-amd64 ...\n ...\n example-cluster@cluster-1-m:~$\n\nTo start Hive and create a database and table, run the following commands in the SSH session:\n\n1. Start the Spark shell.\n\n spark-shell\n\n2. Create a database called `myDB`.\n\n spark.sql(\"create database myDB\");\n\n3. Use the database you created.\n\n spark.sql(\"use myDB\");\n\n4. Create a table called `myTable`.\n\n spark.sql(\"create table bar(id int,name string)\");\n\n5. List the tables under `myDatabase`.\n\n spark.sql(\"show tables\").show();\n\n6. Show the table rows in the table you created.\n\n desc myTable;\n\nRunning these commands shows an output similar to the following: \n\n $spark-shell\n\n scala\u003espark.sql(\"create database myDB\");\n\n scala\u003espark.sql(\"use myTable\");\n\n scala\u003espark.sql(\"create table myTable(id int,name string)\");\n\n scala\u003espark.sql(\"show tables\").show();\n\n +--------+---------+-----------+\n |database|tableName|isTemporary|\n +--------+---------+-----------+\n | myDB| myTable| false|\n +--------+---------+-----------+\n +--------+---------+-------+\n |col_name|data_type|comment|\n +--------+---------+-------+\n | id| int| null|\n | name| string| null|\n +--------+---------+-------+\n\nWhat's next\n-----------\n\n- [Import metadata](/dataproc-metastore/docs/import-metadata)\n- [Export metadata](/dataproc-metastore/docs/export-metadata)\n- [Use Apache Hive](/dataproc-metastore/docs/use-hive)"]]