Untuk menggunakan prosedur tersimpan Spark, tinjau peran yang diperlukan untuk prosedur tersimpan
dan berikan peran yang diperlukan.
Untuk mendapatkan izin yang Anda perlukan untuk menggunakan Spark dan prosedur tersimpan dengan metastore BigLake sebagai penyimpanan metadata, minta administrator Anda untuk memberi Anda peran IAM berikut:
Buat tabel metastore BigLake di Spark:
BigQuery Data Editor (roles/bigquery.dataEditor)
di akun layanan Koneksi Spark dalam project
Storage Object Admin (roles/storage.objectAdmin)
di akun layanan Spark Connection dalam project
Menjalankan kueri pada tabel metastore BigLake di BigQuery:
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-08-17 UTC."],[],[],null,["# Use BigLake metastore with Spark stored procedures\n==================================================\n\nThis document explains how to use [Apache Spark stored procedures](/bigquery/docs/spark-procedures) with\nBigLake metastore.\n\nBefore you begin\n----------------\n\n1. Enable billing for your Google Cloud project. Learn how to [check if billing is enabled on a project](/billing/docs/how-to/verify-billing-enabled).\n2. Enable the BigQuery and Dataflow APIs.\n\n [Enable the APIs](/billing/docs/how-to/verify-billing-enabled)\n3. Optional: Learn more about the following:\n\n - Understand how [BigLake metastore works](https://console.cloud.google.com/flows/enableapi?apiid=bigquery.googleapis.com,dataproc.googleapis.com) and why you should use it.\n - Learn how [BigQuery Spark stored procedures work](/bigquery/docs/spark-procedures) and complete the before you begin tasks.\n\n### Required roles\n\nTo use Spark stored procedures, review the required roles for [stored procedures](/bigquery/docs/spark-procedures#required_roles)\nand grant the necessary roles.\n\n\nTo get the permissions that\nyou need to use Spark and stored procedures with BigLake metastore as a metadata store,\n\nask your administrator to grant you the\nfollowing IAM roles:\n\n- Create BigLake metastore tables in Spark:\n - [BigQuery Data Editor](/iam/docs/roles-permissions/bigquery#bigquery.dataEditor) (`roles/bigquery.dataEditor`) on the Spark Connection service account in the project\n - [Storage Object Admin](/iam/docs/roles-permissions/storage#storage.objectAdmin) (`roles/storage.objectAdmin`) on the Spark Connection service account in the project\n- Query BigLake metastore tables in BigQuery:\n - [BigQuery Data Viewer](/iam/docs/roles-permissions/bigquery#bigquery.dataViewer) (`roles/bigquery.dataViewer`) on the project\n - [BigQuery User](/iam/docs/roles-permissions/bigquery#bigquery.user) (`roles/bigquery.user`) on the project\n - [Storage Object Viewer](/iam/docs/roles-permissions/storage#storage.objectViewer) (`roles/storage.objectViewer`) on the project\n\n\nFor more information about granting roles, see [Manage access to projects, folders, and organizations](/iam/docs/granting-changing-revoking-access).\n\n\nYou might also be able to get\nthe required permissions through [custom\nroles](/iam/docs/creating-custom-roles) or other [predefined\nroles](/iam/docs/roles-overview#predefined).\n\nCreate and run a stored procedure\n---------------------------------\n\nThe following example shows you how to create and run a stored procedure with\nBigLake metastore.\n\n1. Go to the **BigQuery** page.\n\n [Go to BigQuery](https://console.cloud.google.com/bigquery)\n2. In the query editor, add the following sample code for the [`CREATE PROCEDURE`\n statement](/bigquery/docs/reference/standard-sql/data-definition-language#create_procedure).\n\n ```googlesql\n CREATE OR REPLACE PROCEDURE\n `\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.\u003cvar translate=\"no\"\u003eBQ_DATASET_ID\u003c/var\u003e.\u003cvar translate=\"no\"\u003ePROCEDURE_NAME\u003c/var\u003e`()\n WITH CONNECTION `\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.\u003cvar translate=\"no\"\u003eREGION\u003c/var\u003e.\u003cvar translate=\"no\"\u003eSPARK_CONNECTION_ID\u003c/var\u003e` OPTIONS (engine='SPARK',\n runtime_version='1.1',\n properties=[(\"spark.sql.catalog.\u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e.warehouse\",\n \"\u003cvar translate=\"no\"\u003eWAREHOUSE_DIRECTORY\u003c/var\u003e\"),\n (\"spark.sql.catalog.\u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e.gcp_location\",\n \"\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e\"),\n (\"spark.sql.catalog.\u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e.gcp_project\",\n \"\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e\"),\n (\"spark.sql.catalog.\u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e\",\n \"org.apache.iceberg.spark.SparkCatalog\"),\n (\"spark.sql.catalog.\u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e.catalog-impl\",\n \"org.apache.iceberg.gcp.bigquery.BigQueryMetastoreCatalog\"),\n (\"spark.jars.packages\",\n \"org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.6.1\")],\n jar_uris=[\"gs://spark-lib/bigquery/iceberg-bigquery-catalog-1.6.1-1.0.1-beta.jar\"])\n LANGUAGE python AS R\"\"\"\n from pyspark.sql import SparkSession\n spark = SparkSession \\\n .builder \\\n .appName(\"BigLake Metastore Iceberg\") \\\n .getOrCreate()\n spark.sql(\"USE CATALOG_NAME;\")\n spark.sql(\"CREATE NAMESPACE IF NOT EXISTS NAMESPACE_NAME;\")\n spark.sql(\"USE NAMESPACE_NAME;\")\n spark.sql(\"CREATE TABLE TABLE_NAME (id int, data string) USING ICEBERG LOCATION '\u003cvar translate=\"no\"\u003eWAREHOUSE_DIRECTORY\u003c/var\u003e'\")\n spark.sql(\"DESCRIBE TABLE_NAME;\")\n spark.sql(\"INSERT INTO TABLE_NAME VALUES (1, \\\"first row\\\");\")\n spark.sql(\"SELECT * from \u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e;\")\n spark.sql(\"ALTER TABLE \u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e ADD COLUMNS (newDoubleCol double);\")\n spark.sql(\"DESCRIBE \u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e;\")\n \"\"\";\n CALL `\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.\u003cvar translate=\"no\"\u003eBQ_DATASET_ID\u003c/var\u003e.\u003cvar translate=\"no\"\u003ePROCEDURE_NAME\u003c/var\u003e`();\n ```\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: the ID of your Google Cloud project.\n - \u003cvar translate=\"no\"\u003eBQ_DATASET_ID\u003c/var\u003e: the ID of the dataset in BigQuery that contains the procedure.\n - \u003cvar translate=\"no\"\u003ePROCEDURE_NAME\u003c/var\u003e: the name of the procedure that you're creating or replacing.\n - \u003cvar translate=\"no\"\u003eREGION\u003c/var\u003e: the location of your Spark connection.\n - \u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e: the location of your BigQuery resources.\n - \u003cvar translate=\"no\"\u003eSPARK_CONNECTION_ID\u003c/var\u003e: the ID of your Spark connection.\n - \u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e: the name of the catalog that you're using.\n - \u003cvar translate=\"no\"\u003eWAREHOUSE_DIRECTORY\u003c/var\u003e: the URI of the Cloud Storage folder that contains your data warehouse.\n - \u003cvar translate=\"no\"\u003eNAMESPACE_NAME\u003c/var\u003e: the namespace that you're using.\n\nWhat's next\n-----------\n\n- Set up [optional BigLake metastore features](/bigquery/docs/blms-features).\n- [View and query tables from Spark in the BigQuery console](/bigquery/docs/blms-query-tables)."]]