Menggunakan tabel transaksional dengan Dataproc Metastore
Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Transaksi dengan semantik ACID didukung oleh metastore Apache Hive di Dataproc Metastore. Untuk mengetahui informasi selengkapnya, lihat Transaksi Hive.
Transaksi ini diaktifkan secara default di Hive 3.
Konfigurasi
Anda harus menyetel konfigurasi sisi server dan klien untuk mengaktifkan
dukungan transaksi.
Konfigurasi sisi server
Konfigurasi sisi server berikut ditetapkan secara default selama pembuatan
layanan oleh Dataproc Metastore. Anda dapat memilih untuk mengganti
nilai ini dengan memasukkan penggantian Kunci dan Nilai di bagian Penggantian konfigurasi Metastore.
metastore.compactor.initiator.on — Apakah akan menjalankan thread pemrakarsa
dan pembersih di layanan Dataproc Metastore.
Setel ke true untuk mengaktifkan pemrakarsa.
metastore.compactor.worker.threads — Jumlah thread pekerja
pemadatan yang akan dijalankan di Dataproc Metastore.
Setel ke angka positif untuk mengaktifkan pemadatan. Menetapkan nilai ini ke angka yang lebih tinggi dapat memengaruhi performa layanan, terutama jika Anda menggunakan tingkat Developer. Jika angka ini perlu disesuaikan, sebaiknya gunakan nilai yang lebih rendah, seperti 8.
hive.metastore.event.db.notification.api.auth — Apakah layanan Dataproc Metastore harus memberikan otorisasi terhadap API terkait notifikasi database.
Tetapkan ke false. Jika disetel ke true, hanya pengguna super di setelan proxy yang memiliki izin. Lihat Keamanan API notifikasi Metastore
untuk mengetahui informasi selengkapnya tentang hak istimewa proxy superuser.
Konfigurasi sisi klien
Konfigurasi sisi klien ditetapkan di klien Hive seperti yang dijelaskan dalam
Memvalidasi transaksi.
hive.support.concurrency — Setel ke true untuk mendukung transaksi penyisipan,
pembaruan, dan penghapusan.
hive.exec.dynamic.partition.mode — Dalam mode ketat, Anda harus
menentukan setidaknya satu partisi statis jika semua partisi tidak sengaja
ditimpa. Dalam mode tidak ketat, semua partisi diizinkan bersifat dinamis.
Setel ke nonstrict untuk mendukung transaksi penyisipan, pembaruan, dan penghapusan.
hive.txn.manager — Disetel ke org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.
Memvalidasi transaksi
Anda dapat memvalidasi transaksi Hive menggunakan cluster Dataproc yang menggunakan layanan Dataproc Metastore di Hive 3.
Anda harus membuat cluster Dataproc di project yang sama dengan layanan Dataproc Metastore dan dengan Hive 3. Image Dataproc 2.0, 2.0-ubuntu18, dan 2.0-debian10 mendukung Hive 3 dan transaksi. Anda
dapat menggunakan tanda --image-version untuk menyetel gambar 2.0. Contoh:
Petunjuk berikut menunjukkan cara memvalidasi transaksi di layanan Dataproc Metastore yang digunakan oleh cluster Dataproc.
Gunakan SSH untuk terhubung ke cluster Dataproc. Anda dapat melakukannya dari
browser atau dari command line.
Jalankan perintah hive untuk membuka klien Hive:
$>hive
Siapkan konfigurasi sisi klien untuk mengaktifkan dukungan ACID Hive untuk
transaksi di sesi klien Hive:
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.support.concurrency=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
Buat tabel transaksional untuk menyisipkan dan memperbarui. Berikut adalah
contohnya.
Buat tabel transaksi:
create table student (id int, name string, age int)
STORED AS ORC TBLPROPERTIES ('transactional' = 'true');
Periksa apakah tabel bersifat transaksional:
describe formatted <tableName>;
Daftar properti tabel akan dicetak. Tabel transaksional memiliki
transactional=true dalam parameter tabelnya.
Amati folder delta yang dibuat di direktori student di direktori warehouse layanan. Beberapa folder delta dibuat
jika Anda menjalankan beberapa pernyataan penyisipan atau pembaruan.
Lihat pemadatan yang sedang berjalan dan statusnya. Metastore Hive menjalankan thread yang disebut inisiator setiap lima menit untuk memeriksa tabel yang harus dikompresi dan meminta kompresi untuk tabel tersebut.
show compactions;
Untuk memulai pemadatan manual (kecil atau besar):
ALTER TABLE student COMPACT 'minor';
ALTER TABLE student COMPACT 'major';
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-08-27 UTC."],[[["\u003cp\u003eApache Hive metastores in Dataproc Metastore support transactions with ACID semantics, enabled by default on Hive 3.\u003c/p\u003e\n"],["\u003cp\u003eServer-side configurations for transaction support, such as \u003ccode\u003emetastore.compactor.initiator.on\u003c/code\u003e and \u003ccode\u003emetastore.compactor.worker.threads\u003c/code\u003e, are set by default during Dataproc Metastore service creation.\u003c/p\u003e\n"],["\u003cp\u003eClient-side configurations like \u003ccode\u003ehive.support.concurrency\u003c/code\u003e, \u003ccode\u003ehive.exec.dynamic.partition.mode\u003c/code\u003e, and \u003ccode\u003ehive.txn.manager\u003c/code\u003e must be set in the Hive client to enable transactions.\u003c/p\u003e\n"],["\u003cp\u003eYou can validate Hive transactions using a Dataproc cluster with Hive 3, ensuring the cluster and Dataproc Metastore service are in the same project.\u003c/p\u003e\n"],["\u003cp\u003eThe Enterprise tier supports managed asynchronous compactions, which is accessible through the Canary release channel.\u003c/p\u003e\n"]]],[],null,["# Use transactional tables with Dataproc Metastore\n\nTransactions with ACID semantics is supported by [Apache Hive](https://hive.apache.org/)\nmetastores in Dataproc Metastore. For more information, see [Hive Transactions](https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-ACIDandTransactionsinHive).\nThese transactions are enabled by default on Hive 3.\n| **Note:** The `Enterprise` tier supports fully managed asynchronous compaction workflows, which is accessible through the [`Canary` release channel](/dataproc-metastore/docs/release-channel).\n\nConfigurations\n--------------\n\nYou must set server and client side configurations in order to enable\ntransaction support.\n| **Note:** These configurations are set by default for Dataproc Metastore services created with Hive version 3.1.2.\n\n### Server side configurations\n\nThe following server side configurations are set by default during the creation\nof the service by Dataproc Metastore. You can choose to override\nthese by entering **Key** and **Value** overrides under **Metastore config\noverrides**.\n| **Caution:** Changing these configurations while creating a Hive 3 Dataproc Metastore service may result in unexpected errors or cause the service to not work correctly.\n\n- **`metastore.compactor.initiator.on`** --- Whether to run the initiator\n and cleaner threads on the Dataproc Metastore service.\n\n Set to `true` to enable the initiator.\n- **`metastore.compactor.worker.threads`** --- The number of compactor\n worker threads to run on the Dataproc Metastore.\n\n Set to a positive number to enable the compactor. Setting this to a higher\n number may affect the performance of the service, especially if you're on\n Developer tier. If this number needs to be tweaked, we recommend using a lower\n value, such as 8.\n- **`hive.metastore.event.db.notification.api.auth`** --- Whether the\n Dataproc Metastore service should authorize against database\n notification related APIs.\n\n Set to `false`. If set to `true`, then only the superusers in proxy\n settings have permission. See [Metastore notification API security](https://cwiki.apache.org/confluence/display/Hive/HiveReplicationv2Development#HiveReplicationv2Development-MetastorenotificationAPIsecurity)\n for more information on superuser proxy privilege.\n\n### Client side configurations\n\nClient side configurations are set in the Hive client as described in\n[Validate transactions](#validate-transactions).\n\n- **`hive.support.concurrency`** --- Set to `true` to support insert,\n update, and delete transactions.\n\n- **`hive.exec.dynamic.partition.mode`** --- In strict mode, you must\n specify at least one static partition in case all partitions are accidentally\n overwritten. In nonstrict mode, all partitions are allowed to be dynamic.\n\n Set to `nonstrict` to support insert, update, and delete transactions.\n- **`hive.txn.manager`** --- Set to `org.apache.hadoop.hive.ql.lockmgr.DbTxnManager`.\n\nValidate transactions\n---------------------\n\nYou can validate Hive transactions using a Dataproc cluster that uses a\nDataproc Metastore service on Hive 3.\n\nYou must create the Dataproc cluster in the same project as the\nDataproc Metastore service and with Hive 3. The Dataproc\n2.0 images, 2.0-ubuntu18 and 2.0-debian10, support Hive 3 and transactions. You\ncan use the flag `--image-version` to set the 2.0 image. For example: \n\n gcloud dataproc clusters create \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eDATAPROC_CLUSTER_ID\u003c/span\u003e\u003c/var\u003e \\\n --dataproc-metastore=projects/\u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003ePROJECT_ID\u003c/span\u003e\u003c/var\u003e/locations/\u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eLOCATION\u003c/span\u003e\u003c/var\u003e/services/\u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eSERVICE\u003c/span\u003e\u003c/var\u003e \\\n --region=\u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eREGION\u003c/span\u003e\u003c/var\u003e \\\n --image-version 2.0-debian10\n\nThe following instructions demonstrate how to validate transactions in your\nDataproc Metastore service that is used by a Dataproc\ncluster.\n\n1. SSH into the Dataproc cluster. You can do this from either a\n browser or from the command line.\n\n2. Run the command `hive` to open the Hive client:\n\n $\u003e hive\n\n3. Set up the client side configurations to enable Hive ACID support for\n transactions in the hive client session:\n\n SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;\n SET hive.support.concurrency=true;\n SET hive.exec.dynamic.partition.mode=nonstrict;\n\n4. Create a transactional table to insert and update into. The following is an\n example.\n\n 1. Create a transaction table:\n\n create table student (id int, name string, age int)\n STORED AS ORC TBLPROPERTIES ('transactional' = 'true');\n\n 2. Check if the table is transactional:\n\n describe formatted \u003ctableName\u003e;\n\n A list of the table properties are printed. A transactional table has\n `transactional=true` in its table parameters.\n 3. Insert data into the table:\n\n INSERT INTO student VALUES\n (1, 'Alice', 10),\n (2, 'Bob', 10),\n (3, 'Charlie', 10);\n\n 1. Observe the delta folder created under the `student` directory in the warehouse directory of the service. Multiple delta folders are created if you run multiple insert or update statements.\n 4. View which compactions are running and their statuses. Hive metastore runs\n a thread called initiator every five minutes to check for tables which are\n due for compaction and requests compaction for those tables.\n\n show compactions;\n\n 1. To start a manual compaction (either minor or major):\n\n ALTER TABLE student COMPACT 'minor';\n ALTER TABLE student COMPACT 'major';\n\nWhat's next\n-----------\n\n- [Create a service](/dataproc-metastore/docs/create-service)\n- [Update and delete a service](/dataproc-metastore/docs/manage-service)\n- [Import metadata into a service](/dataproc-metastore/docs/import-metadata)"]]