Stay organized with collections
Save and categorize content based on your preferences.
Transactions with ACID semantics is supported by Apache Hive
metastores in Dataproc Metastore. For more information, see Hive Transactions.
These transactions are enabled by default on Hive 3.
Configurations
You must set server and client side configurations in order to enable
transaction support.
Server side configurations
The following server side configurations are set by default during the creation
of the service by Dataproc Metastore. You can choose to override
these by entering Key and Value overrides under Metastore config
overrides.
metastore.compactor.initiator.on — Whether to run the initiator
and cleaner threads on the Dataproc Metastore service.
Set to true to enable the initiator.
metastore.compactor.worker.threads — The number of compactor
worker threads to run on the Dataproc Metastore.
Set to a positive number to enable the compactor. Setting this to a higher
number may affect the performance of the service, especially if you're on
Developer tier. If this number needs to be tweaked, we recommend using a lower
value, such as 8.
hive.metastore.event.db.notification.api.auth — Whether the
Dataproc Metastore service should authorize against database
notification related APIs.
Set to false. If set to true, then only the superusers in proxy
settings have permission. See Metastore notification API security
for more information on superuser proxy privilege.
Client side configurations
Client side configurations are set in the Hive client as described in
Validate transactions.
hive.support.concurrency — Set to true to support insert,
update, and delete transactions.
hive.exec.dynamic.partition.mode — In strict mode, you must
specify at least one static partition in case all partitions are accidentally
overwritten. In nonstrict mode, all partitions are allowed to be dynamic.
Set to nonstrict to support insert, update, and delete transactions.
hive.txn.manager — Set to org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.
Validate transactions
You can validate Hive transactions using a Dataproc cluster that uses a
Dataproc Metastore service on Hive 3.
You must create the Dataproc cluster in the same project as the
Dataproc Metastore service and with Hive 3. The Dataproc
2.0 images, 2.0-ubuntu18 and 2.0-debian10, support Hive 3 and transactions. You
can use the flag --image-version to set the 2.0 image. For example:
The following instructions demonstrate how to validate transactions in your
Dataproc Metastore service that is used by a Dataproc
cluster.
SSH into the Dataproc cluster. You can do this from either a
browser or from the command line.
Run the command hive to open the Hive client:
$>hive
Set up the client side configurations to enable Hive ACID support for
transactions in the hive client session:
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.support.concurrency=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
Create a transactional table to insert and update into. The following is an
example.
Create a transaction table:
create table student (id int, name string, age int)
STORED AS ORC TBLPROPERTIES ('transactional' = 'true');
Check if the table is transactional:
describe formatted <tableName>;
A list of the table properties are printed. A transactional table has
transactional=true in its table parameters.
Observe the delta folder created under the student directory in the
warehouse directory of the service. Multiple delta folders are created
if you run multiple insert or update statements.
View which compactions are running and their statuses. Hive metastore runs
a thread called initiator every five minutes to check for tables which are
due for compaction and requests compaction for those tables.
show compactions;
To start a manual compaction (either minor or major):
ALTER TABLE student COMPACT 'minor';
ALTER TABLE student COMPACT 'major';
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-28 UTC."],[[["\u003cp\u003eApache Hive metastores in Dataproc Metastore support transactions with ACID semantics, enabled by default on Hive 3.\u003c/p\u003e\n"],["\u003cp\u003eServer-side configurations for transaction support, such as \u003ccode\u003emetastore.compactor.initiator.on\u003c/code\u003e and \u003ccode\u003emetastore.compactor.worker.threads\u003c/code\u003e, are set by default during Dataproc Metastore service creation.\u003c/p\u003e\n"],["\u003cp\u003eClient-side configurations like \u003ccode\u003ehive.support.concurrency\u003c/code\u003e, \u003ccode\u003ehive.exec.dynamic.partition.mode\u003c/code\u003e, and \u003ccode\u003ehive.txn.manager\u003c/code\u003e must be set in the Hive client to enable transactions.\u003c/p\u003e\n"],["\u003cp\u003eYou can validate Hive transactions using a Dataproc cluster with Hive 3, ensuring the cluster and Dataproc Metastore service are in the same project.\u003c/p\u003e\n"],["\u003cp\u003eThe Enterprise tier supports managed asynchronous compactions, which is accessible through the Canary release channel.\u003c/p\u003e\n"]]],[],null,["# Use transactional tables with Dataproc Metastore\n\nTransactions with ACID semantics is supported by [Apache Hive](https://hive.apache.org/)\nmetastores in Dataproc Metastore. For more information, see [Hive Transactions](https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-ACIDandTransactionsinHive).\nThese transactions are enabled by default on Hive 3.\n| **Note:** The `Enterprise` tier supports fully managed asynchronous compaction workflows, which is accessible through the [`Canary` release channel](/dataproc-metastore/docs/release-channel).\n\nConfigurations\n--------------\n\nYou must set server and client side configurations in order to enable\ntransaction support.\n| **Note:** These configurations are set by default for Dataproc Metastore services created with Hive version 3.1.2.\n\n### Server side configurations\n\nThe following server side configurations are set by default during the creation\nof the service by Dataproc Metastore. You can choose to override\nthese by entering **Key** and **Value** overrides under **Metastore config\noverrides**.\n| **Caution:** Changing these configurations while creating a Hive 3 Dataproc Metastore service may result in unexpected errors or cause the service to not work correctly.\n\n- **`metastore.compactor.initiator.on`** --- Whether to run the initiator\n and cleaner threads on the Dataproc Metastore service.\n\n Set to `true` to enable the initiator.\n- **`metastore.compactor.worker.threads`** --- The number of compactor\n worker threads to run on the Dataproc Metastore.\n\n Set to a positive number to enable the compactor. Setting this to a higher\n number may affect the performance of the service, especially if you're on\n Developer tier. If this number needs to be tweaked, we recommend using a lower\n value, such as 8.\n- **`hive.metastore.event.db.notification.api.auth`** --- Whether the\n Dataproc Metastore service should authorize against database\n notification related APIs.\n\n Set to `false`. If set to `true`, then only the superusers in proxy\n settings have permission. See [Metastore notification API security](https://cwiki.apache.org/confluence/display/Hive/HiveReplicationv2Development#HiveReplicationv2Development-MetastorenotificationAPIsecurity)\n for more information on superuser proxy privilege.\n\n### Client side configurations\n\nClient side configurations are set in the Hive client as described in\n[Validate transactions](#validate-transactions).\n\n- **`hive.support.concurrency`** --- Set to `true` to support insert,\n update, and delete transactions.\n\n- **`hive.exec.dynamic.partition.mode`** --- In strict mode, you must\n specify at least one static partition in case all partitions are accidentally\n overwritten. In nonstrict mode, all partitions are allowed to be dynamic.\n\n Set to `nonstrict` to support insert, update, and delete transactions.\n- **`hive.txn.manager`** --- Set to `org.apache.hadoop.hive.ql.lockmgr.DbTxnManager`.\n\nValidate transactions\n---------------------\n\nYou can validate Hive transactions using a Dataproc cluster that uses a\nDataproc Metastore service on Hive 3.\n\nYou must create the Dataproc cluster in the same project as the\nDataproc Metastore service and with Hive 3. The Dataproc\n2.0 images, 2.0-ubuntu18 and 2.0-debian10, support Hive 3 and transactions. You\ncan use the flag `--image-version` to set the 2.0 image. For example: \n\n gcloud dataproc clusters create \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eDATAPROC_CLUSTER_ID\u003c/span\u003e\u003c/var\u003e \\\n --dataproc-metastore=projects/\u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003ePROJECT_ID\u003c/span\u003e\u003c/var\u003e/locations/\u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eLOCATION\u003c/span\u003e\u003c/var\u003e/services/\u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eSERVICE\u003c/span\u003e\u003c/var\u003e \\\n --region=\u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eREGION\u003c/span\u003e\u003c/var\u003e \\\n --image-version 2.0-debian10\n\nThe following instructions demonstrate how to validate transactions in your\nDataproc Metastore service that is used by a Dataproc\ncluster.\n\n1. SSH into the Dataproc cluster. You can do this from either a\n browser or from the command line.\n\n2. Run the command `hive` to open the Hive client:\n\n $\u003e hive\n\n3. Set up the client side configurations to enable Hive ACID support for\n transactions in the hive client session:\n\n SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;\n SET hive.support.concurrency=true;\n SET hive.exec.dynamic.partition.mode=nonstrict;\n\n4. Create a transactional table to insert and update into. The following is an\n example.\n\n 1. Create a transaction table:\n\n create table student (id int, name string, age int)\n STORED AS ORC TBLPROPERTIES ('transactional' = 'true');\n\n 2. Check if the table is transactional:\n\n describe formatted \u003ctableName\u003e;\n\n A list of the table properties are printed. A transactional table has\n `transactional=true` in its table parameters.\n 3. Insert data into the table:\n\n INSERT INTO student VALUES\n (1, 'Alice', 10),\n (2, 'Bob', 10),\n (3, 'Charlie', 10);\n\n 1. Observe the delta folder created under the `student` directory in the warehouse directory of the service. Multiple delta folders are created if you run multiple insert or update statements.\n 4. View which compactions are running and their statuses. Hive metastore runs\n a thread called initiator every five minutes to check for tables which are\n due for compaction and requests compaction for those tables.\n\n show compactions;\n\n 1. To start a manual compaction (either minor or major):\n\n ALTER TABLE student COMPACT 'minor';\n ALTER TABLE student COMPACT 'major';\n\nWhat's next\n-----------\n\n- [Create a service](/dataproc-metastore/docs/create-service)\n- [Update and delete a service](/dataproc-metastore/docs/manage-service)\n- [Import metadata into a service](/dataproc-metastore/docs/import-metadata)"]]