Stay organized with collections
Save and categorize content based on your preferences.
Dataproc clusters can be created on Compute Engine
sole-tenant nodes. A sole-tenant node
is a Compute Engine server that is dedicated to hosting your project's
VMs only. Creating a Dataproc cluster on a sole tenant node
keeps the cluster's VMs physically separate from VMs in other projects. The
clusters function as standard Dataproc clusters, but with
additional hardware isolation to address security and compliance concerns.
Dataproc sole-tenant node clusters are created in a
user-specified sole-tenant node group. Each cluster's master, worker, and
secondary worker instances will be created within this sole-tenant node group.
Make sure the node group's max-nodes is sufficient for
the maxInstances of clusters you will create in
the sole-tenant node group.
Use the default or migrate-within-node-group node group
maintenance policy; VMs may be unavailable for up to one hour
with the restart-in-place policy.
--region (required): Must match the region of the sole-tenant-group.
--node-group (required): You can specify the sole tenant node group name ("node-group-name")
or the sole-tenant node group resource URI
("projects/project-id/zones/zone/nodeGroups/node-group-name").
--zone (required): The cluster zone must match
the sole-tenant node group zone.
gcloud dataproc clusters create cluster-name \
--region=region \
--zone=zone \
--node-group=node group resource name or URI \
... other args
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eDataproc clusters can be created on Compute Engine sole-tenant nodes, which are dedicated servers for your project's VMs, providing hardware isolation for security and compliance.\u003c/p\u003e\n"],["\u003cp\u003eEach Dataproc sole-tenant cluster's master, worker, and secondary worker instances are created within a user-specified sole-tenant node group.\u003c/p\u003e\n"],["\u003cp\u003eSecondary workers in sole-tenant clusters must be non-preemptible, as preemptible VMs are not supported by Compute Engine sole-tenant nodes.\u003c/p\u003e\n"],["\u003cp\u003eTo create a sole-tenant cluster, use the \u003ccode\u003egcloud dataproc clusters create\u003c/code\u003e command with the \u003ccode\u003e--node-group\u003c/code\u003e flag, ensuring the region and zone match the sole-tenant group.\u003c/p\u003e\n"],["\u003cp\u003eIf using an autoscaling cluster it is recommended to use autoscaling node groups as well and ensure the \u003ccode\u003emax-nodes\u003c/code\u003e is sufficient for the \u003ccode\u003emaxInstances\u003c/code\u003e of the cluster.\u003c/p\u003e\n"]]],[],null,["# Sole-tenant node clusters\n\nDataproc clusters can be created on Compute Engine\n[sole-tenant nodes](/compute/docs/nodes/sole-tenant-nodes). A sole-tenant node\nis a Compute Engine server that is dedicated to hosting your project's\nVMs only. Creating a Dataproc cluster on a sole tenant node\nkeeps the cluster's VMs physically separate from VMs in other projects. The\nclusters function as standard Dataproc clusters, but with\nadditional hardware isolation to address security and compliance concerns.\n\nDataproc sole-tenant node clusters are created in a\nuser-specified *sole-tenant node group*. Each cluster's master, worker, and\nsecondary worker instances will be created within this sole-tenant node group.\n| **Caution:** Dataproc defaults to preemptible VMs for [secondary workers](/dataproc/docs/concepts/compute/secondary-vms), which are not supported by Compute Engine sole-tenant nodes. If you plan to use secondary workers with your sole-tenant cluster, by [manually\n| adding them](/dataproc/docs/concepts/compute/secondary-vms#using_secondary_workers) or using an [autoscaling policy](/dataproc/docs/concepts/configuring-clusters/autoscaling#gcloud-command) that scales up secondary workers, you must set your secondary worker type to non-preemptible.\n\nFirst steps\n-----------\n\n1. See [Before you begin](/compute/docs/nodes/provisioning-sole-tenant-vms#before-you-begin).\n\n2. [Create a sole-tenant node template](/compute/docs/nodes/provisioning-sole-tenant-vms#creating_a_sole-tenant_node_template).\n\n3. [Create a sole-tenant node group](/compute/docs/nodes/provisioning-sole-tenant-vms#creating_a_soletenant_node_group).\n\n - Use [autoscaling node groups](/compute/docs/nodes/autoscaling-node-groups)\n if you will create [autoscaling clusters](/dataproc/docs/concepts/configuring-clusters/autoscaling)\n in the sole-tenant node group.\n\n **Node group autoscaling recommendations:**\n - Make sure the node group's `max-nodes` is sufficient for the `maxInstances` of clusters you will create in the sole-tenant node group.\n - Use the default or `migrate-within-node-group` node group maintenance policy; VMs may be unavailable for up to one hour with the `restart-in-place` policy.\n\nCreating a sole-tenant cluster\n------------------------------\n\n- Before creating a sole-tenant cluster, see the\n [sole-tenant node VM restrictions](/compute/docs/nodes/sole-tenant-nodes#restrictions).\n\n- If you create an\n [autoscaling cluster](/dataproc/docs/concepts/configuring-clusters/autoscaling)\n in a sole-tenant node group, it is recommended that\n [node group also use autoscaling](/compute/docs/nodes/autoscaling-node-groups)\n (see [Node group autoscaling recommendations](#node-autoscaling-recommendations)).\n\n### gcloud Command\n\nTo create a sole-tenant cluster, pass the `--node-group` flag to the\n[gcloud dataproc clusters create](/sdk/gcloud/reference/dataproc/clusters/create)\ncommand.\n\nFlag notes:\n\n- `--region` (required): Must match the region of the sole-tenant-group.\n- `--node-group` (required): You can specify the sole tenant node group name (\"node-group-name\") or the sole-tenant node group resource URI (\"projects/\u003cvar translate=\"no\"\u003eproject-id\u003c/var\u003e/zones/\u003cvar translate=\"no\"\u003ezone\u003c/var\u003e/nodeGroups/\u003cvar translate=\"no\"\u003enode-group-name\u003c/var\u003e\").\n- `--zone` (required): The cluster zone must match the sole-tenant node group zone.\n\n```\ngcloud dataproc clusters create cluster-name \\\n --region=region \\\n --zone=zone \\\n --node-group=node group resource name or URI \\\n ... other args\n \n```\n\n### REST API\n\nCreate a sole-tenant cluster using a\n[clusters.create](/dataproc/docs/reference/rest/v1/projects.regions.clusters/create)\nrequest that specifies the\n[NodeGroupAffinity.nodeGroupUri](/dataproc/docs/reference/rest/v1/ClusterConfig#NodeGroupAffinity)\nof the sole-tenant node group.\n\nNote: the cluster zone specified in the [`zoneUri`](/dataproc/docs/reference/rest/v1/ClusterConfig#GceClusterConfig.FIELDS.zone_uri)\nfield must match the sole-tenant node group zone.\n| An easy way to examine and construct the JSON body of a Dataproc API clusters request is to open the Dataproc [Create a cluster](https://console.cloud.google.com/dataproc/clustersAdd) Google Cloud console page, fill in the applicable fields on the page, then click the **Equivalent REST** button at the bottom of the left panel to view the POST request with the completed JSON request body.\n\n### Console\n\nCurrently, creating a sole-tenant Dataproc cluster is\nnot supported in the Google Cloud console."]]