Dataproc service account based Secure Multi-tenancy
Stay organized with collections
Save and categorize content based on your preferences.
Dataproc service account based Secure Multi-tenancy
(called "secure multi-tenancy", below) lets you share a cluster with multiple
users, with a set of users mapped to
service accounts when the cluster is created. With secure multi-tenancy, users
can submit interactive workloads to the cluster with isolated user identities.
When a user submits a job to the cluster, the job:
runs as a specific OS user with a specific Kerberos principal
accesses Google Cloud resources using the mapped service account credentials
Considerations and limitations
When you create a cluster with secure multi-tenancy enabled:
You can submit jobs only through the Dataproc Jobs API.
The cluster is available only to users with mapped service accounts. For
example, unmapped users cannot run jobs on the cluster.
Service accounts can be mapped only to Google users, not Google groups.
Direct SSH access to the cluster and Compute Engine features, such
as the ability to run startup scripts on cluster VMs, are blocked. Also,
jobs cannot run with sudo privileges.
Kerberos is
enabled and configured on the cluster for secure intra-cluster communication.
End user authentication through Kerberos is not supported.
To create a Dataproc secure multi-tenancy cluster, use
the --secure-multi-tenancy-user-mapping
flag to specify a list of user-to-service-account mappings.
Example:
The following command creates a cluster, with user bob@my-company.com
mapped to service account service-account-for-bob@iam.gserviceaccount.com
and user alice@my-company.com mapped to service account service-account-for-alice@iam.gserviceaccount.com.
Alternatively, you can store the list of user-to-service-account mappings in
a local or Cloud Storage YAML or JSON file. Use the
--identity-config-file flag to specify the file location.
Sample command to create the cluster using the --identity-config-file flag:
gcloud dataproc clusters create my-cluster \
--identity-config-file=local or "gs://bucket" /path/to/identity-config-file \
--scopes=https://www.googleapis.com/auth/iam \
--service-account=cluster-service-account@iam.gserviceaccount.com \
--region=region \
other args ...
Notes:
As shown in the preceding commands, cluster --scopes must include at least
https://www.googleapis.com/auth/iam, which is necessary for the cluster
service account to perform impersonation.
The cluster service account must have permissions to impersonate the
service accounts mapped to the users (see
Service account permissions).
Recommendation: Use different cluster service accounts for
different clusters to allow each cluster service account to impersonate
only a limited, intended group of mapped user service accounts.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-28 UTC."],[[["\u003cp\u003eSecure multi-tenancy in Dataproc allows multiple users to share a cluster with isolated identities, each user mapped to a specific service account.\u003c/p\u003e\n"],["\u003cp\u003eJobs submitted to a secure multi-tenancy cluster run with a unique OS user and Kerberos principal, accessing Google Cloud resources through their mapped service account credentials.\u003c/p\u003e\n"],["\u003cp\u003eCreating a secure multi-tenancy cluster requires specifying user-to-service-account mappings using the \u003ccode\u003e--secure-multi-tenancy-user-mapping\u003c/code\u003e flag or an \u003ccode\u003e--identity-config-file\u003c/code\u003e with a YAML or JSON configuration.\u003c/p\u003e\n"],["\u003cp\u003eSecure multi-tenancy clusters have limitations, such as jobs being submitted only through the Dataproc Jobs API, the inability to use Dataproc Component Gateway, and the blocking of direct SSH access.\u003c/p\u003e\n"],["\u003cp\u003eWhen creating a cluster, the cluster service account must have the necessary permissions to impersonate the service accounts mapped to the users and it is recommended to use different service accounts for different clusters.\u003c/p\u003e\n"]]],[],null,["# Dataproc service account based Secure Multi-tenancy\n(called \"secure multi-tenancy\", below) lets you share a cluster with multiple\nusers, with a set of users mapped to\nservice accounts when the cluster is created. With secure multi-tenancy, users\ncan submit interactive workloads to the cluster with isolated user identities.\n\nWhen a user submits a job to the cluster, the job:\n\n- runs as a specific OS user with a specific Kerberos principal\n\n- accesses Google Cloud resources using the mapped service account credentials\n\nConsiderations and limitations\n------------------------------\n\nWhen you create a cluster with secure multi-tenancy enabled:\n\n- You can submit jobs only through the Dataproc [Jobs API](/dataproc/docs/guides/submit-job).\n\n- The cluster is available only to users with mapped service accounts. For\n example, unmapped users cannot run jobs on the cluster.\n\n- Service accounts can be mapped only to Google users, not Google groups.\n\n- The Dataproc\n [Component Gateway](/dataproc/docs/concepts/accessing/dataproc-gateways)\n is not enabled.\n\n- Direct SSH access to the cluster and Compute Engine features, such\n as the ability to run startup scripts on cluster VMs, are blocked. Also,\n jobs cannot run with `sudo` privileges.\n\n- [Kerberos](/dataproc/docs/concepts/configuring-clusters/security) is\n enabled and configured on the cluster for secure intra-cluster communication.\n End user authentication through Kerberos is not supported.\n\n- Dataproc [Workflows](/dataproc/docs/concepts/workflows/overview)\n are not supported.\n\nCreate a secure multi-tenancy cluster\n-------------------------------------\n\nTo create a Dataproc secure multi-tenancy cluster, use\nthe `--secure-multi-tenancy-user-mapping`\nflag to specify a list of user-to-service-account mappings.\n\n**Example:**\n\nThe following command creates a cluster, with user `bob@my-company.com`\nmapped to service account `service-account-for-bob@iam.gserviceaccount.com`\nand user `alice@my-company.com` mapped to service account `service-account-for-alice@iam.gserviceaccount.com`. \n\n```\ngcloud dataproc clusters create my-cluster \\\n --secure-multi-tenancy-user-mapping=\"bob@my-company.com:service-account-for-bob@iam.gserviceaccount.com,alice@my-company.com:service-account-for-alice@iam.gserviceaccount.com\" \\\n --scopes=https://www.googleapis.com/auth/iam \\\n --service-account=cluster-service-account@iam.gserviceaccount.com \\\n --region=region \\\n other args ...\n```\n\nAlternatively, you can store the list of user-to-service-account mappings in\na local or Cloud Storage YAML or JSON file. Use the\n`--identity-config-file` flag to specify the file location.\n\nSample identity config file: \n\n```\nuser_service_account_mapping:\n bob@my-company.com: service-account-for-bob@iam.gserviceaccount.com\n alice@my-company.com: service-account-for-alice@iam.gserviceaccount.com\n```\n\nSample command to create the cluster using the `--identity-config-file` flag: \n\n```\ngcloud dataproc clusters create my-cluster \\\n --identity-config-file=local or \"gs://\u003cvar translate=\"no\"\u003ebucket\u003c/var\u003e\" /path/to/identity-config-file \\\n --scopes=https://www.googleapis.com/auth/iam \\\n --service-account=cluster-service-account@iam.gserviceaccount.com \\\n --region=region \\\n other args ...\n```\n\n**Notes:**\n\n- As shown in the preceding commands, cluster `--scopes` must include at least\n `https://www.googleapis.com/auth/iam`, which is necessary for the cluster\n service account to perform impersonation.\n\n- The cluster service account must have permissions to impersonate the\n service accounts mapped to the users (see\n [Service account permissions](/iam/docs/service-account-permissions)).\n\n- **Recommendation:** Use different cluster service accounts for\n different clusters to allow each cluster service account to impersonate\n only a limited, intended group of mapped user service accounts."]]