Set up a cross-project deployment

You can set up a cross-project deployment for Dataproc Metastore to separate the following resources:

  • The Dataproc Metastore service.
  • The Dataproc cluster attached to the Dataproc Metastore service.
  • The network used by the Dataproc cluster.

Before you begin

Required Roles

To get the permissions that you need to create a Dataproc Metastore and a Dataproc cluster, ask your administrator to grant you the following IAM roles:

  • To grant full control of Dataproc Metastore resources: Dataproc Metastore Editor (roles/metastore.editor) on the metastore project.

For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

For more information about specific Dataproc Metastore roles and permissions, see Manage access with IAM.

About cross-project deployments

A cross-project deployment for Dataproc Metastore can consist of the following configurations:

  • Two projects:

    • Project one contains the Dataproc cluster (cluster project).
    • Project two contains the Dataproc Metastore service and the network (metastore project and network project).
  • Two projects:

    • Project one contains the Dataproc cluster.
    • Project two contains the Dataproc Metastore service and the network (metastore project and network project).
  • Two projects:

    • Project one contains the Dataproc cluster and the Dataproc Metastore service (cluster project and metastore project).
    • Project two contains and the network (network project).
  • Three projects:

    • Project one contains the Dataproc cluster (cluster project).
    • Project two contains the Dataproc Metastore service (metastore project).
    • Project three contains the network (network project).

The following diagram provides an overview of the possible project configurations you can use.

Overview of the possible project configurations when deploying a
Dataproc Metastore and Dataproc cluster

Cross-project permissions

Before you set up cross-project permissions, determine if it's necessary for your configuration:

You must set up additional cross-project permissions

  • If your Dataproc cluster and Dataproc Metastore service are in separate projects.

  • If your Dataproc Metastore service and network are in separate projects.

Set up cross-project permissions

If the cluster project and metastore project are in separate projects, grant the following roles:

  • roles/metastore.user to the cluster project's Dataproc Service Agent account (contained in the metastore project's IAM policy). This configuration applies to both the Thrift and gRPC endpoint protocols.

If the network project and metastore project are in separate projects, grant the following roles:

  • roles/metastore.serviceAgent to the metastore project Service Agent (contained in the network project's IAM policy). This configuration only applies to the Thrift endpoint protocol.

Console

To find your project number:

  1. Navigate to the IAM & Admin Settings tab.

  2. From the project list at the top of the page, select the project you want to use to create the Dataproc cluster.

  3. Note the project number.

Configure the permissions:

  1. Navigate to the IAM tab.

  2. From the project list at the top of the page, select the metastore project.

  3. Click Add.

  4. Enter the service account in the New Principals field.

  5. From the Roles menu, select Dataproc Metastore > Dataproc Metastore Viewer.

  6. Click Add.

After you have completed the previous steps, you can create a Dataproc cluster that's attached to a Dataproc Metastore service. Note that to do this, the Dataproc cluster's network or subnetwork configuration must match the Dataproc Metastore network or subnetwork.

For example:

gcloud metastore services create SERVICE \
     --network=projects/HOST_PROJECT/global/networks/NETWORK_ID

What's next