VPC Service Controls with Dataplex Universal Catalog

This document describes how to secure your Dataplex Universal Catalog services using VPC Service Controls (VPC-SC).

VPC Service Controls provides additional security for your Dataplex Universal Catalog services to help mitigate the risk of data exfiltration. Using VPC Service Controls, you can add projects to service perimeters that protect resources and services from requests that cross the perimeter. For more information, see Overview of VPC Service Controls.

Dataplex Universal Catalog resources are exposed on the dataplex.googleapis.com API, which lets you perform service-level operations, such as creation and deletion of services. You set up VPC Service Controls with Dataplex Universal Catalog by restricting connectivity to this API surface.

Supported features

When you secure Dataplex Universal Catalog with a service perimeter, the following features are supported:

  • Lakes: you can create and manage Dataplex Universal Catalog lakes within the perimeter.

  • Assets: you can manage assets within the perimeter.

  • Metadata resources in Catalog: you can manage metadata resources within the perimeter.

  • Metadata export: you can export metadata to Dataproc Metastore, which requires additional VPC-SC configuration. For more information, see Export metadata to Dataproc Metastore.

  • Data Insights: you can run data profiling, data quality, and metadata insights scans.

  • Data Lineage: you can track data lineage by using the restricted Virtual IP (VIP).

  • Dataplex Search: you can use the SearchEntries API. When you use Dataplex Universal Catalog search in a project protected by a service perimeter, search results include only resources that are within the same perimeter. For more information, see Search scope and Isolate search results by environment using VPC Service Controls.

Limitations

You can create Dataplex Universal Catalog resources before setting up the VPC Service Controls security perimeter, but those resources won't have perimeter protection.

By design, VPC Service Controls prevents resources within a service perimeter from accessing data and services outside that perimeter. For example, Dataplex Universal Catalog data profiling and data quality scans cannot access data sources, like BigQuery tables or Cloud Storage files, that are outside the service perimeter. Similarly, when exporting data scan results, the destination BigQuery table must be in a project protected by the perimeter. For information about granting access to your protected resources from outside the perimeter, see Create an access level.

Set up VPC Service Controls

To protect Dataplex Universal Catalog resources with VPC Service Controls, perform the following steps:

  1. Configure the VPC network.
  2. Create a service perimeter.
  3. Add projects to the perimeter.
  4. Add the Dataplex API to the service perimeter.
  5. Optional: Create an access level.

Configure the Virtual Private Cloud (VPC) network

You can configure the VPC network to restrict Private Google Access with respect to a service perimeter. This ensures that hosts on your VPC or on-premises network can only communicate with Google APIs and services that are supported by VPC Service Controls in ways which conform to the associated perimeter's policy.

For more information, see Setting up private connectivity to Google APIs and services.

Create a service perimeter

When you create a service perimeter, you select the Dataplex Universal Catalog projects that you want the VPC Service Controls service perimeter to protect.

To create a service perimeter, follow the instructions in Create a service perimeter.

Add more projects to the service perimeter

To add existing Dataplex Universal Catalog projects to the perimeter, follow the instructions in Update a service perimeter.

Add the Dataplex API to the service perimeter

To mitigate the risk of your data being exfiltrated from Dataplex Universal Catalog, for example, using Dataplex API methods, you must restrict the Dataplex API.

To add the Dataplex API as a restricted service, follow these steps:

Console

  1. In the Google Cloud console, go to the VPC Service Controls page.

    Go to VPC Service Controls

  2. On the VPC Service Controls page, in the table, click the name of the service perimeter that you want to modify.

  3. Click Edit Perimeter.

  4. On the Edit Service Perimeter page, click Add Services.

  5. Add Dataplex API.

  6. Click Save.

gcloud

  • Use the gcloud access-context-manager perimeters update command:

    gcloud access-context-manager perimeters update PERIMETER_ID \
    --policy=POLICY_ID \
    --add-restricted-services=dataplex.googleapis.com
    

    Replace the following:

    • PERIMETER_ID: the ID of the perimeter or the fully qualified identifier for the perimeter
    • POLICY_ID: the ID of the access policy

Optional: Create an access level

To permit external access to protected resources inside a perimeter, you can use access levels. Access levels apply only to requests for protected resources coming from outside the service perimeter. You can't use access levels to give protected resources permission to access data and services outside the perimeter.

For more information, see Allow access to protected resources from outside a perimeter.

What's next