Dataplex pricing

Dataplex pricing is based on pay-as-you-go usage. Dataplex currently charges based on the following SKUs:

  • Dataplex processing (standard and premium)
  • Dataplex shuffle storage
  • Metadata storage
  • Data Catalog API calls

The following is a high-level overview of how each key Dataplex capability is billed:

Capability Dataplex processing Dataplex shuffle storage Metadata storage
Cloud Storage metadata harvesting Standard N/A N/A
Data exploration workbench Premium Yes N/A
Data lineage Premium N/A Yes
Data quality Premium N/A Yes - if published to Data Catalog
Data profiling Premium N/A Yes - if published to Data Catalog
Enrich metadata in BigQuery universal catalog N/A N/A Yes
Enrich metadata in Data Catalog N/A N/A Yes

In addition to this billing, Data Catalog API calls are billed based on the Data Catalog API charges.

Other usage

Data organization features in Dataplex (lake, zone, or asset setup) and security policy application and propagation, are provided free of charge.

In addition, some Dataplex functionalities (including scheduled data quality and data ingestion tasks, and Dataplex managed connectors for ingesting metadata from CloudSQL and Looker) trigger job execution using Dataproc Serverless, BigQuery, Dataflow, and Cloud Scheduler. Those usages are charged according to the Dataproc, BigQuery, Dataflow, and Cloud Scheduler pricing models respectively, and charges will show up under Dataproc, BigQuery, and Dataflow instead of Dataplex.

Dataplex processing pricing

Dataplex standard and premium processing are metered by the Data Compute Unit (DCU). DCU-hour is an abstract billing unit for Dataplex and the actual metering depends on the individual features you use.

Dataplex standard processing pricing

Dataplex standard tier covers the data discovery functionality that discovers metadata across Dataplex managed data. The following are the prices as per the region of your choice.

Dataplex free tier

As part of the Google Cloud Free Tier, Dataplex offers some resources free of charge up to a specific limit. These free usage limits are available during and after the free trial period. If you go over these usage limits and are no longer in the free trial period, you will be charged according to the pricing as described in the sections above.

Resource Monthly free usage limits
Dataplex Processing 100 DCU-hour

Dataplex premium processing pricing

The Dataplex premium processing tier covers data lineage, data quality, data profiling, and data exploration workbench.

DCU charges for each feature is calculated as follows:

  • Auto data quality scans:
    • The DCU-hour consumption is proportional to the processing involved in profiling the data and computing the data quality metrics. This is billed per second, with a minimum of one minute.
    • The charge depends on the number of rows, the number of columns, the amount of data that you've scanned, the data quality rule configuration, the partitioning and clustering settings on the table, and the frequency of the scan.
    • There are several options to reduce the cost of auto data quality scans:
    • To separate data quality charges from other charges in the Dataplex premium processing SKU, on the Cloud Billing report, use the label goog-dataplex-workload-type with value DATA_QUALITY.
    • To filter aggregate charges, use the following labels available in billing export in BigQuery:
      • goog-dataplex-datascan-data-source-dataplex-entity
      • goog-dataplex-datascan-data-source-dataplex-lake
      • goog-dataplex-datascan-data-source-dataplex-zone
      • goog-dataplex-datascan-data-source-project
      • goog-dataplex-datascan-data-source-region
      • goog-dataplex-datascan-id
      • goog-dataplex-datascan-job-id
  • Data Profiling scans:
    • The DCU-hour consumption is proportional to the processing involved in profiling the data and computing the data quality metrics. This is billed per second, with a minimum of one minute.
    • The charge depends on the number of rows, numbers of columns, the amount of data scanned, partitioning and clustering settings on the table, and the frequency of the scan.
    • There are several options to reduce the cost of data profiling scans:
      • Sampling
      • Incremental scans
      • Column filtering
      • Row filtering
    • To separate data profiling charges from other charges in the Dataplex premium processing SKU, on the Cloud Billing report, use the label goog-dataplex-workload-type with value DATA_PROFILE.
    • To filter aggregate charges, use the following labels available in billing export in BigQuery:
      • goog-dataplex-datascan-data-source-dataplex-entity
      • goog-dataplex-datascan-data-source-dataplex-lake
      • goog-dataplex-datascan-data-source-dataplex-zone
      • goog-dataplex-datascan-data-source-project
      • goog-dataplex-datascan-data-source-region
      • goog-dataplex-datascan-id
      • goog-dataplex-datascan-job-id
  • Data Lineage:
    • The DCU-hour consumption is proportional to the processing involved to automatically parse lineage.
    • To separate data lineage charges from other charges in the Dataplex premium processing SKU, on the Cloud Billing report, use the label goog-dataplex-workload-type with value LINEAGE.
    • If you call the Data Lineage API Origin sourceType with a value other than CUSTOM, it causes additional costs.
  • Data exploration workbench:
    • The DCU-hour is calculated based on the compute consumption of the session.

Data lineage pricing example

User A enables data lineage to track lineage for BigQuery in their project. The project is in the us-central1 location. During one month, data lineage consumes 100 DCU-hours of Dataplex Premium processing, and generates 1GiB of data lineage metadata. The cost is:


100 * $0.089           // 100 DCU-hours of Dataplex Premium processing billed at $0.089 per DCU-hr

+ (1GiB - 1MiB) * $2   // cost of storing 1GiB data lineage metadata excluding 1MiB of free storage per month
  ---
  = $10.9

Dataplex shuffle storage pricing

Shuffle storage pricing covers any disk storage specified in the environments configured for the data exploration workbench.

Catalog pricing

This section describes the pricing for universal catalog and Data Catalog. For more information about the differences between universal catalog and Data Catalog, see Universal catalog versus Data Catalog.

Universal catalog charges apply to metadata storage for universal catalog including metadata stored for data lineage. These charges are effective Aug 1, 2024.

Data Catalog charges apply to metadata storage for Data Catalog and API calls made to the Data Catalog API.

Metadata storage and API call charges accrue daily. You can view unbilled usage on the Google Cloud console.

Metadata storage pricing

Dataplex uses the metadata storage SKU to charge for metadata storage. Metadata storage is measured in gibibytes (GiB), where 1 GiB is 1,073,741,824 bytes. Universal catalog and Data Catalog measure the average amount of the stored metadata during a short time interval. For billing, these measurements are combined into a one-month average, which is multiplied by the monthly rate.

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Universal catalog storage pricing

Metadata storage charges (including those for entries and aspects) are billed to the project where the respective resource was created.

Monthly average storage Price (USD)
Any $2 per GiB per month

When a resource in Data Catalog is made simultaneously available in universal catalog, you are charged for only one active instance of such resource.

Data Catalog storage pricing

Monthly average storage Price (USD)
Up to 1 MiB No charge
Over 1 MiB $2 per GiB per month

API pricing

This section describes the pricing for universal catalog and Data Catalog APIs.

Universal catalog API charges

As users interact with universal catalog, API calls for the following are free of charge:

  • Creating and managing universal catalog resources
  • Creating and managing lineage resources, except for lineage that is automatically harvested
  • Catalog search

Data Catalog API charges

Data Catalog API calls are billed as described in the following table:

API calls Price (USD)
1 million in a month No charge
Over 1 million in a month $10 per 100,000 API calls

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Universal catalog pricing examples

This section provides examples of how to calculate the universal catalog cost.

Small aspects

  • User A creates and applies small aspects (1024 bytes each). For $10 per month, the user can store 5 GiB of metadata, which corresponds approximately to 5M aspects. Assuming one aspect per table, this amounts to a total of 5M tables with aspects.

  • User B creates 5M aspects of 1 KB each on the 10th of the month, and deletes the aspects on the 20th. The cost is $3.33, calculated as 5 GiB of data divided by one-third month:

5 GiB * $2
* 1/3
= $3.33

Large aspects

  • User C creates and applies large aspects (10 KB each). For $10 per month, the user can store 5 GiB of metadata, which corresponds to approximately 500k aspects. Assuming one aspect per table, it amounts to a total of 500k tables with aspects.

  • User D creates 10k aspect types (for example ETL, data governance, data quality), and applies large aspects (10 KB each) using each of the 10 aspect types. For $10 per month, the user can store 5 GiB of metadata, which corresponds approximately to 500k aspects. Assuming 10 aspects per table, it amounts to a total of 50k tables with aspects.

What's next

Request a custom quote

With Google Cloud's pay-as-you-go pricing, you only pay for the services you use. Connect with our sales team to get a custom quote for your organization.
Contact sales