Dataplex Universal Catalog is a unified, intelligent governance solution for data and AI assets in Google Cloud. Through Dataplex Universal Catalog, you can use AI to simplify data queries, quality assurance, and business insights.
Dataplex Universal Catalog performs governance at scale. Take, for example, a global retail company generating large amounts of sales, inventory, and customer data stored in Cloud Storage, Spanner, and Pub/Sub. With data distributed across systems, managing governance, ensuring quality, and maintaining compliance is complex and time-consuming. Dataplex Universal Catalog simplifies this process by providing a central view to discover, profile, validate, track the lineage of, and control access to organizational data assets.
Why use Dataplex Universal Catalog?
Dataplex Universal Catalog governs data through the following features:
- Metadata cataloging. Retrieve metadata for Google Cloud resources (in BigQuery, Cloud SQL, Spanner, Vertex AI, Pub/Sub, Dataform, Dataproc Metastore), and third-party resources you bring into Dataplex Universal Catalog, for a snapshot of your data assets.
- Data discovery. Scan for structured and unstructured data in Cloud Storage buckets to extract and catalog their metadata.
- Data insights. Use AI to generate natural language questions about your data, to uncover patterns, assess data quality, and perform statistical analyses.
- Data profiling. Identify common characteristics of the column data in your BigQuery tables, for example, typical data values, data distribution, and null counts, which can inform data classification and quality assurance.
- Data quality. Define and measure the quality of the data in your BigQuery tables, by validating data against organizational policies and logging alerts if data doesn't meet quality criteria.
- Business glossary. Manage business-related terminology and definitions across your organization, and attach terms to table columns to promote a consistent understanding of data usage.
- Data lineage. Track how data moves through your systems- where it comes from, where it is passed to, and what transformations are applied to it.
Dataplex Universal Catalog supports an end-to-end data lifecycle, from distributed discovery to business insights. Governance features are also available through BigQuery.
What's next
- Learn about metadata management in Dataplex Universal Catalog.
- Learn how to search for data assets in Dataplex Universal Catalog.
- Learn how to manage entries and ingest custom sources.
- Learn how to import metadata into Dataplex Universal Catalog.
- Learn about BigQuery governance.