Data Mesh concepts

Before diving into the detailed guide on deploying Data Mesh with Google Cloud Cortex Framework, this page provides a foundation for how relevant Data Mesh concepts are implemented within the product. For a deeper understanding of general Data Mesh concepts, see Build a modern, distributed Data Mesh with Google Cloud.

Term Google Cloud product Description Cortex Application context
Lake Dataplex - Manage Lakes Top level unit for organizing data within a Data Mesh. A data source, for example, SAP ECC, Salesforce, Google Ads.
Zone Dataplex Second level unit for organizing data within a Lake. Specific processing layers within a data source, like raw versus CDC.
Dataplex Asset Dataplex Reference to data that is stored in Cloud Storage or BigQuery that is associated with a zone. This is a reference to the data asset and not the data itself. Reference to BigQuery datasets registered in zones.
Label Dataplex Arbitrary key value pairs that can be applied to lakes or zones. Label entire lakes or zones (rather than tables or columns) with metadata that can be viewed in Dataplex or used for custom applications.
Data Catalog Dataplex Technical business metadata that can be used to help discover, understand, or manage data assets within a warehouse. Annotate tables or columns (rather than lakes or zones) with rich metadata tags that can be used in Dataplex search or custom applications.
Catalog Tag Templates Dataplex - Tag Templates A template defining the available fields and their types in a tag. Define a set of templates for uses like tagging data assets with lines of business.
Catalog Tag Dataplex A set of fields and their values that contain metadata applicable to a table or column. An instance of a tag template. Annotate a table or column with metadata values relevant to that asset, such as a particular line of business.
Catalog Glossary Dataplex - Glossaries A dictionary of terms that can be defined and associated with BigQuery columns. Define terms or acronyms used in BigQuery Assets. Note that this is planned for the future and is not supported.
Policy Taxonomy BigQuery - Policy Tags A hierarchy of policy tags. Organize related policy tags that can be used for access control into a hierarchy with inherited permissions.
Policy Tag BigQuery A tag that is applied to specific columns within a BigQuery table or view. Policy tags at any level in the hierarchy can be applied. Only one policy tag can be applied to a particular column. Annotate columns with tags that are used for column-level access control. Principals on the policy tag define 'Fine-Grained' or 'Unmasked' Readers who can see the raw column data.
Data Policy BigQuery Policies applied to a Policy Tag that define how and who can view the masked column data. Principals on the Data Policy define the 'Masked readers' who can see the masked column data. Anyone who doesn't have masked or unmasked reader privileges won't be able to query the column.
Masking Rule BigQuery Rules applied to a Data Policy that define how the data is masked, for example, hashing, showing a default value, last four characters, and others. Applied situationally to sensitive columns.
Row Access Policy BigQuery SQL statements that define which groups can query rows within tables based on specific column values. Used for row-level access control when asset and column level control is insufficient.
Metadata Resource Cortex Data Mesh term All of the metadata objects listed, are used in the data mesh. This is specifically the metadata and not the data in BigQuery itself. The entirety of these resources compose the Cortex Data Mesh.
BigQuery Asset Cortex Data Mesh term BigQuery table or view. Existing Cortex BigQuery objects that we would like to govern with the Data Mesh.
BigQuery Asset Annotation Cortex Data Mesh term Metadata resources applied to a specific BigQuery table or view. Associate metadata with BigQuery Assets to enable discovery and access control.
Resource Specification (spec) Cortex Data Mesh term A YAML file defining a Metadata Resource or BigQuery Asset Annotation. The full set of resource specs codifies the Data Mesh configuration to be deployed.
Data Lineage Dataplex A graph representing BigQuery Asset dependencies. These are not defined by the Cortex Data Mesh, however it is a relevant Dataplex tool to help users discover BigQuery Asset data sources.
Lineage Event Dataplex A point in time when an operation occurred to move data between BigQuery Assets. Contains a list of Links. Automatically created for supported BigQuery and Composer operations.
Lineage Link Dataplex An edge representing data flowing from a source to target asset as part of a Lineage Event. It can be analyzed to support use cases beyond the lineage visualization graphs that are presented in the console.

After understanding the Data Mesh concepts, see the Data Mesh User Guide for Cortex Data Foundation.