Stay organized with collections
Save and categorize content based on your preferences.
This document provides information on data lineage compliance and limitations.
Data lineage is enabled on a per-project basis, not a
per-system basis.
This means that after you enable the Data Lineage API, lineage information
can be automatically reported for multiple systems in the project, depending on
each system's product-level lineage control.
Automatic lineage tracking is supported for the following systems:
Product-level lineage controls in Google Cloud supported systems
System
Available lineage controls
BigQuery, Cloud Data Fusion
There is no configurability to restrict lineage tracking to only Cloud Data Fusion
or BigQuery when the Data Lineage API is enabled in a project.
Cloud Composer
Cloud Composer uses environment-level data lineage
integration control. Data lineage is automatically
enabled for all new Cloud Composer environments, provided they meet the
requirements. See
Data lineage with Dataplex Universal Catalog for more
information. For existing environments, you can enable or disable
data lineage integration in environment settings.
Dataflow
Dataflow jobs can capture lineage events and publish them to the Data Lineage API.
See Use data lineage in Dataflow for more information.
Dataproc
Dataproc Spark jobs can capture lineage events and publish them to the Data Lineage API.
See Data lineage Dataproc integration for more information.
Vertex AI
Data lineage is automatically enabled for Vertex AI artifacts and parameters, such as models, datasets, pipeline templates, and components. The lineage of a pipeline includes factors that contributed to its creation, as well as artifacts and metadata derived afterwards.
See Track the lineage of pipeline artifacts for more information.
Billing impact
When you enable the Data Lineage API on a project, review
the impact on your billing charges because the Data Lineage API is enabled
on a per-project basis (see the previous section for details).
For more information about how data lineage is charged, see
Dataplex Universal Catalog pricing.
For BigQuery Omni, lineage processing
is distributed to specific regions, and costs depend on the regions where
the processing is performed.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[[["\u003cp\u003eData lineage is enabled per project, allowing automatic reporting for multiple systems within that project based on each system's product-level lineage control.\u003c/p\u003e\n"],["\u003cp\u003eAutomatic lineage tracking is supported for various systems, including BigQuery, Cloud Data Fusion, Cloud Composer, Dataflow, Dataproc, and Vertex AI, each with its own lineage control features.\u003c/p\u003e\n"],["\u003cp\u003eEnabling the Data Lineage API has a billing impact, as it is enabled per project, and BigQuery Omni's lineage processing costs depend on the regions where it is performed.\u003c/p\u003e\n"],["\u003cp\u003eData lineage captures metadata about data movement, not the data itself, and it provides VPC-SC support but does not support Customer Managed Encryption Keys for lineage metadata.\u003c/p\u003e\n"],["\u003cp\u003eLineage node details might be empty if the resource is in another organization or the user isn't part of the organization hosting the resource.\u003c/p\u003e\n"]]],[],null,["# Data lineage considerations\n\nThis document provides information on data lineage compliance and limitations.\nData lineage is enabled on a per-project basis, not a\nper-system basis.\nThis means that after you enable the Data Lineage API, lineage information\ncan be automatically reported for multiple systems in the project, depending on\neach system's product-level lineage control.\n\nAutomatic lineage tracking is supported for the following systems:\n\n| **Important:** See [Supported systems](/dataplex/docs/about-data-lineage#lineage-supported-systems) for details on the support status of these systems. When a new system becomes available, depending on the level of that system's lineage control, the Data Lineage API can automatically start harvesting lineage data.\n\nBilling impact\n--------------\n\nWhen you enable the Data Lineage API on a project, review\nthe impact on your billing charges because the Data Lineage API is enabled\non a per-project basis (see the previous section for details).\nFor more information about how data lineage is charged, see\n[Dataplex Universal Catalog pricing](https://cloud.google.com/dataplex/pricing).\n\nFor BigQuery Omni, lineage processing\nis distributed to specific regions, and costs depend on the regions where\nthe processing is performed.\n\nData lineage compliance\n-----------------------\n\n- Data lineage records metadata about data movement but doesn't capture the data itself. See [data lineage information model](/dataplex/docs/about-data-lineage#information-model) and [Data Lineage API reference](/dataplex/docs/reference/data-lineage/rest) for details on what fields are included in the metadata.\n- Data lineage as part of Dataplex Universal Catalog offers VPC-SC support.\n- Dataplex Universal Catalog doesn't offer the ability to use Customer Managed Encryption Keys to protect the harvested lineage metadata.\n\nData lineage limitations\n------------------------\n\nWhen you select a node in the lineage graph, the node details side panel\nwill be empty when:\n\n1. the resources is located in another organization, or\n2. the user is not a member of the organization hosting the resource."]]