POST https://datalineage.googleapis.com/v1/projects/{project}/locations/{location}:processOpenLineageRunEvent \
--data '{"eventTime":"2023-04-04T13:21:16.098Z","eventType":"COMPLETE","inputs":[{"name":"somename","namespace":"somenamespace"}],"job":{"name":"somename","namespace":"somenamespace"},"outputs":[{"name":"somename","namespace":"somenamespace"}],"producer":"someproducer","run":{"runId":"somerunid"},"schemaURL":"https://openlineage.io/spec/1-0-5/OpenLineage.json#/$defs/RunEvent"}'
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-19。"],[[["\u003cp\u003eOpenLineage, an open platform for data lineage, can be integrated with the Dataplex Data Lineage API to display lineage information from various data pipeline components.\u003c/p\u003e\n"],["\u003cp\u003eThe Dataplex Data Lineage API imports OpenLineage events via the \u003ccode\u003eProcessOpenLineageRunEvent\u003c/code\u003e REST API method, mapping OpenLineage facets and attributes to the Data Lineage API structure.\u003c/p\u003e\n"],["\u003cp\u003eThe Data Lineage API supports OpenLineage major versions 1 and 2, but it has limitations, such as not supporting \u003ccode\u003eDatasetEvent\u003c/code\u003e or \u003ccode\u003eJobEvent\u003c/code\u003e and having size and length restrictions for messages and fully qualified names.\u003c/p\u003e\n"],["\u003cp\u003eDataplex displays lineage graphs for job runs, showcasing inputs and outputs derived from lineage events, however it does not go down to the lower level process such as Spark stages.\u003c/p\u003e\n"],["\u003cp\u003eThe Data Lineage API stores a curated set of facet fields and core information from OpenLineage messages, such as \u003ccode\u003espark_version\u003c/code\u003e, \u003ccode\u003eenvironment-properties\u003c/code\u003e, \u003ccode\u003eeventTime\u003c/code\u003e, \u003ccode\u003erun.runId\u003c/code\u003e, \u003ccode\u003ejob.namespace\u003c/code\u003e, and \u003ccode\u003ejob.name\u003c/code\u003e.\u003c/p\u003e\n"]]],[],null,["# Integrate with OpenLineage\n\n[OpenLineage](https://openlineage.io/) is an open platform\nfor collecting and analyzing data lineage information. Using an open standard\nfor lineage data, OpenLineage captures lineage events from data pipeline\ncomponents which use an OpenLineage API to report on runs, jobs, and datasets.\n\nThrough the Data Lineage API, you can import OpenLineage events to display\nin the Dataplex Universal Catalog web interface alongside lineage information from\nGoogle Cloud services, such as BigQuery, Cloud Composer,\nCloud Data Fusion, and Dataproc.\n\nTo import OpenLineage events that use the\n[OpenLineage specification](https://github.com/OpenLineage/OpenLineage/blob/main/spec/OpenLineage.json),\nuse the [`ProcessOpenLineageRunEvent`](/dataplex/docs/reference/data-lineage/rest/v1/projects.locations/processOpenLineageRunEvent)\nREST API method, and map OpenLineage facets to Data Lineage API attributes.\n\nLimitations\n-----------\n\n- The Data Lineage API supports OpenLineage major versions 1 and 2.\n\n- The Data Lineage API doesn't support the following:\n\n - Any subsequent OpenLineage release with message format changes\n - `DatasetEvent`\n - `JobEvent`\n- Maximum size of a single message is 5 MB.\n\n- Length of each [Fully Qualified Name](/dataplex/docs/fully-qualified-names)\n in inputs and outputs is limited to 4000 characters.\n\n- [Links](/dataplex/docs/reference/data-lineage/rest/v1/projects.locations.processes.runs.lineageEvents#EventLink)\n are grouped by events with 100 links. The maximum aggregate number of links\n is 1000.\n\n- Dataplex Universal Catalog displays a lineage graph for each job run, showing the inputs\n and outputs of lineage events. It doesn't support lower-level processes like\n Spark stages.\n\nOpenLineage mapping\n-------------------\n\nThe REST API method [`ProcessOpenLineageRunEvent`](/dataplex/docs/reference/data-lineage/rest/v1/projects.locations/processOpenLineageRunEvent)\nmaps OpenLineage attributes to Data Lineage API attributes as follows:\n\nImport an OpenLineage event\n---------------------------\n\nIf you haven't yet set up OpenLineage, see\n[Getting started](https://openlineage.io/getting-started/).\n\nTo import an OpenLineage event into Dataplex Universal Catalog, call the REST API method\n[`ProcessOpenLineageRunEvent`](/dataplex/docs/reference/data-lineage/rest/v1/projects.locations/processOpenLineageRunEvent): \n\n POST https://datalineage.googleapis.com/v1/projects/{project}/locations/{location}:processOpenLineageRunEvent \\\n --data '{\"eventTime\":\"2023-04-04T13:21:16.098Z\",\"eventType\":\"COMPLETE\",\"inputs\":[{\"name\":\"somename\",\"namespace\":\"somenamespace\"}],\"job\":{\"name\":\"somename\",\"namespace\":\"somenamespace\"},\"outputs\":[{\"name\":\"somename\",\"namespace\":\"somenamespace\"}],\"producer\":\"someproducer\",\"run\":{\"runId\":\"somerunid\"},\"schemaURL\":\"https://openlineage.io/spec/1-0-5/OpenLineage.json#/$defs/RunEvent\"}'\n\nAnalyze information from OpenLineage\n------------------------------------\n\nTo analyze the imported OpenLineage events, see\n[View lineage graphs in Dataplex Universal Catalog UI](/dataplex/docs/use-lineage#view-lineage).\n\nStored data\n-----------\n\nThe Data Lineage API doesn't store all facets data from the OpenLineage messages.\nThe Data Lineage API stores the following facet fields:\n\n- `spark_version`\n - `openlineage-spark-version`\n - `spark-version`\n- all `spark.logicalPlan.*`\n- `environment-properties` (custom Google Cloud lineage facet)\n - `origin.sourcetype` and `origin.name`\n - `spark.app.id`\n - `spark.app.name`\n - `spark.batch.id`\n - `spark.batch.uuid`\n - `spark.cluster.name`\n - `spark.cluster.region`\n - `spark.job.id`\n - `spark.job.uuid`\n - `spark.project.id`\n - `spark.query.node.name`\n - `spark.session.id`\n - `spark.session.uuid`\n\nThe Data Lineage API stores the following information:\n\n- `eventTime`\n- `run.runId`\n- `job.namespace`\n- `job.name`"]]