View advanced pipeline logs in Cloud Logging

This page describes how to enable Cloud Logging for your Cloud Data Fusion Dataproc clusters and access advanced pipeline logs.

Enable Dataproc Cloud Logging

To view pipeline and cluster issues in Cloud Logging, enable advanced logs in new or existing Cloud Data Fusion instances. To enable advanced logs in an existing instance, do the following:

  1. In the Google Cloud console, go to the Cloud Data Fusion Instances page.

    Go to Instances

  2. Click the instance name.

  3. In the Advanced monitoring and logging section, for Dataproc Cloud Logging, click Edit.

  4. In the Cloud Logging window, select the Enable Cloud Logging checkbox.

  5. Click Save.

View logs

Every Cloud Data Fusion pipeline run is assigned a unique RunID. After you deploy and run your pipeline, find its RunID. Then, in Logging, use the RunID to view your pipeline logs.

Get the pipeline's RunID

  1. Go to your instance:
    1. In the Google Cloud console, go to the Cloud Data Fusion page.

    2. To open the instance in the Cloud Data Fusion Studio, click Instances, and then click View instance.

      Go to Instances

  2. Click List.
  3. Click the pipeline for which you want to get the Run ID.
  4. Click Summary.
  5. In the Run history section, click Table.
  6. To copy the Run ID, right-click the RunID, and click Copy.

View the logs in Logs Explorer

  1. In the Google Cloud console, go to the Cloud Logging > Logs Explorer page:

    Go Logs Explorer

  2. In the All resources drop-down, select Cloud Dataproc Cluster > cdap-PIPELINE_NAME-YOUR_RUNID.

Optional: Filter the logs

Use the filter menus to filter your logs. You can filter by various log severity levels or by components such as datafusion-pipeline-logs.

Optional: Download the logs

Click Download logs.

For more information, see downloading log entries.

What's next