Troubleshoot Cloud Data Fusion

This page shows you how to resolve issues with Cloud Data Fusion.

Retrieve error information for a failed pipeline run

When a pipeline run fails, you can retrieve detailed error information. Cloud Data Fusion 6.11.0 classifies pipeline errors by category, reason, and message. This classification speeds up resolution and reduces the need to examine complex logs.

To get error details, follow these steps:

Console

  1. In the Google Cloud console, open your Cloud Data Fusion instance and go to your pipeline on the Studio page.

  2. On the node where the error occurred, click View errors.

  3. Review the error details, which include the error category, error reason, and error message.

  4. Optional: To download raw logs for further analysis, click Download raw logs.

  5. Optional: To view raw logs, click View logs.

REST

Send a POST request to the following endpoint:

 curl -H "Authorization: Bearer $(gcloud auth print-access-token)"
 -H "Content-Type: application/json" 
 ${CDAP_ENDPOINT}/v3/namespaces/NAMESPACE_ID/apps/PIPELINE_NAME/workflows/DataPipelineWorkflow/runs/RUN_ID/classify -X POST

Replace the following:

  • NAMESPACE_ID: the ID of the namespace
  • PIPELINE_NAME: the name of the pipeline
  • RUN_ID: the run ID of the pipeline

The following is a sample response for a plugin error:

 [
    {
       "stageName": "Stage Name",
       "errorCategory": "Plugin-x",
       "errorReason": "Input path gs://x does not exist",
       "errorMessage": "Input path gs://x does not exist",
       "errorType": "SYSTEM/USER/UNKNOWN",
       "dependency": "true/false"
    }
 ]
 ```
You can also [view advanced logs for your pipelines](/data-fusion/docs/how-to/viewing-stackdriver-logs).

Resolve issues with creating Cloud Data Fusion instance

When you create a Cloud Data Fusion instance, you might encounter the following issue:

Read access to project PROJECT_ID was denied.

To resolve this issue, disable and re-enable the Cloud Data Fusion API, and then create the instance.