Create a pipeline monitoring dashboard using Cloud Monitoring
Learn how to use Cloud Monitoring to create a dashboard to monitor pipelines.
To follow step-by-step guidance for this task directly in the Google Cloud console, click Guide me:
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Cloud Data Fusion, BigQuery, Cloud Storage, and Dataproc APIs.
To create custom dashboards, you must be granted the Monitoring Editor (
roles/monitoring.editor
) IAM role on the service account.For more information about granting roles, see Manage access.
Create a Cloud Data Fusion instance with Cloud Logging enabled
To use Cloud Logging with your Cloud Data Fusion pipeline, create a Cloud Data Fusion instance with Cloud Logging enabled:
Go to the Cloud Data Fusion Instances page and click Create instance.
In the Instance name field, enter a name for your new instance.
From the Region drop-down, select the Google Cloud region in which you want to create the instance.
From the Version drop-down, select a Cloud Data Fusion version.
Select an Edition.
Expand Advanced options.
In the Logging and monitoring section, select Enable Stackdriver logging service.
Click Create.
Create a log-based metric
Go to the Cloud Logging Log-based metrics page:
Click Create metric.
On the Create a metric page, do the following:
- For Metric type, select Counter.
- In the Log-based metric name field, enter
pipeline_logs
. - In the Units field, enter
1
. In the Build filter field, enter the following:
resource.type="cloud_dataproc_cluster" log_name=~"projects/.*/logs/datafusion-pipeline-logs"
In the Labels section, click Add label and create the following labels. After entering each label, click Done, and click Add label again to create the next label.
Label name Label type Field name Project
STRING
resource.labels.project_id
Message
STRING
jsonPayload.message
LoggerName
STRING
labels.loggerName
ClusterName
STRING
resource.labels.cluster_name
SparkPhase
STRING
labels.".workflowSparkId"
Region
STRING
resource.labels.region
Pipeline
STRING
labels.".applicationId"
RunId
STRING
labels.".runId"
Namespace
STRING
labels.".namespaceId"
LogLevel
STRING
labels.levelName
Click Create metric.
The newly created metric appears in the user-defined metrics table. If the metric isn't immediately visible, refresh the page.
The dashboard contains the following charts:
- All pipelines
- Completed pipelines
- Failed pipelines
- All pipeline runs
- Completed pipeline runs
- Failed pipeline runs
- Dataproc clusters for runs
After a metric is created, it might take up to 24 hours to start displaying the time series data.
Install the dashboard
Download the JSON file to your local machine.
Go to the Cloud Monitoring Dashboards page:
Click Create dashboard.
Click > JSON > JSON editor.
Dashboard settingsIn a text editor, open the JSON file that you downloaded.
Copy the content of the downloaded JSON file and paste it into the JSON editor, replacing the content that the JSON editor contains by default.
Click Apply changes.
This refreshes the dashboard. The Cloud Data Fusion pipelines run after the metric was created, appear in the dashboard. If no pipelines were run after the metric was created, the dashboard will be empty.
Autosave is enabled by default. If autosave is disabled, click Save to save the dashboard.
Click Close editor.
Your new dashboard appears in the list of dashboards on the Monitoring overview page.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
Delete the Cloud Data Fusion instance
Follow these instructions to delete your Cloud Data Fusion instance.
Delete the project
The easiest way to eliminate billing is to delete the project that you created for the tutorial.
To delete the project:
- In the Google Cloud console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
What's next
- Learn more about Cloud Monitoring.