This guide explains how to monitor Cloud Data Fusion instances and pipelines. You can view key metrics in the Cloud Data Fusion monitoring dashboard. For additional aggregation or filtering, you can view the metrics in Cloud Monitoring.
Before you begin
Before you start monitoring Cloud Data Fusion instances and pipelines, review the pricing and grant the required roles.
Pricing
Cloud Monitoring usage incurs charges. For more information, see Google Cloud Observability pricing.
Required roles
To get the permissions that
you need to view metrics,
ask your administrator to grant you the
Monitoring Viewer (roles/monitoring.viewer
)
IAM role on your project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Monitor in Cloud Data Fusion
View and filter metrics for your instances and pipelines directly in Cloud Data Fusion using the monitoring dashboard:
In the Google Cloud console, go to the Cloud Data Fusion page.
Click Instances, and then click the instance's name to go to the Instance details page.
Click the Monitoring tab. This displays a dashboard with charts for your instance and its pipelines.
The dashboard displays the following views:
- Overview: Monitors Cloud Data Fusion system metrics.
- Instance: Monitors all Cloud Data Fusion instances in the project.
- Pipeline: Provides information about pipeline runs and performance.
For detailed information about the metrics in any view, hover over the widget and click More > View in Metrics explorer.
Widgets in the monitoring dashboard
The following sections describe the widgets available in the Overview, Instance, and Pipeline views.
Overview view widgets
This table describes the widgets in the Overview view:
Widget | Description |
---|---|
Appfabric service health | Service availability for Appfabric |
Metric services health | Service availability for metrics |
Wrangler service health | Service availability for Wrangler |
Dataset executor health | Service availability for dataset |
Pipeline studio health | Service availability for Pipeline Studio |
Runtime service health | Service availability for runtime |
Log saver service health | Service availability for log saver |
Metadata service health | Service availability for metadata |
Pipeline view widgets
This table describes the widgets in the Pipeline view:
Widget | Description |
---|---|
Successful pipeline runs | Cumulative count of successful pipeline runs. |
Failed pipeline runs | Cumulative count of failed pipeline runs. |
Killed pipeline runs | Cumulative count of killed pipeline runs. |
Rejected pipeline runs | Cumulative count of rejected pipeline runs. |
Successful pipeline run time | Time taken for successful pipeline runs to complete. |
Pipeline start latency | Time taken for a pipeline run to reach a "Running" state. |
Dataproc provisioning latency | Time taken to provision the Dataproc cluster. |
Dataproc API request count | Cumulative count of API requests made to Dataproc. |
Successful preview run time | Time taken for successful preview runs to complete. |
Preview runs | Number of preview runs. |
Pipeline bytes read | Cumulative count of bytes read by a pipeline. |
Pipeline bytes written | Cumulative count of bytes written by a pipeline. |
Pipeline bytes shuffled | Cumulative count of bytes shuffled in a pipeline. |
Plugin records processed in | Cumulative count of records entering a plugin. |
Plugin records processed out | Cumulative count of records exiting a plugin. |
Instance view widgets
This table describes the widgets in the Instance view:
Widget | Description |
---|---|
Concurrent pipelines running | Number of pipelines running concurrently. |
Concurrent pipelines launched | Number of pipelines in a provisioning or starting state. |
API requests received | Cumulative count of API requests received. |
API responses count | Cumulative count of API responses sent. |
Authorization check count | Cumulative count of authorization checks. |
Authorization check time | Latency of authorization checks. |
Deployed pipeline count | Number of deployed pipelines. |
Draft pipeline count | Number of draft pipelines. |
Namespace count | Number of namespaces. |
View metrics for advanced analysis in Cloud Monitoring
For additional filtering and aggregation options, view your metrics in Cloud Monitoring:
In the Google Cloud console, go to the Cloud Monitoring Metrics explorer page.
Select the Cloud Data Fusion monitored resource.
Choose a metric.
Select filters and aggregation.
What's next
- To learn about filters and metrics, see the Cloud Data Fusion metrics overview.