Monitor Cloud Data Fusion system, instance, and pipeline health

This guide explains how to monitor Cloud Data Fusion instances and pipelines. You can view key metrics in the Cloud Data Fusion monitoring dashboard. For additional aggregation or filtering, you can view the metrics in Cloud Monitoring.

Before you begin

Before you start monitoring Cloud Data Fusion instances and pipelines, review the pricing and grant the required roles.

Pricing

Cloud Monitoring usage incurs charges. For more information, see Google Cloud Observability pricing.

Required roles

To get the permissions that you need to view metrics, ask your administrator to grant you the Monitoring Viewer (roles/monitoring.viewer) IAM role on your project. For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Monitor in Cloud Data Fusion

View and filter metrics for your instances and pipelines directly in Cloud Data Fusion using the monitoring dashboard:

  1. In the Google Cloud console, go to the Cloud Data Fusion page.

  2. Click Instances, and then click the instance's name to go to the Instance details page.

    Go to Instances

  3. Click the Monitoring tab. This displays a dashboard with charts for your instance and its pipelines.

    The dashboard displays the following views:

    • Overview: Monitors Cloud Data Fusion system metrics.
    • Instance: Monitors all Cloud Data Fusion instances in the project.
    • Pipeline: Provides information about pipeline runs and performance.

    For detailed information about the metrics in any view, hover over the widget and click More > View in Metrics explorer.

Widgets in the monitoring dashboard

The following sections describe the widgets available in the Overview, Instance, and Pipeline views.

Overview view widgets

This table describes the widgets in the Overview view:

Widget Description
Appfabric service health Service availability for Appfabric
Metric services health Service availability for metrics
Wrangler service health Service availability for Wrangler
Dataset executor health Service availability for dataset
Pipeline studio health Service availability for Pipeline Studio
Runtime service health Service availability for runtime
Log saver service health Service availability for log saver
Metadata service health Service availability for metadata

Pipeline view widgets

This table describes the widgets in the Pipeline view:

Widget Description
Successful pipeline runs Cumulative count of successful pipeline runs.
Failed pipeline runs Cumulative count of failed pipeline runs.
Killed pipeline runs Cumulative count of killed pipeline runs.
Rejected pipeline runs Cumulative count of rejected pipeline runs.
Successful pipeline run time Time taken for successful pipeline runs to complete.
Pipeline start latency Time taken for a pipeline run to reach a "Running" state.
Dataproc provisioning latency Time taken to provision the Dataproc cluster.
Dataproc API request count Cumulative count of API requests made to Dataproc.
Successful preview run time Time taken for successful preview runs to complete.
Preview runs Number of preview runs.
Pipeline bytes read Cumulative count of bytes read by a pipeline.
Pipeline bytes written Cumulative count of bytes written by a pipeline.
Pipeline bytes shuffled Cumulative count of bytes shuffled in a pipeline.
Plugin records processed in Cumulative count of records entering a plugin.
Plugin records processed out Cumulative count of records exiting a plugin.

Instance view widgets

This table describes the widgets in the Instance view:

Widget Description
Concurrent pipelines running Number of pipelines running concurrently.
Concurrent pipelines launched Number of pipelines in a provisioning or starting state.
API requests received Cumulative count of API requests received.
API responses count Cumulative count of API responses sent.
Authorization check count Cumulative count of authorization checks.
Authorization check time Latency of authorization checks.
Deployed pipeline count Number of deployed pipelines.
Draft pipeline count Number of draft pipelines.
Namespace count Number of namespaces.

View metrics for advanced analysis in Cloud Monitoring

For additional filtering and aggregation options, view your metrics in Cloud Monitoring:

  1. In the Google Cloud console, go to the Cloud Monitoring Metrics explorer page.

    Go to Metrics explorer

  2. Select the Cloud Data Fusion monitored resource.

  3. Choose a metric.

  4. Select filters and aggregation.

What's next