Metrics overview

This page describes the metrics that help you monitor the health and performance of your Cloud Data Fusion instances and pipelines. Use Cloud Monitoring to monitor these metrics. The metrics provide insights into pipeline runs, instance details, API requests, and authorization checks.

The metrics are categorized as either pipeline metrics or instance metrics:

  • Pipeline metrics provide data about individual pipeline runs, such as run status, duration, latency, and data throughput.
  • Instance metrics provide aggregated information about the pipelines within an instance, including service availability, the number of deployed pipelines, and API request counts.

Filter and aggregate Cloud Data Fusion pipeline and instance metrics in Monitoring using metric and monitored-resource labels. When you customize your metrics views, you can use one or both of these label types.

Cloud Data Fusion Pipeline monitored-resource labels

Filter and aggregate the metrics with the following Cloud Data Fusion Pipeline monitored-resource labels:

Label name Description
resource_container The ID of the customer project.
org_id The ID of the organization that the customer project belongs to.
location The zone or region where the instance is hosted.
edition The edition of the Cloud Data Fusion instance.
is_private_ip_enabled Whether the instance uses an internal IP address.
version The Cloud Data Fusion data plane version of the instance.
instance_id The Cloud Data Fusion instance ID.
namespace The namespace of the pipeline.
pipeline_id The pipeline ID.
run_id The run ID for the pipeline.

Pipeline metric labels

Filter and aggregate the metrics with the following Cloud Data Fusion metric labels in Monitoring:

Name Metric Description Metric labels
Pipeline run status datafusion.googleapis.com/pipeline/v2/runs_completed_count The cumulative count of pipelines that have completed a run.
  • complete_state
  • previous_state
  • program
  • provisioner
  • cluster_state
  • compute_profile_id
  • enable_rbac
  • private_service_connect_enabled
Pipeline run time datafusion.googleapis.com/pipeline/v2/pipeline_duration Time taken to complete the pipeline run.
  • complete_state
  • program
  • provisioner
  • cluster_state
  • compute_profile_id
  • enable_rbac
  • private_service_connect_enabled
Pipeline start latency datafusion.googleapis.com/pipeline/v2/pipeline_start_latency The time taken for the pipeline to reach Running state.
  • program
  • provisioner
  • cluster_state
  • compute_profile_id
  • complete_state
  • enable_rbac
  • private_service_connect_enabled
Provisioning latency datafusion.googleapis.com/pipeline/v2/dataproc/provisioning_latency The Dataproc cluster provisioning latency.
  • provisioner
  • enable_rbac
  • private_service_connect_enabled
Dataproc API requests datafusion.googleapis.com/pipeline/v2/dataproc/api_request_count The cumulative count of Dataproc API requests.
  • provisioner
  • method
  • response_code
  • region
  • launch_mode
  • image_version
  • enable_rbac
  • private_service_connect_enabled
Pipeline preview run time datafusion.googleapis.com/pipeline/v2/preview_duration Time taken to complete preview.
  • complete_state
  • enable_rbac
  • private_service_connect_enabled
Pipeline bytes written datafusion.googleapis.com/pipeline/v2/write_bytes_count The cumulative count of bytes written by a pipeline.
  • enable_rbac
  • private_service_connect_enabled
Pipeline bytes read datafusion.googleapis.com/pipeline/v2/read_bytes_count The cumulative count of bytes read by a pipeline.
  • enable_rbac
  • private_service_connect_enabled
Pipeline bytes shuffled datafusion.googleapis.com/pipeline/v2/shuffle_bytes_count The cumulative count of bytes shuffled in a pipeline.
  • enable_rbac
  • private_service_connect_enabled
Plugin records processed in datafusion.googleapis.com/pipeline/v2/plugin/incoming_records_count Cumulative count of records entering a plugin.
  • enable_rbac
  • private_service_connect_enabled
  • stage_name
Plugin records processed out datafusion.googleapis.com/pipeline/v2/plugin/outgoing_records_count The cumulative count of records exiting a plugin.
  • enable_rbac
  • private_service_connect_enabled
  • stage_name

Cloud Data Fusion Instance monitored-resource labels

Filter and aggregate the metrics with the following Cloud Data Fusion Instance monitored-resource labels:

Label name Description
resource_container The ID of the customer project.
org_id The ID of the organization that the customer project belongs to.
location The zone or region where the instance is hosted.
edition The edition of the instance.
is_private_ip_enabled Whether the instance uses an internal IP address.
version The Cloud Data Fusion data plane version of the instance.
instance_id The Cloud Data Fusion instance ID.
namespace The namespace name.

Instance metric labels

Filter and aggregate the metrics with the following Cloud Data Fusion metric labels in Monitoring:

Name Metric Description Metric labels
Service status datafusion.googleapis.com/instance/v2/service_available The availability of Cloud Data Fusion services.
  • service
  • enable_rbac
  • private_service_connect_enabled
Deployed pipeline count datafusion.googleapis.com/instance/v2/pipelines The number of deployed pipelines.
  • enable_rbac
  • private_service_connect_enabled
  • maintenance_window_enabled
Concurrent pipelines running count datafusion.googleapis.com/instance/v2/concurrent_pipelines_running The number of pipelines running concurrently.
  • enable_rbac
  • private_service_connect_enabled
Concurrent pipeline launches count datafusion.googleapis.com/instance/v2/concurrent_pipelines_launched The number of pipelines in either Provisioning or Starting state.
  • enable_rbac
  • private_service_connect_enabled
CDAP REST API requests received datafusion.googleapis.com/instance/v2/api_request_count The cumulative count of REST API requests received by a service in the backend.
  • service
  • handler
  • method
  • enable_rbac
  • private_service_connect_enabled
CDAP REST API responses sent datafusion.googleapis.com/instance/v2/api_response_count The cumulative count of REST API responses sent by a service in the backend.
  • service
  • handler
  • method
  • response_code
  • enable_rbac
  • private_service_connect_enabled
Authorization check count datafusion.googleapis.com/instance/v2/authorization_check_count The cumulative count of authorization checks made by the access enforcer.
  • enable_rbac
  • type
  • private_service_connect_enabled
Authorization check time datafusion.googleapis.com/instance/v2/authorization_check_time The latency of authorization checks made by the access enforcer.
  • enable_rbac
  • type
  • private_service_connect_enabled
Draft pipeline count datafusion.googleapis.com/instance/v2/draft_pipelines The number of draft pipelines.
  • enable_rbac
  • private_service_connect_enabled
Namespace count datafusion.googleapis.com/instance/v2/namespaces The number of namespaces.
  • enable_rbac
  • private_service_connect_enabled

What's next