Use Cloud Monitoring with BigQuery Engine for Apache Flink

Cloud Monitoring provides powerful logging and diagnostics. The BigQuery Engine for Apache Flink integration with Monitoring lets you access BigQuery Engine for Apache Flink deployment and job metrics from the Monitoring dashboards. You can also use Monitoring alerts to notify you of various conditions, such as failed jobs.

Before you begin

Use Metrics Explorer

Use Monitoring to explore BigQuery Engine for Apache Flink metrics. Follow the steps in this section to observe the standard metrics that are provided for each of your BigQuery Engine for Apache Flink deployments and jobs. For more information about using Metrics Explorer, see Create charts with Metrics Explorer.

  1. In the Google Cloud console, select Monitoring:

    Go to Monitoring

  2. In the navigation pane, select Metrics explorer.

  3. In the Select a metric menu, enter Flink in the filter.

  4. From the list that appears, select a metric to observe for one of your deployments or jobs.

When running BigQuery Engine for Apache Flink jobs, you might also want to monitor metrics for your sources and sinks. For example, you might want to monitor BigQuery Storage API metrics. For more information, see Create dashboards, charts, and alerts and the complete list of metrics for BigQuery Engine for Apache Flink.

Create alerting policies and dashboards

Monitoring provides access to metrics related to BigQuery Engine for Apache Flink. Create dashboards to chart the time series of metrics, and create alerting policies that notify you when metrics reach specified values.

Create groups of resources

To make it easier to set alerts and build dashboards, create resource groups that include multiple BigQuery Engine for Apache Flink jobs.

  1. In the Google Cloud console, select Monitoring:

    Go to Monitoring

  2. In the navigation pane, select Groups.

  3. Click Create group.

  4. Enter a name for your group.

  5. Add filter criteria that define the BigQuery Engine for Apache Flink resources included in the group. For example, one of your filter criteria can be the name prefix of your jobs.

  6. After the group is created, you can see the basic metrics related to resources in that group.

For more information, see Configure resource groups.

Create alerting policies for BigQuery Engine for Apache Flink metrics

Monitoring lets you create alerts and receive notifications when a metric crosses a specified threshold. For example, you can receive a notification when the CPU usage of a deployment increases above a certain threshold.

  1. In the Google Cloud console, select Monitoring:

    Go to Monitoring

  2. In the navigation pane, select Alerting.

  3. Click Create policy.

  4. For Select a metric, in the filter, enter Flink, and then select a BigQuery Engine for Apache Flink metric. Click Apply.

  5. On the Configure alert trigger page, define the alerting conditions and notification channels. When done, click Create policy. For more information about creating alerts, see Alerting overview.

    Every time an alert is triggered, an incident and a corresponding event are created. If you specified a notification mechanism in the alert, such as email or SMS, you also receive a notification.

Build custom monitoring dashboards

You can build Monitoring dashboards with the most relevant charts related to BigQuery Engine for Apache Flink. To add a chart to a dashboard, follow these steps:

  1. In the Google Cloud console, select Monitoring:

    Go to Monitoring

  2. In the navigation pane, select Dashboards.

  3. Click Create dashboard.

  4. Click Add widget.

  5. In the Add widget window, for Data, select Metric.

  6. In the Select a metric menu, for Metric, enter Flink.

  7. Select a metric category and a metric, and then click Apply.

You can add as many charts to the dashboard as you like. For more information, see View and customize Google Cloud dashboards.

Storage and retention

Operational logs are stored in the _Default log bucket. The logging API service name is managedflink.googleapis.com. For more information about the Google Cloud monitored resource types and services used in Cloud Logging, see Monitored resources and services.

For details about how long log entries are retained by Logging, see the retention information in Quotas and limits: Logs retention periods.

For information about viewing operational logs, see BigQuery Engine for Apache Flink logging.

What's next

To learn more, consider exploring these other resources: