Configure alerts for snapshot schedules


You can create a custom metric to raise alerts or provide information to troubleshoot problems with scheduled snapshots.

For example, to set up an alert for scheduled snapshot failures, use the following procedure:

  1. Create a log filter to capture scheduled snapshot events.
  2. Create a metric based off of the log filter that counts scheduled snapshot failures.
  3. Create an alert policy to send an alert when there is a scheduled snapshot failure.

Before you begin

  • If you haven't already, set up authentication. Authentication is the process by which your identity is verified for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine as follows.

    Select the tab for how you plan to use the samples on this page:

    Console

    When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

    gcloud

    1. Install the Google Cloud CLI, then initialize it by running the following command:

      gcloud init
    2. Set a default region and zone.

    REST

    To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.

      Install the Google Cloud CLI, then initialize it by running the following command:

      gcloud init

    For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Required roles and permissions

To get the permissions that you need to create a snapshot schedule, ask your administrator to grant you the following IAM roles on the project:

For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Create a log filter

Create a log filter to capture scheduled snapshot events.

Console

  1. In the Google Cloud console, go to the Logging > Logs Explorer page.

    Go to the Logs Explorer page

  2. In the Filter by label or text search list, select Convert to advanced filter.

Convert to advanced filter.

  1. Replace the filter field by entering the following text, replacing PROJECT_ID with your project ID:

    resource.type="gce_disk"
    logName="projects/PROJECT_ID/logs/cloudaudit.googleapis.com%2Fsystem_event"
    protoPayload.methodName="ScheduledSnapshots"
    severity>"INFO"
    
  2. Click Submit Filter.

Create a metric

After you create the log filter, create a metric that counts scheduled snapshot failures.

Console

  1. On the Logs Explorer page, click Create metric.

  2. In the Metric Editor, enter the following:

    • Name: scheduled_snapshot_failure_count.
    • Description: count of scheduled snapshot failures.
    • Type: Counter
  3. Under Labels, click Add item and enter the following:

    • Name: status
    • Description: status of scheduled snapshot request
    • Label type: String
    • Field name: protoPayload.response.status
  4. Click Done.

  5. Click Create Metric.

Create an alert policy

After you create the metric, create an alert policy to send an alert when there is a scheduled snapshot failure.

Console

  1. In the Google Cloud console, go to the Cloud Logging > Logs-based metrics page.

    Go to the Logs-based metrics page

  2. Under User-defined Metrics, find your new metric named user/scheduled_snapshot_failure_count.

  3. Click the More menu button in this row and select Create alert from metric. The alert policy condition creation page opens.

    User-defined metric.

  4. In the Target panel, under Aggregator, select none.

  5. Under Filter:

    1. Click Add a filter.
    2. Select status from the list.
    3. In the Value field, type DONE.
    4. Click Apply.

    Alert filter status.

  6. Click Show advanced options.

  7. In the Advanced aggregation pane, click the Aligner list and select sum.

  8. In the Configuration panel, select the following values:

    • Condition triggers if: Any time series violates
    • Condition: is above
    • Threshold: 1
    • For: most recent value

    Configuration panel.

  9. Click Save.

  10. On the Create new alerting policy page, enter a policy name. Optionally, you can add notification channels and documentation for this policy.

  11. Click Save.

What's next