Evaluation of rules and alerts with managed collection
Stay organized with collections
Save and categorize content based on your preferences.
This document describes a configuration for rule and alert evaluation
in a Managed Service for Prometheus deployment that uses
managed collection.
The following diagram illustrates a deployment that uses multiple clusters
in two Google Cloud projects and uses both rule and alert evaluation, as well
as the optional GlobalRules resource:
To set up and use a deployment like the one in the diagram, note the
following:
The managed rule evaluator is automatically
deployed in any cluster where managed collection is running. These
evaluators are configured as follows:
Use Rules resources to run rules on data
within a namespace. Rules resources must be applied in every namespace
in which you want to execute the rule.
Use ClusterRules resources to run
rules on data across a cluster. ClusterRules resources should be applied
once per cluster.
All rule evaluation executes against the global datastore,
Monarch.
Rules resources automatically filter rules to the project, location,
cluster, and namespace in which they are installed.
ClusterRules resources automatically filter rules to the project,
location, and cluster in which they are installed.
All rule results are written to Monarch after evaluation.
A Prometheus AlertManager instance is manually deployed in every single
cluster. Managed rule evaluators are configured by editing the
OperatorConfig resource to send fired alerting
rules to their local AlertManager instance. Silences, acknowledgements, and
incident management workflows are typically handled in a third-party tool
such as PagerDuty.
You can centralize alert management across multiple clusters into a
single AlertManager by using a Kubernetes
Endpoints resource.
The preceding diagram also shows the optional
GlobalRules resource.
Use GlobalRules very sparingly, for tasks like
calculating global SLOs across projects or for evaluating rules across
clusters within a single Google Cloud project.
We strongly recommend using Rules and ClusterRules whenever possible;
these resources provide superior reliability and are better fits for
common Kubernetes deployment mechanisms and tenancy models.
If you use the GlobalRules resource, note the following from the
preceding diagram:
One single cluster running inside Google Cloud is designated as the
global rule-evaluation cluster for a metrics scope. This managed rule
evaluator is configured to use scoping_project_A, which contains
Projects 1 and 2. Rules executed against scoping_project_A automatically
fan out to Projects 1 and 2.
As in all other clusters, this rule evaluator is set up with Rules
and ClusterRules resources that evaluate rules scoped to a namespace
or cluster. These rules are automatically filtered to the local
project—Project 1, in this case. Because scoping_project_A
contains Project 1, Rules and ClusterRules-configured rules execute
only against data from the local project as expected.
This cluster also has GlobalRules resources that execute rules against
scoping_project_A. GlobalRules are not automatically filtered, and
therefore GlobalRules execute exactly as written across all projects,
locations, clusters, and namespaces in scoping_project_A.
Fired alerting rules will be sent to the self-hosted AlertManager as
expected.
Using GlobalRules may have unexpected effects, depending on whether you
preserve or aggregate the project_id, location, cluster, and
namespace labels in your rules:
If your GlobalRules rule preserves the project_id label (by using
a by(project_id) clause), then rule results are written back to
Monarch using the original project_id value of the underlying
time series.
In this scenario, you need to ensure the underlying service account
has the Monitoring Metric Writer permissions for each
monitored project in scoping_project_A. If you add a new
monitored project to scoping_project_A, then you must also manually
add a new permission to the service account.
If your GlobalRules rule does not preserve the project_id label (by
not using a by(project_id) clause), then rule results are written back
to Monarch using the project_id value of the cluster
where the global rule evaluator is running.
In this scenario, you do not need to further modify the underlying
service account.
If your GlobalRules preserves the location label (by using a
by(location) clause), then rule results are written back to
Monarch using each original Google Cloud region from which
the underlying time series originated.
If your GlobalRules does not preserve the location label, then data
is written back to the location of the cluster where the global rule
evaluator is running.
We strongly recommend preserving the cluster and namespace labels in
rule evaluation results unless the purpose of the rule is to aggregate away
those labels. Otherwise, query performance might decline and you might
encounter cardinality limits. Removing both labels is strongly discouraged.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-03 UTC."],[],[],null,["# Evaluation of rules and alerts with managed collection\n\nThis document describes a configuration for rule and alert evaluation\nin a Managed Service for Prometheus deployment that uses\n[managed collection](/stackdriver/docs/managed-prometheus/setup-managed).\n\nThe following diagram illustrates a deployment that uses multiple clusters\nin two Google Cloud projects and uses both rule and alert evaluation, as well\nas the optional GlobalRules resource:\n\nTo set up and use a deployment like the one in the diagram, note the\nfollowing:\n\n- The [managed rule evaluator](/stackdriver/docs/managed-prometheus/rules-managed) is automatically\n deployed in any cluster where managed collection is running. These\n evaluators are configured as follows:\n\n - Use [Rules](https://github.com/GoogleCloudPlatform/prometheus-engine/blob/v0.15.3/doc/api.md#rules) resources to run rules on data\n within a namespace. Rules resources must be applied in every namespace\n in which you want to execute the rule.\n\n - Use [ClusterRules](https://github.com/GoogleCloudPlatform/prometheus-engine/blob/v0.15.3/doc/api.md#clusterrules) resources to run\n rules on data across a cluster. ClusterRules resources should be applied\n once per cluster.\n\n- All rule evaluation executes against the global datastore,\n Monarch.\n\n - Rules resources automatically filter rules to the project, location, cluster, and namespace in which they are installed.\n - ClusterRules resources automatically filter rules to the project, location, and cluster in which they are installed.\n - All rule results are written to Monarch after evaluation.\n- A Prometheus AlertManager instance is manually deployed in every single\n cluster. Managed rule evaluators are configured by [editing the\n OperatorConfig resource](/stackdriver/docs/managed-prometheus/rules-managed#am-config-managed) to send fired alerting\n rules to their local AlertManager instance. Silences, acknowledgements, and\n incident management workflows are typically handled in a third-party tool\n such as PagerDuty.\n\n You can centralize alert management across multiple clusters into a\n single AlertManager by using a Kubernetes\n [Endpoints resource](/stackdriver/docs/managed-prometheus/rules-managed#am-config-managed).\n\nThe preceding diagram also shows the optional\n[GlobalRules](https://github.com/GoogleCloudPlatform/prometheus-engine/blob/v0.15.3/doc/api.md#globalrules) resource.\nUse GlobalRules very sparingly, for tasks like\ncalculating global SLOs across projects or for evaluating rules across\nclusters within a single Google Cloud project.\n**We strongly recommend using Rules and ClusterRules whenever possible**;\nthese resources provide superior reliability and are better fits for\ncommon Kubernetes deployment mechanisms and tenancy models.\n\nIf you use the GlobalRules resource, note the following from the\npreceding diagram:\n\n- One single cluster running inside Google Cloud is designated as the\n global rule-evaluation cluster for a metrics scope. This managed rule\n evaluator is configured to use scoping_project_A, which contains\n Projects 1 and 2. Rules executed against scoping_project_A automatically\n fan out to Projects 1 and 2.\n\n The underlying service account must be given the [Monitoring\n Viewer](/monitoring/access-control#mon_roles_desc) permissions for scoping_project_A.\n For additional information on how to set these fields, see\n [Multi-project and global rule evaluation](/stackdriver/docs/managed-prometheus/rules-managed#multi-project_and_global_rule_evaluation).\n- As in all other clusters, this rule evaluator is set up with Rules\n and ClusterRules resources that evaluate rules scoped to a namespace\n or cluster. These rules are automatically filtered to the *local*\n project---Project 1, in this case. Because scoping_project_A\n contains Project 1, Rules and ClusterRules-configured rules execute\n only against data from the local project as expected.\n\n- This cluster also has GlobalRules resources that execute rules against\n scoping_project_A. GlobalRules are not automatically filtered, and\n therefore GlobalRules execute exactly as written across all projects,\n locations, clusters, and namespaces in scoping_project_A.\n\n- Fired alerting rules will be sent to the self-hosted AlertManager as\n expected.\n\nUsing GlobalRules may have unexpected effects, depending on whether you\npreserve or aggregate the `project_id`, `location`, `cluster`, and\n`namespace` labels in your rules:\n\n- If your GlobalRules rule preserves the `project_id` label (by using\n a `by(project_id)` clause), then rule results are written back to\n Monarch using the original `project_id` value of the underlying\n time series.\n\n In this scenario, you need to ensure the underlying service account\n has the [Monitoring Metric Writer](/monitoring/access-control#mon_roles_desc) permissions for each\n monitored project in scoping_project_A. If you add a new\n monitored project to scoping_project_A, then you must also manually\n add a new permission to the service account.\n- If your GlobalRules rule does not preserve the `project_id` label (by\n not using a `by(project_id)` clause), then rule results are written back\n to Monarch using the `project_id` value of the cluster\n where the global rule evaluator is running.\n\n In this scenario, you do not need to further modify the underlying\n service account.\n- If your GlobalRules preserves the `location` label (by using a\n `by(location)` clause), then rule results are written back to\n Monarch using each original Google Cloud region from which\n the underlying time series originated.\n\n If your GlobalRules does not preserve the `location` label, then data\n is written back to the location of the cluster where the global rule\n evaluator is running.\n\nWe strongly recommend preserving the `cluster` and `namespace` labels in\nrule evaluation results unless the purpose of the rule is to aggregate away\nthose labels. Otherwise, query performance might decline and you might\nencounter cardinality limits. Removing both labels is strongly discouraged."]]