This tutorial shows you how to set up liveness probes to application microservices deployed to Google Kubernetes Engine (GKE) using open source Prometheus.
This tutorial uses open source Prometheus. However, each GKE Autopilot cluster automatically deploys Managed Service for Prometheus, Google Cloud's fully managed, multi-cloud, cross-project solution for Prometheus metrics. Managed Service for Prometheus lets you globally monitor and alert on your workloads using Prometheus, without having to manually manage and operate Prometheus at scale.
You can also use open source tools like Grafana to visualize metrics collected by Prometheus.
Objectives
- Create a cluster.
- Deploy Prometheus.
- Deploy the sample application, Bank of Anthos.
- Configure Prometheus liveness probes.
- Configure Prometheus alerts.
- Configure Alertmanager to get notification in a Slack channel.
- Simulate an outage to test Prometheus.
Costs
In this document, you use the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage,
use the pricing calculator.
When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, click Create project to begin creating a new Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the GKE API.
-
In the Google Cloud console, on the project selector page, click Create project to begin creating a new Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the GKE API.
- Install the Helm API
Prepare the environment
In this tutorial, you use Cloud Shell to manage resources hosted on Google Cloud.
Set the default environment variables:
gcloud config set project PROJECT_ID gcloud config set compute/region COMPUTE_REGION
Replace the following:
PROJECT_ID
: your Google Cloud project ID.PROJECT_ID
: the Compute Engine region for the cluster. For this tutorial, the region isus-central1
. Typically, you want a region that is close to you.
Clone the sample repository used in this tutorial:
git clone https://github.com/GoogleCloudPlatform/bank-of-anthos.git cd bank-of-anthos/
Create a cluster:
gcloud container clusters create-auto CLUSTER_NAME \ --release-channel=CHANNEL_NAME \ --region=COMPUTE_REGION
Replace the following:
CLUSTER_NAME
: a name for the new cluster.CHANNEL_NAME
: the name of a release channel.
Deploy Prometheus
Install Prometheus using the sample Helm chart:
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install tutorial bitnami/kube-prometheus \
--version 8.2.2 \
--values extras/prometheus/oss/values.yaml \
--wait
This command installs Prometheus with the following components:
- Prometheus Operator: a popular way to deploy and configure open source Prometheus.
- Alertmanager: handles alerts sent by the Prometheus server and routes them to applications, such as Slack.
- Blackbox exporter: lets Prometheus probe endpoints using HTTP, HTTPS, DNS, TCP, ICMP, and gRPC.
Deploy Bank of Anthos
Deploy the Bank of Anthos sample application:
kubectl apply -f extras/jwt/jwt-secret.yaml
kubectl apply -f kubernetes-manifests
Slack notifications
To set up Slack notifications, you must create a Slack application, activate Incoming Webhooks for the application, and install the application to a Slack workspace.
Create the Slack application
Join a Slack workspace, either by registering with your email or by using an invitation sent by a Workspace Admin.
Sign in to Slack using your workspace name and your Slack account credentials.
-
- In the Create an app dialog, click From scratch.
- Specify an App Name and choose your Slack workspace.
- Click Create App.
- Under Add features and functionality, click Incoming Webhooks.
- Click the Activate Incoming Webhooks toggle.
- In the Webhook URLs for Your Workspace section, click Add New Webhook to Workspace.
- On the authorization page that opens, select a channel to receive notifications.
- Click Allow.
- A webhook for your Slack application is displayed in the Webhook URLs for Your Workspace section. Save the URL for later.
Configure Alertmanager
Create a Kubernetes Secret to store the webhook URL:
kubectl create secret generic alertmanager-slack-webhook --from-literal webhookURL=SLACK_WEBHOOK_URL
kubectl apply -f extras/prometheus/oss/alertmanagerconfig.yaml
Replace SLACK_WEBHOOK_URL
with the URL of the webhook
from the previous section.
Configure Prometheus
Review the following manifest:
This manifest describes Prometheus liveness probes and includes the following fields:
spec.jobName
: the Job name assigned to scraped metrics.spec.prober.url
: the Service URL of the blackbox exporter. This includes the default port for the blackbox exporter, which is defined in the Helm chart.spec.prober.path
: the metrics collection path.spec.targets.staticConfig.labels
: the labels assigned to all metrics scraped from the targets.spec.targets.staticConfig.static
: the list of hosts to probe.
Apply the manifest to your cluster:
kubectl apply -f extras/prometheus/oss/probes.yaml
Review the following manifest:
This manifest describes a
PrometheusRule
and includes the following fields:spec.groups.[*].name
: the name of the rule group.spec.groups.[*].interval
: how often rules in the group are evaluated.spec.groups.[*].rules[*].alert
: the name of the alert.spec.groups.[*].rules[*].expr
: the PromQL expression to evaluate.spec.groups.[*].rules[*].for
: the amount of time alerts must return for before they are considered firing.spec.groups.[*].rules[*].annotations
: a list of annotations to add to each alert. This is only valid for alerting rules.spec.groups.[*].rules[*].labels
: the labels to add or overwrite.
Apply the manifest to your cluster:
kubectl apply -f extras/prometheus/oss/rules.yaml
Simulate an outage
Simulate an outage by scaling the
contacts
Deployment to zero:kubectl scale deployment contacts --replicas 0
You should see a notification message in your Slack workspace channel. GKE might take up to 5 minutes to scale the Deployment.
Restore the
contacts
Deployment:kubectl scale deployment contacts --replicas 1
You should see an alert resolution notification message in your Slack workspace channel. GKE might take up to 5 minutes to scale the Deployment.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.
Delete the project
Delete a Google Cloud project:
gcloud projects delete PROJECT_ID
Delete individual resources
Delete the Kubernetes resources:
kubectl delete -f kubernetes-manifests
Uninstall Prometheus:
helm uninstall tutorial
Delete the GKE cluster:
gcloud container clusters delete CLUSTER_NAME --quiet
What's next
- Learn about Google Cloud Managed Service for Prometheus, a fully managed, global metrics solution, based on Prometheus, that is deployed by default in all Autopilot clusters.
- Explore reference architectures, diagrams, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.