Stay organized with collections
Save and categorize content based on your preferences.
The health check feature regularly monitors the health of the cluster control
plane and several critical components, and helps you detect and diagnose
potential problems with your clusters.
The cluster health checker detects and alerts you to the
following issues in a cluster:
kube-scheduler health on control plane nodes: If the kube-scheduler
is unhealthy, this suggests that the cluster is having trouble assigning Pods
to nodes. To investigate further, you can examine the kube-scheduler Pod
log.
kube-controller-manager health on control plane nodes: The
kube-controller-manager monitors
various controllers, such as the ReplicaSet, Deployment, and Namespace
controllers, among others. If the kube-controller-manager is deemed
unhealthy, this suggests that one or more of the controllers it manages might
not be working properly. To determine the precise issue, you can examine the
kube-controller-manager Pod log, which might provide more information about
the malfunctioning controller(s).
Root volume capacity: The health checker checks for sufficient capacity
on the root volume of each control plane node. If the available capacity
falls under 512MB, the health checker alerts you to the potential risk of
running out of disk space.
View health check events
To view alerts from the health checker for a specific cluster, run the following
command:
GOOGLE_CLOUD_LOCATION: the name of the Google Cloud
location that manages the cluster
Here's an excerpt of the kind of output you can expect:
{
"name": "some-cluster-name",
"description": "test-cluster",
...
"errors": [
{
"message": "Replica (replica-name)": kube-controller-manager is unhealthy"
},
{
"message": "Replica (replica-name)": not enough disk space on root volume, only 9 MB left"
}
]
...
}
In this example, the error message indicates that a kube-controller-manager
component is unhealthy, and that the capacity on a control plane node's root
volume is getting low.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[],[],null,["# Diagnose cluster issues\n\nThe health check feature regularly monitors the health of the cluster control\nplane and several critical components, and helps you detect and diagnose\npotential problems with your clusters.\nIf you need additional assistance, reach out to [Cloud Customer Care](/kubernetes-engine/multi-cloud/docs/aws/getting-support).\n\nIssues detected\n---------------\n\nThe cluster health checker detects and alerts you to the\nfollowing issues in a cluster:\n\n- **`kube-scheduler` health on control plane nodes** : If the `kube-scheduler`\n is unhealthy, this suggests that the cluster is having trouble assigning Pods\n to nodes. To investigate further, you can examine the `kube-scheduler` Pod\n log.\n\n- **`kube-controller-manager` health on control plane nodes** : The\n `kube-controller-manager` monitors\n various controllers, such as the ReplicaSet, Deployment, and Namespace\n controllers, among others. If the `kube-controller-manager` is deemed\n unhealthy, this suggests that one or more of the controllers it manages might\n not be working properly. To determine the precise issue, you can examine the\n `kube-controller-manager` Pod log, which might provide more information about\n the malfunctioning controller(s).\n\n- **Root volume capacity**: The health checker checks for sufficient capacity\n on the root volume of each control plane node. If the available capacity\n falls under 512MB, the health checker alerts you to the potential risk of\n running out of disk space.\n\nView health check events\n------------------------\n\nTo view alerts from the health checker for a specific cluster, run the following\ncommand: \n\n gcloud container aws clusters describe \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e \\\n --location \u003cvar translate=\"no\"\u003eGOOGLE_CLOUD_LOCATION\u003c/var\u003e\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: your cluster's name\n- \u003cvar translate=\"no\"\u003eGOOGLE_CLOUD_LOCATION\u003c/var\u003e: the name of the Google Cloud location that manages the cluster\n\nHere's an excerpt of the kind of output you can expect:\n\n```\n{\n \"name\": \"some-cluster-name\",\n \"description\": \"test-cluster\",\n ...\n \"errors\": [\n {\n \"message\": \"Replica (replica-name)\": kube-controller-manager is unhealthy\"\n },\n {\n \"message\": \"Replica (replica-name)\": not enough disk space on root volume, only 9 MB left\"\n }\n ]\n ...\n}\n```\n\nIn this example, the error message indicates that a `kube-controller-manager`\ncomponent is unhealthy, and that the capacity on a control plane node's root\nvolume is getting low.\n\nWhat's next\n-----------\n\nIf you need additional assistance, reach out to [Cloud Customer Care](/kubernetes-engine/multi-cloud/docs/aws/getting-support)."]]