Admission webhooks, or webhooks in Kubernetes, are a type of admission
controller,
which can be used in Kubernetes clusters to validate or mutate requests to the
control plane prior to a request being persisted. It is common for third-party
applications to use webhooks that operate on system-critical resources and
namespaces. Incorrectly configured webhooks can impact control plane
performance and reliability. For example, an incorrectly configured webhook
created by a third-party application could prevent GKE from creating and
modifying resources in the managed kube-system
namespace, which could degrade
the functionality of the cluster.
Google Kubernetes Engine (GKE) monitors your clusters and uses the Recommender service to deliver guidance for how you can optimize your usage of the platform. To help you ensure that your cluster remains stable and performant, see recommendations from GKE for the following scenarios:
- Webhooks that operate but have no endpoints available.
- Webhooks that are considered unsafe as they operate on system critical resources and namespaces.
With this guidance, you can see instructions for how to check your potentially misconfigured webhooks and update them, if necessary.
To learn more about how to manage insights and recommendations from Recommenders, see Optimize your usage of GKE with insights and recommendations.
Identify misconfigured webhooks that could affect your cluster
To get insights identifying webhooks that could affect your cluster's performance and stability, follow the instructions to view insights and recommendations. You can get insights in the following ways:
- Use the Google Cloud console.
- Use the Google Cloud CLI, or the Recommender API, filtering with the
subtypes
K8S_ADMISSION_WEBHOOK_UNSAFE
andK8S_ADMISSION_WEBHOOK_UNAVAILABLE
.
After you identify the webhooks via the insights, follow the instructions to troubleshoot the detected webhooks.
When GKE detects misconfigured webhooks
GKE generates an insight and recommendation if either of the following criteria are true for a cluster:
K8S_ADMISSION_WEBHOOK_UNAVAILABLE
: The GKE cluster has one or more webhooks reporting no available endpoints. Follow the instructions to check webhooks reporting no available endpoints.K8S_ADMISSION_WEBHOOK_UNSAFE
: The GKE cluster has one or more webhooks that are considered unsafe based on the resources they intercept. Follow the instructions to check the webhooks that are considered unsafe. The following webhooks are considered unsafe:- Webhooks intercepting resources, including Pods and Leases, in the
kube-system
namespace. - Webhooks intercepting Leases in the
kube-node-lease
namespace. - Webhooks intercepting cluster-scoped system resources, including
Nodes
,TokenReviews
,SubjectAccessReviews
, andCertificateSigningRequests
.
- Webhooks intercepting resources, including Pods and Leases, in the
Troubleshoot the detected webhooks
The following sections have instructions for you to troubleshoot the webhooks that GKE detected as potentially misconfigured.
After you implement the instructions and the webhooks are correctly configured, the recommendation is resolved within 24 hours and no longer appears in the console.
If you do not want to implement the recommendation, you can dismiss it.
Webhooks reporting no available endpoints
If a webhook is reporting that it has no available endpoints, the Service that is backing the webhook endpoint has one or more Pods which are not running. To make the webhook endpoints available, follow the instructions to find and troubleshoot the Pods of the Service that is backing this webhook endpoint:
View insights and recommendations, choosing one insight at a time to troubleshoot. GKE generates one insight per cluster, and this insight lists one or more webhooks with a broken endpoint that must be investigated. For each of these webhooks, the insight also states the Service name, what endpoint is broken, and the last time that the endpoint was called.
Find the serving Pods for the Service associated with the webhook:
Console
From the insight's sidebar panel, see the table of misconfigured webhooks. Click on the name of the Service.
kubectl
Run the following command to describe the Service:
kubectl describe svc SERVICE_NAME -n SERVICE_NAMESPACE
Replace SERVICE_NAME and SERVICE_NAMESPACE with the name and namespace of the service, respectively.
If you cannot find the Service name listed in the webhook, the unavailable endpoint might be caused by a mismatch between the name listed in the configuration and the actual name of the Service. To fix the endpoint availability, update the Service name in the webhook configuration to match the correct Service object.
Inspect the serving Pods for this Service:
Console
Under Serving Pods in the Service details, see the list of Pods backing this Service.
kubectl
Identify which Pods are not running by listing the Deployment or Pods:
kubectl get deployment -n SERVICE_NAMESPACE
Or, run this command:
kubectl get pods -n SERVICE_NAMESPACE -o wide
For any Pods that are not running, inspect the Pod logs to see why the Pod is not running. For instructions on common issues with Pods, see Troubleshoot issues with deployed workloads.
Webhooks that are considered unsafe
If a webhook is intercepting any resources in system-managed namespaces, or certain types of resources, GKE considers this unsafe and recommends that you update the webhooks to avoid intercepting these resources.
- Follow the instructions to view insights and recommendations, choosing one insight at a time to troubleshoot. GKE only generates one insight per cluster, and this insight lists one or more webhook configurations, each of which lists one or more webhooks. For each webhook configuration listed, the insight states the reason why the configuration was flagged.
Inspect the webhook configuration:
Console
From the insight's sidebar panel, see the table. In each row is the name of the webhook configuration, and the reason why this configuration was flagged.
To inspect each configuration, click the name to navigate to this configuration in the GKE Object Browser dashboard.
kubectl
Run the following
kubectl
command to get the webhook configuration, replacing CONFIGURATION_NAME with the name of the webhook configuration:kubectl get validatingwebhookconfigurations CONFIGURATION_NAME -o yaml
If this command doesn't return anything, run the command again, replacing
validatingwebhookconfigurations
withmutatingwebhookconfigurations
.In the
webhooks
section, there are one or more webhooks listed.Edit the configuration, depending on the reason the webhook was flagged:
Exclude kube-system and kube-node-lease namespaces
A webhook is flagged if
scope
is*
. Or, a webhook is flagged if scope isNamespaced
and either of the following conditions are true:The
operator
condition isNotIn
andvalues
omitskube-system
andkube-node-lease
, as in the following example:webhooks: - admissionReviewVersions: ... namespaceSelector: matchExpressions: - key: kubernetes.io/metadata.name operator: NotIn values: - blue-system objectSelector: {} rules: - apiGroups: ... scope: '*' sideEffects: None timeoutSeconds: 3
Ensure that you set
scope
toNamespaced
, not*
, so that the webhook only operates in specific namespaces. Also ensure that if theoperator
isNotIn
, you includekube-system
andkube-node-lease
invalues
(in this example, withblue-system
).The
operator
condition isIn
andvalues
includeskube-system
andkube-node-lease
, as in the following example:namespaceSelector: matchExpressions: - key: kubernetes.io/metadata.name operator: In values: - blue-system - kube-system - kube-node-lease
Ensure that you set
scope
toNamespaced
, not*
, so that the webhook only operates in specific namespaces. Ensure that ifoperator
isIn
, you don't includekube-system
andkube-node-lease
invalues
. In this example, onlyblue-system
should be invalues
as theoperator
isIn
.
Exclude matched resources
A webhook is also flagged if
nodes
,tokenreviews
,subjectaccessreviews
, orcertificatesigningrequests
are listed under resources, as in the following example:- admissionReviewVersions: ... resources: - 'pods' - 'nodes' - 'tokenreviews' - 'subjectacessreviews' - 'certificatesigningrequests' scope: '*' sideEffects: None timeoutSeconds: 3
Remove
nodes
,tokenreviews
,subjectaccessreviews
, andcertificatesigningrequests
from the resource section. You can keeppods
inresources
.