Resolving sidecar proxy/webhook issues in Cloud Service Mesh
This section explains common Cloud Service Mesh problems and how to resolve them. If you need additional assistance, see Getting support.
Cloud Service Mesh contains two webhooks:
- The validating webhook ensures applied Istio configuration is valid.
- The mutating webhook sets automatic sidecar injection on new pods.
A configuration issue in one of these webhooks might cause new pods to fail
start up, or kubectl apply
generating error messages.
Sidecar injection problems
If you have provisioned managed Cloud Service Mesh, then contact support.
Sidecar injection is not working correctly if you see any of the following:
- pods that are scheduling without sidecars
- pods that should have sidecars injected never appear when using
kubectl get pods
, but the corresponding replica set fromkubectl get replicaset
exists.
Use the following steps to troubleshoot sidecar injection problems.
Verify that your namespace or pod has the correct injection label.
If you are running single-revision Istio (the default), verify that your namespace or pod spec have the istio-injection=enabled label.
If you are running multiple-revision Istio (for zero-downtime migrations, multiple control planes, etc), verify that your namespace or pod spec have the appropriate
istio.io/rev=REVISION
label, where REVISION is the Cloud Service Mesh revision number onistiod
that corresponds with your selected Cloud Service Mesh version. For more information about revision labels, see Injecting sidecar proxies.Verify that your istio sidecar injection webhook is present and has a CA bundle.
The sidecar injector webhook (which is used for automatic sidecar injection) requires a CA bundle to establish secure connections with the API server and
istiod
. This CA bundle is patched into the configuration byistiod
, but can sometimes be overwritten (for example, if you reapply the webhook configuration).You can verify the presence of the CA bundle using the following command. The command includes
istio-sidecar-injector-asm-1227-4
, which is specific to this version of Cloud Service Mesh. Ensure you use your actual revision if it differs.kubectl get mutatingwebhookconfigurations.admissionregistration.k8s.io istio-sidecar-injector-asm-1227-4 -o=jsonpath='{.webhooks[0].clientConfig.caBundle}'
If the output is not empty, the CA bundle is configured. If the CA bundle is missing, restart
istiod
to cause it to rescan the webhook and reinstall the CA bundle.Check for sidecar injection failures.
If you have injection enabled, but are not seeing pods scheduling, check the status of the next higher level of abstraction. For example, if you are running a deployment but no pods are scheduling, check the status of the corresponding replica sets using the following command:
kubectl -n my-namespace describe replicaset your-deployment-name
If the replica set is present, check the events log at the bottom of the description for errors. If the error relates to sidecar injection, check the
istiod
logs for an indication of what is causing the error.If the problem persists, the issue might be any of the following:
- Bad configuration passed to the injector
- Firewall configuration problems
- A problem in the Istio code itself
See Troubleshooting Istio for additional diagnostic steps.
Envoy proxies don't receive configuration from istiod
There are several issues that can prevent proxies from receiving configuration
from istiod
.
istiod
will not push configuration to the envoy proxies if it has problems, such as an RBAC issue preventing it from reading its configuration resource.Discovery address is incorrect ('no healthy upstream' errors)
The discovery address provided to the sidecar injector being incorrect. If you see logs that mention
gRPC config stream closed, no healthy upstream
, check that the discovery address in the meshProxyConfig
is correct and points to youristiod
service.Invalid configuration being pushed to the proxy. In this case, configuration is successfully pushed to the proxy, but the configuration is invalid. You will see repeating messages similar to the following:
Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected
In this example,
cds
is the Cluster Discovery Service (which reports 1 update pushed fromistiod
), andlds
is the Listener Discovery Service (which reports 1 update rejected fromistiod
). Often you will see an earlier error message that explains the reason for the rejection, which usually starts with a warning about envoy configuration or similar.To fix the issue, investigate the cause of the rejected configuration. One common cause is bad
EnvoyFilter
resources. If no reason is obvious, submit a bug report with a configuration dump of the proxy.
Pod creation fails
If you observe that pods are not being created successfully, look for error messages that might give clues to the root problem, using the following command:
kubectl describe replicaset YOUR_REPLICA_SET
Common webhook error messages
Error messages output by the kubectl apply
command can provide a hint about
their root cause. See the following table for common error messages, their
causes and potential resolutions.
Error message | Cause | Resolution |
---|---|---|
net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) |
This might be a network connectivity issue. | Ensure that your firewall rules provide connectivity to `istiod` on port 15017. |
no endpoints available for service 'istiod' |
This can occur if the `istiod` pod is not available or not ready. | Check the `istiod` pods to ensure they are running and ready. |
Service "istiod" not found |
This can occur if the `istiod` service does not exist. | Verify that your Istio installation was successful and correct. |
x509: certificate signed by unknown authority |
This might be a webhook certificate issue. | Check that caBundle is correctly set on the webhook. |
Failed to update validatingwebhookconfiguration
istio-validator-asm-[version-n]-istio-system (failurePolicy=Fail,
resourceVersion=[version]): Operation cannot be fulfilled on
validatingwebhookconfigurations.admissionregistration.k8s.io
"istio-validator-asm-[version-n]-istio-system": the object has been
modified; please apply your changes to the latest version and try
again. |
A validating webhook from an old version of Istio or Cloud Service Mesh that has been uninstalled may be interfering with an upgrade or install. | Check that all webhooks still in the cluster and remove any webhook(s) that reference versions which are no longer installed. |
Error from server (InternalError): Internal error occurred: failed
calling webhook "rev.namespace.sidecar-injector.istio.io": Post "https://istiod-asm-1122-0.istio-system.svc:443/inject?timeout=10s": context deadline exceeded |
For private clusters, port 15017 must be open. This error message indicates that port 15017 may not be open. | Ensure that your firewall rules provide connectivity to Istiod on port 15017. For more information, see Opening a port on a private cluster. |