[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-19。"],[],[],null,["# Resolving workload startup issues in Cloud Service Mesh\n=======================================================\n\nThis document explains common Cloud Service Mesh problems and how to resolve\nthem. If you need additional assistance, see\n[Getting support](/service-mesh/v1.21/docs/getting-support).\n\nConnection Refused when reaching a Cloud Service Mesh endpoint\n--------------------------------------------------------------\n\nYou might intermittently experience connection refused (`ECONNREFUSED`) errors\nwith communication from your clusters to your endpoints, for example\nMemorystore Redis, Cloud SQL, or any external service your application\nworkload needs to reach.\n\nThis can occur when your application workload initiates faster than the\nistio-proxy (`Envoy`) container and tries to reach an external endpoint. Because\nat this stage istio-init (`initContainer`) has already executed, there are\niptables rules in place redirecting all outgoing traffic to `Envoy`. Since\nistio-proxy is not ready yet, the iptables rules will redirect traffic to a\nsidecar proxy that is not yet started and therefore, the application gets the\n`ECONNREFUSED` error.\n\nThe following steps detail how to check if this is the error you are\nexperiencing:\n\n1. Check the stackdriver logs with the following Filter to identify which pods\n had the problem.\n\n The following example shows a typical error message: \n\n Error: failed to create connection to feature-store redis, err=dial tcp 192.168.9.16:19209: connect: connection refused\n [ioredis] Unhandled error event: Error: connect ECONNREFUSED\n\n2. Search for an occurrence of the problem. If you are using legacy Stackdriver,\n then use `resource.type=\"container\"`.\n\n resource.type=\"k8s_container\"\n textPayload:\"$ERROR_MESSAGE$\"\n\n3. Expand the latest occurrence to obtain the name of the pod and then make note\n of the `pod_name` under `resource.labels`.\n\n4. Obtain the first occurrence of the issue for that pod:\n\n resource.type=\"k8s_container\"\n resource.labels.pod_name=\"$POD_NAME$\"\n\n Example output: \n\n E 2020-03-31T10:41:15.552128897Z\n post-feature-service post-feature-service-v1-67d56cdd-g7fvb failed to create\n connection to feature-store redis, err=dial tcp 192.168.9.16:19209: connect:\n connection refused post-feature-service post-feature-service-v1-67d56cdd-g7fvb\n\n5. Make note of the timestamp of the first error for this pod.\n\n6. Use the following filter to see the pod startup events.\n\n resource.type=\"k8s_container\"\n resource.labels.pod_name=\"$POD_NAME$\"\n\n Example output: \n\n I 2020-03-31T10:41:15Z spec.containers{istio-proxy} Container image \"docker.io/istio/proxyv2:1.3.3\" already present on machine spec.containers{istio-proxy}\n I 2020-03-31T10:41:15Z spec.containers{istio-proxy} Created container spec.containers{istio-proxy}\n I 2020-03-31T10:41:15Z spec.containers{istio-proxy} Started container spec.containers{istio-proxy}\n I 2020-03-31T10:41:15Z spec.containers{APP-CONTAINER-NAME} Created container spec.containers{APP-CONTAINER-NAME}\n W 2020-03-31T10:41:17Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503 spec.containers{istio-proxy}\n W 2020-03-31T10:41:26Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503 spec.containers{istio-proxy}\n W 2020-03-31T10:41:28Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503 spec.containers{istio-proxy}\n W 2020-03-31T10:41:31Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503 spec.containers{istio-proxy}\n W 2020-03-31T10:41:58Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503 spec.containers{istio-proxy}\n\n7. Use the timestamps of errors and istio-proxy startup events to confirm the\n errors are happening when `Envoy` is not ready.\n\n If the errors occur while the istio-proxy container is not ready yet, it is\n normal to obtain connection refused errors. In the preceding example, the pod\n was trying to connect to Redis as soon as `2020-03-31T10:41:15.552128897Z`\n but by `2020-03-31T10:41:58Z` istio-proxy was still failing readiness probes.\n\n Even though the istio-proxy container started first, it is possible that it\n did not become ready fast enough before the app was already trying to connect\n to the external endpoint.\n\n If this is the problem you are experiencing, then continue through the\n following troubleshooting steps.\n8. Annotate the config at the pod level. This is *only* available at the pod\n level and not at a global level.\n\n annotations:\n proxy.istio.io/config: '{ \"holdApplicationUntilProxyStarts\": true }'\n\n9. Modify the application code so that it checks if `Envoy` is ready before it\n tries to make any other requests to external services. For example, on\n application start, initiate a loop that makes requests to the istio-proxy\n health endpoint and only continues once a 200 is obtained. The istio-proxy\n health endpoint is as follows:\n\n http://localhost:15020/healthz/ready\n\nRace condition during sidecar injection between Vault and Cloud Service Mesh\n----------------------------------------------------------------------------\n\nWhen using `vault` for secrets management, sometimes `vault` injects sidecar\nbefore `istio`, causing that Pods get stuck in `Init` status. When this happens,\nthe Pods created get stuck in Init status after restarting any deployment or\ndeploying a new one. For example: \n\n E 2020-03-31T10:41:15.552128897Z\n post-feature-service post-feature-service-v1-67d56cdd-g7fvb failed to create\n connection to feature-store redis, err=dial tcp 192.168.9.16:19209: connect:\n connection refused post-feature-service post-feature-service-v1-67d56cdd-g7fvb\n\nThis issue is caused by a race condition, both Istio and `vault` inject the\nsidecar and Istio must be the last doing this, the `istio` proxy is not running\nduring init containers. The `istio` init container sets up iptables rules to\nredirect all traffic to the proxy. Since it is not running yet, those rules\nredirect to nothing, blocking all traffic. This is why the init container must\nbe last, so the proxy is up and running immediately after the iptables rules are\nset up. Unfortunately, the order is not deterministic, so if Istio is injected\nfirst it breaks.\n\nTo troubleshoot this condition, allow the IP address of `vault` so the traffic\ngoing to the Vault IP is not redirected to the Envoy Proxy which is not ready\nyet and therefore blocking the communication. To achieve this, a new annotation\nnamed `excludeOutboundIPRanges` should be added.\n\nFor managed Cloud Service Mesh, this is only possible at Deployment or Pod\nlevel under `spec.template.metadata.annotations`, for example: \n\n apiVersion: apps/v1\n kind: Deployment\n ...\n ...\n ...\n spec:\n template:\n metadata:\n annotations:\n traffic.sidecar.istio.io/excludeOutboundIPRanges:\n\nFor in-cluster Cloud Service Mesh, there is an option to set it as a global\none with an IstioOperator under `spec.values.global.proxy.excludeIPRanges`, for\nexample: \n\n apiVersion: install.istio.io/v1alpha1\n kind: IstioOperator\n spec:\n values:\n global:\n proxy:\n excludeIPRanges: \"\"\n\nAfter adding the annotation, restart your workloads."]]