Stay organized with collections
Save and categorize content based on your preferences.
Resolving scaling issues in Cloud Service Mesh
This section explains common Cloud Service Mesh problems and how to resolve
them. If you need additional assistance, see
Getting support.
Scaling factors
Istiod
sends configuration to each sidecar using a long-lived gRPC stream. It has
several characteristics that affect scaling:
The size of the configuration to generate:
Total number of services/pods & Istio resources
For large scale, adjust settings for the Sidecar
to reduce the configuration size.
The rate of change in the environment:
When a new service is created or the Istio configuration is changed, full
updates are sent to proxies.
Adding new endpoints is inexpensive for performance, because only incremental
updates are sent.
The number of proxies for which configuration is generated:
Affected by the number of gateways and pods with a sidecar.
Scaling considerations
Istiod scales well vertically (large requests) and horizontally (more
replicas). Ensure that your CPU limits are not too restrictive; if Istiod
reaches the CPU limit, throttling may occur which will negatively affect
configuration distribution. If you encounter performance issues, consider
upgrading to the latest version of Cloud Service Mesh, as each version has
performance optimizations.
Large changes in cluster size might cause a temporarily unbalanced load, due to
the long-lived connections. This is mitigated by a 30 minute maximum connection
age, which might result in error messages in Envoy, such as gRPC config stream
closed: 13, which allows the load to naturally rebalance.
Mitigate this issue by having multiple replicas of Istiod (the default is 2
replicas), and pre-scaling if you expect extreme cluster scale-ups.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-28 UTC."],[],[],null,["# Resolving scaling issues in Cloud Service Mesh\n==============================================\n\n| **Note:** This guide only supports Cloud Service Mesh with Istio APIs and does not support Google Cloud APIs. For more information see, [Cloud Service Mesh overview](/service-mesh/v1.25/docs/overview).\n\nThis section explains common Cloud Service Mesh problems and how to resolve\nthem. If you need additional assistance, see\n[Getting support](/service-mesh/v1.25/docs/getting-support).\n\nScaling factors\n---------------\n\n[Istiod](https://istio.io/v1.24/blog/2020/istiod/)\nsends configuration to each sidecar using a long-lived gRPC stream. It has\nseveral characteristics that affect scaling:\n\n- The size of the configuration to generate:\n - Total number of services/pods \\& Istio resources\n - For large scale, adjust settings for the [Sidecar](https://istio.io/v1.24/docs/reference/config/networking/sidecar/) to reduce the configuration size.\n- The rate of change in the environment:\n - When a new service is created or the Istio configuration is changed, full updates are sent to proxies.\n - Adding new endpoints is inexpensive for performance, because only incremental updates are sent.\n- The number of proxies for which configuration is generated:\n - Affected by the number of gateways and pods with a sidecar.\n\nScaling considerations\n----------------------\n\nIstiod scales well vertically (large requests) and horizontally (more\nreplicas). Ensure that your CPU limits are not too restrictive; if Istiod\nreaches the CPU limit, throttling may occur which will negatively affect\nconfiguration distribution. If you encounter performance issues, consider\nupgrading to the latest version of Cloud Service Mesh, as each version has\nperformance optimizations.\n\nFor more guidance on scaling your mesh, see the\n[Scalability best practices guide](/service-mesh/v1.25/docs/operate-and-maintain/scalability-best-practices).\n\nUnbalanced load\n---------------\n\nLarge changes in cluster size might cause a temporarily unbalanced load, due to\nthe long-lived connections. This is mitigated by a 30 minute maximum connection\nage, which might result in error messages in Envoy, such as `gRPC config stream\nclosed: 13`, which allows the load to naturally rebalance.\n\nMitigate this issue by having multiple replicas of Istiod (the default is 2\nreplicas), and pre-scaling if you expect extreme cluster scale-ups."]]