Scaling Limits for Cloud Service Mesh on GKE

This document describes the scaling limits of the control plane for managed Cloud Service Mesh architectures on GKE so you can make informed decisions regarding your deployments.

Overview

The scalability of Cloud Service Mesh on GKE depends on the efficient operation of its two main components, the data plane and the control plane. This document focuses on scaling limits of the control plane. Refer to Scalability Best Practices for data plane scalability best practices.

Some of the scaling limits documented are enforced by quota restrictions. Exceeding them will require quota increase requests. Others are not strictly enforced, but can lead to undefined behavior and performance if exceeded.

To understand how Istio resources are translated to Google Cloud resources, refer to the Understanding API resources guide first.

Service scaling limits

Service scaling is limited along two dimensions

Note that once Cloud Service Mesh is enabled for a particular membership (i.e GKE cluster), all kubernetes services in the cluster are translated to Cloud Service Mesh services, including those that target workloads without a Cloud Service Mesh sidecar. Cloud Service Mesh creates Zonal Network Endpoint Groups for all services in the GKE cluster. If the cluster is regional, network endpoint groups are created for all node pool zones in the region.

Cloud Service Mesh services versus Kubernetes services

Cloud Service Mesh services are not the same as Kubernetes services in that Cloud Service Mesh services are one service per port.

For example, this Kubernetes service is internally translated into two Cloud Service Mesh services, one for each port.

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  selector:
    app: my-app
  ports:
    -   port: 80
      targetPort: 80
      protocol: TCP
      name: http
    -   port: 443
      targetPort: 443
      protocol: TCP
      name: https

Destination rule subsets

When configuring the Istio Destination Rule API with subsets, each subset may result in the generation of multiple new Cloud Service Mesh services.

For example, consider the following DestinationRule that targets the kubernetes service defined earlier:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: my-service-destinationrule
spec:
  host: my-service
  subsets:
  -   name: testversion
    labels:
      version: v3
  -   name: prodversion
    labels:
      version: v2

New synthetic services will be created for each of the subsets defined. If the original Kubernetes service created two Cloud Service Mesh services, the DestinationRule will create 4 additional Cloud Service Mesh services, 2 for each subset, resulting in a total of 6 Cloud Service Mesh services.

Multi-project deployments

When a single mesh is deployed across workloads in different Google Cloud projects, all Cloud Service Mesh service resources are created in the fleet host project. This means they are all subject to the Cloud Service Mesh scalability limitations in the fleet host project.

Kubernetes headless services

Kubernetes headless services have a lower limit compared to regular services. Cloud Service Mesh only supports 50 headless Cloud Service Mesh services per cluster. See the Kubernetes networking documentation for an example.

Endpoint scaling limits

Endpoint scaling limits are typically per the following:

  • Cloud Service Mesh service

  • GKE cluster

Regular Kubernetes services

Endpoints per NEG quotas affect the maximum number of endpoints that can belong to a single Kubernetes service.

Kubernetes headless services

For Kubernetes headless service, Cloud Service Mesh supports not more than 36 endpoints per headless service. Refer to the Kubernetes networking documentation for an example.

GKE cluster limits

Cloud Service Mesh supports up to 5000 endpoints (Pod IPs) per cluster.

Gateway scaling limit

When using Istio Gateways, especially to terminate HTTPS connections using TLS credentials in Kubernetes secrets, Cloud Service Mesh supports at most the following number of pods:

  • 1500 gateway pods when using Regional GKE clusters

  • 500 gateway pods when using Zonal or Autopilot GKE clusters