Cloud Service Mesh and Traffic Director are now Cloud Service Mesh. For more information, see the Cloud Service Mesh overview.

Configuration updates for modernization

This document describes configuration updates you may need to make to your managed Cloud Service Mesh before modernizing your mesh to the TRAFFIC_DIRECTOR control plane from the ISTIOD control plane.

The following are a list of possible configuration updates necessary to prepare your cluster for modernization. See each section for update instructions:

Multi-cluster
Enable Workload Identity Federation for GKE
Enable managed CNI
Non-Standard Binary Usage in Sidecar
Migrate to the Istio Ingress Gateway

For more information on the modernization workflow, see the Managed control plane modernization page.

Migrate from Istio secrets to multicluster_mode

Multi-cluster secrets are not supported when a cluster is using the TRAFFIC_DIRECTOR control plane. This document describes how you can modernize from using Istio multi-cluster secrets to using multicluster_mode.

Istio secrets versus declarative API overview

Open source istio multi-cluster endpoint discovery works by using istioctl or other tools to create a Kubernetes Secret in a cluster. This secret allows a cluster to load balance traffic to another cluster in the mesh. The ISTIOD control plane then reads this secret and begins routing traffic to that other cluster.

Cloud Service Mesh has a declarative API to control multi-cluster traffic instead of directly creating Istio secrets. This API treats Istio secrets as an implementation detail and is more reliable than creating Istio secrets manually. Future Cloud Service Mesh features will depend on the declarative API, and you won't be able to use those new features with Istio secrets directly. The declarative API is the only supported path forward.

If you are using Istio Secrets, migrate to using the declarative API as soon as possible. Note that the multicluster_mode setting directs each cluster to direct traffic to every other cluster in the mesh. Using secrets allows a more flexible configuration, letting you configure for each cluster which other cluster it should direct traffic to in the mesh. For a full list of the differences between the supported features of the declarative API and Istio secrets, see Supported features using Istio APIs.

Migrate from Istio secrets to declarative API

If you provisioned Cloud Service Mesh using automatic management with the fleet feature API, you don't need to follow these instructions. These steps only apply if you onboarded using asmcli --managed.

Note, this process changes secrets that point to a cluster. During this process, the endpoints are removed and then re-added. In between the endpoints being removed and added, the traffic will briefly revert to routing locally instead of load balancing to other clusters. For more information, see the GitHub issue.

To move from using Istio secrets to the declarative API, follow these steps. Execute these steps at the same time or in close succession:

Enable the declarative API for each cluster in the fleet where you want to enable multi cluster endpoint discovery by setting multicluster_mode=connected. Note that you need to explicitly set multicluster_mode=disconnected if you don't want the cluster to be discoverable.

Use the following command to opt in a cluster for multi cluster endpoint discovery:
```
 kubectl patch configmap/asm-options -n istio-system --type merge -p '{"data":{"multicluster_mode":"connected"}}'
```
Use the following command to opt a cluster out of endpoint discovery:
```
 kubectl patch configmap/asm-options -n istio-system --type merge -p '{"data":{"multicluster_mode":"disconnected"}}'
```
Delete old secrets.

After setting multicluster_mode=connected on your clusters, each cluster will have a new secret generated for every other cluster that also has multicluster_mode=connected set. The secret is placed in the istio-system namespace and have the following format:
```
istio-remote-secret-projects-PROJECT_NAME-locations-LOCATION-memberships-MEMBERSHIPS
```
Each secret will also have the label istio.io/owned-by: mesh.googleapis.com applied.

Once the new secrets are created, you can delete any secrets manually created with istioctl create-remote-secret:
```
kubectl delete secret SECRET_NAME -n istio-system
```

Once migrated, check your request metrics to make sure they're routed as expected.

Enable Workload Identity Federation for GKE

Workload Identity Federation is the recommended secure method for Google Kubernetes Engine workloads. This allows access to Google Cloud services such as Compute Engine, BigQuery, and Machine Learning APIs. Workload Identity Federation doesn't require manual configuration or less secure methods like service account key files because it uses IAM policies. For more details on Workload Identity Federation, see How Workload Identity Federation for GKE works.

The following section describe how to enable Workload Identity Federation.

Enable Workload Identity Federation on clusters

Check Workload Identity Federation is enabled for your cluster. To do that, ensure the GKE cluster has a Workload Identity Federation pool set configured, which is essential for IAM credential validation.

Use the following command, to check the workload identity pool set for a cluster:
```
gcloud container clusters describe CLUSTER_NAME \
  --format="value(workloadIdentityConfig.workloadPool)"
```
Replace CLUSTER_NAME with the name of your GKE cluster. If you haven't already specified a default zone or region for gcloud, you might also need to specify a --region or --zone flag when running this command.
If the output is empty, follow the instructions in Update an existing cluster to enable workload identity on existing GKE clusters.

Enable Workload Identity Federation on node pools

After Workload Identity Federation is enabled on a cluster, node pools must be configured to use the GKE metadata server.

List all the node pools of a Standard cluster. Run the gcloud container node-pools list command:
```
gcloud container node-pools list --cluster CLUSTER_NAME
```
Replace CLUSTER_NAME with the name of your GKE cluster. If you haven't already specified a default zone or region for gcloud, you might also need to specify a --region or --zone flag when running this command.
Verify that each node pool is using the GKE metadata server:
```
gcloud container node-pools describe NODEPOOL_NAME \
    --cluster=CLUSTER_NAME \
    --format="value(config.workloadMetadataConfig.mode)"
```
Replace the following:
- NODEPOOL_NAME with the name of your nodepool.
- CLUSTER_NAME with the name of your GKE cluster.
If the output doesn't contain GKE_METADATA, update the node pool using the Update an existing node pool guide.

Enable managed container network interface (CNI)

This section guides you through enabling managed CNI for Cloud Service Mesh on Google Kubernetes Engine.

Managed CNI overview

Managed container network interface (CNI) is a Google-managed implementation of the Istio CNI. The CNI plugin streamlines pod networking by configuring iptables rules. This enables traffic redirection between applications and Envoy proxies, eliminating the need for privileged permissions for the init-container required to manage iptables.

The Istio CNI plugin replaces the istio-init container. The istio-init container was previously responsible for setting up the pod's network environment to enable traffic interception for the Istio sidecar. The CNI plugin performs the same network redirect function, but with the added benefit of reducing the need for elevated privileges, thereby enhancing security.

Therefore, for enhanced security and reliability, and to simplify management and troubleshooting, managed CNI is required across all Managed Cloud Service Mesh deployments.

Impact on init containers

Init containers are specialized containers that run before application containers for setup tasks. Setup tasks can include tasks such as downloading configuration files, communicating with external services, or performing pre-application initialization. Init containers that rely on network access might encounter issues when managed CNI is enabled in the cluster.

The pod setup process with managed CNI is as follows:

The CNI plugin sets up pod network interfaces, assigns pod IPs and redirects traffic to the Istio sidecar proxy which hasn't started yet.
All init containers execute and complete.
The Istio sidecar proxy starts alongside the application containers.

Therefore, if an init container attempts to make outbound network connections or connect to services within the mesh, the network requests from the init containers may be dropped or misrouted. This is because the Istio sidecar proxy, which manages network traffic for the pod, is not running when the requests are made. For more details, refer Istio CNI documentation.

Enable managed CNI for your cluster

Follow the steps in this section to enable managed CNI on your cluster.

Remove network dependencies from your init container. Consider the following alternatives:
- Modify application logic or containers: You can modify your services to remove the dependency on init containers that require network requests or perform network operations within your application containers, after the sidecar proxy has started.
- Use Kubernetes ConfigMaps or secrets: Store configuration data fetched by the network request in Kubernetes ConfigMaps or secrets and mount them into your application containers. For alternative solutions, refer to the Istio documentation.
Enable managed CNI on your cluster:
1. Make the following configuration changes:
  1. Run the following command to locate the controlPlaneRevision.
```
kubectl get controlplanerevision -n istio-system
```
  2. In your ControlPlaneRevision (CPR) custom resource (CR), set the label mesh.cloud.google.com/managed-cni-enabled to true.
```
kubectl label controlplanerevision CPR_NAME \
    -n istio-system mesh.cloud.google.com/managed-cni-enabled=true \
    --overwrite
```
    Replace CPR_NAME with the value under the NAME column from the output of the previous step.
  3. In the asm-options ConfigMap, set the ASM_OPTS value to CNI=on.
```
kubectl patch configmap asm-options -n istio-system \
    -p '{"data":{"ASM_OPTS":"CNI=on"}}'
```
  4. In your ControlPlaneRevision (CPR) custom resource (CR), set the label mesh.cloud.google.com/force-reprovision to true. This action triggers control plane restart.
    
    Note: This method is not the recommended method for restarting the control plane, and should only be used for Cloud Service Mesh modernization efforts.
```
kubectl label controlplanerevision CPR_NAME \
    -n istio-system mesh.cloud.google.com/force-reprovision=true \
    --overwrite
```
2. Check the feature state. Retrieve the feature state using the following command:
```
gcloud container fleet mesh describe --project FLEET_PROJECT_ID
```
  Replace FLEET_PROJECT_ID with the ID of your Fleet Host project. Generally, the FLEET_PROJECT_ID has the same name as the project.
  - Verify that the MANAGED_CNI_NOT_ENABLED condition is removed from servicemesh.conditions.
  - Note, it may take up to 15-20 minutes for the state to update. Try waiting a few minutes and re-run the command.
3. Once the controlPlaneManagement.state is Active in the cluster's feature state, restart the pods.

Move away from Non-Standard Binary Usage in Sidecar

This section suggest ways to make your deployments compatible with the distroless envoy proxy image.

Distroless envoy proxy sidecar images

Cloud Service Mesh uses two types of Envoy proxy sidecar images based on your control plane configuration, Ubuntu-based image containing various binaries and Distroless image. Distroless base images are minimal container images that prioritize security and resource optimization by only including essential components. The attack surface is reduced to help prevent vulnerabilities. For more information, refer to the documentation on Distroless proxy image.

Binary compatibility

As a best practice, you should restrict the contents of a container runtime to only the necessary packages. This approach improves security and the signal-to-noise ratio of Common Vulnerabilities and Exposures (CVE) scanners. Distroless Sidecar image has a minimal set of dependencies, stripped of all non-essential executables, libraries, and debugging tools. It is therefore not possible to execute a shell command or use curl, ping, or other debug utilities like kubectl exec inside the container.

Make clusters compatible with distroless images

Remove references to any unsupported binaries (like bash or curl) from your configuration. Particularly inside Readiness, Startup, and Liveness probes, and Lifecycle PostStart and PreStop hooks within the istio-proxy, istio-init, or istio-validation containers.
Consider alternatives like holdApplicationUntilProxyStarts for certain use cases.
For debugging, you can use ephemeral containers to attach to a running workload Pod. You can then inspect it and run custom commands. For an example, see Collecting Cloud Service Mesh logs.

If you can't find a solution for your specific use case, contact Google Cloud Support at Getting support.

Migrate to the Istio Ingress Gateway

This section shows you how to migrate to the Istio Ingress Gateway. There are two methods for migrating to the Istio Ingress Gateway:

Phased Migration with Traffic Splitting

This method prioritizes minimizing disruption in which you'll incrementally send traffic to the new Istio gateway, allowing you to monitor its performance on a small percentage of requests and quickly revert if necessary. Keep in mind that configuring Layer 7 traffic splitting can be challenging for some applications, so you need to manage both gateway systems concurrently during the transition. See Phased Migration with traffic splitting for the steps.
Direct Migration

This method involves simultaneously rerouting all traffic to the new Istio gateway once you have thoroughly conducted testing. The advantage of this approach is complete separation from the old gateway's infrastructure, allowing adaptable configuration of the new gateway without the constraints of the existing setup. However, there is an increased risk of downtime in case unexpected problems arise with the new gateway during the transition. See Direct Migration for the steps.

The following migration examples assume you have an HTTP service (httpbin) running in the application namespace (default) and exposed externally using Kubernetes Gateway API. The relevant configurations are:

Gateway: k8-api-gateway (in istio-ingress namespace) - configured to listen for HTTP traffic on port 80 for any hostname ending with .example.com.
HTTPRoute: httpbin-route (in default namespace) - directs any HTTP request with the hostname httpbin.example.com and a path starting with /get to the httpbin service within the default namespace.
The httpbin application is accessible using the external IP 34.57.246.68.

Basic gateway diagram

Phased Migration with traffic splitting

Provision a new Istio Ingress Gateway

Deploy a new Ingress Gateway following the steps in the Deploy sample gateway section and customize the sample configurations to your requirements. The samples in the anthos-service-mesh repository are meant for deploying a istio-ingressgateway loadBalancer service and the corresponding ingress-gateway pods.

Example Gateway Resource (istio-ingressgateway.yaml)

 apiVersion: networking.istio.io/v1beta1
 kind: Gateway
 metadata:
   name: istio-api-gateway
   namespace: GATEWAY_NAMESPACE
 spec:
   selector:
     istio: ingressgateway  # The selector should match the ingress-gateway pod labels.
   servers:
   - port:
       number: 80
       name: http
       protocol: HTTP
     hosts:   # or specific hostnames if needed
     - "httpbin.example.com"

Apply the Gateway configuration to manage traffic:
```
kubectl apply -f istio-ingressgateway.yaml -n GATEWAY_NAMESPACE
```
Ensure the 'spec.selector' in your Gateway resource matches the labels of your ingress-gateway pods. For example, if the ingress-gateway pods have the label istio=ingressgateway, your Gateway configuration must also select this the istio=ingressgateway label.

Configure initial routing for the new Gateway

Define the initial routing rules for your application using an Istio VirtualService.

Example VirtualService (my-app-vs-new.yaml):

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: httpbin-vs
  namespace: APPLICATION_NAMESPACE
spec:
    gateways:
    - istio-ingress/istio-api-gateway  # Replace with <gateway-namespace/gateway-name>
    hosts:
    - httpbin.example.com
    http:
    - match:
      - uri:
          prefix: /get
      route:
      - destination:
          host: httpbin
          port:
            number: 8000

Apply the VirtualService:

kubectl apply -f my-app-vs-new.yaml -n MY_APP_NAMESPACE

Access the backend (httpbin) service through the newly deployed Istio Ingress Gateway

Set the Ingress Host environment variable to the external IP address associated with the recently deployed istio-ingressgateway load balancer:

export INGRESS_HOST=$(kubectl -n GATEWAY_NAMESPACE get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

Verify the application (httpbin) is accessible using the new gateway:
```
curl -s -I -HHost:httpbin.example.com "http://$INGRESS_HOST/get"
```
The output is similar to:
```
HTTP/1.1 200 OK
```

Request flow with the new istio ingress gateway

Modify existing Ingress for traffic splitting

After confirming the successful setup of the new gateway (ex. istio-api-gateway), you can begin routing a portion of your traffic through it. To do this, update your current HTTPRoute to direct a small percentage of traffic to the new gateway, while the larger portion continues to use the existing gateway (k8-api-gateway).

Open the httproute for editing:

kubectl edit httproute httpbin-route -n MY_APP_NAMESPACE

Add a new backend reference pointing to the new Ingress Gateway's loadbalancer service with an initial weight of 10% and update the weight for the old gateway's backend.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: httpbin-route
  namespace: MY_APP_NAMESPACE  # your application's namespace
spec:
  parentRefs:
  - name: k8-api-gateway
    namespace: istio-ingress
  hostnames: ["httpbin.example.com"]
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /get
    backendRefs:
    - name: httpbin
      port: 8000
      weight: 90
    - name: istio-ingressgateway # Newly deployed load balancer service
      namespace: GATEWAY_NAMESPACE
      port: 80
      weight: 10

Grant Permission for Cross-Namespace Referencing with reference grant.

To allow your HTTPRoute in the application namespace (default) to access loadbalancer service in gateway namespace (istio-ingress), you may need to create a reference grant. This resource serves as a security control, explicitly defining which cross-namespace references are permitted.

The following istio-ingress-grant.yaml describes an example reference grant:

apiVersion: gateway.networking.k8s.io/v1beta1
kind: ReferenceGrant
metadata:
  name: istio-ingressgateway-grant
  namespace: istio-ingress # Namespace of the referenced resource
spec:
  from:
  - group: gateway.networking.k8s.io
    kind: HTTPRoute 
    namespace: MY_APP_NAMESPACE # Namespace of the referencing resource
  to:
  - group: ""               # Core Kubernetes API group for Services
    kind: Service
    name: istio-ingressgateway # Loadbalancer Service of the new ingress gateway

Apply the reference grant:

kubectl apply -f istio-ingress-grant.yaml -n GATEWAY_NAMESPACE

Verify requests to existing external IP address (ex. 34.57.246.68)are not failing. The following check-traffic-flow.sh describes a script to check request failures:

# Update the following values based on your application setup
external_ip="34.57.246.68" # Replace with existing external IP
url="http://$external_ip/get"
host_name="httpbin.example.com"

# Counter for successful requests
success_count=0

# Loop 50 times
for i in {1..50}; do
  # Perform the curl request and capture the status code
  status_code=$(curl -s -HHost:"$host_name" -o /dev/null -w "%{http_code}" "$url")
  # Check if the request was successful (status code 200)
  if [ "$status_code" -eq 200 ]; then
    ((success_count++))  # Increment the success counter
  else
    echo "Request $i: Failed with status code $status_code"
  fi
done

# After the loop, check if all requests were successful
if [ "$success_count" -eq 50 ]; then
  echo "All 50 requests were successful!"
else
  echo "Some requests failed.  Successful requests: $success_count"
fi

Execute the script to confirm that no requests fail, regardless of the traffic route:
```
chmod +x check-traffic-flow.sh
./check-traffic-flow.sh
```

Request flow with traffic split between existing gateway and new istio ingress gateway

Slowly increase traffic percentage

If no request failures are seen for the existing external IP address (for example, 34.57.246.68), gradually shift more traffic to the new Istio Ingress Gateway by adjusting the backend weights in your HTTPRoute. Increase the weight for the istio-ingressgateway and decrease the weight for the old gateway in small increments such as 10%, 20%, and on.

Use the following command to update your existing HTTPRoute:

kubectl edit httproute httpbin-route -n MY_APP_NAMESPACE

Full traffic migration and removing the old gateway

When the new Istio Ingress Gateway demonstrates stable performance and successful request handling, shift all traffic to it. Update your HTTPRoute to set the old gateway's backend weight to 0 and the new gateway's to 100.
Once traffic is fully routed to the new gateway, update your external DNS records for your application's hostname (for example, httpbin.example.com) to point to the external IP address of the load balancer service created in Provision a new Istio Ingress Gateway.

Finally, delete the old gateway and its associated resources:

kubectl delete gateway OLD_GATEWAY -n GATEWAY_NAMESPACE
kubectl delete service OLD_GATEWAY_SERVICE -n GATEWAY_NAMESPACE

Direct Migration