Configuration updates for modernization
This document describes configuration updates you may need to make to your
managed Cloud Service Mesh before modernizing your mesh to
the TRAFFIC_DIRECTOR
control plane from the
ISTIOD
control plane.
The following are a list of possible configuration updates necessary to prepare your cluster for modernization. See each section for update instructions:
- Multi-cluster
- Enable Workload Identity Federation for GKE
- Enable managed CNI
- Unsupported Binary Usage
For more information on the modernization workflow, see the Managed control plane modernization page.
Migrate from Istio secrets to multicluster_mode
Multi-cluster secrets are not supported when a cluster is using the
TRAFFIC_DIRECTOR
control plane. This document describes how you
can modernize from using Istio multi-cluster secrets to using multicluster_mode
.
Istio secrets versus declarative API overview
Open source istio multi-cluster endpoint discovery works by
using istioctl
or other tools to create a Kubernetes Secret in a
cluster. This secret allows a cluster to load balance traffic to another cluster
in the mesh. The ISTIOD
control plane then reads this
secret and begins routing traffic to that other cluster.
Cloud Service Mesh has a declarative API to control multi-cluster traffic instead of directly creating Istio secrets. This API treats Istio secrets as an implementation detail and is more reliable than creating Istio secrets manually. Future Cloud Service Mesh features will depend on the declarative API, and you won't be able to use those new features with Istio secrets directly. The declarative API is the only supported path forward.
If you are using Istio Secrets, migrate to using the declarative API as
soon as possible. Note that the multicluster_mode
setting directs each cluster
to direct traffic to every other cluster in the mesh. Using secrets allows a
more flexible configuration, letting you configure for each cluster which other
cluster it should direct traffic to in the mesh.
For a full list of the differences between the supported
features of the declarative API and Istio secrets, see
Supported features using Istio APIs.
Migrate from Istio secrets to declarative API
If you provisioned Cloud Service Mesh using automatic management with the
fleet feature API, you don't
need to follow these instructions.
These steps only apply if you onboarded using asmcli --managed
.
Note, this process changes secrets that point to a cluster. During this process, the endpoints are removed and then re-added. In between the endpoints being removed and added, the traffic will briefly revert to routing locally instead of load balancing to other clusters. For more information, see the GitHub issue.
To move from using Istio secrets to the declarative API, follow these steps. Execute these steps at the same time or in close succession:
Enable the declarative API for each cluster in the fleet where you want to enable multi cluster endpoint discovery by setting
multicluster_mode=connected
. Note that you need to explicitly setmulticluster_mode=disconnected
if you don't want the cluster to be discoverable.Use the following command to opt in a cluster for multi cluster endpoint discovery:
kubectl patch configmap/asm-options -n istio-system --type merge -p '{"data":{"multicluster_mode":"connected"}}'
Use the following command to opt a cluster out of endpoint discovery:
kubectl patch configmap/asm-options -n istio-system --type merge -p '{"data":{"multicluster_mode":"disconnected"}}'
Delete old secrets.
After setting
multicluster_mode=connected
on your clusters, each cluster will have a new secret generated for every other cluster that also hasmulticluster_mode=connected
set. The secret is placed in the istio-system namespace and have the following format:istio-remote-secret-projects-PROJECT_NAME-locations-LOCATION-memberships-MEMBERSHIPS
Each secret will also have the label
istio.io/owned-by: mesh.googleapis.com
applied.Once the new secrets are created, you can delete any secrets manually created with
istioctl create-remote-secret
:kubectl delete secret SECRET_NAME -n istio-system
Once migrated, check your request metrics to make sure they're routed as expected.
Enable Workload Identity Federation for GKE
Workload Identity Federation is the recommended secure method for Google Kubernetes Engine workloads. This allows access to Google Cloud services such as Compute Engine, BigQuery, and Machine Learning APIs. Workload Identity Federation doesn't require manual configuration or less secure methods like service account key files because it uses IAM policies. For more details on Workload Identity Federation, see How Workload Identity Federation for GKE works.
The following section describe how to enable Workload Identity Federation.
Enable Workload Identity Federation on clusters
Check Workload Identity Federation is enabled for your cluster. To do that, ensure the GKE cluster has a Workload Identity Federation pool set configured, which is essential for IAM credential validation.
Use the following command, to check the workload identity pool set for a cluster:
gcloud container clusters describe CLUSTER_NAME \ --format="value(workloadIdentityConfig.workloadPool)"
Replace
CLUSTER_NAME
with the name of your GKE cluster. If you haven't already specified a default zone or region for gcloud, you might also need to specify a--region
or--zone
flag when running this command.If the output is empty, follow the instructions in Update an existing cluster to enable workload identity on existing GKE clusters.
Enable Workload Identity Federation on node pools
After Workload Identity Federation is enabled on a cluster, node pools must be configured to use the GKE metadata server.
List all the node pools of a Standard cluster. Run the gcloud container node-pools list command:
gcloud container node-pools list --cluster CLUSTER_NAME
Replace
CLUSTER_NAME
with the name of your GKE cluster. If you haven't already specified a default zone or region for gcloud, you might also need to specify a--region
or--zone
flag when running this command.Verify that each node pool is using the GKE metadata server:
gcloud container node-pools describe NODEPOOL_NAME \ --cluster=CLUSTER_NAME \ --format="value(config.workloadMetadataConfig.mode)"
Replace the following:
NODEPOOL_NAME
with the name of your nodepool.CLUSTER_NAME
with the name of your GKE cluster.
If the output doesn't contain
GKE_METADATA
, update the node pool using the Update an existing node pool guide.
Enable managed container network interface (CNI)
This section guides you through enabling managed CNI for Cloud Service Mesh on Google Kubernetes Engine.
Managed CNI overview
Managed container network interface (CNI) is a Google-managed implementation of
the Istio CNI. The CNI
plugin
streamlines pod networking by configuring iptables rules. This enables traffic
redirection between applications and Envoy proxies, eliminating the need for
privileged permissions for the init-container required to manage iptables
.
The Istio CNI plugin
replaces the istio-init
container. The istio-init
container was previously
responsible for setting up the pod's network environment to enable traffic
interception for the Istio sidecar. The CNI plugin performs the same network
redirect function, but with the added benefit of reducing the need for
elevated privileges, thereby enhancing security.
Therefore, for enhanced security and reliability, and to simplify management and troubleshooting, managed CNI is required across all Managed Cloud Service Mesh deployments.
Impact on init containers
Init containers are specialized containers that run before application containers for setup tasks. Setup tasks can include tasks such as downloading configuration files, communicating with external services, or performing pre-application initialization. Init containers that rely on network access might encounter issues when managed CNI is enabled in the cluster.
The pod setup process with managed CNI is as follows:
- The CNI plugin sets up pod network interfaces, assigns pod IPs and redirects traffic to the Istio sidecar proxy which hasn't started yet.
- All init containers execute and complete.
- The Istio sidecar proxy starts alongside the application containers.
Therefore, if an init container attempts to make outbound network connections or connect to services within the mesh, the network requests from the init containers may be dropped or misrouted. This is because the Istio sidecar proxy, which manages network traffic for the pod, is not running when the requests are made. For more details, refer Istio CNI documentation.
Enable managed CNI for your cluster
Follow the steps in this section to enable managed CNI on your cluster.
Remove network dependencies from your init container. Consider the following alternatives:
- Modify application logic or containers: You can modify your services to remove the dependency on init containers that require network requests or perform network operations within your application containers, after the sidecar proxy has started.
- Use Kubernetes ConfigMaps or secrets: Store configuration data fetched by the network request in Kubernetes ConfigMaps or secrets and mount them into your application containers. For alternative solutions, refer to the Istio documentation.
Enable managed CNI on your cluster:
Make the following configuration changes:
Run the following command to locate the
controlPlaneRevision
.kubectl get controlplanerevision -n istio-system
In your
ControlPlaneRevision
(CPR) custom resource (CR), set the labelmesh.cloud.google.com/managed-cni-enabled
totrue
.kubectl label controlplanerevision CPR_NAME -n istio-system mesh.cloud.google.com/managed-cni-enabled=true --overwrite
Replace
CPR_NAME
with the value under the NAME column from the output of the previous step.In the asm-options ConfigMap, set the
ASM_OPTS
value toCNI=on
.kubectl patch configmap asm-options -n istio-system -p '{"data":{"ASM_OPTS":"CNI=on"}}'
In your
ControlPlaneRevision
(CPR) custom resource (CR), set the labelmesh.cloud.google.com/force-reprovision
totrue
. This action triggers control plane restart.kubectl label controlplanerevision CPR_NAME -n istio-system mesh.cloud.google.com/force-reprovision=true --overwrite
Check the feature state. Retrieve the feature state using the following command:
gcloud container fleet mesh describe --project FLEET_PROJECT_ID
Replace
FLEET_PROJECT_ID
with the ID of your Fleet Host project. Generally, theFLEET_PROJECT_ID
has the same name as the project.- Verify that the
MANAGED_CNI_NOT_ENABLED
condition is removed fromservicemesh.conditions
. - Note, it may take up to 15-20 minutes for the state to update. Try waiting a few minutes and re-run the command.
- Verify that the
Once the
controlPlaneManagement.state
isActive
in the cluster's feature, restart the pods.
Move away from unsupported binary usage
TRAFFIC_DIRECTOR
control plane only supports the distroless image type.
This section suggest ways to make your deployment compatible with this image
type.
Distroless envoy proxy sidecar images
Cloud Service Mesh uses two types of Envoy proxy sidecar images based on your control plane configuration, Ubuntu-based image containing various binaries and Distroless image. Distroless base images are minimal container images that prioritize security and resource optimization by only including essential components. The attack surface is reduced to help prevent vulnerabilities. For more information, refer to the documentation on Distroless proxy image.
Modernization and binary compatibility
As part of modernization,
Cloud Service Mesh is transitioning to a distroless sidecar image. This image
has a minimal set of dependencies and is stripped of all non-essential
executables, libraries, and debugging tools. It is therefore not possible to
execute a shell command or use curl, ping, or other debug utilities like
kubectl exec
inside the container.
For a cluster to be eligible for modernization, the current usage should be compatible with the distroless sidecar image.
Make clusters compatible with distroless images
- Remove references to any unsupported binaries (like bash or curl) from your configuration. Particularly inside Readiness, Startup, and Liveness probes, and Lifecycle PostStart and PreStop hooks within the istio-proxy, istio-init, or istio-validation containers.
- Consider alternatives like holdApplicationUntilProxyStarts for certain use cases.
- For debugging, you can use ephemeral containers to attach to a running workload Pod. You can then inspect it and run custom commands. For an example, see Collecting Cloud Service Mesh logs.
If you can't find a solution for your specific use case, contact Google Cloud Support at Getting support.