This page explains Workload Identity Federation for GKE, including how it works, how enabling it affects your GKE clusters, and how to grant roles to Kubernetes entities in Identity and Access Management policies. In most cases, Workload Identity Federation for GKE is the recommended way for your workloads running on GKE to access Google Cloud services in a secure and manageable way.
This page is for Security specialists and Operators who manage workloads on GKE that require access to other Google Cloud services. To learn more about common roles and example tasks that we reference in Google Cloud content, see Common GKE Enterprise user roles and tasks.
Before reading this page, ensure that you're familiar with the following resources:
- To learn about Workload Identity Federation in other environments, see Workload Identity Federation.
- To enable and use Workload Identity Federation for GKE, see Access Google Cloud APIs from GKE workloads.
- To provide Workload Identity Federation support for clusters in fleets, use fleet workload identity.
Terminology
This page distinguishes between Kubernetes service accounts and Identity and Access Management (IAM) service accounts.
- Kubernetes service accounts
- Kubernetes resources that provide an identity for processes running in your GKE pods.
- IAM service accounts
- Google Cloud resources that allow applications to make authorized calls to Google Cloud APIs.
What is Workload Identity Federation for GKE?
Applications running on GKE might need access to Google Cloud APIs such as Compute Engine API, BigQuery Storage API, or Machine Learning APIs.
Workload Identity Federation for GKE lets you use IAM policies to grant Kubernetes workloads in your GKE cluster access to specific Google Cloud APIs without needing manual configuration or less secure methods like service account key files. Using Workload Identity Federation for GKE lets you assign distinct, fine-grained identities and authorization for each application in your cluster.
Workload Identity Federation for GKE replaces the need to use Metadata concealment. The sensitive metadata protected by metadata concealment is also protected by Workload Identity Federation for GKE.
Workload Identity Federation for GKE is available through IAM Workload Identity Federation, which provides identities for workloads that run in environments inside and outside Google Cloud. You can use IAM Workload Identity Federation to securely authenticate to supported Google Cloud APIs from workloads running on, for example, AWS, Azure, and self-managed Kubernetes. In GKE, Google Cloud manages the workload identity pool and provider for you and doesn't require an external identity provider.
How Workload Identity Federation for GKE works
When you enable Workload Identity Federation for GKE on a cluster, GKE does the following:
Creates a fixed workload identity pool for the cluster's Google Cloud project with the following format:
PROJECT_ID.svc.id.goog
The workload identity pool provides a naming format that allows IAM to understand and trust Kubernetes credentials. GKE doesn't delete this workload identity pool even if you delete all of the clusters in your project.
Registers the GKE cluster as an identity provider in the workload identity pool.
Deploys the GKE metadata server, which intercepts credential requests from workloads, on every node.
Create IAM allow policies on Google Cloud resources
To provide access with Workload Identity Federation for GKE, you create an IAM
allow policy that grants access on a specific Google Cloud resource
to a principal that corresponds to your application's identity. For example,
you could give read permissions on a Cloud Storage bucket to all Pods that
use the database-reader
Kubernetes ServiceAccount.
For a list of resources that support allow policies, see Resource types that accept allow policies.
Use conditions in IAM policies
You can also limit the scope of the access by setting conditions in your allow policies. Conditions are an extensible method of specifying when an allow policy should apply. For example, you could use conditions to grant temporary access to a workload on a specific Google Cloud resource, eliminating the need to manage that access manually.
Conditions might also be useful if you set your allow policies at the project, folder, or organization level instead of on specific resources like Secret Manager secrets or Cloud Storage buckets.
To add a condition to your allow policy, use the following resources:
- Manage conditional role bindings: Add, modify, or remove conditional role bindings.
- Configure temporary access: Use conditions to set expiring access to Google Cloud resources in allow policies.
- Tags and conditional access: Use conditions to only apply allow policies when resources have specific tags.
The following example expressions are for common scenarios in which you might use conditions. For a list of available attributes in expressions, see Attribute reference for IAM Conditions.
Example condition expressions | |
---|---|
Allow access before the specified time | request.time < timestamp(' Replace |
Allow access if the resource in the request has the specified tag | resource.matchTag(' Replace the following:
|
Reference Kubernetes resources in IAM policies
In your IAM policy, you refer to a Kubernetes resource by using an IAM principal identifier to select the resource. This identifier has the following syntax:
PREFIX://iam.googleapis.com/projects/1234567890/locations/global/workloadIdentityPools/example-project.svc.id.goog/SELECTOR
In this example, consider the following fields:
PREFIX
: must beprincipal
orprincipalSet
depending on the resource that you select.principal
is for a specific resource, like a single ServiceAccount.principalSet
is for multiple resources that belong to the specified resource, like all Pods in a specific cluster.SELECTOR
: a string that selects a principal type. For example,kubernetes.serviceaccount.uid/SERVICEACCOUNT_UID
selects a ServiceAccount by its UID.
The following table shows the supported principal types in GKE:
Principal identifier type | Syntax |
---|---|
All Pods that use a specific Kubernetes ServiceAccount | Select the ServiceAccount by name:
principal://iam.googleapis.com/projects/ Replace the following:
Select the ServiceAccount by UID: principal://iam.googleapis.com/projects/ Replace the following:
|
All Pods in a namespace, regardless of service account or cluster | principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/namespace/NAMESPACE Replace the following:
|
All Pods in a specific cluster | principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/kubernetes.cluster/https://container.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/clusters/CLUSTER_NAME Replace the following:
|
Credential flow
When a workload sends a request to access a Google Cloud API, for example when using a Google Cloud client library, the following authentication steps occur:
- Application default credentials (ADC) requests a Google Cloud access token from the Compute Engine metadata server that runs on the VM.
- The GKE metadata server intercepts the token request and asks the Kubernetes API server for a Kubernetes ServiceAccount token that identifies the requesting workload. This credential is a JSON web token (JWT) that's signed by the API server.
- The GKE metadata server uses Security Token Service to exchange the JWT for a short-lived federated access token that references the identity of the Kubernetes workload.
The federated access token that's returned by Security Token Service might have limitations when trying to access some Google Cloud services, as described in Supported products and limitations. If your selected Google Cloud service has limitations, you can optionally configure service account impersonation. This method results in an access token for an IAM service account that your workload can use to access the target service. For details, see link Kubernetes ServiceAccounts to IAM.
The workload can then access any Google Cloud APIs that the IAM principal identifier of the workload can access.
Identity sameness
If the metadata in your principal identifier is the same for workloads in multiple clusters that share a workload identity pool because they belong to the same Google Cloud project, IAM identifies those workloads as the same. For example, if you have the same namespace in two clusters and you grant access to that namespace in IAM, the workloads in that namespace in both clusters get that access. You can limit this access to specific clusters by using conditional IAM policies.
For example, consider the following diagram. Clusters A and B belong to the
same workload identity pool. Google Cloud identifies applications that use
the back-ksa
ServiceAccount in the backend
namespace of both Cluster A and
Cluster B as the same identity. IAM doesn't distinguish between
the clusters making the calls.
This identity sameness also means that you must be able to trust every cluster
in a specific workload identity pool. For example, if a new cluster, Cluster C
in the previous example was owned by an untrusted team, they could create a
backend
namespace and access Google Cloud APIs using the back-ksa
ServiceAccount, just like Cluster A and Cluster B.
To avoid untrusted access, place your clusters in separate projects to ensure that they get different workload identity pools, or ensure that the namespace names are distinct from each other to avoid a common principal identifier.
Understanding the GKE metadata server
Every node in a GKE with Workload Identity Federation for GKE enabled stores its metadata on the GKE metadata server. The GKE metadata server is a subset of the Compute Engine metadata server endpoints required for Kubernetes workloads.
The GKE metadata server runs as a DaemonSet, with one Pod on
every Linux node or a native Windows service on every Windows node in the
cluster. The metadata server intercepts HTTP requests to http://metadata.google.internal
(169.254.169.254:80
). For example, the GET
/computeMetadata/v1/instance/service-accounts/default/token
request retrieves a
token for the IAM service account that the Pod is configured to impersonate.
Traffic to the GKE metadata server never leaves the VM instance
that hosts the Pod.
The following tables describe the subset of Compute Engine metadata server endpoints available with the GKE metadata server. For a full list of endpoints available in the Compute Engine metadata server, see Default VM metadata values.
Instance metadata
Instance metadata is stored under the following directory.
http://metadata.google.internal/computeMetadata/v1/instance/
Entry | Description |
---|---|
hostname |
The hostname of your node. |
id |
The unique ID of your node. |
service-accounts/ |
A directory of service accounts associated with the node. For each service account, the following information is available:
|
zone |
The Compute Engine zone of your GKE node. |
Instance attributes
Instance attributes are stored under the following directory.
http://metadata.google.internal/computeMetadata/v1/instance/attributes/
Entry | Description |
---|---|
cluster-location |
The Compute Engine zone or region of your cluster. |
cluster-name |
The name of your GKE cluster. |
cluster-uid |
The UID of your GKE cluster. |
Project metadata
Cluster project metadata is stored under the following directory.
http://metadata.google.internal/computeMetadata/v1/project/
Entry | Description |
---|---|
project-id |
Your Google Cloud project ID. |
numeric-project-id |
Your Google Cloud project number. |
Restrictions of Workload Identity Federation for GKE
You can't change the name of the workload identity pool that GKE creates for your Google Cloud project.
When GKE enables the GKE metadata server on a node pool, Pods can no longer access the Compute Engine metadata server. Instead, the GKE metadata server intercepts requests made from these pods to metadata endpoints, with the exception of Pods running on the host network.
The GKE metadata server takes a few seconds to start accepting requests on a newly created Pod. Therefore, attempts to authenticate using Workload Identity Federation for GKE within the first few seconds of a Pod's life might fail. Retrying the call will resolve the problem. See Troubleshooting for more details.
GKE built-in logging and monitoring agents continue to use the node's service account.
Workload Identity Federation for GKE requires manual setup for Knative serving to continue releasing request metrics.
Workload Identity Federation for GKE sets a limit of 200 connections to the GKE metadata server for each node to avoid memory issues. You may experience timeouts if your nodes exceed this limit.
Workload Identity Federation for GKE for Windows Server nodes is available in GKE versions 1.18.16-gke.1200, 1.19.8-gke.1300, 1.20.4-gke.1500 and later.
The GKE metadata server uses memory resources proportional to the total number of Kubernetes service accounts in your cluster. If your cluster has more than 3000 Kubernetes service accounts, the kubelet might terminate the metadata server Pods. For mitigations, refer to Troubleshooting.
Alternatives to Workload Identity Federation for GKE
You can use one of the following alternatives to Workload Identity Federation for GKE to access Google Cloud APIs from GKE. We recommend that you use Workload Identity Federation for GKE because these alternatives require you to make certain security compromises.
Use the Compute Engine default service account of your nodes. You can run node pools as any IAM service account in your project. If you don't specify a service account during node pool creation, GKE uses the Compute Engine default service account for the project. The Compute Engine service account is shared by all workloads deployed on that node. This can result in over-provisioning of permissions, which violates the principle of least privilege and is inappropriate for multi-tenant clusters.
Export service account keys and store them as Kubernetes Secrets that you mount to your Pods as volumes.
What's next
- Learn how to enable and configure Workload Identity Federation for GKE.
- Learn about the Compute Engine metadata server.