About Workload Identity Federation for GKE


This page explains Workload Identity Federation for GKE, including how it works, how enabling it affects your GKE clusters, and how to grant roles to Kubernetes entities in Identity and Access Management policies. In most cases, Workload Identity Federation for GKE is the recommended way for your workloads running on GKE to access Google Cloud services in a secure and manageable way.

This page is for Security specialists and Operators who manage workloads on GKE that require access to other Google Cloud services. To learn more about common roles and example tasks that we reference in Google Cloud content, see Common GKE Enterprise user roles and tasks.

Before reading this page, ensure that you're familiar with the following resources:

Terminology

This page distinguishes between Kubernetes service accounts and Identity and Access Management (IAM) service accounts.

Kubernetes service accounts
Kubernetes resources that provide an identity for processes running in your GKE pods.
IAM service accounts
Google Cloud resources that allow applications to make authorized calls to Google Cloud APIs.

What is Workload Identity Federation for GKE?

Applications running on GKE might need access to Google Cloud APIs such as Compute Engine API, BigQuery Storage API, or Machine Learning APIs.

Workload Identity Federation for GKE lets you use IAM policies to grant Kubernetes workloads in your GKE cluster access to specific Google Cloud APIs without needing manual configuration or less secure methods like service account key files. Using Workload Identity Federation for GKE lets you assign distinct, fine-grained identities and authorization for each application in your cluster.

Workload Identity Federation for GKE replaces the need to use Metadata concealment. The sensitive metadata protected by metadata concealment is also protected by Workload Identity Federation for GKE.

Workload Identity Federation for GKE is available through IAM Workload Identity Federation, which provides identities for workloads that run in environments inside and outside Google Cloud. You can use IAM Workload Identity Federation to securely authenticate to supported Google Cloud APIs from workloads running on, for example, AWS, Azure, and self-managed Kubernetes. In GKE, Google Cloud manages the workload identity pool and provider for you and doesn't require an external identity provider.

How Workload Identity Federation for GKE works

When you enable Workload Identity Federation for GKE on a cluster, GKE does the following:

  • Creates a fixed workload identity pool for the cluster's Google Cloud project with the following format:

    PROJECT_ID.svc.id.goog
    

    The workload identity pool provides a naming format that allows IAM to understand and trust Kubernetes credentials. GKE doesn't delete this workload identity pool even if you delete all of the clusters in your project.

  • Registers the GKE cluster as an identity provider in the workload identity pool.

  • Deploys the GKE metadata server, which intercepts credential requests from workloads, on every node.

Create IAM allow policies on Google Cloud resources

To provide access with Workload Identity Federation for GKE, you create an IAM allow policy that grants access on a specific Google Cloud resource to a principal that corresponds to your application's identity. For example, you could give read permissions on a Cloud Storage bucket to all Pods that use the database-reader Kubernetes ServiceAccount.

For a list of resources that support allow policies, see Resource types that accept allow policies.

Use conditions in IAM policies

You can also limit the scope of the access by setting conditions in your allow policies. Conditions are an extensible method of specifying when an allow policy should apply. For example, you could use conditions to grant temporary access to a workload on a specific Google Cloud resource, eliminating the need to manage that access manually.

Conditions might also be useful if you set your allow policies at the project, folder, or organization level instead of on specific resources like Secret Manager secrets or Cloud Storage buckets.

To add a condition to your allow policy, use the following resources:

The following example expressions are for common scenarios in which you might use conditions. For a list of available attributes in expressions, see Attribute reference for IAM Conditions.

Example condition expressions
Allow access before the specified time
request.time < timestamp('TIMESTAMP')

Replace TIMESTAMP with a timestamp in UTC, like 2024-08-30T00:00:00.000Z.

Allow access if the resource in the request has the specified tag
resource.matchTag('TAG_KEY', 'TAG_VALUE')

Replace the following:

  • TAG_KEY: the tag key to match, like env
  • TAG_VALUE: the value of the tag, like dev

Reference Kubernetes resources in IAM policies

In your IAM policy, you refer to a Kubernetes resource by using an IAM principal identifier to select the resource. This identifier has the following syntax:

PREFIX://iam.googleapis.com/projects/1234567890/locations/global/workloadIdentityPools/example-project.svc.id.goog/SELECTOR

In this example, consider the following fields:

  • PREFIX: must be principal or principalSet depending on the resource that you select. principal is for a specific resource, like a single ServiceAccount. principalSet is for multiple resources that belong to the specified resource, like all Pods in a specific cluster.
  • SELECTOR: a string that selects a principal type. For example, kubernetes.serviceaccount.uid/SERVICEACCOUNT_UID selects a ServiceAccount by its UID.

The following table shows the supported principal types in GKE:

Principal identifier type Syntax
All Pods that use a specific Kubernetes ServiceAccount Select the ServiceAccount by name:
principal://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/subject/ns/NAMESPACE/sa/SERVICEACCOUNT

Replace the following:

  • PROJECT_NUMBER: your numerical project number. To get the project number, see Identifying projects.
  • PROJECT_ID: your Google Cloud project ID.
  • NAMESPACE: the Kubernetes namespace.
  • SERVICEACCOUNT: the Kubernetes ServiceAccount name.

Select the ServiceAccount by UID:
principal://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/kubernetes.serviceaccount.uid/SERVICEACCOUNT_UID

Replace the following:

  • PROJECT_NUMBER: your numerical project number. To get the project number, see Identifying projects.
  • PROJECT_ID: your Google Cloud project ID.
  • SERVICEACCOUNT_UID: the UID of the ServiceAccount object in the API server.
All Pods in a namespace, regardless of service account or cluster
principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/namespace/NAMESPACE

Replace the following:

  • PROJECT_NUMBER: your numerical project number. To get the project number, see Identifying projects.
  • PROJECT_ID: your Google Cloud project ID.
  • NAMESPACE: the Kubernetes namespace.
All Pods in a specific cluster
principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/kubernetes.cluster/https://container.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/clusters/CLUSTER_NAME

Replace the following:

  • PROJECT_NUMBER: your numerical project number. To get the project number, see Identifying projects.
  • PROJECT_ID: your Google Cloud project ID.
  • CLUSTER_NAME: the name of your GKE cluster.
  • LOCATION: the location of your cluster.

Credential flow

When a workload sends a request to access a Google Cloud API, for example when using a Google Cloud client library, the following authentication steps occur:

How a workload gets an IAM service account token with Workload Identity.
Figure 1: How a workload gets a federated access token with Workload Identity Federation for GKE.
  1. Application default credentials (ADC) requests a Google Cloud access token from the Compute Engine metadata server that runs on the VM.
  2. The GKE metadata server intercepts the token request and asks the Kubernetes API server for a Kubernetes ServiceAccount token that identifies the requesting workload. This credential is a JSON web token (JWT) that's signed by the API server.
  3. The GKE metadata server uses Security Token Service to exchange the JWT for a short-lived federated access token that references the identity of the Kubernetes workload.

The federated access token that's returned by Security Token Service might have limitations when trying to access some Google Cloud services, as described in Supported products and limitations. If your selected Google Cloud service has limitations, you can optionally configure service account impersonation. This method results in an access token for an IAM service account that your workload can use to access the target service. For details, see link Kubernetes ServiceAccounts to IAM.

The workload can then access any Google Cloud APIs that the IAM principal identifier of the workload can access.

Identity sameness

If the metadata in your principal identifier is the same for workloads in multiple clusters that share a workload identity pool because they belong to the same Google Cloud project, IAM identifies those workloads as the same. For example, if you have the same namespace in two clusters and you grant access to that namespace in IAM, the workloads in that namespace in both clusters get that access. You can limit this access to specific clusters by using conditional IAM policies.

For example, consider the following diagram. Clusters A and B belong to the same workload identity pool. Google Cloud identifies applications that use the back-ksa ServiceAccount in the backend namespace of both Cluster A and Cluster B as the same identity. IAM doesn't distinguish between the clusters making the calls.

Diagram illustrating identity sameness within a workload identity pool
Figure 2: Identity sameness accessing Google Cloud APIs with Workload Identity Federation for GKE.

This identity sameness also means that you must be able to trust every cluster in a specific workload identity pool. For example, if a new cluster, Cluster C in the previous example was owned by an untrusted team, they could create a backend namespace and access Google Cloud APIs using the back-ksa ServiceAccount, just like Cluster A and Cluster B.

To avoid untrusted access, place your clusters in separate projects to ensure that they get different workload identity pools, or ensure that the namespace names are distinct from each other to avoid a common principal identifier.

Understanding the GKE metadata server

Every node in a GKE with Workload Identity Federation for GKE enabled stores its metadata on the GKE metadata server. The GKE metadata server is a subset of the Compute Engine metadata server endpoints required for Kubernetes workloads.

The GKE metadata server runs as a DaemonSet, with one Pod on every Linux node or a native Windows service on every Windows node in the cluster. The metadata server intercepts HTTP requests to http://metadata.google.internal (169.254.169.254:80). For example, the GET /computeMetadata/v1/instance/service-accounts/default/token request retrieves a token for the IAM service account that the Pod is configured to impersonate. Traffic to the GKE metadata server never leaves the VM instance that hosts the Pod.

The following tables describe the subset of Compute Engine metadata server endpoints available with the GKE metadata server. For a full list of endpoints available in the Compute Engine metadata server, see Default VM metadata values.

Instance metadata

Instance metadata is stored under the following directory.

http://metadata.google.internal/computeMetadata/v1/instance/

Entry Description
hostname

The hostname of your node.

id

The unique ID of your node.

service-accounts/

A directory of service accounts associated with the node. For each service account, the following information is available:

  • aliases
  • email: the service account email address.
  • identity: a JSON Web Token (JWT) unique to the node. You must include the audience parameter in your request. For example, ?audience=http://www.example.com.
  • scopes: the access scopes assigned to the service account.
  • token: the OAuth 2.0 access token to authenticate your workloads.
zone

The Compute Engine zone of your GKE node.

Instance attributes

Instance attributes are stored under the following directory.

http://metadata.google.internal/computeMetadata/v1/instance/attributes/

Entry Description
cluster-location

The Compute Engine zone or region of your cluster.

cluster-name

The name of your GKE cluster.

cluster-uid

The UID of your GKE cluster.

Project metadata

Cluster project metadata is stored under the following directory.

http://metadata.google.internal/computeMetadata/v1/project/

Entry Description
project-id

Your Google Cloud project ID.

numeric-project-id

Your Google Cloud project number.

Restrictions of Workload Identity Federation for GKE

  • You can't change the name of the workload identity pool that GKE creates for your Google Cloud project.

  • When GKE enables the GKE metadata server on a node pool, Pods can no longer access the Compute Engine metadata server. Instead, the GKE metadata server intercepts requests made from these pods to metadata endpoints, with the exception of Pods running on the host network.

  • The GKE metadata server takes a few seconds to start accepting requests on a newly created Pod. Therefore, attempts to authenticate using Workload Identity Federation for GKE within the first few seconds of a Pod's life might fail. Retrying the call will resolve the problem. See Troubleshooting for more details.

  • GKE built-in logging and monitoring agents continue to use the node's service account.

  • Workload Identity Federation for GKE requires manual setup for Knative serving to continue releasing request metrics.

  • Workload Identity Federation for GKE sets a limit of 200 connections to the GKE metadata server for each node to avoid memory issues. You may experience timeouts if your nodes exceed this limit.

  • Workload Identity Federation for GKE for Windows Server nodes is available in GKE versions 1.18.16-gke.1200, 1.19.8-gke.1300, 1.20.4-gke.1500 and later.

  • The GKE metadata server uses memory resources proportional to the total number of Kubernetes service accounts in your cluster. If your cluster has more than 3000 Kubernetes service accounts, the kubelet might terminate the metadata server Pods. For mitigations, refer to Troubleshooting.

Alternatives to Workload Identity Federation for GKE

You can use one of the following alternatives to Workload Identity Federation for GKE to access Google Cloud APIs from GKE. We recommend that you use Workload Identity Federation for GKE because these alternatives require you to make certain security compromises.

What's next