This page shows you how to allow workloads running in Google Kubernetes Engine (GKE) to access private image registries by using the public key of the certificate authority (CA) that issued the certificate for the registry.
This page is for Security specialists who manage access for their organization's workloads. To learn more about common roles and example tasks that we reference in Google Cloud content, see Common GKE Enterprise user roles and tasks.
Before reading this page, ensure that you're familiar with Secret Manager.
How it works
You store the public key of the CA used to issue certificates for your private registries in Secret Manager and configure which registry fully-qualified domain names (FQDNs) use that public key for certificate validation. GKE automatically fetches the key and updates the container runtime registry configuration during node bootstrapping. When you deploy a workload that uses a container image from your private registry, the following steps occur:
- The kubelet on the node tries to pull the image from the private registry.
- The registry presents a server-side TLS certificate.
- The container runtime validates the registry certificate cryptographically and to ensure that the FQDN matches what you specified.
- If the validation passes, GKE pulls the image and schedules your workload.
Benefits
This method of accessing private registries provides benefits like the following:
- Improve reliability of container runtime configuration: Using methods like DaemonSets to set the containerd configuration adds a risk of a race condition occurring, where other DaemonSets might run before your configuration DaemonSet.
- Reduce vulnerability to privilege escalation attacks: Removes the need to run privileged DaemonSets that modify your container runtime configuration.
- Reduce management overhead: Secret Manager lets you store CA public keys in a central location; manage access to the keys using IAM; and implement version control and annotations. For more information, see the Secret Manager product overview.
- Improve auditability: Cloud Logging already collects logs, including when certificates are added to a cluster and when GKE nodes pull images.
Pricing
In this document, you use the following billable components of Google Cloud:
- GKE
- Secret Manager
- Logging: GKE generates Admin Activity audit logs and, if enabled, Data Access audit logs for this feature. For information about the different types of audit logs, see GKE audit logging.
To generate a cost estimate based on your projected usage, use the pricing calculator.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
Enable the Secret Manager API.
You must already have a private registry and private CA certificates to access the registry. This guide doesn't cover setting up a private registry or creating certificates.
Requirements
To use private CA public keys to access private registries, you must meet the following requirements:
- Your clusters must use GKE version 1.27.3-gke.1700 or later.
- You must use a Container-Optimized OS with containerd node image, which is the default for all GKE clusters. Ubuntu node images with containerd aren't supported. Windows Server node images aren't supported.
- Your node pools must have the
cloud-platform
access scope for your nodes to download the certificates. For more information, see Default access scopes in GKE. This document includes instructions to set the access scope when you create a cluster or node pool.
Limitations
Consider the following limitations:
- You can't use private CA certificates in Ubuntu node images.
- You can't use private CA certificates in Windows Server nodes.
- Each cluster supports up to five private CA certificates for private registries.
- Each certificate can have up to 25 fully-qualified domain names (FQDNs).
- Each domain can only be used in a single certificate file. However, certificate bundles are supported.
- Certificates must be PEM-encoded.
- The server doesn't automatically rotate certificates. For more information, see Rotate your private CA certificates in this document.
- FQDNs have the following limitations:
- The maximum FQDN length is 255 characters, including special characters.
- FQDNs can only use letters, numbers, and dashes (-).
- Punycode isn't supported.
- Wildcard characters aren't supported.
Migrate from configuration DaemonSets
In GKE Standard clusters, you can deploy privileged DaemonSets to modify your container runtime configuration. This method directly modifies the containerd configuration on each node.
If you use privileged DaemonSets to configure access to private registries, consider the following before you use this document:
- Storing private CA public keys in Secret Manager only configures access to private registries. Other registry-related configuration isn't supported.
Enabling this feature causes your cluster to use containerd's CRI hostpath configuration model, which is incompatible with the previous configuration model. If you have any DaemonSets that modify the containerd host configuration, such as for insecure private registries, mirrors, or proxies, update the DaemonSets to use the CRI hostpath model.
For the available fields in the CRI hostpath model, see Registry configuration in the containerd GitHub repository.
When you enable this feature, GKE applies the CRI hostpath configuration model to new nodes in the cluster. Existing nodes continue to use the previous configuration model until they're recreated during events like upgrades.
Update DaemonSets to support both configuration models
To reduce the risk of your configuration DaemonSets not working on nodes that
support a specific configuration model, ensure that your DaemonSets
conditionally use a specific configuration model depending on the containerd
configuration files on the node. For an example DaemonSet that implements this
conditional logic, in the GoogleCloudPlatform/k8s-node-tools
GitHub
repository, see the
insecure-registry-config.yaml manifest.
Store your CA public keys in Secret Manager
Store the public keys of your private CAs that issue your private registry certificates as secrets in Secret Manager. For instructions, in the Secret Manager documentation, see Create a secret.
Configure access to Secret Manager from GKE
To ensure that the cluster's IAM service account has the necessary permissions to pull secrets from Secret Manager, ask your administrator to grant the cluster's IAM service account the following IAM roles on the secret:
-
Access secret content:
Secret Manager Secret Accessor (
roles/secretmanager.secretAccessor
) -
Access secret metadata:
Secret Manager Viewer (
roles/secretmanager.viewer
)
For more information about granting roles, see Manage access to projects, folders, and organizations.
These predefined roles contain the permissions required to pull secrets from Secret Manager. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to pull secrets from Secret Manager:
-
resourcemanager.projects.get
-
resourcemanager.projects.list
-
secretmanager.secrets.get
-
secretmanager.secrets.list
-
secretmanager.versions.get
-
secretmanager.versions.list
-
secretmanager.versions.access
Your administrator might also be able to give the cluster's IAM service account these permissions with custom roles or other predefined roles.
If you didn't associate a custom IAM service account with your cluster or node pool, which is our recommended approach, the cluster uses the Compute Engine default service account. If possible, we recommend that you configure your clusters and node pools with a minimally privileged IAM service account. For instructions, see Use least-privileged service accounts.
Create a runtime configuration file
To enable the ability to use private CA certificates for private registries in GKE, you create a YAML file to modify the containerd configuration.
Get your Google Cloud project number:
gcloud projects describe PROJECT_ID \ --format="value(projectNumber)"
The output is your numerical project number.
Save the following configuration as
containerd-configuration.yaml
:privateRegistryAccessConfig: certificateAuthorityDomainConfig: - gcpSecretManagerCertificateConfig: secretURI: "projects/PROJECT_NUMBER/secrets/SECRET_NAME/versions/SECRET_VERSION" fqdns: - "FQDN1" - "FQDN2" enabled: true
Replace the following:
PROJECT_NUMBER
: the project number that you got in the previous step.SECRET_VERSION
: the version number of your secret in Secret Manager. You can optionally use a version alias, but we recommend using the version number to avoid management complexity.FQDN1
,FQDN2
: the fully-qualified domain names for your private registries. You can also use an IPv4 address if a certificate was issued for that address, but we don't recommend it.
For a description of these fields, see privateRegistryAccessConfig in the Available containerd configuration options table.
Apply containerd configuration to new clusters
This section shows you how to apply a containerd configuration file when you create a new GKE cluster.
Run the following command:
gcloud container clusters create-autoCLUSTER_NAME
\ --location=LOCATION
\ --scopes="cloud-platform" \ --containerd-config-from-file="PATH_TO_CONFIG_FILE
"
Replace the following:
CLUSTER_NAME
: the name of your new cluster.LOCATION
: the Compute Engine location of your new cluster.PATH_TO_CONFIG_FILE
: the path to the configuration file that you created, like~/containerd-configuration.yaml
.
You can enable private registry configuration on new Standard clusters by running the
gcloud container clusters create
command with the same options.
Apply containerd configuration to existing clusters
This section shows you how to apply a containerd configuration to existing clusters and nodes.
Check access scopes
Existing clusters must have the cloud-platform
access scope to use
this feature. This section shows you how to check your access scopes and update an existing
cluster with a new or modified private registry configuration file.
For details about the default access scopes in new clusters, see Access scopes in GKE.
Check Autopilot access scopes
Run the following command:
gcloud container clusters describeCLUSTER_NAME
\ --location=LOCATION
\ --flatten=nodeConfig \ --format='csv[delimiter="\\n",no-heading](oauthScopes)'
If your cluster doesn't have the https://www.googleapis.com/auth/cloud-platform
access scope, create a new cluster with this access scope.
Check Standard access scopes
To check your Standard cluster access scopes, check a node pool:
gcloud container node-pools describeNODE_POOL_NAME
\ --cluster=CLUSTER_NAME
\ --location=LOCATION
\ --flatten=nodeConfig \ --format='csv[delimiter="\\n",no-heading](oauthScopes)'
Replace NODE_POOL_NAME
with the name of the node pool.
If your cluster doesn't have the
https://www.googleapis.com/auth/cloud-platform
access scope, create a new node pool
with the cloud-platform
access scope and delete your existing node pool.
Update the cluster to use your configuration file
Run the following command:
gcloud container clusters updateCLUSTER_NAME
\ --location=LOCATION
\ --containerd-config-from-file="PATH_TO_CONFIG_FILE
"
Recreate nodes in Standard clusters
If your Standard cluster doesn't use automatic upgrades, you must manually recreate your node pools to apply the new configuration. To trigger a manual node recreation, upgrade your cluster to the same GKE version that it already uses.
gcloud container clusters upgradeCLUSTER_NAME
\ --location=LOCATION
\ --cluster-version=VERSION
Replace VERSION
with the same GKE patch version that the
cluster already uses.
Verify that your cluster can access the private registry
Run the following command:
gcloud container clusters describe CLUSTER_NAME \
--location=LOCATION \
--flatten="nodePoolDefaults.nodeConfigDefaults.containerdConfig"
The output is similar to the following:
containerdConfig:
privateRegistryAccessConfig:
certificateAuthorityDomainConfig:
- fqdns:
- 203.0.113.105
gcpSecretManagerCertificateConfig:
secretUri: projects/123456789012/secrets/example-secret-name/versions/1
enabled: true
Deploy a workload that accesses a private image
In this section, you deploy a static Pod that references an image from your private registry.
Save the following manifest as
private-registry-pod.yaml
:apiVersion: v1 kind: Pod metadata: name: private-registry-pod spec: containers: - name: private-image image: IMAGE_NAME
Replace
IMAGE_NAME
with your private image name.Deploy the Pod:
kubectl create -f private-registry-pod.yaml
Rotate your private CA certificates
Secret Manager and GKE can't automatically rotate private CA certificates in Secret Manager. To perform a certificate rotation, do the following steps. These steps require that you recreate existing nodes twice. We recommend that you perform certificate rotations during scheduled downtime to minimize the impact of workload disruptions.
- Create a PEM-encoded certificate bundle that contains both your old and new certificates.
- Add the bundle as a new secret version in Secret Manager.
- Update your runtime configuration file
secretURI
field with the new secret version number. - Update your cluster to use the new secret version.
Get the timestamp of the update operation:
gcloud container operations list \ --filter="operationType ~ UPDATE_CLUSTER AND targetLink ~ CLUSTER_NAME" \ --sort-by=startTime \ --limit=1 \ --format='value(endTime)'
The output is similar to the following:
2024-01-31T09:27:30.864308964Z
Look for nodes that were created before the update operation ended:
kubectl get nodes -o json | jq ".items[] | select(.metadata.creationTimestamp | fromdateiso8601 < $(date -d CLUSTER_UPDATE_TIMESTAMP +%s)) | .metadata.name"
Replace
CLUSTER_UPDATE_TIMESTAMP
with the timestamp from the previous step.The output is a list of node names that haven't been recreated with the updated configuration. When the output is blank, proceed to the next step.
Create a new version of your secret in Secret Manager with only the new certificate.
Repeat the previous steps to update your cluster, get the operation timestamp, and verify that your nodes use the new secret version.
Delete the old secret version from Secret Manager.
View audit logs in Logging
This section shows you how to use Logging to check whether GKE installed your secret version on your nodes.
Go to the Logs Explorer page in the Google Cloud console:
Specify the following query:
resource.type="gce_instance" textPayload:"Installed certificate \\\"projects/PROJECT_NUMBER/secrets/SECRET_NAME/versions/SECRET_VERSION\\\""
If your certificate installation succeeded, the output is similar to the following:
"Installed certificate "projects/PROJECT_NUMBER/secrets/SECRET_NAME/versions/SECRET_VERSION""
If your certificate installation failed, the output is similar to the following:
"Failed to install certificate "projects/PROJECT_NUMBER/secrets/SECRET_NAME/versions/SECRET_VERSION""
Best practices
We recommend that you use the following best practices when you use this feature:
- Don't use aliases for Secret Manager secret versions. Use the auto-generated version number for each secret version. An alias might point to a different certificate version over time, which might cause complexities in tracking the specific versions that your workloads use.
- Use maintenance windows and exclusions to control when GKE can recreate your nodes to apply updated containerd configurations.
- Provide access to secrets at the secret level, not at the project level.
Disable containerd configuration options
To remove your custom configuration, do the following:
-
Update your configuration file to specify
enabled: false
in the configuration item that you want to disable and delete any other fields in the item, like in the following example:privateRegistryAccessConfig: enabled: false
- Apply the updated configuration file to your cluster. For instructions, see Apply containerd configuration to existing clusters.
Troubleshoot
For troubleshooting steps, see Troubleshooting the container runtime.