Access private registries with private CA certificates


This page shows you how to allow workloads running in Google Kubernetes Engine (GKE) to access private image registries by using the public key of the certificate authority (CA) that issued the certificate for the registry.

How it works

You store the public key of the CA used to issue certificates for your private registries in Secret Manager and configure which registry fully-qualified domain names (FQDNs) use that public key for certificate validation. GKE automatically fetches the key and updates the container runtime registry configuration during node bootstrapping. When you deploy a workload that uses a container image from your private registry, the following steps occur:

  1. The kubelet on the node tries to pull the image from the private registry.
  2. The registry presents a server-side TLS certificate.
  3. The container runtime validates the registry certificate cryptographically and to ensure that the FQDN matches what you specified.
  4. If the validation passes, GKE pulls the image and schedules your workload.

Benefits

This method of accessing private registries provides benefits like the following:

  1. Improve reliability of container runtime configuration: Using methods like DaemonSets to set the containerd configuration adds a risk of a race condition occurring, where other DaemonSets might run before your configuration DaemonSet.
  2. Reduce vulnerability to privilege escalation attacks: Removes the need to run privileged DaemonSets that modify your container runtime configuration.
  3. Reduce management overhead: Secret Manager lets you store CA public keys in a central location; manage access to the keys using IAM; and implement version control and annotations. For more information, see the Secret Manager product overview.
  4. Improve auditability: Cloud Logging already collects logs, including when certificates are added to a cluster and when GKE nodes pull images.

Pricing

In this document, you use the following billable components of Google Cloud:

  • GKE
  • Secret Manager
  • Logging: GKE generates Admin Activity audit logs and, if enabled, Data Access audit logs for this feature. For information about the different types of audit logs, see GKE audit logging.

To generate a cost estimate based on your projected usage, use the pricing calculator.

Before you begin

Before you start, make sure you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.
  • Enable the Secret Manager API.

    Enable the API

  • You must already have a private registry and private CA certificates to access the registry. This guide doesn't cover setting up a private registry or creating certificates.

Requirements

To use private CA public keys to access private registries, you must meet the following requirements:

  • Your clusters must use GKE version 1.27.3-gke.1700 or later.
  • You must use a Container-Optimized OS with containerd node image, which is the default for all GKE clusters. Ubuntu node images with containerd aren't supported. Windows Server node images aren't supported.
  • Your node pools must have the cloud-platform access scope for your nodes to download the certificates. For more information, see Default access scopes in GKE. This document includes instructions to set the access scope when you create a cluster or node pool.

Limitations

Consider the following limitations:

  • You can't use private CA certificates in Ubuntu node images.
  • You can't use private CA certificates in Windows Server nodes.
  • Each cluster supports up to five private CA certificates for private registries.
  • Each certificate can have up to 25 fully-qualified domain names (FQDNs).
  • Each domain can only be used in a single certificate file. However, certificate bundles are supported.
  • Certificates must be PEM-encoded.
  • The server doesn't automatically rotate certificates. For more information, see Rotate your private CA certificates in this document.
  • FQDNs have the following limitations:
    • The maximum FQDN length is 255 characters, including special characters.
    • FQDNs can only use letters, numbers, and dashes (-).
    • Punycode isn't supported.
    • Wildcard characters aren't supported.

Migrate from configuration DaemonSets

In GKE Standard clusters, you can deploy privileged DaemonSets to modify your container runtime configuration. This method directly modifies the containerd configuration on each node.

If you use privileged DaemonSets to configure access to private registries, consider the following before you use this document:

  • Storing private CA public keys in Secret Manager only configures access to private registries. Other registry-related configuration isn't supported.
  • Enabling this feature causes your cluster to use containerd's CRI hostpath configuration model, which is incompatible with the previous configuration model. If you have any DaemonSets that modify the containerd host configuration, such as for insecure private registries, mirrors, or proxies, update the DaemonSets to use the CRI hostpath model.

    For the available fields in the CRI hostpath model, see Registry configuration in the containerd GitHub repository.

When you enable this feature, GKE applies the CRI hostpath configuration model to new nodes in the cluster. Existing nodes continue to use the previous configuration model until they're recreated during events like upgrades.

Update DaemonSets to support both configuration models

To reduce the risk of your configuration DaemonSets not working on nodes that support a specific configuration model, ensure that your DaemonSets conditionally use a specific configuration model depending on the containerd configuration files on the node. For an example DaemonSet that implements this conditional logic, in the GoogleCloudPlatform/k8s-node-tools GitHub repository, see the insecure-registry-config.yaml manifest.

Store your CA public keys in Secret Manager

Store the public keys of your private CAs that issue your private registry certificates as secrets in Secret Manager. For instructions, in the Secret Manager documentation, see Create a secret.

Configure access to Secret Manager from GKE

To ensure that the cluster's IAM service account has the necessary permissions to pull secrets from Secret Manager, ask your administrator to grant the cluster's IAM service account the following IAM roles on the secret:

For more information about granting roles, see Manage access to projects, folders, and organizations.

These predefined roles contain the permissions required to pull secrets from Secret Manager. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to pull secrets from Secret Manager:

  • resourcemanager.projects.get
  • resourcemanager.projects.list
  • secretmanager.secrets.get
  • secretmanager.secrets.list
  • secretmanager.versions.get
  • secretmanager.versions.list
  • secretmanager.versions.access

Your administrator might also be able to give the cluster's IAM service account these permissions with custom roles or other predefined roles.

If you didn't associate a custom IAM service account with your cluster or node pool, which is our recommended approach, the cluster uses the Compute Engine default service account. If possible, we recommend that you configure your clusters and node pools with a minimally privileged IAM service account. For instructions, see Use least-privileged service accounts.

Create a runtime configuration file

To enable the ability to use private CA certificates for private registries in GKE, you create a YAML file to modify the containerd configuration.

  1. Get your Google Cloud project number:

    gcloud projects describe PROJECT_ID \
        --format="value(projectNumber)"
    

    The output is your numerical project number.

  2. Save the following configuration as containerd-configuration.yaml:

    privateRegistryAccessConfig:
      certificateAuthorityDomainConfig:
      - gcpSecretManagerCertificateConfig:
          secretURI: "projects/PROJECT_NUMBER/secrets/SECRET_NAME/versions/SECRET_VERSION"
        fqdns:
          - "FQDN1"
          - "FQDN2"
      enabled: true
    

    Replace the following:

    • PROJECT_NUMBER: the project number that you got in the previous step.
    • SECRET_VERSION: the version number of your secret in Secret Manager. You can optionally use a version alias, but we recommend using the version number to avoid management complexity.
    • FQDN1, FQDN2: the fully-qualified domain names for your private registries. You can also use an IPv4 address if a certificate was issued for that address, but we don't recommend it.

For a description of these fields, see privateRegistryAccessConfig in the Available containerd configuration options table.

Apply containerd configuration to new clusters

This section shows you how to apply a containerd configuration file when you create a new GKE cluster.

Run the following command:

gcloud container clusters create-auto CLUSTER_NAME \
    --location=LOCATION \
    --scopes="cloud-platform" \
    --containerd-config-from-file="PATH_TO_CONFIG_FILE"

Replace the following:

  • CLUSTER_NAME: the name of your new cluster.
  • LOCATION: the Compute Engine location of your new cluster.
  • PATH_TO_CONFIG_FILE: the path to the configuration file that you created, like ~/containerd-configuration.yaml.

You can enable private registry configuration on new Standard clusters by running the gcloud container clusters create command with the same options.

Apply containerd configuration to existing clusters

This section shows you how to apply a containerd configuration to existing clusters and nodes.

Check access scopes

Existing clusters must have the cloud-platform access scope to use this feature. This section shows you how to check your access scopes and update an existing cluster with a new or modified private registry configuration file.

For details about the default access scopes in new clusters, see Access scopes in GKE.

Check Autopilot access scopes

Run the following command:

gcloud container clusters describe CLUSTER_NAME \
    --location=LOCATION \
    --flatten=nodeConfig \
    --format='csv[delimiter="\\n",no-heading](oauthScopes)'

If your cluster doesn't have the https://www.googleapis.com/auth/cloud-platform access scope, create a new cluster with this access scope.

Check Standard access scopes

To check your Standard cluster access scopes, check a node pool:

gcloud container node-pools describe NODE_POOL_NAME \
    --cluster=CLUSTER_NAME \
    --location=LOCATION \
    --flatten=nodeConfig \
    --format='csv[delimiter="\\n",no-heading](oauthScopes)'

Replace NODE_POOL_NAME with the name of the node pool.

If your cluster doesn't have the https://www.googleapis.com/auth/cloud-platform access scope, create a new node pool with the cloud-platform access scope and delete your existing node pool.

Update the cluster to use your configuration file

Run the following command:

gcloud container clusters update CLUSTER_NAME \
    --location=LOCATION \
    --containerd-config-from-file="PATH_TO_CONFIG_FILE"

Recreate nodes in Standard clusters

If your Standard cluster doesn't use automatic upgrades, you must manually recreate your node pools to apply the new configuration. To trigger a manual node recreation, upgrade your cluster to the same GKE version that it already uses.

gcloud container clusters upgrade CLUSTER_NAME \
    --location=LOCATION \
    --cluster-version=VERSION

Replace VERSION with the same GKE patch version that the cluster already uses.

Verify that your cluster can access the private registry

Run the following command:

gcloud container clusters describe CLUSTER_NAME \
    --location=LOCATION \
    --flatten="nodePoolDefaults.nodeConfigDefaults.containerdConfig"

The output is similar to the following:

    containerdConfig:
      privateRegistryAccessConfig:
        certificateAuthorityDomainConfig:
        - fqdns:
          - 203.0.113.105
          gcpSecretManagerCertificateConfig:
            secretUri: projects/123456789012/secrets/example-secret-name/versions/1
        enabled: true

Deploy a workload that accesses a private image

In this section, you deploy a static Pod that references an image from your private registry.

  1. Save the following manifest as private-registry-pod.yaml:

    apiVersion: v1
    kind: Pod
    metadata:
      name: private-registry-pod
    spec:
      containers:
      - name: private-image
        image: IMAGE_NAME
    

    Replace IMAGE_NAME with your private image name.

  2. Deploy the Pod:

    kubectl create -f private-registry-pod.yaml
    

Rotate your private CA certificates

Secret Manager and GKE can't automatically rotate private CA certificates in Secret Manager. To perform a certificate rotation, do the following steps. These steps require that you recreate existing nodes twice. We recommend that you perform certificate rotations during scheduled downtime to minimize the impact of workload disruptions.

  1. Create a PEM-encoded certificate bundle that contains both your old and new certificates.
  2. Add the bundle as a new secret version in Secret Manager.
  3. Update your runtime configuration file secretURI field with the new secret version number.
  4. Update your cluster to use the new secret version.
  5. Get the timestamp of the update operation:

    gcloud container operations list \
        --filter="operationType ~ UPDATE_CLUSTER AND targetLink ~ CLUSTER_NAME" \
        --sort-by=startTime \
        --limit=1 \
        --format='value(endTime)'
    

    The output is similar to the following:

    2024-01-31T09:27:30.864308964Z
    
  6. Look for nodes that were created before the update operation ended:

    kubectl get nodes -o json | jq ".items[] |
    select(.metadata.creationTimestamp | fromdateiso8601 < $(date -d
    CLUSTER_UPDATE_TIMESTAMP +%s)) | .metadata.name"
    

    Replace CLUSTER_UPDATE_TIMESTAMP with the timestamp from the previous step.

    The output is a list of node names that haven't been recreated with the updated configuration. When the output is blank, proceed to the next step.

  7. Create a new version of your secret in Secret Manager with only the new certificate.

  8. Repeat the previous steps to update your cluster, get the operation timestamp, and verify that your nodes use the new secret version.

  9. Delete the old secret version from Secret Manager.

View audit logs in Logging

This section shows you how to use Logging to check whether GKE installed your secret version on your nodes.

  1. Go to the Logs Explorer page in the Google Cloud console:

    Go to Logs Explorer

  2. Specify the following query:

    resource.type="gce_instance"
    textPayload:"Installed certificate \\\"projects/PROJECT_NUMBER/secrets/SECRET_NAME/versions/SECRET_VERSION\\\""
    

    If your certificate installation succeeded, the output is similar to the following:

    "Installed certificate "projects/PROJECT_NUMBER/secrets/SECRET_NAME/versions/SECRET_VERSION""
    

    If your certificate installation failed, the output is similar to the following:

    "Failed to install certificate "projects/PROJECT_NUMBER/secrets/SECRET_NAME/versions/SECRET_VERSION""
    

Best practices

We recommend that you use the following best practices when you use this feature:

  • Don't use aliases for Secret Manager secret versions. Use the auto-generated version number for each secret version. An alias might point to a different certificate version over time, which might cause complexities in tracking the specific versions that your workloads use.
  • Use maintenance windows and exclusions to control when GKE can recreate your nodes to apply updated containerd configurations.
  • Provide access to secrets at the secret level, not at the project level.

Disable containerd configuration options

To remove your custom configuration, do the following:

  1. Update your configuration file to specify enabled: false in the configuration item that you want to disable and delete any other fields in the item, like in the following example:

    privateRegistryAccessConfig:
      enabled: false
  2. Apply the updated configuration file to your cluster. For instructions, see Apply containerd configuration to existing clusters.

Troubleshoot

For troubleshooting steps, see Troubleshooting the container runtime.