Isolate your workloads in dedicated node pools

This page shows you how to isolate your container workloads on dedicated node pools in Google Distributed Cloud (GDC) air-gapped to give you more control of your pods. Workload isolation provides some benefits, such as:

  • Reduced risk of privilege escalation attacks in your Kubernetes cluster.
  • More control over pods that require additional resources.

For these cases, consider isolating your container workloads for more control and optimization.

Why should I isolate my workloads?

Isolating your workloads on dedicated node pools is not required, but can be a prudent action to take to avoid potential issues. Be aware, however, that managing dedicated node pools requires more oversight, and is many times unnecessary.

Kubernetes clusters use privileged GDC-managed workloads to enable specific cluster capabilities and features, such as metrics gathering. These workloads are given special permissions to run correctly in the cluster.

Workloads that you deploy to your nodes might have the potential to be compromised by a malicious entity. Running these workloads alongside privileged GDC-managed workloads means that an attacker who breaks out of a compromised container can use the credentials of the privileged workload on the node to escalate privileges in your cluster.

Dedicated node pools are also useful when you must schedule pods that require more resources than others, such as more memory or more local disk space.

You can use the following mechanisms to schedule your workloads on a dedicated node pool:

A node taint informs your Kubernetes cluster to avoid scheduling workloads without a corresponding toleration, such as GDC-managed workloads, on those nodes. The node affinity on your own workloads tells the cluster to schedule your pods on the dedicated nodes.

Limitations of node isolation

  • Attackers can still initiate Denial-of-Service (DoS) attacks from the compromised node.

  • Compromised nodes can still read many resources, including all pods and namespaces in the cluster.

  • Compromised nodes can access secrets and credentials used by every pod running on that node.

  • Using a separate node pool to isolate your workloads can impact your cost efficiency, autoscaling, and resource utilization.

  • Compromised nodes can still bypass egress network policies.

  • Some GDC-managed workloads must run on every node in your cluster, and are configured to tolerate all taints.

  • If you deploy DaemonSet resources that have elevated permissions and can tolerate any taint, those pods might be a pathway for privilege escalation from a compromised node.

How node isolation works

To implement node isolation for your workloads, you must do the following:

  1. Taint and label a node pool for your workloads.

  2. Update your workloads with the corresponding toleration and node affinity rule.

This guide assumes that you start with one node pool in your cluster. Using node affinity in addition to node taints isn't mandatory, but we recommend it because you benefit from greater control over scheduling.

Before you begin

Before you start, make sure you have performed the following tasks:

  • Choose a specific name for the node taint and the node label that you want to use for the dedicated node pools. For example, workloadType=untrusted.

  • If necessary, ask your Organization IAM Admin to grant you the User Cluster Admin role (user-cluster-admin), which is not bound to a namespace.

Taint and label a new node pool

When you apply a taint or a label to a new node pool, all nodes, including any nodes added later, will automatically get the specified taints and labels.

To add a taint and a label to a new node pool, complete the following steps:

  1. Edit the nodePools section of the Cluster custom resource directly when creating the node pool:

    nodePools:
      ...
      - machineTypeName: n2-standard-2-gdc
        name: nodepool-1
        nodeCount: 3
        taints: TAINT_KEY=TAINT_VALUE:TAINT_EFFECT
        labels: LABEL_KEY=LABEL_VALUE
    

    Replace the following:

    • TAINT_KEY=TAINT_VALUE: a key-value pair associated with a scheduling TAINT_EFFECT. For example, workloadType=untrusted.
    • TAINT_EFFECT: one of the following effect values:
      • NoSchedule: pods that don't tolerate this taint are not scheduled on the node; existing pods are not evicted from the node.
      • PreferNoSchedule: Kubernetes avoids scheduling pods that don't tolerate this taint onto the node.
      • NoExecute: the pod is evicted from the node if it's already running on the node, and is not scheduled onto the node if it's not yet running on the node.
    • LABEL_KEY=LABEL_VALUE: the key-value pairs for the node labels, which correspond to the selectors that you specify in your workload manifests.
  2. Apply the Cluster resource to create the new node pool:

    kubectl apply -f cluster.yaml \
        --kubeconfig ORG_ADMIN_CLUSTER_KUBECONFIG
    

Add a toleration and a node affinity rule

After you taint the dedicated node pool, no workloads can schedule on it unless they have a toleration corresponding to the taint you added. Add the toleration to the specification for your workloads to let those pods schedule on your tainted node pool.

If you labeled the dedicated node pool, you can also add a node affinity rule to tell GDC to only schedule your workloads on that node pool.

To configure your container workload to run in the dedicated node pool, complete the following steps:

  1. Add the following sections to the .spec.template.spec section of your container workload:

    kind: Deployment
    apiVersion: apps/v1
        ...
        spec:
        ...
          template:
            spec:
              tolerations:
              - key: TAINT_KEY
                operator: Equal
                value: TAINT_VALUE
                effect: TAINT_EFFECT
              affinity:
                nodeAffinity:
                  requiredDuringSchedulingIgnoredDuringExecution:
                    nodeSelectorTerms:
                    - matchExpressions:
                      - key: LABEL_KEY
                        operator: In
                        values:
                        - "LABEL_VALUE"
              ...
    

    Replace the following:

    • TAINT_KEY: the taint key that you applied to your dedicated node pool.
    • TAINT_VALUE: the taint value that you applied to your dedicated node pool.
    • TAINT_EFFECT: one of the following effect values:
      • NoSchedule: pods that don't tolerate this taint are not scheduled on the node; existing pods are not evicted from the node.
      • PreferNoSchedule: Kubernetes avoids scheduling pods that don't tolerate this taint onto the node.
      • NoExecute: the pod is evicted from the node if it's already running on the node, and is not scheduled onto the node if it's not yet running on the node.
    • LABEL_KEY: the node label key that you applied to your dedicated node pool.
    • LABEL_VALUE: the node label value that you applied to your dedicated node pool.

    For example, the following Deployment resource adds a toleration for the workloadType=untrusted:NoExecute taint and a node affinity rule for the workloadType=untrusted node label:

    kind: Deployment
    apiVersion: apps/v1
    metadata:
      name: my-app
      namespace: default
      labels:
        app: my-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: my-app
      template:
        metadata:
          labels:
            app: my-app
        spec:
          tolerations:
          - key: workloadType
            operator: Equal
            value: untrusted
            effect: NoExecute
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: workloadType
                    operator: In
                    values:
                    - "untrusted"
          containers:
          - name: my-app
            image: harbor-1.org-1.zone1.google.gdc.test/harborproject/my-app
            ports:
            - containerPort: 80
          imagePullSecrets:
          - name: SECRET
    
  2. Update your deployment:

    kubectl apply -f deployment.yaml -n NAMESPACE \
        --kubeconfig KUBERNETES_CLUSTER_KUBECONFIG
    

    Replace the following variables:

    • NAMESPACE: the project namespace of your container workload.
    • KUBERNETES_CLUSTER_KUBECONFIG: the kubeconfig path for the Kubernetes cluster.

GDC recreates the affected pods. The node affinity rule forces the pods onto the dedicated node pool that you created. The toleration allows only those pods to be placed on the nodes.

Verify that the separation works

To verify that the scheduling works correctly, run the following command and check whether your workloads are on the dedicated node pool:

kubectl get pods -o=wide -n NAMESPACE \
    --kubeconfig KUBERNETES_CLUSTER_KUBECONFIG

Recommendations and best practices

After setting up node isolation, we recommend that you do the following:

  • When creating new node pools, prevent most GDC-managed workloads from running on those nodes by adding your own taint to those node pools.
  • Whenever you deploy new workloads to your cluster, such as when installing third-party tooling, audit the permissions that the pods require. When possible, avoid deploying workloads that use elevated permissions to shared nodes.