Isolate your workloads in dedicated node pools

This document explains how to enhance the security and management of your Kubernetes cluster by isolating container workloads within dedicated node pools in Google Distributed Cloud (GDC) air-gapped. Isolating your workloads gives you greater control over your pods and reduces the risk of privilege escalation attacks in your Kubernetes cluster. For more information about the benefits and limitations of dedicated node pools, see Node isolation overview.

There are several workflows involved with isolating your container workloads, which include the following:

Taint and label a node pool: Apply a taint and label to a node pool so it repels pods away from the node pool unless they're specifically labeled to run there.
Add a toleration and a node affinity rule: Apply tolerations and rules to your pods to force them to run on the designated node pool only.
Verify that the separation works: Confirm your tainted node pools are only running the pods you labeled to run there.

These workflows are intended for audiences such as IT administrators within the platform administrator group who are responsible for managing the node pools of a Kubernetes cluster, and application developers within the application operator group who are responsible for managing container workloads. For more information, see Audiences for GDC air-gapped documentation.

Before you begin

Before you start, make sure you have performed the following tasks:

Choose a specific name for the node taint and the node label that you want to use for the dedicated node pools. For example, workloadType=untrusted.
If necessary, ask your Organization IAM Admin to grant you the User Cluster Developer role (user-cluster-developer), which is not bound to a namespace.

Taint and label a new node pool

When you apply a taint or a label to a new node pool, all nodes, including any nodes added later, will automatically get the specified taints and labels.

To add a taint and a label to a new node pool, complete the following steps:

Edit the nodePools section of the Cluster custom resource directly when creating the node pool:
```
nodePools:
  # Several lines of code are omitted here.
  - machineTypeName: n2-standard-2-gdc
    name: nodepool-1
    nodeCount: 3
    taints:
    - key: "TAINT_KEY"
      value: "TAINT_VALUE"
      effect: "TAINT_EFFECT"
    labels:
      LABEL_KEY: LABEL_VALUE
```
Replace the following:
- TAINT_KEY: the taint key portion of the key-value pair associated with a scheduling TAINT_EFFECT. For example, workloadType.
- TAINT_VALUE: the taint value portion of the key-value pair associated with a scheduling TAINT_EFFECT. For example, untrusted.
- TAINT_EFFECT: one of the following effect values:
  - NoSchedule: pods that don't tolerate this taint are not scheduled on the node; existing pods are not evicted from the node.
  - PreferNoSchedule: Kubernetes avoids scheduling pods that don't tolerate this taint onto the node.
  - NoExecute: the pod is evicted from the node if it's already running on the node, and is not scheduled onto the node if it's not yet running on the node.
- LABEL_KEY: LABEL_VALUE: the key-value pairs for the node labels, which correspond to the selectors that you specify in your workload manifests.
Apply the Cluster resource to create the new node pool:
```
kubectl apply -f cluster.yaml --kubeconfig MANAGEMENT_API_SERVER
```
Replace MANAGEMENT_API_SERVER with the zonal API server's kubeconfig path where the Kubernetes cluster is hosted. If you have not yet generated a kubeconfig file for the API server in your targeted zone, see Zonal Management API server resources for more information.

Taint and label an existing node pool

To apply a taint or label to an existing node pool, you must apply the changes to each existing node. You cannot dynamically update node pool configurations.

To add a taint and label to an existing node pool, complete the following steps:

List the nodes in the dedicated node pool:
```
kubectl get node --kubeconfig KUBERNETES_CLUSTER_KUBECONFIG \
    -l baremetal.cluster.gke.io/node-pool=NODE_POOL_NAME
```
Replace the following variables:
- KUBERNETES_CLUSTER_KUBECONFIG: the kubeconfig path for the Kubernetes cluster.
- NODE_POOL_NAME: the name of your dedicated node pool.
Note each node ID of all the nodes in the node pool from the output.
For each node in the node pool, apply the taints:
```
kubectl taint nodes NODE_ID \
    TAINT_KEY=TAINT_VALUE:TAINT_EFFECT \
    --kubeconfig KUBERNETES_CLUSTER_KUBECONFIG
```
Replace the following variables:
- NODE_ID: the ID of the worker node in the dedicated node pool.
- TAINT_KEY=TAINT_VALUE: a key-value pair associated with a scheduling TAINT_EFFECT. For example, workloadType=untrusted.
- TAINT_EFFECT: one of the following effect values:
  - NoSchedule: pods that don't tolerate this taint are not scheduled on the node; existing pods are not evicted from the node.
  - PreferNoSchedule: Kubernetes avoids scheduling pods that don't tolerate this taint onto the node.
  - NoExecute: the pod is evicted from the node if it's already running on the node, and is not scheduled onto the node if it's not yet running on the node.
- KUBERNETES_CLUSTER_KUBECONFIG: the kubeconfig path for the Kubernetes cluster.
For each node in the node pool, apply the labels that correspond with the selectors you'll define in your container workloads:
```
kubectl label NODE_ID \
    LABEL_KEY:LABEL_VALUE \
    --kubeconfig KUBERNETES_CLUSTER_KUBECONFIG
```
Replace the following variables:
- NODE_ID: the ID of the worker node in the dedicated node pool.
- LABEL_KEY:LABEL_VALUE: the key-value pairs for the node labels, which correspond to the selectors that you specify in your workload manifests.
- KUBERNETES_CLUSTER_KUBECONFIG: the kubeconfig path for the Kubernetes cluster.

Add a toleration and a node affinity rule

After you taint the dedicated node pool, no workloads can schedule on it unless they have a toleration corresponding to the taint you added. Add the toleration to the specification for your workloads to let those pods schedule on your tainted node pool.

If you labeled the dedicated node pool, you can also add a node affinity rule to tell GDC to only schedule your workloads on that node pool.

To configure your container workload to run in the dedicated node pool, complete the following steps:

Add the following sections to the .spec.template.spec section of your container workload manifest file, such as a Deployment custom resource:

  # Several lines of code are omitted here.
    spec:
      template:
        spec:
          tolerations:
          - key: TAINT_KEY
            operator: Equal
            value: TAINT_VALUE
            effect: TAINT_EFFECT
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: LABEL_KEY
                    operator: In
                    values:
                    - "LABEL_VALUE"
        # Several lines of code are omitted here.

Replace the following:

TAINT_KEY: the taint key that you applied to your dedicated node pool.
TAINT_VALUE: the taint value that you applied to your dedicated node pool.
TAINT_EFFECT: one of the following effect values:
- NoSchedule: pods that don't tolerate this taint are not scheduled on the node; existing pods are not evicted from the node.
- PreferNoSchedule: Kubernetes avoids scheduling pods that don't tolerate this taint onto the node.
- NoExecute: the pod is evicted from the node if it's already running on the node, and is not scheduled onto the node if it's not yet running on the node.
LABEL_KEY: the node label key that you applied to your dedicated node pool.
LABEL_VALUE: the node label value that you applied to your dedicated node pool.

For example, the following Deployment resource adds a toleration for the workloadType=untrusted:NoExecute taint and a node affinity rule for the workloadType=untrusted node label:

kind: Deployment
apiVersion: apps/v1
metadata:
  name: my-app
  namespace: default
  labels:
    app: my-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      tolerations:
      - key: workloadType
        operator: Equal
        value: untrusted
        effect: NoExecute
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: workloadType
                operator: In
                values:
                - "untrusted"
      containers:
      - name: my-app
        image: harbor-1.org-1.zone1.google.gdc.test/harborproject/my-app
        ports:
        - containerPort: 80
      imagePullSecrets:
      - name: SECRET

Update your container workload:
```
kubectl apply -f deployment.yaml -n NAMESPACE \
    --kubeconfig KUBERNETES_CLUSTER_KUBECONFIG
```
Replace the following variables:
- NAMESPACE: the project namespace of your container workload.
- KUBERNETES_CLUSTER_KUBECONFIG: the kubeconfig path for the Kubernetes cluster.

GDC recreates the affected pods. The node affinity rule forces the pods onto the dedicated node pool that you created. The toleration allows only those pods to be placed on the nodes.

Verify that the separation works

Verify that the pods you designated are running on the labelled node pool.

List the pods in the given namespace:
```
kubectl get pods -o=wide -n NAMESPACE \
    --kubeconfig KUBERNETES_CLUSTER_KUBECONFIG
```
Replace the following variables:
- NAMESPACE: the project namespace of your container workload.
- KUBERNETES_CLUSTER_KUBECONFIG: the kubeconfig path for the Kubernetes cluster.
The output looks similar to the following:
```
pod/kube-abc-12tyuj
pod/kube-abc-39oplef
pod/kube-abc-95rzkap
```
Confirm that your workloads are running in the dedicated node pool.