Control autoscaled node attributes with custom compute classes


This page shows you how to control the compute infrastructure and autoscaling behavior of Google Kubernetes Engine (GKE) clusters based on the specific needs of your workloads by using custom compute classes. You should already be familiar with the concept of custom compute classes. For details, see About custom compute classes.

This page is intended for platform administrators who want to declaratively define autoscaling profiles for nodes, and for cluster operators who want to run their workloads on specific compute classes.

About custom compute classes

Custom compute classes are Kubernetes Custom Resources that let you define priorities for GKE to follow when provisioning nodes to run your workloads. You can use a custom compute class to do the following:

  • Give GKE a set of priorities to sequentially follow when provisioning nodes, each with specific parameters like a Compute Engine machine series or minimum resource capacity
  • Define autoscaling thresholds and parameters for removing underutilized nodes and consolidating workloads efficiently on existing compute capacity
  • Tell GKE to automatically replace less preferred node configurations with more preferred node configurations for optimal workload performance

To understand all of the configuration options and how they interact with each other and with GKE Autopilot mode and GKE Standard mode, see About custom compute classes.

Pricing

The ComputeClass custom resource is provided at no extra cost in GKE. The following pricing considerations apply:

Before you begin

Before you start, make sure you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.
  • Ensure that you have an existing GKE cluster running version 1.30.3-gke.1451000 or later. For more information, see Create an Autopilot cluster.
  • If you're using a Standard mode cluster, ensure that you have at least one node pool with autoscaling enabled or that your cluster uses node auto-provisioning.

Example scenario for compute classes

This page presents an example scenario for which you define a custom compute class. In practice, you should consider the requirements of your specific workloads and organization, and define compute classes that meet those requirements. For full descriptions of all of the options for compute classes, and for special considerations, see About custom compute classes.

Consider the following example scenario:

  • Your goal is to optimize running costs for your workloads
  • Your workloads are fault-tolerant and don't require graceful shutdown or extended runtime
  • Your workloads need at least 64 vCPU to run optimally
  • You're limited to the N2 Compute Engine machine series

Based on the example scenario, you decide that you want a compute class that does the following:

  • Prioritizes N2 Spot nodes that have at least 64 vCPU
  • Lets GKE fall back to any N2 Spot node, regardless of compute capacity
  • If no N2 Spot nodes are available, lets GKE use on-demand N2 nodes
  • Tells GKE to move your workloads to Spot nodes whenever they're available again

Configure a compute class in Autopilot mode

In GKE Autopilot, you define a compute class, deploy it to the cluster, and request that compute class in your workloads. GKE performs any node configuration steps, like applying labels and taints, for you.

Save the following manifest as compute-class.yaml:

apiVersion: cloud.google.com/v1
kind: ComputeClass
metadata:
  name: cost-optimized
spec:
  priorities:
  - machineFamily: n2
    spot: true
    minCores: 64
  - machineFamily: n2
    spot: true
  - machineFamily: n2
    spot: false
  activeMigration:
    optimizeRulePriority: true
  nodePoolAutoCreation:
    enabled: true

Configure a compute class in Standard mode

In GKE Standard mode clusters, you define a compute class, after which you might have to perform manual configuration to ensure that your compute class Pods schedule as expected. Manual configuration depends on whether your node pools are automatically provisioned, as follows:

Use compute classes with manually-created node pools

This section shows you how to define a compute class in a cluster that only uses manually-created node pools.

  1. Save the following manifest as compute-class.yaml:

    apiVersion: cloud.google.com/v1
    kind: ComputeClass
    metadata:
      name: cost-optimized
    spec:
      priorities:
      - machineFamily: n2
        spot: true
        minCores: 64
      - machineFamily: n2
        spot: false
      activeMigration:
        optimizeRulePriority: true
    
  2. Create a new autoscaled node pool that uses Spot VMs and associate it with the compute class:

    gcloud container node-pools create cost-optimized-pool \
        --location=LOCATION \
        --cluster=CLUSTER_NAME \
        --machine-type=n2-standard-64 \
        --spot \
        --enable-autoscaling \
        --max-nodes=9 \
        --node-labels="cloud.google.com/compute-class=cost-optimized" \
        --node-taints="cloud.google.com/compute-class=cost-optimized:NoSchedule"
    

    Replace the following:

    • LOCATION: the location of your cluster.
    • CLUSTER_NAME: the name of your existing cluster.
  3. Create a new autoscaled node pool with on-demand VMs and associate it with the compute class:

    gcloud container node-pools create on-demand-pool \
        --location=LOCATION \
        --cluster=CLUSTER_NAME \
        --machine-type=n2-standard-64 \
        --enable-autoscaling \
        --max-nodes=9 \
        --num-nodes=0 \
        --node-labels="cloud.google.com/compute-class=cost-optimized" \
        --node-taints="cloud.google.com/compute-class=cost-optimized:NoSchedule"
    

When you deploy Pods that request this compute class and new nodes need to be created, GKE prioritizes creating nodes in the cost-optimized-pool node pool. If new nodes can't be created, GKE creates nodes in the on-demand-pool node pool.

For more details about how manually-created node pools interact with custom compute classes, see Configure manually-created node pools for compute class use.

Use compute classes with auto-provisioned node pools

This section shows you how to define a compute class in a cluster that uses node auto-provisioning.

Save the following manifest as compute-class.yaml:

apiVersion: cloud.google.com/v1
kind: ComputeClass
metadata:
  name: cost-optimized
spec:
  priorities:
  - machineFamily: n2
    spot: true
    minCores: 64
  - machineFamily: n2
    spot: true
  - machineFamily: n2
    spot: false
  activeMigration:
    optimizeRulePriority: true
  nodePoolAutoCreation:
    enabled: true

When you deploy Pods that request this compute class and new nodes need to be created, GKE prioritizes creating nodes in the order items in the priorities field. If required, GKE creates new node pools that meet the hardware requirements of the compute class.

For more details about how node auto-provisioning works with custom compute classes, see Node auto-provisioning and compute classes.

Customize autoscaling thresholds for node consolidation

By default, GKE removes underutilized nodes and reschedules your workloads onto other available nodes. You can further customize the thresholds and timing after which a node becomes a candidate for removal by using the autoscalingPolicy field in the compute class definition, like in the following example:

apiVersion: cloud.google.com/v1
kind: ComputeClass
metadata:
  name: cost-optimized
spec:
  priorities:
  - machineFamily: n2
    spot: true
    minCores: 64
  - machineFamily: n2
    spot: true
  - machineFamily: n2
    spot: false
  activeMigration:
    optimizeRulePriority: true
  autoscalingPolicy:
    consolidationDelayMinutes : 5
    consolidationThreshold    : 70

This example makes a node become a candidate for removal if it's underutilized by 70% of its available CPU and memory capacity for more than five minutes. For a list of available parameters, see Set autoscaling parameters for node consolidation.

Deploy a compute class in a cluster

After you define a compute class, deploy it to the cluster:

kubectl apply -f compute-class.yaml

This compute class is ready to use in the cluster. You can request the compute class in Pod specifications or, optionally, set it as the default compute class in a specific namespace.

Set a default compute class for a namespace

When you set a default compute class for a namespace, GKE uses that compute class to create nodes for any Pods that you deploy in that namespace. If a Pod explicitly requests a different compute class, the Pod-level request overrides the namespace default.

To set a compute class as the default for a specific namespace, do the following:

  1. Create a namespace:

    kubectl create namespace cost-optimized-ns
    
  2. Label the namespace with the compute class:

    kubectl label namespaces cost-optimized-ns \
        cloud.google.com/default-compute-class=cost-optimized
    

Request a compute class in a workload

To request a compute class in a workload, add a node selector for that compute class in your manifest.

  1. Save the following manifest as cc-workload.yaml:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: custom-workload
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: custom-workload
      template:
        metadata:
          labels:
            app: custom-workload
        spec:
          nodeSelector:
            cloud.google.com/compute-class: cost-optimized
          containers:
          - name: test
            image: gcr.io/google_containers/pause
            resources:
              requests:
                cpu: 1.5
                memory: "4Gi"
    
  2. Deploy the workload:

    kubectl apply -f cc-workload.yaml
    

When you deploy this workload, GKE automatically adds a toleration to the Pods that corresponds to the node taint for the requested compute class. This toleration ensures that only Pods that request the compute class run on compute class nodes.

What's next