ComputeClass


ComputeClass is a Kubernetes Custom Resource Definition (CRD) that lets you define configurations and fallback priorities for GKE node scaling decisions. To learn more, see About custom compute classes.

apiVersion: cloud.google.com/v1
kind: ComputeClass
metadata:
  name: my-class
spec:
  activeMigration:
    optimizeRulePriority: false
  autoscalingPolicy:
    consolidationDelayMinutes: 1
    consolidationThreshold: 0
    gpuConsolidationThreshold: 0
  nodePoolAutoCreation:
    enabled: false
  priorityDefaults:
    nodeSystemConfig:
      linuxNodeConfig:
        sysctls:
          net.core.somaxconn: 256
      kubeletConfig:
        cpuCfsQuota: true
  priorities:
  - machineFamily: n2
    maxRunDurationSeconds: 360
    minCores: 16
    minMemoryGb: 64
    reservations:
      affinity: Specific
      specific:
      - name: n2-shared-reservation
        project: reservation-project
    spot: true
    storage:
      bootDiskKMSKey: projects/example/locations/us-central1/keyRings/example/cryptoKeys/key-1
      secondaryBootDisks:
      - diskImageName: pytorch-mnist
        project: k8s-staging-jobset
        mode: CONTAINER_IMAGE_CACHE
    nodeSystemConfig:
      linuxNodeConfig:
        sysctls:
          net.core.somaxconn: 512
  - machineType: n2-standard-32
    spot: true
    reservations:
      affinity: AnyBestEffort
    storage:
      bootDiskSize: 100
      bootDiskType: pd-balanced
      localSSDCount: 1
  - nodepools: ['example-first-nodepool-name', 'example-second-nodepool-name']
  - gpu:
      count: 1
      driverVersion: default
      type: nvidia-l4
  - tpu:
      count: 8
      topology: "2x4"
      type: tpu-v5-lite-device
  whenUnsatisfiable: ScaleUpAnyway
status:
  conditions:
  - lastTransitionTime: 2024-10-10T00:00:00Z
    message: example-message
    observedGeneration: 1
    reason: example-reason
    status: "True"
    type: example-type

ComputeClass specification

metadata:
  name: string
spec:
  activeMigration: object(activeMigration)
  autoscalingPolicy: object(autoscalingPolicy)
  nodePoolAutoCreation: object(nodePoolAutoCreation)
  priorities: [
    object(priorities)
  ]
  priorityDefaults: object(priorityDefaults)
  whenUnsatisfiable: string
Fields

metadata

required

object

A field that identifies the compute class.

metadata.name

optional

string

The name of the compute class.

spec

required

object

The compute class specification, which defines how the compute class works.

spec.activeMigration

optional

object (activeMigration)

A specification that lets you choose whether GKE automatically replaces existing nodes that are lower in a compute class priority list with new nodes that are higher in that priority list.

spec.autoscalingPolicy

optional

object (autoscalingPolicy)

A specification that lets you fine-tune the timing and thresholds that cause GKE to remove underused nodes and consolidate workloads on other nodes.

spec.nodePoolAutoCreation

optional

object(nodePoolAutoCreation)

A specification that lets you choose whether GKE can create and delete node pools in Standard mode clusters based on the compute class priority rules. Requires node auto-provisioning to be enabled on the cluster.

spec.priorities[]

required

object (priorities)

A list of priority rules that defines how GKE configures nodes during scaling operations. When a cluster needs to scale up, GKE tries to create nodes that match the first priority rule in this field. If GKE can't create those nodes, it attempts the next rule. The process repeats until GKE successfully creates nodes or exhausts all of the rules in the list.

priorityDefaults

optional

object (priorityDefaults)

Requires GKE version 1.32.1-gke.1729000 or later

A specification that sets default values for specific fields that are omitted in entries in the spec.priorities[] field. The default values that you specify in the priorityDefaults field apply only if the corresponding field isn't set in a specific priority rule.

The values in the priorityDefaults field don't apply to spec.priorities.nodepools[] fields.

spec.whenUnsatisfiable

optional

string

A specification that lets you define what GKE does if none of the rules in the spec.priorities[] field can be met. Supported values are as follows:

  • ScaleUpAnyway: create a new node that uses the default cluster node configuration. In GKE versions earlier than 1.33, this is the default behavior.
  • DoNotScaleUp: leave the Pod in the Pending status until GKE can create a node that meets the criteria a priority rule. In GKE version 1.33 and later, this is the default behavior.

activeMigration

Choose whether GKE migrates workloads to higher priority nodes for the compute class as resources become available. For details, see Configure active migration to higher priority nodes.

activeMigration:
  optimizeRulePriority: boolean
Fields

optimizeRulePriority

required

boolean

Choose whether GKE migrates workloads to higher priority nodes when resources are available. If you omit this field, the default value is false.

autoscalingPolicy

autoscalingPolicy:
  consolidationDelayMinutes: integer
  consolidationThreshold: integer
  gpuConsolidationThreshold: integer
Fields

consolidationDelayMinutes

optional

integer

The number of minutes after which GKE removes underutilized nodes. The value must be between 1 and 1440.

consolidationThreshold

optional

integer

The CPU and memory utilization threshold as a percentage of the total resources on the node. A node becomes eligible for removal only when the resource utilization is less than this threshold. The value must be between 0 and 100.

gpuConsolidationThreshold

optional

integer

The GPU utilization threshold as a percentage of the total GPU resources on the node. A node becomes eligible for removal only when the resource utilization is less than this threshold. The value must be between 0 and 100.

Consider setting this value to either 0 or 100 so that GKE consolidates nodes that don't use 100% of the attached GPUs.

gpu

gpu:
  count: integer
  driverVersion: string
  type: string
Fields

count

required

integer

The number of GPUs to attach to each node. The value must be at least 1.

driverVersion

optional

string

Requires GKE version 1.31.1-gke.1858000 or later

The NVIDIA driver version to install. The supported values are as follows:

  • default: install the default driver version for the node GKE version. If you omit this field, this is the default value.
  • latest: install the latest driver version for the node GKE version.

gpu.type

required

string

The GPU type to attach to each node, such as nvidia-l4.

hugepageConfig

hugepageConfig:
  hugepage_size1g: integer
  hugepage_size2m: integer
Fields

hugepage_size1g

optional

integer

The number of 1 GB huge pages to allocate. Huge pages are a memory management feature that can improve performance for memory-intensive applications. By using huge pages, the system can reduce the overhead associated with Translation Lookaside Buffer (TLB) misses. Allocating 1 GB huge pages is beneficial for workloads that require large, contiguous memory allocations, such as large databases or in-memory computing. If you specify this field, the value must be at least is 1. For limitations and requirements, see Linux huge page configuration options.

hugepage_size2m

optional

integer

Number of 2 MB huge pages to allocate. Similar to 1 GB huge pages, 2 MB huge pages can also improve performance by reducing TLB misses. However, 2 MB huge pages are suitable for applications with smaller large memory requirements. They provide a balance between performance improvement and memory flexibility. If you specify this field, the value must be at least 1. For limitations and requirements, see Linux huge page configuration options.

kubeletConfig

kubeletConfig:
  cpuCfsQuota: boolean
  cpuCfsQuotaPeriod: string
  cpuManagerPolicy: string
  podPidsLimit: integer
Fields

cpuCfsQuota

optional

boolean

Enables CPU Completely Fair Scheduler (CFS) quota enforcement for containers that specify CPU limits. The following values are supported:

  • true: enable CPU CFS quota enforcement. The kubelet uses the kernel CFS quota mechanism to enforce Pod CPU limits. Pods might move to different CPU cores depending on CPU throttling and core availability.
  • false: disable CPU CFS quota enforcement. Pod CPU limits are ignored, which might be useful if Pods are sensitive to CPU limits. However, disabling CPU CFS quota enforcement increases the risk of a Pod consuming more CPU resources than intended.

The default value is true.

cpuCfsQuotaPeriod

optional

string

Sets the value, in microseconds, of the CFS quota period. This period defines how often the kernel reallocates CPU resources to each control group (cgroup). You can use this value to tune CPU throttling behavior.

The value must be between 1ms and 1s. The default value is 100ms.

cpuManagerPolicy

optional

string

Controls the kubelet CPU management policy. Some Pods are more sensitive to CPU limits. In these Pods, the latency of CPU reassignment during CFS quota enforcement might impact performance. The kubelet CPU Manager feature provides exclusive CPU access to specific Pods. The kubelet doesn't enforce CPU limits for those Pods even if CPU CFS quota enforcement is enabled.

The following values are supported:

  • none: disables the CPU Manager feature. The kubelet assigns CPUs to Pods based on the CFS quota settings.
  • static: provides containers that meet all of the following criteria with access to exclusive CPUs on the node:
    • The Pod has the Guaranteed Quality-of-Service (QoS) class.
    • The container has integer values in the requests.cpu field.

The default value is none.

For more information, see the Kubernetes documentation about policies for assigning CPUs to Pods.

podPidsLimit

optional

integer

Sets the maximum number of process IDs (PIDs) that each Pod can use. This setting controls the maximum number of processes and threads that can run simultaneously in a Pod. The value must be between 1024 and 4194304. The default value is 4096.

linuxNodeConfig

linuxNodeConfig:
  hugepageConfig: object (hugepageConfig)
  sysctls: object (sysctls)
Fields

hugepageConfig

optional

object (hugepageConfig)

The huge pages configuration for the node.

sysctls

optional

object (sysctls)

The sysctls configuration for the node.

nodePoolAutoCreation

nodePoolAutoCreation:
  enabled: boolean
Fields

enabled

optional

boolean

Choose whether GKE can create and delete node pools in Standard mode clusters based on the compute class priority rules. Requires node auto-provisioning to be enabled on the cluster. If you omit this field, the default value is false.

nodeSystemConfig

nodeSystemConfig:
  kubeletConfig: object(kubeletConfig)
  linuxNodeConfig: object(linuxNodeConfig)
Fields

kubeletConfig

optional

object (kubeletConfig)

The kubelet configuration for the node.

linuxNodeConfig

optional

object (linuxNodeConfig)

The Linux kernel configuration for the node.

priorities

- gpu: object(gpu)
  spot: boolean
- machineFamily: string
  maxRunDurationSeconds: integer
  minCores: integer
  minMemoryGb: integer
  reservations: object(reservations)
  spot: boolean
  storage: object(storage)
  nodeSystemConfig: object(nodeSystemConfig)
- machineType: string
  maxRunDurationSeconds: integer
  reservations: object(reservations)
  spot: boolean
  storage: object(storage)
  nodeSystemConfig: object(nodeSystemConfig)
- nodepools: []string
- tpu: object(tpu)
  reservations: object(reservations)
  spot: boolean
  storage: object(storage)
Fields

gpu

optional

object(gpu)

The GPU configuration.

machineFamily

optional

string

The Compute Engine machine series to use, such as n2 or c3. If you don't specify a value, GKE uses the default machine series of the cluster.

machineType

optional

string

The predefined Compute Engine machine type to use, such as n2-standard-32.

maxPodsPerNode

optional

integer

The maximum number of Pods that GKE can place on each node. The value must be between 8 and 256.

maxRunDurationSeconds

optional

integer

The maximum duration, in seconds, that the nodes can exist before being shut down. If you omit this field, the nodes can exist indefinitely.

minCores

optional

integer

The minimum number of vCPU cores that each node can have. If you omit this field, the default value is 0.

minMemoryGb

optional

integer

The minimum memory capacity, in GiB, that each node can have. If you omit this field, the default value is 0.

nodepools

optional

[]string

A list of existing manually created node pools in Standard mode clusters. You must associate these node pools with the compute class by using node labels and node taints. GKE doesn't process the node pools in this list in any order.

Example: nodepools: ['example-first-nodepool-name', 'example-second-nodepool-name']

nodeSystemConfig

optional

object (nodeSystemConfig)

The node system configuration.

reservations

optional

object (reservations)

The Compute Engine capacity reservations to consume during node provisioning.

spot

optional

boolean

The Spot VMs configuration. If you set this field to true, GKE uses Spot VMs to create your nodes. If you omit this field, the default value is false.

storage

optional

object (storage)

The boot disk configuration of each node.

tpu

optional

object (tpu)

The TPU configuration.

nodeSystemConfig

optional

object (nodeSystemConfig)

Requires GKE version 1.32.1-gke.1729000 or later

The node system configuration.

priorityDefaults

priorityDefaults:
  nodeSystemConfig: object(nodeSystemConfig)
Fields

nodeSystemConfig

optional

object (nodeSystemConfig)

Default values for the node system configuration. These values apply to a priority rule in the spec.priorities field only if the corresponding fields are omitted from that rule.

reservations

reservations:
  affinity: string
  specific: [
    object(specific)
  ]
Fields

affinity

required

string

The type of reservation to consume when creating nodes. The following values are supported:

  • Specific: consume only specific named reservations. If the specified reservation doesn't have any capacity, GKE moves on to the next priority rule in the compute class. If you use this value, the specific[] field is required.
  • AnyBestEffort: consume any reservation that matches the requirements of the priority rule. If any available reservation doesn't have capacity, GKE tries to provision an on-demand node with the priority rule configuration.
  • None: prevent GKE from consuming reservations when it creates nodes for that priority rule.

specific

optional*

object(specific)

The parameters for consuming specific reservations. If you set the affinity field to Specific, this field is required. If you set the affinity field to any other value, you can't specify the specific field.

secondaryBootDisks

secondaryBootDisks:
- diskImageName: string
  mode: string
  project: string
Fields

diskImageName

required

string

The name of the disk image.

mode

optional

string

The mode in which the secondary boot disk should be used. The following values are supported:

  • CONTAINER_IMAGE_CACHE: use the disk as a container image cache.
  • MODE_UNSPECIFIED: don't use a specific mode. If you omit this field, this is the default value.

project

optional

string

The project ID of the Google Cloud project that the disk image belongs to. If you omit this field, the default value is the project ID of the cluster project.

specific

specific:
- name: string
  project: string
Fields

name

required

string

The name of the specific reservation to consume.

project

optional

The project ID of the Google Cloud project that contains the specific reservation. To use a shared reservation from a different project, this field is required.

storage

storage:
  bootDiskKMSKey: string
  bootDiskSize: integer
  bootDiskType: string
  localSSDCount: integer
  secondaryBootDisks: [
    object(secondarybootdisks)
  ]
Fields

bootDiskKMSKey

optional

string

The path to the Cloud KMS key to use to encrypt the boot disk.

bootDiskSize

optional

integer

The size, in GiB, of the boot disk for each node. The minimum value is 10.

bootDiskType

optional

string

The type of disk to attach to the node. The value that you specify must be supported by the machine series or the machine type in your priority rule. The following values are supported:

  • pd-balanced: balanced Persistent Disk.
  • pd-standard: standard Persistent Disk.
  • pd-ssd: performance (SSD) Persistent Disk.
  • hyperdisk-balanced: Hyperdisk Balanced.

For details about the disk types that specific machine series support, see the Machine series comparison table. Filter the table properties for "Hyperdisk" and "PD".

localSSDCount

optional

integer

The number of Local SSDs to attach to each node. If you specify this field, the minimum value is 1.

secondaryBootDisks[]

optional

[]object(secondaryBootDisks)

Requires GKE version 1.31.2-gke.1105000 or later

The configuration of secondary boot disks that are used to preload nodes with data, such as ML models or container images.

sysctls

sysctls:
  net.core.netdev_max_backlog: integer
  net.core.rmem_max: integer
  net.core.wmem_default: integer
  net.core.wmem_max: integer
  net.core.optmem_max: integer
  net.core.somaxconn: integer
  net.ipv4.tcp_rmem: string
  net.ipv4.tcp_wmem: string
  net.ipv4.tcp_tw_reuse: integer
  net.core.busy_poll: integer
  net.core.busy_read: integer
  net.ipv6.conf.all.disable_ipv6: boolean
  net.ipv6.conf.default.disable_ipv6: boolean
  vm.max_map_count: integer
Fields

net.core.netdev_max_backlog

optional

integer

The maximum number of packets, queued on the INPUT side, when the interface receives packets faster than the kernel can process them. This setting is crucial for high-traffic network interfaces. Increasing this value can help to prevent packet loss under heavy load, but it also increases memory consumption. The value must be between 1 and 2147483647.

net.core.rmem_max

optional

integer

The maximum receive socket buffer size in bytes. This setting limits the amount of data that a socket can buffer when receiving data. Increasing this value can improve performance for high-bandwidth connections by allowing the socket to handle larger bursts of data. The value must be between 2304 and 2147483647.

net.core.wmem_default

optional

integer

The default setting, in bytes, of the socket send buffer. This value defines the initial size of the buffer allocated for sending data on a socket. The value must be between 4608 and 2147483647.

net.core.wmem_max

optional

integer

The maximum send socket buffer size in bytes. This setting limits the amount of data that a socket can buffer when it sends data. Increasing this value can improve performance for applications that send large amounts of data. The value must be between 4608 and 2147483647.

net.core.optmem_max

optional

integer

The maximum ancillary buffer size allowed per socket. Ancillary data is a sequence of cmsghdr structures with appended data, which are used to send and receive control information along with socket data. The value must be between 1 and 2147483647.

net.core.somaxconn

optional

integer

The maximum size of the socket listen() backlog. This setting defines the maximum number of pending connections that can be queued for a listening socket. You might need to increase this value for busy web servers or other network services that handle a high volume of concurrent connection requests. The value must be between 128 and 2147483647. The default value is 128.

net.ipv4.tcp_rmem

optional

string

The minimum size, in bytes, of the receive buffer that's used by TCP sockets in moderation. Each TCP socket can use the size for receiving data, even if the total pages of UDP sockets exceed udp_mem pressure. The value in this field is a string of three integers, separated by spaces. These three integers represent the minimum, default, and maximum size of the receive buffer. For example, '4096 87380 6291456'.

net.ipv4.tcp_wmem

optional

string

The minimal size, in bytes, of send buffer that's used by TCP sockets in moderation. Each TCP socket can use the size for sending data, even if the total pages of TCP sockets exceed udp_mem pressure. The value in this field is a string of three integers, separated by spaces. These three integers represent the minimum, default, and maximum size of the receive buffer. For example, '4096 87380 6291456'. These values control how TCP socket send buffers dynamically resize for sending.

net.ipv4.tcp_tw_reuse

optional

integer

Reuse TIME-WAIT sockets for new connections when the protocol considers it safe to do so. This setting can improve performance by reducing the number of sockets in the TIME-WAIT state, but it also carries a risk of potential data corruption if it isn't used carefully. Don't change this setting unless a technical expect requests the change. The following values are supported:

  • 0
  • 1
  • 2

net.core.busy_poll

optional

integer

The approximate time, in microseconds, to wait for packets on the device queue to do socket polls or selects. This setting is related to network performance tuning for low-latency applications. The value must be between 1024 and 2147483647.

net.core.busy_read

optional

integer

The approximate time, in microseconds, to wait for packets on the device queue to perform read operations. This setting is used for low-latency network tuning, specifically for read operations. The value must be between 0 and 2147483647.

net.ipv6.conf.all.disable_ipv6

optional

boolean

Globally disables IPv6 on all future and existing interfaces. Changing this value has the same effect as changing the net.ipv6.conf.default.disable_ipv6 setting and all per-interface disable_ipv6 settings to a specific value. The following values are supported:

  • true: disable IPv6.
  • false: enable IPv6.

net.ipv6.conf.default.disable_ipv6

optional

boolean

Disable IPv6 operations in all future network interfaces. The following values are supported:

  • true: disable IPv6.
  • false: enable IPv6.

vm.max_map_count

optional

integer

Limit the number of distinct memory regions that a process can map into its address space. You might need to increase this value for applications that require a large number of shared libraries or that perform extensive memory mapping.

The value must be between 65536 and 2147483647.

tpu

- tpu:
    count: integer
    topology: string
    type: string
Fields

count

required

integer

The number of TPUs to attach to the node.

topology

required

string

The TPU topology to use, such as "2x2x1".

type

required

string

The TPU type to use, such as tpu-v6e-slice.

ComputeClass status

The status field is a list of status messages. This field is informational and is updated by the Kubernetes API server and the kubelet on each node.

status:
  conditions: [
    object(conditions)
  ]
Fields

conditions[]

object(conditions)

List of status conditions for the ComputeClass object.

conditions

conditions:
- type: string
  status: boolean
  reason: string
  message: string
  lastTransitionTime: string
  observedGeneration: integer
Fields

type

string

The type of condition, which helps to organize status messages.

status

string

The status of the condition. The value is one of the following:

  • True
  • False
  • Unknown

reason

string

A machine-readable reason why a specific condition type made its most recent transition.

message

string

A human-readable message that provides details about the most recent transition. This field might be empty.

lastTransitionTime

string

The timestamp of the most recent change to the condition.

observedGeneration

integer

A count of how many times the ComputeClass controller observed a change to the ComputeClass object. The controller attempts to reconcile the value in this field with the value in the metadata.generation field, which the Kubernetes API server updates whenever a change is made to the ComputeClass API object.