Optimize Autopilot Pod performance by choosing a machine series

Autopilot

This page shows you how to place workloads on a specific Compute Engine machine series for optimal workload performance in your Google Kubernetes Engine (GKE) Autopilot clusters.

Ensure that you're familiar with the following:

Compute Engine machine series and use cases
Kernel-level requirements for your applications

How machine series selection works

You can add a cloud.google.com/machine-family node selector to your Pod specification for Autopilot to allocate specific Compute Engine hardware for that Pod. For example, you can choose the C3 machine series for Pods that need more CPU power, or the N1 machine series for Pods that need more memory. To optimally run your workload, Autopilot provisions one of the predefined machine types from the machine series you selected.

In addition to optimal Pod performance, choosing a specific machine series offers the following benefits:

Efficient node utilization: By default, Autopilot optimizes node resource usage by scheduling onto each node as many Pods as possible that request the same machine series. This approach optimizes resource usage on the node, which improves the price-to-performance ratio. If your workload needs access to all of the resources on the node, you optionally can configure your workload to request one Pod for each node.
Burstable workloads: You can configure Pods to burst into unused resource capacity on the node by setting your resource limits higher than your requests. For details, see Configure Pod bursting in GKE.

Plan a dedicated node for each Pod

If you have CPU-intensive workloads that need reliable access to all of the node resources, you can optionally configure your Pod to get Autopilot to place a Pod that requests a machine series on its own node.

Dedicated nodes per Pod are recommended when you run large-scale, CPU-intensive workloads, like AI/ML training workloads or high performance computing (HPC) batch workloads.

Choose between multiple-Pod and single-Pod scheduling

Use the following guidance to choose a Pod scheduling behavior based on your requirements:

If you have Pods that can share compute resources with other Pods or you want to optimize costs while running Pods on specialized hardware, use the default scheduling behavior of multiple Pods per node.
If you have Pods that need reliable access to full node resources or you want to minimize the chance of disruptions caused by sharing compute resources, Request a dedicated node for each Pod.

Pricing

You're billed for the underlying VM and any attached hardware by Compute Engine, plus a premium for Autopilot node management and scalability. For details, see GKE pricing.

Before you begin

Before you start, make sure that you have performed the following tasks:

Enable the Google Kubernetes Engine API.

Enable Google Kubernetes Engine API

If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.
Note: For existing gcloud CLI installations, make sure to set the compute/region property. If you use primarily zonal clusters, set the compute/zone instead. By setting a default location, you can avoid errors in the gcloud CLI like the following: One of [--zone, --region] must be supplied: Please specify location. You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.

Ensure that you have an existing Autopilot cluster running version 1.30.1-gke.1396000 or later. To create a cluster, see Create an Autopilot cluster.

Select a machine series

This section shows you how to select a specific Compute Engine machine series in a Pod.

Save the following manifest as machine-series-pod.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: machine-series-pod
spec:
  nodeSelector:
    cloud.google.com/machine-family: MACHINE_SERIES
  containers:
  - name: my-container
    image: "k8s.gcr.io/pause"
    resources:
      requests:
        cpu: 5
        memory: "25Gi"
      limits:
        cpu: 20
        memory: 100Gi

Replace MACHINE_SERIES with the Compute Engine machine series for your Pod, like c3. For supported values, see Supported machine series in this page.

Deploy the Pod:

kubectl apply -f machine-series-pod.yaml

This manifest lets Autopilot optimize node resource usage by efficiently scheduling other Pods that select the same machine series onto the same node if there's available capacity.

Use Local SSDs

Pods that select a machine series can use Local SSDs for ephemeral storage if you specify a machine series that offers Local SSD. Autopilot considers ephemeral storage requests when choosing a Compute Engine machine type for the Pod.

To learn more, see Use Local SSD-backed ephemeral storage with Autopilot clusters.

Request a dedicated node for a Pod

If your Pod has specific performance requirements like needing reliable access to all of the resources of your node, you can request a dedicated node for each Pod by specifying the cloud.google.com/compute-class: Performance node selector along with your machine series node selector. This indicates to Autopilot to place your Pod on a new node that uses the specified machine series and is dedicated for that Pod. This node selector also prevents Autopilot from scheduling other Pods on that node.

Save the following manifest as dedicated-node-pod.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: dedicated-node-pod
spec:
  nodeSelector:
    cloud.google.com/machine-family: MACHINE_SERIES
    cloud.google.com/compute-class: Performance
  containers:
  - name: my-container
    image: "k8s.gcr.io/pause"
    resources:
      requests:
        cpu: 12
        memory: "50Gi"
        ephemeral: "200Gi"

Replace MACHINE_SERIES with a supported machine series that also supports one Pod per node scheduling. If the specified machine series doesn't support one Pod per node scheduling, the deployment fails with an error.

Deploy the Pod:

kubectl apply -f dedicated-node-pod.yaml

When you deploy this manifest, Autopilot does the following:

Ensures that the deployed Pod requests at least the minimum resources for the performance-optimized node.
Calculates the total resource requests of the deployed Pod and any DaemonSets in the cluster.
Provisions a node that's backed by the selected machine series.
Modifies the Pod manifest with a combination of node selectors and tolerations to ensure that the Pod runs on its own node.

Supported machine series

The machine-family selector supports the following machine series:

Machine series	Node selector value	Supports multiple Pods per node	Supports single Pod per node
C4D machine series (see Version requirements)	`c4d`
C4A machine series	`c4a`
C4 machine series (default)	`c4`
C3 machine series	`c3`
C3D machine series	`c3d`
C2 machine series	`c2`
C2D machine series	`c2d`
H3 machine series	`h3`
T2D machine series	`t2d`
T2A machine series	`t2a`
E2 machine series	`e2`
N4 machine series	`n4`
N2 machine series	`n2`
N2D machine series	`n2d`
N1 machine series	`n1`
Z3 machine series	`z3`

Note that c4 is the default if machine series is not specified and if c4 is available in a region.

To compare these machine series and their use cases, see Machine series comparison in the Compute Engine documentation.

Version requirements

The C4D machine series is available with the following versions and configurations:

C4D machine types without Local SSD: GKE version 1.33.0-gke.1439000 and later.
C4D machine types with Local SSD: GKE version 1.33.1-gke.1171000 and later.

Compatibility with other GKE features

Pods that select a machine series can use the GKE capabilities and features supported by that machine series, such as:

Spot Pods
Extended run time Pods (only with dedicated nodes per Pod)
Workload separation
Capacity reservations
Committed-use discounts

Spot Pods and extended run time Pods are mutually exclusive. GKE doesn't enforce higher minimum resource requests for dedicated Pods per node, even though they use workload separation.

How GKE selects a machine type

To select a machine type in the specified machine series, GKE calculates the total CPU, total memory, and total ephemeral storage requests of the Pods and any DaemonSets that will run on the new node. GKE rounds these values up to the nearest available Compute Engine machine type that supports all of these totals.

Example 1: Consider a Deployment with four replicas that selects the C3D machine series. You don't request dedicated nodes per Pod. The resource requests of each replica are as follows:
- 500m vCPU (0.5 vCPU)
- 1 GiB of memory
Autopilot places all four Pods on a node that's backed by the c3d-standard-4 machine type, which has 4 vCPUs and 16 GB of memory.
Example 2: Consider a Pod that selects the C3D machine series and Local SSDs for ephemeral storage. You request a dedicated node for the Pod. The total resource requests including DaemonSets are as follows:
- 12 vCPU
- 50 GiB of memory
- 200 GiB of ephemeral storage
Autopilot places the Pod on a node that uses the c3d-standard-16-lssd machine type, which has 16 vCPUs, 64 GiB of memory, and 365 GiB of Local SSD capacity.

What's next

For guidance about the compute options that Autopilot offers for various use cases, see Compute classes in Autopilot.
Deploy GPU-based workloads in Autopilot.