This page shows you how to place workloads on specific Compute Engine machine series for optimal workload performance in your Google Kubernetes Engine (GKE) Autopilot clusters.
Ensure that you're familiar with the following:
- Compute Engine machine series and use cases
- Kernel-level requirements for your applications
How machine series selection works
You can add a cloud.google.com/machine-family
node selector to your Pod specification
for Autopilot to allocate specific Compute Engine hardware for
that Pod. For example, you can choose the C3 machine series for Pods that need more
CPU power, or the N1 machine series for Pods that need more memory. Autopilot
provisions one of the predefined machine types from the selected machine series
to optimally run your workload.
In addition to optimal Pod performance, choosing a specific machine series offers the following benefits:
Efficient node utilization: By default, Autopilot optimizes node resource usage by scheduling as many Pods as possible that request the same machine series onto each node. This approach optimizes resource usage on the node, which improves the price-to-performance ratio. If your workload needs access to all of the resources on the node, you can optionally configure your workload to request one Pod for each node.
Burstable workloads: You can configure Pods to burst into unused resource capacity on the node by setting your resource limits higher than your requests. For details, see Configure Pod bursting in GKE.
Request a dedicated node for each Pod
If you have CPU-intensive workloads that need reliable access to all of the node resources, you can optionally configure your Pod to get Autopilot to place a Pod that requests a machine series on its own node.
Dedicated nodes per Pod are recommended when you run large-scale, CPU-intensive workloads, like AI/ML training workloads or high performance computing (HPC) batch workloads.
Choose between multiple-Pod and single-Pod scheduling
Use the following guidance to choose a Pod scheduling behavior based on your requirements:
- If you have Pods that can share compute resources with other Pods or you want to optimize costs while running Pods on specialized hardware, use the default scheduling behavior of multiple Pods per node.
- If you have Pods that need reliable access to full node resources or you want to minimize the chance of disruptions caused by sharing compute resources, Request a dedicated node for each Pod.
Pricing
You're billed for the underlying VM and any attached hardware by Compute Engine, plus a premium for Autopilot node management and scalability. For details, see GKE pricing.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
- Ensure that you have an existing Autopilot cluster running version 1.30.1-gke.1396000 or later. To create a cluster, see Create an Autopilot cluster.
Select a machine series
This section shows you how to select a specific Compute Engine machine series in a Pod.
Save the following manifest as
machine-series-pod.yaml
:apiVersion: v1 kind: Pod metadata: name: machine-series-pod spec: nodeSelector: cloud.google.com/machine-family: MACHINE_SERIES containers: - name: my-container image: "k8s.gcr.io/pause" resources: requests: cpu: 5 memory: "25Gi" limits: cpu: 20 memory: 100Gi
Replace
MACHINE_SERIES
with the Compute Engine machine series for your Pod, likec3
. For supported values, see Supported machine series in this page.Deploy the Pod:
kubectl apply -f machine-series-pod.yaml
This manifest lets Autopilot optimize node resource usage by efficiently scheduling other Pods that select the same machine series onto the same node if there's available capacity.
Use Local SSDs
Pods that select a machine series can use Local SSDs for ephemeral storage if you specify a machine series that offers Local SSD. Autopilot considers ephemeral storage requests when choosing a Compute Engine machine type for the Pod.
Save the following manifest as
local-ssd-pod.yaml
:apiVersion: v1 kind: Pod metadata: name: local-ssd-pod spec: nodeSelector: cloud.google.com/machine-family: MACHINE_SERIES cloud.google.com/gke-ephemeral-storage-local-ssd: "true" containers: - name: my-container image: "k8s.gcr.io/pause" resources: requests: cpu: 6 memory: "25Gi" ephemeral: "100Gi" limits: cpu: 12 memory: "50Gi" ephemeral: "200Gi"
Replace
MACHINE_SERIES
with a supported machine series that also supports Local SSDs. If your specified machine series doesn't support Local SSDs, the deployment fails with an error.Deploy the Pod:
kubectl apply -f local-ssd-pod.yaml
Request a dedicated node for a Pod
If your Pod has specific performance requirements like needing reliable access to all
of the resources of your node, you can request a dedicated node for each Pod by specifying the
cloud.google.com/compute-class: Performance
node selector along with your machine
series node selector. This indicates to Autopilot to place your Pod on a
new node that uses the specified machine series and is dedicated for that Pod.
This node selector also prevents Autopilot from scheduling other Pods on that node.
Save the following manifest as
dedicated-node-pod.yaml
:apiVersion: v1 kind: Pod metadata: name: dedicated-node-pod spec: nodeSelector: cloud.google.com/machine-family: MACHINE_SERIES cloud.google.com/compute-class: Performance containers: - name: my-container image: "k8s.gcr.io/pause" resources: requests: cpu: 12 memory: "50Gi" ephemeral: "200Gi"
Replace
MACHINE_SERIES
with a supported machine series that also supports one Pod per node scheduling. If the specified machine series doesn't support one Pod per node scheduling, the deployment fails with an error.Deploy the Pod:
kubectl apply -f dedicated-node-pod.yaml
When you deploy this manifest, Autopilot does the following:
- Ensures that the deployed Pod requests at least the minimum resources for the performance-optimized node.
- Calculates the total resource requests of the deployed Pod and any DaemonSets in the cluster.
- Provisions a node that's backed by the selected machine series.
- Modifies the Pod manifest with a combination of node selectors and tolerations to ensure that the Pod runs on its own node.
Supported machine series
The machine-family
selector supports the following machine series:
(Preview with allowlist *) | ||
(always bundled) |
* This feature requires you to be added to an allowlist. To receive access, contact your account team. ↩
To compare these machine series and their use cases, see Machine series comparison in the Compute Engine documentation.
Compatibility with other GKE features
You can use Pods that select machine series with the following GKE capabilities and features:
- Spot Pods
- Extended run time Pods (only with dedicated nodes per Pod)
- Workload separation
- Capacity reservations
- Committed-use discounts
Spot Pods and extended run time Pods are mutually exclusive. GKE doesn't enforce higher minimum resource requests for dedicated Pods per node, even though they use workload separation.
How GKE selects a machine type
To select a machine type in the specified machine series, GKE calculates the total CPU, total memory, and total ephemeral storage requests of the Pods and any DaemonSets that will run on the new node. GKE rounds these values up to the nearest available Compute Engine machine type that supports all of these totals.
Example 1: Consider a Deployment with four replicas that selects the C3D machine series. You don't request dedicated nodes per Pod. The resource requests of each replica are as follows:
- 500m vCPU
- 1 GiB of memory
Autopilot places all of the Pods on a node that's backed by the
c3d-standard-4
machine type, which has 4 vCPUs and 16 GB of memory.Example 2: Consider a Pod that selects the
C3D
machine series and Local SSDs for ephemeral storage. You request a dedicated node for the Pod. The total resource requests including DaemonSets are as follows:- 12 vCPU
- 50 GiB of memory
- 200 GiB of ephemeral storage
Autopilot places the Pod on a node that uses the
c3d-standard-16-lssd
machine type, which has 16 vCPUs, 64 GiB of memory, and 365 GiB of Local SSD capacity.
What's next
- For guidance about the compute options that Autopilot offers for various use cases, see Compute classes in Autopilot.
- Deploy GPU-based workloads in Autopilot.