About GPUs

You can attach graphics processing units (GPUs) to your virtual machine (VM) instance to accelerate specific workloads on Compute Engine.

This document describes the features and limitations of GPUs running on Compute Engine.

GPUs and machine series

GPUs are supported for N1 general-purpose, and the accelerator-optimized (A3, A2, and G2) machine series. For VMs that use N1 machine types, you attach the GPU to the VM during, or after VM creation. For VMs that use A3, A2 or G2 machine types, the GPUs are automatically attached when you create the VM. GPUs can't be used with other machine series.

Accelerator-optimized machine series

Each accelerator-optimized machine type has a specific model of NVIDIA GPUs attached.

  • For A3 accelerator-optimized machine types, NVIDIA H100 80GB GPUs are attached. These are available in both H100 80GB and H100 80GB Mega options.
  • For A2 accelerator-optimized machine types, NVIDIA A100 GPUs are attached. These are available in both A100 40GB and A100 80GB options.
  • For G2 accelerator-optimized machine types, NVIDIA L4 GPUs are attached.

For more information, see Accelerator-optimized machine series.

N1 general-purpose machine series

For all other GPU types, you can use most N1 machine types except the N1 shared-core.

For this machine series, you can use either predefined or custom machine types.

GPUs on preemptible VM instances

You can add GPUs to your preemptible VM instances at lower spot prices for the GPUs. GPUs attached to preemptible instances work like normal GPUs but persist only for the life of the instance. Preemptible instances with GPUs follow the same preemption process as all preemptible instances.

Consider requesting dedicated Preemptible GPU quota to use for GPUs on preemptible instances. For more information, see Quotas for preemptible VM instances.

During maintenance events, preemptible instances with GPUs are preempted by default and cannot be automatically restarted. If you want to recreate your instances after they have been preempted, use a managed instance group. Managed instance groups recreate your instances if the vCPU, memory, and GPU resources are available.

If you want a warning before your instance is preempted, or want to configure your instance to automatically restart after a maintenance event, use a standard instance with a GPU. For standard instances with GPUs, Google provides one hour advance notice before preemption.

Compute Engine does not charge you for GPUs if their instances are preempted in the first minute after they start running.

For steps to automatically restart a standard instance, see Updating options for an instance.

To learn how to create preemptible instances with GPUs attached, read Create a VM with attached GPUs.

GPUs and Confidential VM

You can't attach GPUs to Confidential VM instances. For more information about Confidential VM, see Confidential VM overview.

GPUs and block storage

When you create a VM on a GPU platform, you can add durable block storage by attaching a Persistent Disk to the VM. You can also add temporary block storage by attaching Local SSD disks when you create the VM.

Persistent Disks

You can add Persistent Disk volumes to VMs with attached GPUs. The data stored on a Persistent Disk volume is independent of the lifecycle of the VM, making it suitable for storing non-transient data.

For more information about the available Persistent Disk types for machine series that support GPUs, see the N1 and accelerator optimized machine series pages.

Local SSD disks

Local SSD disks provide fast, temporary storage for caching, data processing, or other transient data. Local SSD disks are fast storage because they are physically attached to the server hosting your VM. They are temporary because the data might be lost if the VM restarts.

You shouldn't store data with strong persistency requirements on Local SSD disks. To store non-transient data, use one of the available durable storage options instead.

If you manually stop a VM with a GPU, you can preserve the Local SSD data, with certain restrictions. See the Local SSD documentation for more details.

For regional support for Local SSD with GPU types, see Local SSD availability by GPU regions and zones.

GPUs and host maintenance

VMs with attached GPUs are always stopped when Compute Engine performs maintenance events on the VMs. If the VM has attached Local SSD disks, the Local SSD data is lost after the VM stops.

For information on handling maintenance events, see Handling GPU host maintenance events.

GPU pricing

Most VMs with an attached GPU receive sustained use discounts similar to vCPUs. When you select a GPU for a virtual workstation, an NVIDIA RTX Virtual Workstation license is added to your VM.

For hourly and monthly pricing for GPUs, see GPU pricing page.

Reserving GPUs with committed use discounts

To reserve GPU resources in a specific zone, see Reservations of Compute Engine zonal resources.

To receive committed use discounts for GPUs in a specific zone, you must purchase resource-based commitments for the GPUs and also attach reservations that specify matching GPUs to your commitments. For more information, see Attach reservations to resource-based commitments.

GPU restrictions and limitations

For VMs with attached GPUs, the following restrictions and limitations apply:

  • GPUs are only supported with general-purpose N1 or accelerator-optimized - A3, A2, and G2 - machine types.

  • To protect Compute Engine systems and users, new projects have a global GPU quota, which limits the total number of GPUs you can create in any supported zone. When you request a GPU quota, you must request a quota for the GPU models that you want to create in each region, and an additional global quota for the total number of GPUs of all types in all zones.

  • VMs with one or more GPUs have a maximum number of vCPUs for each GPU that you add to the VM. To see the available vCPU and memory ranges for different GPU configurations, see the GPUs list.

  • GPUs require device drivers in order to function properly. NVIDIA GPUs running on Compute Engine must use a minimum driver version. For more information about driver versions, see Required NVIDIA driver versions.

  • VMs with a specific attached GPU model are covered by the Compute Engine SLA only if that attached GPU model is generally available and is supported in more than one zone in the same region. The Compute Engine SLA does not cover GPU models in the following zones:

    • NVIDIA H100 80GB Mega:
      • asia-southeast1-b
      • us-east5-a
      • us-west4-a
    • NVIDIA H100 80GB:
      • asia-northeast1-b
      • europe-west1-b
      • us-east5-a
      • us-west4-a
    • NVIDIA L4:
      • europe-west3-b
      • europe-west6-b
    • NVIDIA A100 80GB:
      • asia-southeast1-c
      • us-east4-c
      • us-east5-b
    • NVIDIA A100 40GB:
      • us-east1-b
      • us-west1-b
      • us-west3-b
      • us-west4-b
    • NVIDIA T4:
      • europe-west3-b
      • southamerica-east1-c
      • us-west3-b
    • NVIDIA V100:
      • asia-east1-c
      • us-east1-c
    • NVIDIA P100:
      • australia-southeast1-c
      • europe-west4-a
  • Compute Engine supports the running of 1 concurrent user per GPU.

What's next?