This document describes how to create instances with attached GPUs from the A3 Ultra or A4 machine series. To learn more about creating instances with attached GPUs, see Overview of creating an instance with attached GPUs.
Before you begin
- To review limitations and additional prerequisite steps for creating instances with attached GPUs, such as selecting an OS image and checking GPU quota, see Overview of creating an instance with attached GPUs.
-
If you haven't already, then set up authentication.
Authentication is
the process by which your identity is verified for access to Google Cloud services and APIs.
To run code or samples from a local development environment, you can authenticate to
Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
-
After installing the Google Cloud CLI, initialize it by running the following command:
gcloud init
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
- Set a default region and zone.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
After installing the Google Cloud CLI, initialize it by running the following command:
gcloud init
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
-
Before you begin
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
-
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
- Set a default region and zone.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
After installing the Google Cloud CLI, initialize it by running the following command:
gcloud init
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
Required roles
To get the permissions that
you need to create instances,
ask your administrator to grant you the
Compute Instance Admin (v1) (roles/compute.instanceAdmin.v1
) IAM role on the project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the permissions required to create instances. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to create instances:
-
compute.instances.create
on the project -
To use a custom image to create the VM:
compute.images.useReadOnly
on the image -
To use a snapshot to create the VM:
compute.snapshots.useReadOnly
on the snapshot -
To use an instance template to create the VM:
compute.instanceTemplates.useReadOnly
on the instance template -
To assign a legacy network to the VM:
compute.networks.use
on the project -
To specify a static IP address for the VM:
compute.addresses.use
on the project -
To assign an external IP address to the VM when using a legacy network:
compute.networks.useExternalIp
on the project -
To specify a subnet for your VM:
compute.subnetworks.use
on the project or on the chosen subnet -
To assign an external IP address to the VM when using a VPC network:
compute.subnetworks.useExternalIp
on the project or on the chosen subnet -
To set VM instance metadata for the VM:
compute.instances.setMetadata
on the project -
To set tags for the VM:
compute.instances.setTags
on the VM -
To set labels for the VM:
compute.instances.setLabels
on the VM -
To set a service account for the VM to use:
compute.instances.setServiceAccount
on the VM -
To create a new disk for the VM:
compute.disks.create
on the project -
To attach an existing disk in read-only or read-write mode:
compute.disks.use
on the disk -
To attach an existing disk in read-only mode:
compute.disks.useReadOnly
on the disk
You might also be able to get these permissions with custom roles or other predefined roles.
Create an A3 Ultra or A4 instance
A3 Ultra or A4 instances are available through the following creation options, which each have different creation procedures, resource availability, and pricing. Identify which option that you want to use based on your workload.
If you are running long running AI and ML workloads such as large model training and inferencing that require the lowest latency, we recommend using Hypercompute Cluster (Preview). With Hypercompute Cluster, you can reserve densely allocated machines that provide topology-aware scheduling and enhanced monitoring and maintenance of these reserved capacity. To learn more about Hypercompute Cluster, see Hypercompute Cluster in the AI Hypercomputer documentation.
For instructions to create A3 Ultra or A4 instances using Hypercompute Cluster, see Overview of creating VMs and clusters in the AI Hypercomputer documentation.
If you are running lower priority AI and ML workloads that are tolerant to availability disruptions, you can get significant discounts by using Spot VMs. Although you can create and delete Spot VMs as needed, Spot VMs are finite resources that might not always be available, and Compute Engine might preempt (automatically stop or delete) Spot VMs at any time. To learn more about Spot VMs, see Spot VMs.
For instructions to create A3 Ultra or A4 instances using Spot VMs, see the following Create an A3 Ultra or A4 instance using Spot VMs section in this document.
Create an A3 Ultra or A4 instance using Spot VMs
To create an A3 Ultra or A4 instance using Spot VMs, complete the steps in the following sections:
Create VPC networks
Based on the machine type that you want to use and the number of network interfaces in the machine type, you need to create Virtual Private Cloud (VPC) networks as follows:
Machine type | Physical NIC count* | Network interfaces† | Number of VPC networks to create |
---|---|---|---|
a4-highgpu-8g |
10 |
|
3 |
a3-ultragpu-8g |
10 |
|
3 |
Set up the networks either manually by following the instruction guides or automatically by using the provided script.
Instruction guides
To create the networks, you can use the following instructions:
- To create the host networks, see Create and manage Virtual Private Cloud networks.
- To create the GPU networks, see Create a Virtual Private Cloud network for RDMA NICs.
Script
To create the networks, you can use the following script.
#!/bin/bash # Create standard VPCs (network and subnets) for the gVNICs for N in $(seq 0 1); do gcloud beta compute networks create GVNIC_NAME_PREFIX-net-$N \ --subnet-mode=custom gcloud beta compute networks subnets create GVNIC_NAME_PREFIX-sub-$N \ --network=GVNIC_NAME_PREFIX-net-$N \ --region=REGION \ --range=10.$N.0.0/16 gcloud beta compute firewall-rules create GVNIC_NAME_PREFIX-internal-$N \ --network=GVNIC_NAME_PREFIX-net-$N \ --action=ALLOW \ --rules=tcp:0-65535,udp:0-65535,icmp \ --source-ranges=10.0.0.0/8 done # Create SSH firewall rules gcloud beta compute firewall-rules create GVNIC_NAME_PREFIX-ssh \ --network=GVNIC_NAME_PREFIX-net-0 \ --action=ALLOW \ --rules=tcp:22 \ --source-ranges=IP_RANGE # Assumes that an external IP is only created for vNIC 0 gcloud beta compute firewall-rules create GVNIC_NAME_PREFIX-allow-ping-net-0 \ --network=GVNIC_NAME_PREFIX-net-0 \ --action=ALLOW \ --rules=icmp \ --source-ranges=IP_RANGE # List and make sure network profiles exist gcloud beta compute network-profiles list # Create network for CX-7 gcloud beta compute networks create RDMA_NAME_PREFIX-mrdma \ --network-profile=ZONE-vpc-roce \ --subnet-mode custom # Create subnets. for N in $(seq 0 7); do gcloud beta compute networks subnets create RDMA_NAME_PREFIX-mrdma-sub-$N \ --network=RDMA_NAME_PREFIX-mrdma \ --region=REGION \ --range=10.$((N+2)).0.0/16 # offset to avoid overlap with gVNICs done
Replace the following:
GVNIC_NAME_PREFIX
: the name prefix to use for the standard VPC networks and subnets that use gVNIC NICs.RDMA_NAME_PREFIX
: the name prefix to use for the VPC networks and subnets that use RDMA NICs.ZONE
: specify a zone in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.REGION
: the region where you want to create the networks. This must correspond to the zone specified. For example, if your zone iseurope-west1-b
, then your region iseurope-west1
.IP_RANGE
: the IP range to use for the SSH firewall rules.
Create the Spot VM
To create the Spot VM, use one of the following methods:
Console
In the Google Cloud console, go to the Create an instance page.
The Create an instance screen appears and displays the Machine configuration pane.
In the Machine configuration pane, complete the following steps:
- Specify a Name for your instance. See Resource naming convention.
- Select the Region and Zone where you want to reserve capacity. See the list of available GPU regions and zones.
- Click the GPUs tab, and then complete the following steps:
- In the GPU type list, select your GPU type.
- For A4 instances, select
NVIDIA B200
- For A3 Ultra instances, select
NVIDIA H200 141GB
- For A4 instances, select
- In the Number of GPUs list, select
8
.
- In the GPU type list, select your GPU type.
In the navigation menu, click OS and storage. In the OS and storage pane that appears, complete the following steps:
- Click Change. The Boot disk configuration pane opens.
- On the Public images tab, select a recommended image. For a list of recommended images, see Operating systems.
- To confirm your boot disk options, click Select.
To create a multi-NIC instance, complete the following steps. Otherwise, to create a single-NIC instance, skip these steps.
In the navigation menu, click Networking. In the Networking pane that appears, complete the following steps:
In the Network interfaces section, complete the following steps:
Delete the default network interface. To delete the interface, click
Delete.Click Add a network interface. Use this option to add the gVNIC and RDMA networks that you created in the previous section. When you add the networks, remember the following:
- Specify your host networks in the Network and Subnetwork lists, and set the Network interface card list to gVNIC.
- Specify your GPU networks in the Network and Sub-network lists, and set the Network interface card list to MRDMA for these networks.
In the navigation menu, click Advanced. In the Advanced pane that appears, complete the following steps:
In the Provisioning model section, select Spot in the VM provisioning model list.
Optional: To specify the action to take when Compute Engine preempts the instance (stop (default) or delete), complete the following steps:
- Expand the VM provisioning model advanced settings section.
- In the On VM termination list, select an option.
To create and start the instance, click Create.
gcloud
To create the instance, use the
gcloud beta compute instances create
command:
gcloud beta compute instance create INSTANCE_NAME \ --machine-type=MACHINE_TYPE \ --image-family=IMAGE_FAMILY \ --image-project=IMAGE_PROJECT \ --provisioning-model=SPOT \ --instance-termination-action=TERMINATION_ACTION \ --zone=ZONE \ --boot-disk-type=hyperdisk-balanced \ --boot-disk-size=DISK_SIZE \ --scopes=cloud-platform \ --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-0,subnet=GVNIC_NAME_PREFIX-sub-0 \ --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-1,subnet=GVNIC_NAME_PREFIX-net-1,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-0,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-1,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-2,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-3,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-4,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-5,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-6,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-7,no-address
Replace the following:
INSTANCE_NAME
: the name of the instance.MACHINE_TYPE
: the machine type to use for the instance, eithera3-ultragpu-8g
ora4-highgpu-8g
.IMAGE_FAMILY
: the image family of the OS image that you want to use. For options, see Operating system details.IMAGE_PROJECT
: the project ID of the OS image.TERMINATION_ACTION
: Optional: specify which action to take when Compute Engine preempts the instance, eitherSTOP
(default behavior) orDELETE
.ZONE
: the zone where you want to create the instance. For options, see GPU regions and zones.DISK_SIZE
: the size of the boot disk in GB.
REST
To create the instance, make a POST
request to the
instances.insert
method
as follows:
POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/ { { "machineType":"projects/PROJECT_ID/zones/ZONE/machineTypes/MACHINE_TYPE", "name":"INSTANCE_NAME", "disks":[ { "boot":true, "initializeParams":{ "diskSizeGb":"DISK_SIZE", "diskType":"hyperdisk-balanced", "sourceImage":"projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY" }, "mode":"READ_WRITE", "type":"PERSISTENT" } ], "networkInterfaces": [ { "accessConfigs": [ { "name": "external-nat", "type": "ONE_TO_ONE_NAT" } ], "network": "projects/PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-0", "nicType": "GVNIC", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-0" }, { "network": "projects/PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-1", "nicType": "GVNIC", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-1" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-0" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-1" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-2" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-3" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-4" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-5" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-6" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-7" } ], "scheduling":{ "provisioningModel":"SPOT", "instanceTerminationAction":"TERMINATION_ACTION" } } }
Replace the following:
PROJECT_ID
: the project ID of the project where you want to create the instance.ZONE
: the zone where you want to create the instance. For options, see GPU regions and zones.MACHINE_TYPE
: the machine type to use for the instance, eithera3-ultragpu-8g
ora4-highgpu-8g
.INSTANCE_NAME
: the name of the instance.DISK_SIZE
: the size of the boot disk in GB.IMAGE_PROJECT
: the project ID of the OS image.IMAGE_FAMILY
: the image family of the OS image that you want to use. For options, see Operating system details.TERMINATION_ACTION
: Optional: specify which action to take when Compute Engine preempts the instance, eitherSTOP
(default behavior) orDELETE
.
Prepare a Spot VM with attached GPUs for use
To prepare a Spot VM with attached GPUs for use, complete the following steps:
- To allow an instance to use attached its GPUs, the instance requires GPU drivers. Unless you specified an image that already includes the required GPU drivers, follow the steps to Install GPU drivers.
- To prepare a Spot VM for use, complete the following steps:
- To learn how to make sure a Spot VM can withstand preemption, see Manage preemption of Spot VMs.
- Optional: Learn about the best practices for using Spot VMs.
What's next
- To monitor GPU performance, see Monitor GPU performance.
- To troubleshoot GPU instances, see Troubleshoot GPU VMs.
- Learn more about GPU platforms.