This document describes the prerequisites required for creating managed instance groups (MIGs) that are deployed on Hypercompute Cluster. For more information about Hypercompute Cluster, see Hypercompute Cluster.
Create a MIG if you want to manage multiple virtual machines (VMs) as a single entity. MIGs offer high availability and scalability by automatically managing the VMs in the group. To learn more about MIGs, see Managed instance groups in the Compute Engine documentation.
To learn about other ways to create VMs or clusters, see the Overview page.
Before you begin
-
Select the tab for how you plan to use the samples on this page:
gcloud
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
After installing the Google Cloud CLI, initialize it by running the following command:
gcloud init
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
Overview
After you have requested capacity, prepare for creating your MIG by completing the following tasks:
Optional: Create a compact placement policy. Use this policy to place your VMs in a single or adjacent blocks.
However, if you want your VMs to be on a specific block, skip this step and provide the name of the required block during the instance template creation.
Required: Create an instance template. This instance template is used to define the VM properties that the MIG will use to create each VM in the group.
Compact placement policy
Create a compact placement policy
When you apply compact placement policies to your VMs, Compute Engine makes best-effort
attempts to create VMs as close to each other as possible. If you require a minimum compactness to
minimize network latency, then specify the maxDistance
field when creating a
placement policy. A lower maxDistance
value ensures closer VM placement, but it also
increases the chance that some VMs won't be created.
The following table shows the machine series and number of VMs that each maxDistance
value supports:
Maximum distance value | Description | Supported machine series | Maximum number of VMs |
---|---|---|---|
Unspecified (Not recommended) | Compute Engine makes best-effort attempts to place the VMs as close to each other as possible, but with no maximum distance between VMs. | A4 and A3 Ultra | 1,500 |
3 | Compute Engine creates VMs in adjacent blocks. | A4 | 1,500 |
2 | Compute Engine creates VMs in the same block. | A4 and A3 Ultra | For A4 VMs: 150, for A3 Ultra VMs: 256 |
gcloud
To create a compact placement policy, use the
gcloud beta compute resource-policies create group-placement
command:
gcloud beta compute resource-policies create group-placement POLICY_NAME \ --collocation=collocated \ --max-distance=MAX_DISTANCE \ --region=REGION
Replace the following:
POLICY_NAME
: the name of the compact placement policy.MAX_DISTANCE
: the maximum distance configuration for your VMs. The value must be3
to place VMs in the adjacent blocks, or2
to place VMs in the same block.REGION
: the region where you want to create the placement policy. Specify a region in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.
REST
To create a compact placement policy, make a POST
request to the
beta
resourcePolicies.insert
method. In the request body, include the
collocation
field set to COLLOCATED
, and the maxDistance
field.
POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/regions/REGION/resourcePolicies { "name": "POLICY_NAME", "groupPlacementPolicy": { "collocation": "COLLOCATED", "maxDistance": MAX_DISTANCE } }
Replace the following:
PROJECT_ID
: your project IDPOLICY_NAME
: the name of the compact placement policy.MAX_DISTANCE
: the maximum distance configuration for your VMs. The value must be3
to place VMs in the adjacent blocks, or2
to place VMs in the same block.REGION
: the region where you want to create the placement policy. Specify a region in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.
Instance template
Each VM in a MIG is based on an instance template. To create an instance template, complete the following steps:
- Create VPC networks. Use these networks to provide connectivity between the VMs.
- Create an instance template. Use this instance template to specify the machine type, network, and other VM properties that you want to use to create VMs in the MIG.
Create VPC networks
Based on the machine type that you want to use and the number of network interfaces in the machine type, you need to create Virtual Private Cloud (VPC) networks as follows:
Machine type | Physical NIC count* | Network interfaces† | Number of VPC networks to create |
---|---|---|---|
a4-highgpu-8g |
10 |
|
3 |
a3-ultragpu-8g |
10 |
|
3 |
Set up the networks either manually by following the instruction guides or automatically by using the provided script.
Instruction guides
To create the networks, you can use the following instructions:
- To create the host networks, see Create and manage Virtual Private Cloud networks.
- To create the GPU networks, see Create a Virtual Private Cloud network for RDMA NICs.
Script
To create the networks, you can use the following script.
#!/bin/bash # Create standard VPCs (network and subnets) for the gVNICs for N in $(seq 0 1); do gcloud beta compute networks create GVNIC_NAME_PREFIX-net-$N \ --subnet-mode=custom gcloud beta compute networks subnets create GVNIC_NAME_PREFIX-sub-$N \ --network=GVNIC_NAME_PREFIX-net-$N \ --region=REGION \ --range=10.$N.0.0/16 gcloud beta compute firewall-rules create GVNIC_NAME_PREFIX-internal-$N \ --network=GVNIC_NAME_PREFIX-net-$N \ --action=ALLOW \ --rules=tcp:0-65535,udp:0-65535,icmp \ --source-ranges=10.0.0.0/8 done # Create SSH firewall rules gcloud beta compute firewall-rules create GVNIC_NAME_PREFIX-ssh \ --network=GVNIC_NAME_PREFIX-net-0 \ --action=ALLOW \ --rules=tcp:22 \ --source-ranges=IP_RANGE # Assumes that an external IP is only created for vNIC 0 gcloud beta compute firewall-rules create GVNIC_NAME_PREFIX-allow-ping-net-0 \ --network=GVNIC_NAME_PREFIX-net-0 \ --action=ALLOW \ --rules=icmp \ --source-ranges=IP_RANGE # List and make sure network profiles exist gcloud beta compute network-profiles list # Create network for CX-7 gcloud beta compute networks create RDMA_NAME_PREFIX-mrdma \ --network-profile=ZONE-vpc-roce \ --subnet-mode custom # Create subnets. for N in $(seq 0 7); do gcloud beta compute networks subnets create RDMA_NAME_PREFIX-mrdma-sub-$N \ --network=RDMA_NAME_PREFIX-mrdma \ --region=REGION \ --range=10.$((N+2)).0.0/16 # offset to avoid overlap with gVNICs done
Replace the following:
GVNIC_NAME_PREFIX
: the name prefix to use for the standard VPC networks and subnets that use gVNIC NICs.RDMA_NAME_PREFIX
: the name prefix to use for the VPC networks and subnets that use RDMA NICs.ZONE
: specify a zone in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.REGION
: the region where you want to create the networks. This must correspond to the zone specified. For example, if your zone iseurope-west1-b
, then your region iseurope-west1
.IP_RANGE
: the IP range to use for the SSH firewall rules.
Create an instance template
Before creating an instance template, ensure that you created the VPC networks as mentioned in the previous section.
When you create an instance template that runs on reservation blocks, consider the following:
- The provisioning model must be reservation-bound. For more information about this provisioning model, see reservation-bound.
- The reservation affinity must be specific.
gcloud
To create a regional instance template, use the
gcloud beta compute instance-templates create
command.
If you chose to use a compact placement policy, also add the
--resource-policies=POLICY_NAME
flag. Replace
POLICY_NAME
with the name of the compact placement policy.
gcloud beta compute instance-templates create INSTANCE_TEMPLATE_NAME \ --machine-type=MACHINE_TYPE \ --image-family=IMAGE_FAMILY \ --image-project=IMAGE_PROJECT \ --reservation-affinity=specific \ --reservation=RESERVATION \ --provisioning-model=RESERVATION_BOUND \ --instance-termination-action=DELETE \ --instance-template-region=REGION \ --boot-disk-type=hyperdisk-balanced \ --boot-disk-size=DISK_SIZE \ --scopes=cloud-platform \ --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-0,subnet=GVNIC_NAME_PREFIX-sub-0 \ --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-1,subnet=GVNIC_NAME_PREFIX-sub-1,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-0,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-1,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-2,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-3,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-4,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-5,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-6,no-address \ --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-7,no-address
Replace the following:
INSTANCE_TEMPLATE_NAME
: the name of the instance template.MACHINE_TYPE
: the machine type to use for the VMs in the MIG. You can specify either an A4 or A3 Ultra machine type. For more information, see GPU machine types.IMAGE_FAMILY
: the image family of the OS image that you want to use. For a list of supported operating systems, see Supported operating systems.IMAGE_PROJECT
: the project ID of the OS image.RESERVATION
: for this value, you can either specify the reservation name or a specific block within a reservation. To get the reservation name or the available blocks, see View capacity. Choose one of the following:Reservation value When to use RESERVATION_NAME
For example:
exr-5010-01
- If you are using a placement policy. The placement policy will be applied to the reservation and the VMs are placed on a single block.
- If you aren't using a placement policy and are ok with VMs placed anywhere in your reservation.
RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME
For example:
exr-5010-01/reservationBlocks/exr-5010-01-block-1
- If you aren't using a placement policy and want your VMs to be placed in a specific block.
REGION
: the region where you want to create the instance template. Specify a region in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.DISK_SIZE
: the size of the boot disk in GB.
REST
To create a regional instance template, make a POST
request to the
regionInstanceTemplates.insert
method as follows:
If you chose to use a compact placement policy, also add the placement policy parameter to the request body.
POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/regions/REGION/instanceTemplates { "name":"INSTANCE_TEMPLATE_NAME", "properties":{ "disks":[ { "boot":true, "initializeParams":{ "diskSizeGb":"DISK_SIZE", "diskType":"hyperdisk-balanced", "sourceImage":"projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY" }, "mode":"READ_WRITE", "type":"PERSISTENT" } ], "machineType":"MACHINE_TYPE", "networkInterfaces": [ { "accessConfigs": [ { "name": "external-nat", "type": "ONE_TO_ONE_NAT" } ], "network": "projects/PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-0", "nicType": "GVNIC", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-0" }, { "network": "projects/PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-1", "nicType": "GVNIC", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-1" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-0" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-1" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-2" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-3" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-4" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-5" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-6" }, { "network": "projects/PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma", "nicType": "MRDMA", "subnetwork": "projects/PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-7" } ], "reservationAffinity":{ "consumeReservationType":"SPECIFIC_RESERVATION", "key":"compute.googleapis.com/reservation-name", "values":[ "RESERVATION" ], "scheduling":{ "provisioningModel":"RESERVATION_BOUND", "instanceTerminationAction": "DELETE" "automaticRestart":true } } } }
Replace the following:
INSTANCE_TEMPLATE_NAME
: the name of the instance template.MACHINE_TYPE
: the machine type to use for the VMs in the MIG. You can specify either an A4 or A3 Ultra machine type. For more information, see GPU machine types.IMAGE_FAMILY
: the image family of the OS image that you want to use. For a list of supported operating systems, see Supported operating systems.IMAGE_PROJECT
: the project ID of the OS image.RESERVATION
: for this value, you can either specify the the reservation name or a specific block within a reservation. To get the reservation name or the available blocks, see View capacity. Choose one of the following:Reservation value When to use RESERVATION_NAME
For example:
exr-5010-01
- If you are using a placement policy. The placement policy will be applied to the reservation and the VMs are placed on a single block.
- If you aren't using a placement policy and are ok with VMs placed anywhere in your reservation.
RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME
For example:
exr-5010-01/reservationBlocks/exr-5010-01-block-1
- If you aren't using a placement policy and want your VMs to be placed in a specific block.
REGION
: the region where you want to create the instance template. Specify a region in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.DISK_SIZE
: the size of the boot disk in GB.
If you chose to use a compact placement policy, also add the following flag to the request body:
"resourcePolicies": [ "projects/PROJECT_ID/regions/REGION/resourcePolicies/POLICY_NAME" ],
Replace the following:
PROJECT_ID
: the project ID of the compact placement policy.REGION
: the region of the compact placement policy.POLICY_NAME
: the name of the compact placement policy.