Create an AI-optimized MIG with A4 or A3 Ultra machine type

This document describes how to create a managed instance group (MIG) that uses an A4 or A3 Ultra machine type.

Creating a MIG lets you manage multiple virtual machines (VMs) as a single entity. Each VM in a MIG is based on an instance template. By automatically managing the VMs in the group, MIGs offer high availability and scalability. To learn more about MIGs, see Managed instance groups in the Compute Engine documentation.

To learn about VM and cluster creation options, see VM and cluster creation overview page.

Limitations

When you create a MIG with A4 or A3 Ultra VMs, the following limitations apply:
- If the instance template to use for the MIG specifies the flex-start provisioning model (Preview), then the following limitations apply:
  - You can only add VMs to the MIG using resize requests.
  - You can't apply a workload policy to the MIG.
  - You must turn off repairs in the MIG.
- If you create a regional MIG, then the MIG can only create VMs in the zone that contains your VPC network's profile.
- You can't configure instance flexibility in the MIG.
- If you apply a workload policy to a MIG, you cannot change the policy in the MIG when the group has VMs in it. To change the policy in a MIG that has VMs, you must first resize the MIG to zero.
- If the instance template of your MIG specifies a compact placement policy, then you cannot apply a workload policy to the MIG.

When you create MIG resize requests, the following limitations apply:
- In a regional MIG, you can use only the ANY_SINGLE_ZONE target distribution shape (Preview). Other distribution shapes aren't supported.
- You can only set the standby pool mode of the MIG to manual (default).
- You can't set autoscaling.
- If the MIG contains accepted resize requests, then you can't do the following:
  - You can't add a second instance template to initiate a canary update in the MIG.
  - You can't change the target size of the MIG.
- You can't delete or abandon the managed instances in a CREATING status that the MIG creates for a resize request. To delete those managed instances, you must cancel the resize request.

Before you begin

Before creating a MIG, if you haven't already done so, complete the following steps:

Choose a consumption option: the option that you pick determines how you want to get and use GPU resources.

To learn more, see Choose a consumption option.

Obtain capacity: to learn how to obtain capacity for your consumption option.

To learn more, see Capacity overview.

Required roles

To get the permissions that you need to create a MIG, ask your administrator to grant you the Compute Instance Admin (v1) (roles/compute.instanceAdmin.v1) IAM role on the project. For more information about granting roles, see Manage access to projects, folders, and organizations.

This predefined role contains the permissions required to create a MIG. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to create a MIG:

To create a MIG: compute.instanceGroupManagers.create on the project

You might also be able to get these permissions with custom roles or other predefined roles.

Overview

Creating a MIG with A4 or A3 Ultra machine type includes the following steps:

Create VPC networks
Optional: Create a workload policy
Create an instance template
Create a MIG

Create VPC networks

Tip: If you are setting up a quick test, you can skip this step and specify a single NIC --network-interface=nic-type=GVNIC instead.

For A4 or A3 Ultra machine type, you must create three VPC networks for the following network interfaces:

2 VPC networks for the gVNIC network interfaces (NIC). These are used for host to host communication.
1 VPC network with the RDMA network profile is required for the CX7 NICs. This network needs to have 8 subnets, one subnet for each CX7 NIC, and is used for GPU to GPU communication.

For more information about NIC arrangement, see Review network bandwidth and NIC arrangement.

Set up the networks either manually by following the instruction guides or automatically by using the provided script.

Instruction guides

To create the networks, you can use the following instructions:

To create the VPC networks for the gVNICs, see Create and manage Virtual Private Cloud networks.
To create the VPC network with the RDMA network profile, see Create a Virtual Private Cloud network for RDMA NICs.

For these VPC networks, we recommend setting the maximum transmission unit (MTU) to a larger value. For A4 or A3 Ultra machine type, the recommended MTU is 8896 bytes. To review the recommended MTU settings for other GPU machine types, see MTU settings for GPU machine types.

Script

To create the networks, follow these steps:

Use this script to create the network.

    #!/bin/bash

    # Create standard VPCs (network and subnets) for the gVNICs
    for N in $(seq 0 1); do
      gcloud compute networks create GVNIC_NAME_PREFIX-net-$N \
        --subnet-mode=custom \
        --mtu=8896

      gcloud compute networks subnets create GVNIC_NAME_PREFIX-sub-$N \
        --network=GVNIC_NAME_PREFIX-net-$N \
        --region=REGION \
        --range=10.$N.0.0/16

      gcloud compute firewall-rules create GVNIC_NAME_PREFIX-internal-$N \
        --network=GVNIC_NAME_PREFIX-net-$N \
        --action=ALLOW \
        --rules=tcp:0-65535,udp:0-65535,icmp \
        --source-ranges=10.0.0.0/8
    done

    # Create SSH firewall rules
    gcloud compute firewall-rules create GVNIC_NAME_PREFIX-ssh \
      --network=GVNIC_NAME_PREFIX-net-0 \
      --action=ALLOW \
      --rules=tcp:22 \
      --source-ranges=IP_RANGE

    # Assumes that an external IP is only created for vNIC 0
    gcloud compute firewall-rules create GVNIC_NAME_PREFIX-allow-ping-net-0 \
      --network=GVNIC_NAME_PREFIX-net-0 \
      --action=ALLOW \
      --rules=icmp \
      --source-ranges=IP_RANGE

    # List and make sure network profiles exist in the machine type's zone
    gcloud compute network-profiles list --filter "location.name=ZONE"

    # Create network for CX-7
    gcloud compute networks create RDMA_NAME_PREFIX-mrdma \
      --network-profile=ZONE-vpc-roce \
      --subnet-mode custom \
      --mtu=8896

    # Create subnets
    for N in $(seq 0 7); do
      gcloud compute networks subnets create RDMA_NAME_PREFIX-mrdma-sub-$N \
        --network=RDMA_NAME_PREFIX-mrdma \
        --region=REGION \
        --range=10.$((N+2)).0.0/16 # offset to avoid overlap with gVNICs
    done

Replace the following:

GVNIC_NAME_PREFIX: the custom name prefix to use for the standard VPC networks and subnets for the gVNICs.
RDMA_NAME_PREFIX: the custom name prefix to use for the VPC network and subnets with the RDMA network profile for the CX7 NICs.
ZONE: specify a zone in which the machine type that you want to use is available, such as us-central1-a. For information about regions, see GPU availability by regions and zones.
REGION: the region where you want to create the subnets. This region must correspond to the zone specified. For example, if your zone is us-central1-a, then your region is us-central1.
IP_RANGE: the IP range to use for the SSH firewall rules.

Optional: To verify that the VPC network resources are created successfully, check the network settings in the Google Cloud console:
1. In the Google Cloud console, go to the VPC networks page.
  Go to VPC networks
2. Search the list for the networks that you created in the previous step.
3. To view the subnets, firewall rules, and other network settings, click the name of the network.

Optional: Create a workload policy

For the Flex-start consumption option (Preview), skip this section and proceed to create an instance template. Due to limitations, the flex-start provisioning model doesn't support workload policies.

You can specify VM placement by creating a workload policy. If you already have a workload policy, you can reuse it. When you apply a workload policy to your MIG, Compute Engine makes best-effort attempts to create VMs that are as close to each other as possible. If your application is latency-sensitive and you want the VMs to be closer together (maximum compactness), then specify the maxTopologyDistance field when creating a workload policy.

You cannot update a workload policy after you create it. To make changes in a workload policy, you must create a new one.

To create a workload policy, select one of the following options:

gcloud

To create a workload policy, use the gcloud compute resource-policies create workload-policy command.

For a best-effort placement of VMs, specify only the --type=high-throughput flag in the command:

gcloud compute resource-policies create workload-policy WORKLOAD_POLICY_NAME \
    --type=high-throughput \
    --region=REGION

For strict colocation of VMs, specify the --max-topology-distance flag in the command:

gcloud compute resource-policies create workload-policy WORKLOAD_POLICY_NAME \
    --type=high-throughput \
    --max-topology-distance=TOPOLOGY_DISTANCE \
    --region=REGION

Replace the following:

WORKLOAD_POLICY_NAME: the name of the workload policy.
TOPOLOGY_DISTANCE: the maximum topology distance. Specify one of the following values:
- To place VMs in the same block: block
- To place VMs in the same cluster: cluster
Note: A shorter maximum distance can reduce the probability of VM availability.
REGION: the region where you want to create the workload policy. Specify a region in which you want to create the MIG and the machine type that you want to use is available. For information about regions, see GPU availability by regions and zones.

REST

To create a workload policy, make a POST request to the resourcePolicies.insert method.

For a best-effort placement of VMs, specify only the type field in the request as follows:

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/resourcePolicies
  {
    "name": "WORKLOAD_POLICY_NAME"
    "workloadPolicy": {
      "type": "HIGH_THROUGHPUT"
    }
  }

For strict colocation of VMs, specify the maxTopologyDistance field in the request as follows:

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/resourcePolicies
  {
    "name": "WORKLOAD_POLICY_NAME"
    "workloadPolicy": {
      "type": "HIGH_THROUGHPUT",
      "maxTopologyDistance": "TOPOLOGY_DISTANCE"
    }
  }

Replace the following:

PROJECT_ID: your project ID
REGION: the region where you want to create the workload policy. Specify a region in which you want to create the MIG and the machine type that you want to use is available. For information about regions, see GPU availability by regions and zones.
WORKLOAD_POLICY_NAME: the name of the workload policy.
TOPOLOGY_DISTANCE: the maximum topology distance. Specify one of the following values:
- To place VMs in the same block: BLOCK
- To place VMs in the same cluster: CLUSTER
Note: A shorter maximum distance can reduce the probability of VM availability.

Create an instance template

Specify the VM properties for a MIG by creating an instance template.

To create an instance template, select one of the following options:

gcloud

The parameters that you need to specify depend on the consumption option that you are using for this deployment. Select the tab that corresponds to your consumption option's provisioning model.

Flex-start

To create a regional instance template, use the gcloud beta compute instance-templates create command.

gcloud beta compute instance-templates create INSTANCE_TEMPLATE_NAME \
    --machine-type=MACHINE_TYPE \
    --image-family=IMAGE_FAMILY \
    --image-project=IMAGE_PROJECT \
    --instance-template-region=REGION \
    --boot-disk-type=hyperdisk-balanced \
    --boot-disk-size=DISK_SIZE \
    --scopes=cloud-platform \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-0,subnet=GVNIC_NAME_PREFIX-sub-0 \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-1,subnet=GVNIC_NAME_PREFIX-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-0,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-2,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-3,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-4,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-5,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-6,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-7,no-address \
    --reservation-affinity=none \
    --instance-termination-action=DELETE \
    --max-run-duration=RUN_DURATION \
    --maintenance-policy=TERMINATE \
    --provisioning-model=FLEX_START

Replace the following:

INSTANCE_TEMPLATE_NAME: the name of the instance template.
MACHINE_TYPE: the machine type to use for the VM. Specify either an A4 or A3 Ultra machine type. For more information, see GPU machine types.
IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Supported operating systems.
IMAGE_PROJECT: the project ID of the OS image.
REGION: the region where you want to create the instance template. Specify a region in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.
DISK_SIZE: the size of the boot disk in GB.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNIC NICs.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
RUN_DURATION: the duration you want the requested VMs to run. You must format the value as the number of days, hours, minutes, or seconds followed by d, h, m, and s respectively. For example, specify 30m for 30 minutes or 1d2h3m4s for one day, two hours, three minutes, and four seconds. The value must be between 10 minutes and seven days.

Reservation-bound

To create a regional instance template, use the gcloud compute instance-templates create command.

gcloud compute instance-templates create INSTANCE_TEMPLATE_NAME \
    --machine-type=MACHINE_TYPE \
    --image-family=IMAGE_FAMILY \
    --image-project=IMAGE_PROJECT \
    --instance-template-region=REGION \
    --boot-disk-type=hyperdisk-balanced \
    --boot-disk-size=DISK_SIZE \
    --scopes=cloud-platform \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-0,subnet=GVNIC_NAME_PREFIX-sub-0 \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-1,subnet=GVNIC_NAME_PREFIX-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-0,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-2,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-3,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-4,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-5,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-6,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-7,no-address \
    --reservation-affinity=specific \
    --reservation=RESERVATION \
    --provisioning-model=RESERVATION_BOUND \
    --instance-termination-action=DELETE \
    --maintenance-policy=TERMINATE

Replace the following:

INSTANCE_TEMPLATE_NAME: the name of the instance template.
MACHINE_TYPE: the machine type to use for the VM. Specify either an A4 or A3 Ultra machine type. For more information, see GPU machine types.
IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Supported operating systems.
IMAGE_PROJECT: the project ID of the OS image.
REGION: the region where you want to create the instance template. Specify a region in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.
DISK_SIZE: the size of the boot disk in GB.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNIC NICs.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
RESERVATION: either the reservation name or a specific block within a reservation. To get the reservation name or the available blocks, see View reserved capacity. Based on your requirement for instance placement, choose one of the following:
- To create instances across blocks or on a single block:
```
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME
```
  Additionally, for a single block, create the MIG by applying a workload policy that specifies a block collocation (maxTopologyDistance=BLOCK) . Compute Engine then applies the policy to the reservation and creates instances on the same block.
- To create instances on a specific block:
```
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME
```
Tip: If the reservation exists in the current project, then you can omit projects/RESERVATION_OWNER_PROJECT_ID/reservations/ from the reservation value.

Spot

To create a regional instance template, use the gcloud compute instance-templates create command.

gcloud compute instance-templates create INSTANCE_TEMPLATE_NAME \
    --machine-type=MACHINE_TYPE \
    --image-family=IMAGE_FAMILY \
    --image-project=IMAGE_PROJECT \
    --instance-template-region=REGION \
    --boot-disk-type=hyperdisk-balanced \
    --boot-disk-size=DISK_SIZE \
    --scopes=cloud-platform \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-0,subnet=GVNIC_NAME_PREFIX-sub-0 \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-1,subnet=GVNIC_NAME_PREFIX-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-0,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-2,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-3,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-4,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-5,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-6,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-7,no-address \
    --provisioning-model=SPOT \
    --instance-termination-action=TERMINATION_ACTION

Replace the following:

INSTANCE_TEMPLATE_NAME: the name of the instance template.
MACHINE_TYPE: the machine type to use for the VM. Specify either an A4 or A3 Ultra machine type. For more information, see GPU machine types.
IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Supported operating systems.
IMAGE_PROJECT: the project ID of the OS image.
REGION: the region where you want to create the instance template. Specify a region in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.
DISK_SIZE: the size of the boot disk in GB.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNIC NICs.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
TERMINATION_ACTION: the action to take when Compute Engine preempts the instance, either STOP (default) or DELETE.

Important: Make sure your application can handle preemption. For example, we recommend that you handle preemption by specifying a shutdown script during instance creation. Learn how to handle preemption with a shutdown script.

REST

The parameters that you need to specify depend on the consumption option that you are using for this deployment. Select the tab that corresponds to your consumption option's provisioning model.

Flex-start

To create a regional instance template, make a POST request to the beta regionInstanceTemplates.insert method.

POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/regions/REGION/instanceTemplates
{
  "name":"INSTANCE_TEMPLATE_NAME",
  "properties":{
    "disks":[
      {
        "boot":true,
        "initializeParams":{
          "diskSizeGb":"DISK_SIZE",
          "diskType":"hyperdisk-balanced",
          "sourceImage":"projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"
        },
        "mode":"READ_WRITE",
        "type":"PERSISTENT"
      }
    ],
    "machineType":"MACHINE_TYPE",
    "networkInterfaces": [
      {
        "accessConfigs": [
          {
            "name": "external-nat",
            "type": "ONE_TO_ONE_NAT"
          }
        ],
        "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-0",
        "nicType": "GVNIC",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-0"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-1",
        "nicType": "GVNIC",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-1"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-0"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-1"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-2"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-3"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-4"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-5"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-6"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-7"
      }
    ],
    "reservationAffinity": {
        "consumeReservationType": "NO_RESERVATION"
      },
    "scheduling": {
        "instanceTerminationAction": "DELETE",
        "maxRunDuration": {
          "seconds": RUN_DURATION
        },
        "onHostMaintenance": "TERMINATE",
        "provisioningModel": "FLEX_START"
      }

  }
}

Replace the following:

INSTANCE_TEMPLATE_NAME: the name of the instance template.
MACHINE_TYPE: the machine type to use for the VM. Specify either an A4 or A3 Ultra machine type. For more information, see GPU machine types.
IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Supported operating systems.
IMAGE_PROJECT: the project ID of the OS image.
REGION: the region where you want to create the instance template. Specify a region in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.
DISK_SIZE: the size of the boot disk in GB.
NETWORK_PROJECT_ID: the project ID of the network.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNIC NICs.
REGION: the region of the subnetwork.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
RUN_DURATION: the duration, in seconds, you want the requested VMs to run. The value must be between 600, which is 600 seconds (10 minutes), and 604800, which is 604,800 seconds (seven days).

Reservation-bound

To create a regional instance template, make a POST request to the regionInstanceTemplates.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/instanceTemplates
{
  "name":"INSTANCE_TEMPLATE_NAME",
  "properties":{
    "disks":[
      {
        "boot":true,
        "initializeParams":{
          "diskSizeGb":"DISK_SIZE",
          "diskType":"hyperdisk-balanced",
          "sourceImage":"projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"
        },
        "mode":"READ_WRITE",
        "type":"PERSISTENT"
      }
    ],
    "machineType":"MACHINE_TYPE",
    "networkInterfaces": [
      {
        "accessConfigs": [
          {
            "name": "external-nat",
            "type": "ONE_TO_ONE_NAT"
          }
        ],
        "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-0",
        "nicType": "GVNIC",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-0"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-1",
        "nicType": "GVNIC",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-1"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-0"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-1"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-2"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-3"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-4"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-5"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-6"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-7"
      }
    ],
    "reservationAffinity":{
        "consumeReservationType":"SPECIFIC_RESERVATION",
        "key":"compute.googleapis.com/reservation-name",
        "values":[
          "RESERVATION"
        ]
      },
    "scheduling":{
        "provisioningModel":"RESERVATION_BOUND",
        "instanceTerminationAction":"DELETE",
        "onHostMaintenance": "TERMINATE",
        "automaticRestart":true
      }
  }
}

Replace the following:

INSTANCE_TEMPLATE_NAME: the name of the instance template.
MACHINE_TYPE: the machine type to use for the VM. Specify either an A4 or A3 Ultra machine type. For more information, see GPU machine types.
IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Supported operating systems.
IMAGE_PROJECT: the project ID of the OS image.
REGION: the region where you want to create the instance template. Specify a region in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.
DISK_SIZE: the size of the boot disk in GB.
NETWORK_PROJECT_ID: the project ID of the network.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNIC NICs.
REGION: the region of the subnetwork.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
RESERVATION: either the reservation name or a specific block within a reservation. To get the reservation name or the available blocks, see View reserved capacity. Based on your requirement for instance placement, choose one of the following:
- To create instances across blocks or on a single block:
```
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME
```
  Additionally, for a single block, create the MIG by applying a workload policy that specifies a block collocation (maxTopologyDistance=BLOCK) . Compute Engine then applies the policy to the reservation and creates instances on the same block.
- To create instances on a specific block:
```
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME
```
Tip: If the reservation exists in the current project, then you can omit projects/RESERVATION_OWNER_PROJECT_ID/reservations/ from the reservation value.

Spot

To create a regional instance template, make a POST request to the regionInstanceTemplates.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/instanceTemplates
{
  "name":"INSTANCE_TEMPLATE_NAME",
  "properties":{
    "disks":[
      {
        "boot":true,
        "initializeParams":{
          "diskSizeGb":"DISK_SIZE",
          "diskType":"hyperdisk-balanced",
          "sourceImage":"projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"
        },
        "mode":"READ_WRITE",
        "type":"PERSISTENT"
      }
    ],
    "machineType":"MACHINE_TYPE",
    "networkInterfaces": [
      {
        "accessConfigs": [
          {
            "name": "external-nat",
            "type": "ONE_TO_ONE_NAT"
          }
        ],
        "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-0",
        "nicType": "GVNIC",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-0"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-1",
        "nicType": "GVNIC",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-1"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-0"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-1"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-2"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-3"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-4"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-5"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-6"
      },
      {
        "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
        "nicType": "MRDMA",
        "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-7"
      }
    ],
    "scheduling":
    {
      "provisioningModel": "SPOT",
      "instanceTerminationAction": "TERMINATION_ACTION"
    }
  }
}

Replace the following:

INSTANCE_TEMPLATE_NAME: the name of the instance template.
MACHINE_TYPE: the machine type to use for the VM. Specify either an A4 or A3 Ultra machine type. For more information, see GPU machine types.
IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Supported operating systems.
IMAGE_PROJECT: the project ID of the OS image.
REGION: the region where you want to create the instance template. Specify a region in which the machine type that you want to use is available. For information about regions, see GPU regions and zones.
DISK_SIZE: the size of the boot disk in GB.
NETWORK_PROJECT_ID: the project ID of the network.
GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNIC NICs.
REGION: the region of the subnetwork.
RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
TERMINATION_ACTION: the action to take when Compute Engine preempts the instance, either STOP (default) or DELETE.

Important: Make sure your application can handle preemption. For example, we recommend that you handle preemption by specifying a shutdown script during instance creation. Learn how to handle preemption with a shutdown script.

After you create the instance template, you can view it to see its ID and review its instance properties.

Create a MIG

After you complete all the previous steps, create a MIG based on your scenario as follows:

Scenario	Method to create a MIG and VMs in it	Example
You have multiple or parallel jobs that can start with any number of VMs.	Create a MIG and use the target size to specify the number of VMs that you want in the group. See Create a MIG with target size.	ML inference jobs
You have a job that requires distribution across an exact number of VMs.	Create a MIG without any VMs in it, and then create a resize request in the MIG. The resize request helps you to obtain VMs all at once. See Create a MIG and a resize request.	Distributed ML training and fine-tuning jobs

Create a MIG with target size

If you can start your job without creating all of the VMs at once, then create a MIG with a target size. The target size determines the number of VMs in the MIG. The MIG starts creating VMs based on current resource availability. If any resource is temporarily unavailable, the MIG continuously attempts to create VMs to meet the target size.

To create a MIG with a target size, select one of the following options:

gcloud

To create a MIG with a specified target size, use the instance-groups managed create command.

The commands to create a MIG use a workload policy to specify VM placement. If you don't want to use a workload policy, then remove the --workload-policy flag.

Create a zonal or regional MIG as follows:

To create a zonal MIG, use the following command:

gcloud compute instance-groups managed create MIG_NAME \
  --template=INSTANCE_TEMPLATE_URL \
  --size=TARGET_SIZE \
  --workload-policy=WORKLOAD_POLICY_URL \
  --zone=ZONE

To create a regional MIG, use the following command:

gcloud compute instance-groups managed create MIG_NAME \
    --template=INSTANCE_TEMPLATE_URL \
    --size=TARGET_SIZE \
    --workload-policy=WORKLOAD_POLICY_URL \
    --region=REGION

Replace the following:

MIG_NAME: the name of the MIG.
INSTANCE_TEMPLATE_URL: the URL of the instance template that you want to use to create VMs in the MIG. The URL can contain either the ID or name of the instance template. Specify one of the following values:
- For a regional instance template: projects/PROJECT_ID/regions/REGION/instanceTemplates/INSTANCE_TEMPLATE_ID
- For a global instance template: INSTANCE_TEMPLATE_ID
TARGET_SIZE: the number of VMs that you want in the MIG.
WORKLOAD_POLICY_URL: Optional: the URL of the workload policy. If you don't want to use a workload policy, then you can remove the --workload-policy flag.
ZONE: the zone in which you want to create the MIG. If you use a workload policy, then specify a zone within the policy's region.
REGION: the region in which you want to create the MIG. If you use a workload policy, then specify the same region as that of the policy. For a regional MIG, instead of a region, you can specify the zones in that region by using the --zones flag.

REST

To create a MIG with a specified target size, make a POST request as follows.

The requests to create a MIG use a workload policy to specify VM placement. If you don't want to use a workload policy, then remove the resourcePolicies.workloadPolicy field.

Create a zonal or regional MIG as follows:

To create a zonal MIG, make a POST request to the instanceGroupManagers.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers
{
  "versions": [
    {
      "instanceTemplate": "INSTANCE_TEMPLATE_URL"
    }
  ],
  "name": "MIG_NAME",
  "instanceTemplate": "INSTANCE_TEMPLATE_URL",
  "targetSize": "TARGET_SIZE",
  "resourcePolicies": {
    "workloadPolicy": WORKLOAD_POLICY_URL
  }
}

To create a regional MIG, make a POST request to the regionInstanceGroupManagers.insert method.

  POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/instanceGroupManagers
  {
    "versions": [
      {
        "instanceTemplate": "INSTANCE_TEMPLATE_URL"
      }
    ],
    "name": "MIG_NAME",
    "instanceTemplate": "INSTANCE_TEMPLATE_URL",
    "targetSize": "TARGET_SIZE",
    "resourcePolicies": {
      "workloadPolicy": WORKLOAD_POLICY_URL
    }
  }

Replace the following:

PROJECT_ID: the project ID.
ZONE: the zone in which you want to create the MIG. If you use a workload policy, then specify a zone within the policy's region.
REGION: the region in which you want to create a MIG. If you use a workload policy, then specify the same region as that of the policy.
INSTANCE_TEMPLATE_URL: the URL of the instance template that you want to use to create VMs in the MIG. The URL can contain either the ID or name of the instance template. Specify one of the following values:
- For a regional instance template: projects/PROJECT_ID/regions/REGION/instanceTemplates/INSTANCE_TEMPLATE_ID
- For a global instance template: INSTANCE_TEMPLATE_ID
MIG_NAME: the name of the MIG.
TARGET_SIZE: the number of VMs that you want in the MIG.
WORKLOAD_POLICY_URL: Optional: the URL of the workload policy. If you don't want to use a workload policy, then you can remove the resourcePolicies.workloadPolicy field.

Create a MIG and a resize request

If you require multiple VMs all at once to start a job, then create a MIG, and create a resize request in the MIG as described in this section.

To create a resize request in a MIG, select one of the following options.

gcloud

The parameters that you need to specify depend on the consumption option that you are using for this deployment. Select the tab that corresponds to your consumption option's provisioning model.

Flex-start

Create a zonal or regional MIG and a resize request as follows:

To create a zonal MIG and a resize request in it, do the following:

Create a zonal MIG using the instance-groups managed create command as follows.

gcloud compute instance-groups managed create MIG_NAME \
    --template=INSTANCE_TEMPLATE_URL \
    --size=0 \
    --default-action-on-vm-failure=do-nothing \
    --zone=ZONE

Create a resize request in the zonal MIG using the instance-groups managed resize-requests create command as follows:

gcloud compute instance-groups managed resize-requests create MIG_NAME \
    --resize-request=RESIZE_REQUEST_NAME \
    --resize-by=COUNT \
    --zone=ZONE

To create a regional MIG and a resize request in it, do the following:

Create a regional MIG using the instance-groups managed create command as follows.

gcloud compute instance-groups managed create MIG_NAME \
    --template=INSTANCE_TEMPLATE_URL \
    --size=0 \
    --default-action-on-vm-failure=do-nothing \
    --zones=ZONE \
    --target-distribution-shape=any-single-zone \
    --instance-redistribution-type=none

Create a resize request in the regional MIG using the beta instance-groups managed resize-requests create command as follows:

gcloud beta compute instance-groups managed resize-requests create MIG_NAME \
    --resize-request=RESIZE_REQUEST_NAME \
    --resize-by=COUNT \
    --region=REGION

Reservation-bound

The commands to create a MIG use a workload policy to specify VM placement. If you don't want to use a workload policy, then remove the --workload-policy flag.

Create a zonal or regional MIG and a resize request as follows:

To create a zonal MIG and a resize request in it, do the following:

Create a zonal MIG using the instance-groups managed create command as follows.

gcloud compute instance-groups managed create MIG_NAME \
    --template=INSTANCE_TEMPLATE_URL \
    --size=0 \
    --workload-policy=WORKLOAD_POLICY_URL \
    --zone=ZONE

Create a resize request in the zonal MIG using the instance-groups managed resize-requests create command as follows:

gcloud compute instance-groups managed resize-requests create MIG_NAME \
    --resize-request=RESIZE_REQUEST_NAME \
    --resize-by=COUNT \
    --zone=ZONE

To create a regional MIG and a resize request in it, do the following:

Create a regional MIG using the instance-groups managed create command as follows.

gcloud compute instance-groups managed create MIG_NAME \
    --template=INSTANCE_TEMPLATE_URL \
    --size=0 \
    --workload-policy=WORKLOAD_POLICY_URL \
    --zones=ZONE \
    --target-distribution-shape=any-single-zone \
    --instance-redistribution-type=none

Create a resize request in the regional MIG using the beta instance-groups managed resize-requests create command as follows:

gcloud beta compute instance-groups managed resize-requests create MIG_NAME \
    --resize-request=RESIZE_REQUEST_NAME \
    --resize-by=COUNT \
    --region=REGION

Spot

The commands to create a MIG use a workload policy to specify VM placement. If you don't want to use a workload policy, then remove the --workload-policy flag.

Create a zonal or regional MIG and a resize request as follows:

To create a zonal MIG and a resize request in it, do the following:

Create a zonal MIG using the instance-groups managed create command as follows.

gcloud compute instance-groups managed create MIG_NAME \
    --template=INSTANCE_TEMPLATE_URL \
    --size=0 \
    --workload-policy=WORKLOAD_POLICY_URL \
    --zone=ZONE

Create a resize request in the zonal MIG using the instance-groups managed resize-requests create command as follows:

gcloud compute instance-groups managed resize-requests create MIG_NAME \
    --resize-request=RESIZE_REQUEST_NAME \
    --resize-by=COUNT \
    --zone=ZONE

To create a regional MIG and a resize request in it, do the following:

Create a regional MIG using the instance-groups managed create command as follows.

gcloud compute instance-groups managed create MIG_NAME \
    --template=INSTANCE_TEMPLATE_URL \
    --size=0 \
    --workload-policy=WORKLOAD_POLICY_URL \
    --zones=ZONE \
    --target-distribution-shape=any-single-zone \
    --instance-redistribution-type=none

Create a resize request in the regional MIG using the beta instance-groups managed resize-requests create command as follows:

gcloud beta compute instance-groups managed resize-requests create MIG_NAME \
    --resize-request=RESIZE_REQUEST_NAME \
    --resize-by=COUNT \
    --region=REGION

Replace the following:

MIG_NAME: the name of the MIG.
INSTANCE_TEMPLATE_URL: the URL of the instance template that you want to use to create VMs in the MIG. The URL can contain either the ID or name of the instance template. Specify one of the following values:
- For a regional instance template: projects/PROJECT_ID/regions/REGION/instanceTemplates/INSTANCE_TEMPLATE_ID
- For a global instance template: INSTANCE_TEMPLATE_ID
WORKLOAD_POLICY_URL: Optional: the URL of the workload policy. If you don't want to use a workload policy, then you can remove the --workload-policy flag.
ZONE: the zone in which you want to create the MIG. For a regional MIG also, you must specify a zone. This zone must be the zone that contains the profile for your VPC network and must be a zone where the machine type is available. For more information, see Limitations.
RESIZE_REQUEST_NAME: the name of the resize request, which must be unique within the specified MIG. Otherwise, creating the resize request fails.
COUNT: the number of VMs to add to the MIG all at once.
REGION: the region in which the MIG is located.

If your workload requires specific VM names, then you can specify a list of names of VMs to create by using the beta instance-groups managed resize-requests create command. In the command, replace the --resize-request flag with the --instances flag.

REST

The parameters that you need to specify depend on the consumption option that you are using for this deployment. Select the tab that corresponds to your consumption option's provisioning model.

Flex-start

Create a zonal or regional MIG and a resize request as follows:

To create a zonal MIG and a resize request in it, do the following:

Create a zonal MIG by making a POST request to the instanceGroupManagers.insert method as follows.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers
{
  "versions": [
    {
      "instanceTemplate": "INSTANCE_TEMPLATE_URL"
    }
  ],
  "name": "MIG_NAME",
  "targetSize": 0,
  "instanceLifecyclePolicy": {
    "defaultActionOnFailure": "DO_NOTHING"
  }
}

Create a resize request in the zonal MIG by making a POST request to the instanceGroupManagerResizeRequests.insert method as follows:

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME/resizeRequests
{
  "name": "RESIZE_REQUEST_NAME",
  "resizeBy": COUNT
}

To create a regional MIG and a resize request in it, do the following:

Create a regional MIG by making a POST request to the regionInstanceGroupManagers.insert method as follows.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/instanceGroupManagers
{
  "versions": [
    {
      "instanceTemplate": "INSTANCE_TEMPLATE_URL"
    }
  ],
  "name": "MIG_NAME",
  "targetSize": 0,
  "distributionPolicy": {
    "targetShape": "ANY_SINGLE_ZONE",
    "zones": [
      {
        "zone": "projects/PROJECT_ID/zones/ZONE"
      }
    ]
  },
  "updatePolicy": {
    "instanceRedistributionType": "NONE"
  },
  "instanceLifecyclePolicy": {
    "defaultActionOnFailure": "DO_NOTHING"
  }
}

Create a resize request in the regional MIG by making a POST request to the beta.regionInstanceGroupManagerResizeRequests.insert method as follows:

POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/regions/REGION/instanceGroupManagers/MIG_NAME/resizeRequests
{
  "name": "RESIZE_REQUEST_NAME",
  "resizeBy": COUNT
}

Reservation-bound

The requests to create a MIG use a workload policy to specify VM placement. If you don't want to use a workload policy, then remove the resourcePolicies.workloadPolicy field.

Create a zonal or regional MIG and a resize request as follows:

To create a zonal MIG and a resize request in it, do the following:

Create a zonal MIG by making a POST request to the instanceGroupManagers.insert method as follows.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers
{
  "versions": [
    {
      "instanceTemplate": "INSTANCE_TEMPLATE_URL"
    }
  ],
  "name": "MIG_NAME",
  "targetSize": 0,
  "resourcePolicies": {
    "workloadPolicy": WORKLOAD_POLICY_URL
  }
}

Create a resize request in the zonal MIG by making a POST request to the instanceGroupManagerResizeRequests.insert method as follows:

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME/resizeRequests
{
  "name": "RESIZE_REQUEST_NAME",
  "resizeBy": COUNT
}

To create a regional MIG and a resize request in it, do the following:

Create a regional MIG by making a POST request to the regionInstanceGroupManagers.insert method as follows.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/instanceGroupManagers
{
  "versions": [
    {
      "instanceTemplate": "INSTANCE_TEMPLATE_URL"
    }
  ],
  "name": "MIG_NAME",
  "targetSize": 0,
  "distributionPolicy": {
    "targetShape": "ANY_SINGLE_ZONE",
    "zones": [
      {
        "zone": "projects/PROJECT_ID/zones/ZONE"
      }
    ]
  },
  "updatePolicy": {
    "instanceRedistributionType": "NONE"
  },
  "resourcePolicies": {
    "workloadPolicy": WORKLOAD_POLICY_URL
  }
}

Create a resize request in the regional MIG by making a POST request to the beta.regionInstanceGroupManagerResizeRequests.insert method as follows:

POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/regions/REGION/instanceGroupManagers/MIG_NAME/resizeRequests
{
  "name": "RESIZE_REQUEST_NAME",
  "resizeBy": COUNT
}

Spot

The requests to create a MIG use a workload policy to specify VM placement. If you don't want to use a workload policy, then remove the resourcePolicies.workloadPolicy field.

Create a zonal or regional MIG and a resize request as follows:

To create a zonal MIG and a resize request in it, do the following:

Create a zonal MIG by making a POST request to the instanceGroupManagers.insert method as follows.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers
{
  "versions": [
    {
      "instanceTemplate": "INSTANCE_TEMPLATE_URL"
    }
  ],
  "name": "MIG_NAME",
  "targetSize": 0,
  "resourcePolicies": {
    "workloadPolicy": WORKLOAD_POLICY_URL
  }
}

Create a resize request in the zonal MIG by making a POST request to the instanceGroupManagerResizeRequests.insert method as follows:

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME/resizeRequests
{
  "name": "RESIZE_REQUEST_NAME",
  "resizeBy": COUNT
}

To create a regional MIG and a resize request in it, do the following:

Create a regional MIG by making a POST request to the regionInstanceGroupManagers.insert method as follows.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/instanceGroupManagers
{
  "versions": [
    {
      "instanceTemplate": "INSTANCE_TEMPLATE_URL"
    }
  ],
  "name": "MIG_NAME",
  "targetSize": 0,
  "distributionPolicy": {
    "targetShape": "ANY_SINGLE_ZONE",
    "zones": [
      {
        "zone": "projects/PROJECT_ID/zones/ZONE"
      }
    ]
  },
  "updatePolicy": {
    "instanceRedistributionType": "NONE"
  },
  "resourcePolicies": {
    "workloadPolicy": WORKLOAD_POLICY_URL
  }
}

Create a resize request in the regional MIG by making a POST request to the beta.regionInstanceGroupManagerResizeRequests.insert method as follows:

POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/regions/REGION/instanceGroupManagers/MIG_NAME/resizeRequests
{
  "name": "RESIZE_REQUEST_NAME",
  "resizeBy": COUNT
}

Replace the following:

PROJECT_ID: the project ID.
ZONE: the zone in which you want to create the MIG. For a regional MIG also, you must specify a zone. This zone must be the zone that contains the profile for your VPC network and must be a zone where the machine type is available. For more information, see Limitations.
REGION: the region in which you want to create the MIG.
INSTANCE_TEMPLATE_URL: the URL of the instance template that you want to use to create VMs in the MIG. The URL can contain either the ID or name of the instance template. Specify one of the following values:
- For a regional instance template: projects/PROJECT_ID/regions/REGION/instanceTemplates/INSTANCE_TEMPLATE_ID
- For a global instance template: INSTANCE_TEMPLATE_ID
MIG_NAME: the name of the MIG.
WORKLOAD_POLICY_URL: Optional: the URL of the workload policy. If you don't want to use a workload policy, then you can remove the resourcePolicies.workloadPolicy field.
RESIZE_REQUEST_NAME: the name of the resize request, which must be unique within the specified MIG. Otherwise, creating the resize request fails.
COUNT: the number of VMs to add to the MIG all at once

If your workload requires specific VM names, then you can specify a list of names of VMs to create. To do so, send a POST request to the beta.regionInstanceGroupManagerResizeRequests.insert method for a regional MIG, or the beta.instanceGroupManagerResizeRequests.insert method for a zonal MIG. In the request body, replace the resizeBy field with the instanceNames field.

Create an AI-optimized MIG with A4 or A3 Ultra machine type Stay organized with collections Save and categorize content based on your preferences.

Limitations

Before you begin

Required roles

Required permissions

Overview

Create VPC networks

Instruction guides

Script

Optional: Create a workload policy

gcloud

REST

Create an instance template

gcloud

Flex-start

Reservation-bound

Spot

REST

Flex-start

Reservation-bound

Spot

Create a MIG

Create a MIG with target size

gcloud

REST

Create a MIG and a resize request

gcloud

Flex-start

Reservation-bound

Spot

REST

Flex-start

Reservation-bound

Spot

What's next?

Create an AI-optimized MIG with A4 or A3 Ultra machine type