Create a MIG with GPU VMs

This document describes how to create a managed instance group (MIG) with virtual machine (VM) instances that have attached GPUs. Specifically, it describes how to add GPU VMs all at once in a zonal MIG by using resize requests and the flex-start provisioning model. The VMs that you create by using the flex-start provisioning model are called Flex-start VMs. If you want to create a MIG resize request to consume a reservation, then see instead the following:

To consume a reservation for a future reservation in AI Hypercomputer, see Create a MIG and a resize request in the AI Hypercomputer documentation.
To consume a reservation for a future reservation in calendar mode, see Create a resize request in a MIG.

Use a MIG resize request with the flex-start provisioning model to increase your chances of obtaining GPU Flex-start VMs. In the request, you must specify the number of GPU Flex-start VMs that you want to create. Dynamic Workload Scheduler (DWS), the underlying scheduler mechanism, makes best-effort attempts to schedule resize requests created across Compute Engine based on requested durations and resource availability. If your request resources become available, then the MIG creates the Flex-start VMs.

If your job finishes earlier than the requested duration, then you can delete the created Flex-start VMs. Otherwise, the MIG deletes Flex-start VMs at the end of their run duration.

You can also read about other basic scenarios for creating a MIG.

Before you begin

To make sure that you have sufficient GPU quota for the resources you're requesting, check your GPU quota.
To understand quota consumption, read GPU VMs and preemptible allocation quotas.
If you haven't already, set up authentication. Authentication verifies your identity for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
Console

When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
1. Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:
  gcloud init
  If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
  
  Note: If you installed the gcloud CLI previously, make sure you have the latest version by running gcloud components update.
2. Set a default region and zone.
REST

To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Limitations

Review the limitations for creating a MIG resize request.

Create a MIG and add GPU VMs all at once

To create a MIG and add GPU Flex-start VMs all at once in the group, do the following:

Create an instance template, which is required to create a MIG. The MIG creates each VM in the group based on the instance template. In the template, specify the configuration for GPU Flex-start VMs and additional configurations required to use resize requests.

For more information about instance templates, see About instance templates.
Create a MIG and a resize request to add GPU Flex-start VMs all at once.

Create an instance template

Create an instance template that specifies a supported GPU machine series for MIG resize requests, as described in this section. Then, use the template to create a MIG.

Note: If you want to run data science or machine learning workloads, consider using a Deep Learning VM image when you create an instance template. Deep Learning VM Images is a set of prepackaged VM images that comes with machine learning frameworks and essential tools. For more information about these images, see Choose an image in the Deep Learning VM Images documentation.

Permissions required for this task

To perform this task, you must have the following permissions:

All permissions required to call the instanceTemplates.insert method.

Console

Go to the Instance templates page.

Go to Instance templates
Click Create instance template. The Create an instance template page opens.
In the Name field, enter a name for the instance template.
In the Machine configuration section, do the following:
1. Click the GPUs tab.
2. In the GPU type list, select the GPU type.
3. In the Number of GPUs list, select the number of GPUs.
4. In the Machine type section, select a machine type.
In the Provisioning model section, do the following:
1. In the VM provisioning model list, select Flex-start.
  
  Note: When you select the flex-start provisioning model, you can't use reservations. The Google Cloud console automatically selects the Don't use a reservation option in the Advanced options > Management > Reservations section.
2. To set a run duration for the VMs created through the instance template, in the Enter number of hours field, enter the number of hours. The value must be between one hour (1) and seven days (168).
Optional: To change the default value boot disk type or image, in the Boot disk section, click Change. Then, follow the prompts to change the boot disk.
Click Create.

gcloud

Create an instance template by using the instance-templates create command:

gcloud compute instance-templates create INSTANCE_TEMPLATE_NAME \
    --image-project=IMAGE_PROJECT \
    --image-family=IMAGE_FAMILY \
    --instance-termination-action=DELETE \
    --instance-template-region=REGION \
    --machine-type=MACHINE_TYPE \
    --maintenance-policy=TERMINATE \
    --max-run-duration=RUN_DURATION \
    --provisioning-model=FLEX_START \
    --reservation-affinity=none

Replace the following:

INSTANCE_TEMPLATE_NAME: the name of the instance template.
IMAGE_PROJECT: the image project that contains the image—for example, debian-cloud. For more information about the supported image projects, see Public images.
IMAGE_FAMILY: an image family. This specifies the most recent, non-deprecated OS image. For example, if you specify debian-12, the latest version in the Debian 12 image family is used. For more information about using image families, see Image families best practices.

Note: If you want to use a specific version of the OS image, such as debian-12-bookworm-v20240701, then replace the --image-family flag with the --image flag.
REGION: the region in which to create the instance template.
MACHINE_TYPE: a GPU machine type. If you specify an N1 machine type, then include the --accelerator flag to specify the number and type of GPUs to attach to your VMs.
RUN_DURATION: the duration you want the requested VMs to run. You must format the value as the number of days, hours, minutes, or seconds followed by d, h, m, or s respectively. For example, specify 30m for 30 minutes or 1d2h3m4s for one day, two hours, three minutes, and four seconds. The value must be between 10 minutes and seven days.

REST

Create an instance template by making a POST request to the instanceTemplates.insert method:

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/instanceTemplates

{
  "name": "INSTANCE_TEMPLATE_NAME",
  "properties": {
    "disks": [
      {
        "boot": true,
        "initializeParams": {
          "sourceImage": "projects/IMAGE_PROJECT/global/images/IMAGE
        }
      }
    ],
    "machineType": "MACHINE_TYPE",
    "networkInterfaces": [
      {
        "network": "global/networks/default"
      }
    ],
    "reservationAffinity": {
      "consumeReservationType": "NO_RESERVATION"
    },
    "scheduling": {
      "instanceTerminationAction": "DELETE",
      "maxRunDuration": {
        "seconds": RUN_DURATION
      },
      "onHostMaintenance": "TERMINATE",
      "provisioningModel": "FLEX_START"
    }
  }
}

Replace the following:

PROJECT_ID: the ID of the project in which you want to create the instance template.
REGION: the region in which to create the instance template.
INSTANCE_TEMPLATE_NAME: the name of the instance template.
IMAGE_PROJECT: the image project that contains the image—for example, debian-cloud. For more information about the supported image projects, see Public images.
IMAGE: specify one of the following:
- A specific version of the OS image—for example, debian-12-bookworm-v20240617.
- An image family, which must be formatted as family/IMAGE_FAMILY. This specifies the most recent, non-deprecated OS image. For example, if you specify family/debian-12, the latest version in the Debian 12 image family is used. For more information about using image families, see Image families best practices.
MACHINE_TYPE: a GPU machine type. If you specify an N1 machine type, then include the guestAccelerators field to specify the number and type of GPUs to attach to your VMs.
RUN_DURATION: the duration, in seconds, you want the requested VMs to run before the MIG automatically deletes them. The value must be between 600, which is 600 seconds (10 minutes), and 604800, which is 604,800 seconds (seven days).

After you create the instance template, you can view it to see its ID and review its instance properties.

Create a MIG and add GPU VMs all at once

Create a MIG as described in this section. To create a resize request in the MIG, you must not configure autoscaling and must turn off repairs.

Permissions required for this task

To perform this task, you must have the following permissions:

All permissions required to call the instanceGroupManagers.insert method.

Console

Go to the Instance groups page.

Go to Instance groups
Click Create instance group. The Create instance group page opens.
In the Name field, enter a name for the MIG.
Before you select an instance template, you must delete the autoscaling configuration and turn off repairs as follows:
1. To delete the autoscaling configuration, do the following:
  1. In the Autoscaling section, click the Autoscaling mode list, and then click Delete autoscaling configuration.
  2. In the confirmation dialog, click Delete.
2. To turn off repairs, in the VM instance lifecycle section, click the Default action on failure list, and then select No action.
Go back to the Instance template field. In the Instance template list, select the instance template that you created in the previous section.
Do one of the following:
- To create a resize request with the MIG, do the following:
  1. In the Number of instances field, enter the number of Flex-start VMs that you want to create all at once.
  2. Select the Use resize request to create VMs all at once checkbox.
  3. Optional: To specify a different run duration for the VMs than the one set in the instance template, in the Requested run duration field and Unit lists, specify a duration. The duration must be between one hour and seven days.
- To create a resize request after you create the MIG, in the Number of instances field, enter 0.
In the Location section, specify whether you want to create a zonal or a regional MIG as follows:
1. To create a zonal MIG, select Single zone. Or, to create a regional MIG, select Multiple zones.
2. Select the Region and Zones of the MIG.
3. If you're creating a regional MIG, then do the following:
  1. In the Target distribution shape field, select Any single zone.
  2. In the dialog that appears, click Disable instance redistribution.
Click Create.

gcloud

Create a zonal MIG using the instance-groups managed create command:

gcloud compute instance-groups managed create INSTANCE_GROUP_NAME \
   --template=INSTANCE_TEMPLATE_URL \
   --size=0 \
   --zone=ZONE \
   --default-action-on-vm-failure=do_nothing

In the MIG, create a resize request using the instance-groups managed resize-requests create command. Specify the number of GPU VMs that you want and the duration for which you want to run those VMs.
```
gcloud compute instance-groups managed resize-requests create INSTANCE_GROUP_NAME \
   --resize-request=RESIZE_REQUEST_NAME \
   --resize-by=COUNT \
   --zone=ZONE
```

Replace the following:

INSTANCE_GROUP_NAME: the name of the MIG.
INSTANCE_TEMPLATE_URL: the URL of the instance template that you want to use to create VMs in the MIG. The URL can contain either the ID or name of the instance template. Specify one of the following values:
- For a regional instance template: projects/PROJECT_ID/regions/REGION/instanceTemplates/INSTANCE_TEMPLATE_ID
- For a global instance template: INSTANCE_TEMPLATE_ID
ZONE: one of the zones available for Compute Engine.
RESIZE_REQUEST_NAME: the name of the resize request.
COUNT: the number of Flex-start VMs to add all at once in the group.

REST

Create a zonal MIG by making a POST request to the instanceGroupManagers.insert method.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers

{
 "versions": [
   {
     "instanceTemplate": "INSTANCE_TEMPLATE_URL"
   }
 ],
 "name": "INSTANCE_GROUP_NAME",
 "targetSize": 0,
 "instanceLifecyclePolicy": {
   "defaultActionOnFailure": "DO_NOTHING"
 }
}

In the MIG, create a resize request by making a POST request to the instanceGroupManagerResizeRequests.insert method. In the request body, specify the number of GPU Flex-start VMs that you want to create all at once and the duration that you want to run those Flex-start VMs.
```
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/INSTANCE_GROUP_NAME/resizeRequests

{
 "name": "RESIZE_REQUEST_NAME",
 "resizeBy": COUNT
}
```

Replace the following:

PROJECT_ID: the ID of the project in which you want to create the MIG.
INSTANCE_GROUP_NAME: the name of the MIG.
INSTANCE_TEMPLATE_URL: the URL of the instance template that you want to use to create VMs in the MIG. The URL can contain either the ID or name of the instance template. Specify one of the following values:
- For a regional instance template: projects/PROJECT_ID/regions/REGION/instanceTemplates/INSTANCE_TEMPLATE_ID
- For a global instance template: INSTANCE_TEMPLATE_ID
ZONE: one of the zones available for Compute Engine.
RESIZE_REQUEST_NAME: the name of the resize request.
COUNT: the number of Flex-start VMs to add all at once in the group.

The resize request that you create stays in the ACCEPTED state until the MIG creates all the requested GPU Flex-start VMs. After all GPU Flex-start VMs are created in the group, the state of the request changes to SUCCEEDED.

What's next

Learn how resize requests work in a MIG.
Learn how to create a regional MIG that is compatible with resize requests (Preview).
Learn how to view, cancel, or delete resize requests in a MIG.
Learn how to view info about MIGs and managed VMs.

Learn how to view the actual and forecasted usage of your VMs and GPUs.