Compute Engine instances provisioning models

Linux Windows

This document describes the provisioning models for Compute Engine instances. To learn more about deployment options, see Choose a Compute Engine deployment strategy for your workload.

Provisioning models determine the availability, lifespan, and pricing of your instances. If you understand these models, then you can choose the best option for your workload.

Available provisioning models

When you create a compute instance, you can specify one of the following provisioning models. If you don't specify a provisioning model, then Compute Engine uses the standard provisioning model by default.

Standard
Spot
Flex-start (Preview)
Reservation-bound

The following table helps you compare the use cases and pricing for each provisioning model:

	Standard	Spot	Flex-start (Preview)	Reservation-bound
Summary	Based on resource availability, you can immediately create instances. You can control when to stop or delete instances.	Based on resource availability, you can immediately create instances. You can control when to stop or delete instances. However, you also allow Compute Engine to stop or delete instances at any time to reclaim capacity.	After you create a zonal managed instance group (MIG), you request Compute Engine to add instances with GPUs attached to the MIG. Compute Engine schedules the provisioning of the instances based on resource availability. You can control when to delete instances. However, you can't stop, suspend, or recreate them. The instances run for up to seven days. Then, Compute Engine deletes them.	You can request to reserve capacity at a future date for creating instances with GPUs attached. If Google Cloud approves your request, then Compute Engine creates a reservation. At the start of the reservation period, you can consume the reservation by creating GPU instances that match the reservation. During the approved reservation period, you can stop, restart, delete, and recreate instances to consume the reservation as needed. When the reservation period ends, Compute Engine deletes the reservation, and stops or deletes any instances that consume the reservation.
Use cases	Ideal for workloads that require stability and continuous operation, such as the following workloads: Web servers Databases Enterprise applications Development and testing	Ideal for workloads that can tolerate interruptions, such as the following workloads: Batch processing High performance computing (HPC) Continuous integration and continuous deployment (CI/CD) Data analytics Media encoding Online inference	Workloads that require stability and need to run for no more than seven days, such as the following workloads: Small model pre-training Model fine-tuning HPC simulation Batch inference	Ideal for workloads that require stability and a specific run time, such as the following: For workloads that last up to 90 days: Model pre-training jobs Model fine-tuning jobs HPC simulation workloads Short-term expected increases in inference workloads For workloads longer than 90 days: Training workloads Inference workloads
Pricing	You incur standard pricing for instances. For more information, see VM instance pricing.	Most vCPUs, GPUs, and Local SSD are available at a 60-91% discount. For more information, see Spot VMs pricing.	Based on the machine family that your instances use, you get up to a 53% discount for vCPUs and GPUs. See Dynamic Workload Scheduler (DWS) pricing.	Based on the machine family that your instances use, you get up to a 53% discount for vCPUs and GPUs. Additionally, you incur charges based on how you reserve capacity for creating instances as follows: If you reserve capacity in AI Hypercomputer, then you incur charges based on accelerator-optimized VMs pricing. If you reserve capacity by using future reservations in calendar mode, then you incur charges based on the Dynamic Workload Scheduler (DWS) pricing.
Quota	When you create an instance, standard quota is consumed.	When you create an instance, preemptible quota is consumed. If your project lacks preemptible quota, then standard quota is consumed. Google Cloud Free Tier credits don't apply to Spot VMs.	When the MIG adds instances to the group, preemptible quota is consumed. If your project lacks preemptible quota, then standard quota is consumed.	Quota doesn't apply to the reservation-bound provisioning model.

Instance availability and lifespan

The following table shows you the compute instances availability and lifespan for each provisioning model:

	Standard	Spot	Flex-start (Preview)	Reservation-bound
Creation prerequisites	No creation prerequisites.	No creation prerequisites.	No creation prerequisites.	To create instances, you must first reserve capacity using one of the following methods: To reserve capacity for long-running workloads, use future reservations in AI Hypercomputer. To reserve capacity for workloads that run for up to 90 days, use future reservations in calendar mode. At your chosen delivery date and time, Compute Engine provisions your requested capacity. Then, you can consume the capacity by creating instances.
Supported machine series	You can use any machine series, except A4X, A4, and A3 Ultra.	You can use any machine series, except the following: M2 and M3 Bare metal instances	You can only use the following machine series: Accelerator-optimized machine series N1 virtual machine (VM) instances with GPUs attached	Based on how you reserve capacity to create VMs, you can only use the following machine series: If you reserve capacity in AI Hypercomputer, then you can only use A4X, A4, and A3 Ultra machine series. If you create a future reservation in calendar mode, then you can only use A4 and A3 Ultra machine series.
Instance availability	You can create instances at any time, as long as your requested resources are available.	You can create instances at any time, as long as your requested resources are available.	You can only create instances by creating resize requests in a MIG. Compute Engine uses DWS to schedule the provisioning of your requested capacity based on resource availability. DWS helps you obtain high-demand resources like GPUs.	You can only create instances after reserving capacity for a future date. On your requested date, Compute Engine delivers your requested capacity, which you can then use to create instances. If you reserve resources using future reservations in calendar mode, then Compute Engine uses DWS to provision your requested capacity. DWS helps you obtain high-demand resources like GPUs.
Instance lifespan	You can control when to stop or delete an instance, except in the following cases: If the machine type that the instance uses doesn't support live migration, then Compute Engine stops your instances during host maintenance events. In rare cases, the instance may stop due to a host error.	You can control when to stop or delete an instance, except in the following cases: Compute Engine might stop or delete the instance at any time to reclaim capacity. This process is called preemption. If the machine type that the instance uses doesn't support live migration, then Compute Engine stops your instances during host maintenance events. In rare cases, the instance may stop due to a host error.	The provisioned instances run for your chosen run duration, which can be up to seven days. You can't stop, suspend, or recreate instances. Compute Engine deletes instances when one of the following happens: You request to delete instances. The instances reach the end of their run duration.	You can control when to stop or delete an instance, except in the following cases: Compute Engine stops your instance during host maintenance events. The automatically created reservation to provision your requested capacity reaches the end of its committed reservation period. At that time, Compute Engine deletes the reservation, and stops or deletes any instances that consume the reservation. In rare cases, the instance may stop due to a host error.

What's next

Read an overview of creating Compute Engine instances.
To create instances by using the spot provisioning model, see Spot VMs.
To create instances by using the flex-start provisioning model, see About resize requests in a MIG.
To reserve capacity to create instances by using the reservation-bound model, see one of the following options:
- About future reservation requests in calendar mode
- Reserve capacity in AI Hypercomputer

Compute Engine instances provisioning models Stay organized with collections Save and categorize content based on your preferences.

Available provisioning models

Instance availability and lifespan

What's next

Compute Engine instances provisioning models