About regional MIGs

A managed instance group (MIG) that spreads its VMs across multiple zones in a region is also known as a regional MIG. A MIG that is confined to a single zone is also known as a zonal MIG.

You can use a regional MIG to increase the resilience of your MIG-based workload. Spreading your workload across multiple zones in a region helps protect you from extreme cases where all instances in a single zone fail.

This document contains conceptual information about regional MIGs:

To learn how to create a regional MIG, see Creating a MIG in multiple zones.

Why choose regional managed instance groups?

Google recommends regional MIGs over zonal MIGs for the following reasons:

  • You can use regional MIGs to manage up to 2,000 instances, twice as many as zonal MIGs. If you need more, you can further increase the size limit of a regional MIG to 4,000 instances.
  • You can use regional MIGs to spread your application load across multiple zones, rather than confining your application to a single zone or managing multiple zonal MIGs across different zones.

Using multiple zones protects against zonal failures and unforeseen scenarios where an entire group of instances in a single zone malfunctions. If that happens, your application can continue serving traffic from instances running in another zone in the same region.

In the case of a zonal failure, or if a group of instances in a zone stops responding, a regional MIG continues supporting your instances as follows:

  • The number of instances that are part of the regional MIG in the remaining zones continue to serve traffic. No new instances are added and no instances are redistributed (unless you set up autoscaling).

  • After the failed zone has recovered, the MIG starts serving traffic again from that zone.

When designing for robust and scalable applications, use regional MIGs.

Additional configuration options for regional MIGs

Creating a regional MIG is similar to creating a zonal MIG, except that you have additional options:

These options are described in the following sections.

Zone selection

By default, a regional MIG distributes its managed instances evenly across three zones. For various reasons, you might want to select specific zones for your application. For example, if you require GPUs for your instances, you might only select zones that support GPUs, or you might have existing persistent disks or reservations that are only available in certain zones.

If you want to choose the number of zones or choose the specific zones the group runs in, you must do that when you first create the group. After you choose specific zones during creation, you cannot change or update the zones later.

If you want your MIG to automatically use zones that support the hardware that you specify in your MIG's configuration, you can set the MIG's target distribution shape to BALANCED, ANY, or ANY_SINGLE_ZONE and select all zones in a region. The MIG automatically checks for resource availability and schedules instances only in zones that have the resources. For more information, see Target distribution shape.

  • To select more than three zones within a region, you must explicitly specify the individual zones. For example, to select all four zones within a region, you must provide all four zones explicitly in your request. If you don't, Compute Engine selects three zones by default.

  • To select two or fewer zones in a region, you must explicitly specify the individual zones. Even if the region only contains two zones, you must still explicitly specify the zones in your request.

Google regularly expands its infrastructure by making specialized hardware available in more zones. A regional MIG periodically checks hardware availability and automatically starts scheduling instances in zones which support required machines. If for any reason you don't want to run your instances in some zones, don't select those zones when creating your group.

To learn how to create a regional MIG and select zones, see Creating a regional MIG.

Target distribution shape

By default, a regional MIG distributes its managed instances evenly across selected zones. But if you need hardware that is not available in all zones, or if you need to prioritize the use of zonal reservations, you might prefer a different distribution.

To configure how your regional MIG distributes its instances across selected zones within a region, set the MIG's target distribution shape. The following options are available:

  • EVEN (default): the group creates and deletes VMs to achieve and maintain the same number of VMs across the selected zones. In an EVEN distribution, the number of VMs does not differ by more than 1 between any two zones. Recommended for highly available serving workloads.
  • BALANCED: the group prioritizes creation of VMs in zones where resources are available, while distributing VMs as evenly as possible across selected zones to minimize the impact of zonal failure. Recommended for highly available serving or batch workloads.
  • ANY: the group picks zones for creating VM instances to fulfill the requested number of VMs within present resource constraints and to maximize utilization of unused zonal reservations. Recommended for batch workloads that do not require high availability.
  • ANY SINGLE ZONE: the group creates all VM instances within a single zone. The zone is chosen based on hardware support, current resource and quota availability, and matching reservations. Recommended in combination with a compact instance placement policy for workloads that require extensive communication between VMs.

When you create your MIG, if you set its shape to BALANCED, ANY, or ANY_SINGLE_ZONE, you don't need to manually verify which zones support the hardware that you specify in the MIG's configuration. You can select all zones in a region and, with its shape set to BALANCED, ANY, or ANY_SINGLE_ZONE, your regional MIG checks resource availability for you and schedules instances only in zones that have the resources.

Choose an option based on your workload requirements and which MIG capabilities you need. For more information, see the comparison table and use cases.

To learn how to configure the target shape for a new or existing MIG, see Setting a policy for distributing instances across zones.

Proactive instance redistribution

By default, a regional MIG attempts to maintain an even distribution of instances across zones in the region to maximize the availability of your application in the event of a zone-level failure.

If you delete or abandon instances from your group, causing uneven distribution across zones, the group proactively redistributes instances to reestablish an even distribution.

To reestablish an even distribution across zones, the group deletes instances in zones with more instances, and adds instances to zones with fewer instances. The group automatically picks which instances to delete.

Proactive redistribution reestablishes even distribution across zones.
Example of proactive redistribution

For example, suppose you have a regional MIG with 12 instances spread across 3 zones: a, b, and c. If you delete 3 managed instances in c, the group attempts to rebalance so that the instances are again evenly distributed across the zones. In this case, the group deletes 2 instances (one from a and one from b) and creates 2 instances in zone c, so that each zone has 3 instances and even distribution is achieved. There is no way to selectively determine which instances are deleted. The group temporarily loses capacity while the new instances start up.

To prevent automatic redistribution of your instances, you can turn off proactive instance redistribution.

Turning off proactive instance redistribution is useful when you need to:

  • Delete or abandon instances from the group without affecting other running instances. For example, you can delete a batch worker instance after job completion without affecting other workers.
  • Protect instances with stateful workloads from undesirable automatic deletion due to proactive redistribution.
  • Set the MIG's target distribution shape to BALANCED or ANY_SINGLE_ZONE
Disabling proactive redistribution can affect capacity during a
            zonal failure.
Uneven distribution after disabling proactive redistribution

If you turn off proactive instance redistribution, a MIG does not proactively add or remove instances to achieve balance but still opportunistically converges toward balance during resize operations, treating each resize operation as an opportunity to balance the group. For example, when scaling in, the group automatically uses the rescaling as an opportunity to remove instances from bigger zones; when scaling out, the group uses the opportunity to add instances to smaller zones.

Behavior differences from zonal MIGs

The main difference between a zonal MIG and a regional MIG is that a regional MIG can use more than one zone.

Because a regional MIG's managed instances are distributed across zones within a region, the following MIG features behave a bit differently.

Autoscaling a regional MIG

Compute Engine offers autoscaling for MIGs, which allows your groups to automatically add VMs (scale out) or remove VMs (scale in) based on increases or decreases in load.

If you enable autoscaling for a regional MIG, the feature behaves as follows:

  • The autoscaler decides in which zone to create the VMs based on the largest autoscaling signal in each zone. For example, if you scale based on CPU utilization, the autoscaler creates more VMs in the zones with higher utilization.

  • If the zones have different signal values, then autoscaling might create an uneven distribution of VMs. In such cases, the autoscaler attempts to balance the load across zones by creating additional VMs in zones with fewer VMs. After the zones with additional VMs take over the load, the number of VMs across zones should balance.

  • If the signal value in a zone triggers a scale out but the overall signal value in the regional MIG does not require an additional VM or requires an additional VM in a different zone, then the autoscaler might add a VM and then immediately delete it from one of the zones.

  • When an autoscaling signal applies to a regional MIG as a whole, such as scaling schedules or some monitoring metrics, the autoscaler distributes VMs across the zones as equally as possible.

  • With the target distribution shape set to BALANCED, the autoscaler is aware of the resource availability across zones. The autoscaler proactively creates VMs only in zones with enough quota and capacity for VMs as specified in the MIG's configuration.

Updating a regional MIG

You cannot change or update the zones for a regional MIG after the group is created. But you can set the group's target distribution shape to prioritize the use of different zones—for example, if you have reserved resources or need hardware that is not available in all zones.

If you want to roll out a new template to a regional MIG, see Updating a regional MIG.

If you want to add or remove instances in a MIG, the process is similar for regional and zonal MIGs. See Add and remove VMs in a MIG.

If you're interested in configuring stateful disks or stateful metadata in a MIG, see Configuring stateful MIGs.

How to increase availability by overprovisioning

A variety of events might cause one or more instances to become unavailable, and you can help mitigate this issue by using multiple Google Cloud services:

  • Use a regional MIG with an EVEN or BALANCED target distribution shape to distribute your application across multiple zones.
  • Use application-based autohealing to recreate instances with failed applications.
  • Use load balancing to automatically direct user traffic away from unavailable instances.

However, even if you use these services, your users might still experience issues if too many of your instances are simultaneously unavailable.

To be prepared for the extreme case where one zone fails or an entire group of instances stops responding, Google strongly recommends overprovisioning your MIG. Depending on your application needs, overprovisioning your group prevents your system from failing entirely if a zone or group of instances becomes unresponsive.

Google makes recommendations for overprovisioning with the priority of keeping your application available for your users. These recommendations include provisioning and paying for more instances than your application might need on a day-to-day basis. Base your overprovisioning decisions on application needs and cost limitations.

You can set your MIG's size when creating it, and you can add or remove instances after you've created it.

You can configure an autoscaler to automatically add and remove instances in the group based on the load.

Estimating the recommended group size

We recommend that you provision enough instances so that, if all of the instances in any one zone become unavailable, your remaining instances would still meet the minimum number of instances that you require.

Use the following table to determine the minimum recommended size for your group:

Number of zones Additional VM instances Recommended total VM instances
2 +100% 200%
3 +50% 150%
4 +33% 133%

Provisioning a regional MIG in three or more zones

When you create a regional MIG in a region with at least three zones, Google recommends overprovisioning your group by at least 50%. By default, a regional MIG creates instances in three zones. Having instances in three zones already helps you preserve at least 2/3 of your serving capacity, and if a single zone fails, the other two zones in the region can continue to serve traffic without interruption. By overprovisioning to 150%, you can ensure that if 1/3 of the capacity is lost, 100% of traffic is supported by the remaining zones.

For example, if you need 20 instances in your MIG across three zones, we recommend, at a minimum, an additional 50% of instances. In this case, 50% of 20 is 10 more instances, for a total of 30 instances in the group. If you create a regional MIG with a size of 30, the group distributes your VMs across the three zones, like so:

Zone Number of VM instances
example-zone-1 10
example-zone-2 10
example-zone-3 10

If any single zone fails, you still have 20 instances serving traffic.

Provisioning a regional MIG in two zones

To provision your instances in two zones instead of three, Google recommends doubling the number of instances. For example, if you need 20 instances for your service, distributed across two zones, we recommend that you configure a regional MIG with 40 instances, so that each zone has 20 instances. If a single zone fails, you still have 20 instances serving traffic.

Zone Number of VM instances
example-zone-1 20
example-zone-2 20

If the number of instances in your group is not equally divisible across two zones, Compute Engine evenly divides the group of VMs and randomly puts the remaining instances in one of the zones.

Provisioning a regional MIG in one zone

You can create a regional MIG with just one zone. This is similar to creating a zonal MIG.

Creating a single-zone regional MIG is not recommended because it offers the minimum guarantee for highly available applications. If the zone fails, your entire MIG is unavailable, potentially disrupting your users.

What's next