Single-zone deployment on Compute Engine

Last reviewed 2024-02-08 UTC

This document provides a reference architecture for a multi-tier application that runs on Compute Engine VMs in a single zone in Google Cloud. You can use this reference architecture to efficiently rehost (lift and shift) on-premises applications to the cloud with minimal changes to the applications. The document also describes the design factors that you should consider when you build a zonal architecture for your cloud applications. The intended audience for this document is cloud architects.


The following diagram shows an architecture for an application that runs in a single Google Cloud zone. This architecture is aligned with the Google Cloud zonal deployment archetype.

Single-zone architecture using Compute Engine.

The architecture is based on the infrastructure as a service (IaaS) cloud model. You provision the required infrastructure resources (compute, networking, and storage) in Google Cloud. You retain full control over the infrastructure and responsibility for the operating system, middleware, and higher layers of the application stack. To learn more about IaaS and other cloud models, see PaaS vs. IaaS vs. SaaS vs. CaaS: How are they different?

The preceding diagram includes the following components:

Component Purpose
Regional external load balancer

The regional external load balancer receives and distributes user requests to the web tier VMs.

Use an appropriate load balancer type depending on the traffic type and other requirements. For example, if the backend consists of web servers (as shown in the preceding architecture), then use an Application Load Balancer to forward HTTP(S) traffic. To load-balance TCP traffic, use a Network Load Balancer. For more information, see Choose a load balancer.

Zonal managed instance group (MIG) for the web tier The web tier of the application is deployed on Compute Engine VMs that are part of a zonal MIG. The MIG is the backend for the regional external load balancer. Each VM in the MIG hosts an independent instance of the web tier of the application.
Regional internal load balancer

The regional internal load balancer distributes traffic from the web tier VMs to the application tier VMs.

Depending on your requirements, you can use a regional internal Application Load Balancer or Network Load Balancer. For more information, see Choose a load balancer.

Zonal MIG for the application tier The application tier is deployed on Compute Engine VMs that are part of a zonal MIG, which is the backend for the internal load balancer. Each VM in the MIG hosts an independent instance of the application tier.
Third-party database deployed on a Compute Engine VM

The architecture in this document shows a third-party database (like PostgreSQL) that's deployed on a Compute Engine VM. You can deploy a standby database in another zone. The database replication and failover capabilities depend on the database that you use.

Installing and managing a third-party database involves additional effort and operational cost for applying updates, monitoring, and ensuring availability. You can avoid the overhead of installing and managing a third-party database and take advantage of built-in high availability (HA) features by using a fully managed database service like Cloud SQL or AlloyDB for PostgreSQL. For more information about managed database options, see Database services.

Virtual Private Cloud network and subnet

All the Google Cloud resources in the architecture use a single VPC network and subnet.

Depending on your requirements, you can choose to build an architecture that uses multiple VPC networks or multiple subnets. For more information, see Deciding whether to create multiple VPC networks in "Best practices and reference architectures for VPC design."

Cloud Storage regional bucket

Application and database backups are stored in a regional Cloud Storage bucket. If a zone outage occurs, your application and data aren't lost.

Alternatively, you can use Backup and DR Service to create, store, and manage the database backups.

Products used

This reference architecture uses the following Google Cloud products:

  • Compute Engine: A secure and customizable compute service that lets you create and run VMs on Google's infrastructure.
  • Cloud Load Balancing: A portfolio of high performance, scalable, global and regional load balancers.
  • Cloud Storage: A low-cost, no-limit object store for diverse data types. Data can be accessed from within and outside Google Cloud, and it's replicated across locations for redundancy.
  • Virtual Private Cloud: A virtual system that provides global, scalable networking functionality for your Google Cloud workloads.

Use cases

This section describes use cases for which a single-zone deployment on Compute Engine is an appropriate choice.

  • Cloud development and testing: You can use a single-zone deployment architecture to build a low-cost cloud environment for development and testing.
  • Applications that don't need HA: A single-zone architecture might be sufficient for applications that can tolerate downtime due to infrastructure outages.
  • Low-latency, low-cost networking between application components: A single-zone architecture might be well suited for applications such as batch computing that need low-latency and high-bandwidth network connections among the compute nodes. With a single-zone deployment, there's no cross-zone network traffic, and you don't incur costs for intra-zone traffic.
  • Migration of commodity workloads: The zonal deployment architecture provides a simple cloud-migration path for commodity on-premises applications for which you have no control over the code or that can't support architectures beyond a basic active-passive topology.
  • Running license-restricted software: A single-zone architecture might be well suited for license-restricted systems where running more than one instance at a time is either too expensive or isn't permitted.

Design considerations

This section provides guidance to help you use this reference architecture to develop an architecture that meets your specific requirements for system design, security and compliance, reliability, operational efficiency, cost, and performance.

System design

This section provides guidance to help you to choose Google Cloud regions and zones for your zonal deployment and to select appropriate Google Cloud services.

Region selection

When you choose a Google Cloud region and zone for your applications, consider the following factors and requirements:

  • Availability of Google Cloud services. For more information, see Products available by location.
  • Availability of Compute Engine machine types. For more information, see Regions and zones.
  • End-user latency requirements.
  • Cost of Google Cloud resources.
  • Regulatory requirements.

Some of these factors and requirements might involve trade-offs. For example, the most cost-efficient region might not have the lowest carbon footprint. For more information, see Select geographic zones and regions in the Google Cloud Architecture Framework.

Compute services

The reference architecture in this document uses Compute Engine VMs for all the tiers of the application. The design guidance in this document is specific to Compute Engine unless mentioned otherwise.

Depending on the requirements of your application, you can choose from the following other Google Cloud compute services. The design guidance for those services is outside the scope of this document.

  • You can run containerized applications in Google Kubernetes Engine (GKE) clusters. GKE is a container-orchestration engine that automates deploying, scaling, and managing containerized applications.
  • If you prefer to focus your IT efforts on your data and applications instead of setting up and operating infrastructure resources, then you can use serverless services like Cloud Run and Cloud Functions.

The decision of whether to use VMs, containers, or serverless services involves a trade-off between configuration flexibility and management effort. VMs and containers provide more configuration flexibility, but you're responsible for managing the resources. In a serverless architecture, you deploy workloads to a preconfigured platform that requires minimal management effort. For more information about choosing appropriate compute services for your workloads in Google Cloud, see Choose and manage compute in the Google Cloud Architecture Framework.

Storage services

The architecture shown in this document uses zonal Persistent Disk volumes for all the tiers. For more durable persistent storage, you can use regional Persistent Disk volumes, which provide synchronous replication of data across two zones within a region.

For low-cost storage that's redundant across the zones within a region, you can use Cloud Storage regional buckets.

To store data that's shared across multiple VMs in a region, such as across all the VMs in the web tier or application tier, you can use Filestore. The data that you store in a Filestore Enterprise instance is replicated synchronously across three zones within the region. This replication ensures high availability and robustness against zone outages. You can store shared configuration files, common tools and utilities, and centralized logs in the Filestore instance, and mount the instance on multiple VMs.

If your database is Microsoft SQL Server, we recommend using Cloud SQL for SQL Server. In scenarios when Cloud SQL doesn't support your configuration requirements, or if you need access to the operating system, then you can deploy a failover cluster instance (FCI). In this scenario, you can use the fully managed Google Cloud NetApp Volumes to provide continuous availability (CA) SMB storage for the database.

When you design storage for your workloads, consider the functional characteristics, resilience requirements, performance expectations, and cost goals. For more information, see Design an optimal storage strategy for your cloud workload.

Database services

The reference architecture in this document uses a third-party database, like PostgreSQL, that's deployed on Compute Engine VMs. Installing and managing a third-party database involves effort and cost for operations like applying updates, monitoring and ensuring availability, performing backups, and recovering from failures.

You can avoid the effort and cost of installing and managing a third-party database by using a fully managed database service like Cloud SQL, AlloyDB for PostgreSQL, Bigtable, Spanner, or Firestore. These Google Cloud database services provide uptime service-level agreements (SLAs), and they include default capabilities for scalability and observability. If your workloads require an Oracle database, you can use Bare Metal Solution provided by Google Cloud. For an overview of the use cases that each Google Cloud database service is suitable for, see Google Cloud databases.

Security and compliance

This section describes factors that you should consider when you use this reference architecture to design and build a zonal topology in Google Cloud that meets the security and compliance requirements of your workloads.

Protection against external threats

To protect your application against external threats like distributed denial-of-service (DDoS) attacks and cross-site scripting (XSS), you can use Google Cloud Armor security policies. The security policies are enforced at the perimeter—that is, before traffic reaches the web tier. Each policy is a set of rules that specifies certain conditions that should be evaluated and actions to take when the conditions are met. For example, a rule could specify that if the incoming traffic's source IP address matches a specific IP address or CIDR range, then the traffic must be denied. In addition, you can apply preconfigured web application firewall (WAF) rules. For more information, see Security policy overview.

External access for VMs

In the reference architecture that this document describes, the VMs that host the application tier, web tier, and databases don't need inbound access from the internet. Don't assign external IP addresses to those VMs. Google Cloud resources that have only a private, internal IP address can still access certain Google APIs and services by using Private Service Connect or Private Google Access. For more information, see Private access options for services.

To enable secure outbound connections from Google Cloud resources that have only internal IP addresses, like the Compute Engine VMs in this reference architecture, you can use Cloud NAT.

VM image security

To ensure that your VMs use only approved images (that is, images with software that meets your policy or security requirements), you can define an organization policy that restricts the use of images in specific public image projects. For more information, see Setting up trusted image policies.

Service account privileges

In Google Cloud projects where the Compute Engine API is enabled, a default service account is created automatically. The default service account is granted the Editor IAM role (roles/editor) unless this behavior is disabled. By default, the default service account is attached to all VMs that you create by using the Google Cloud CLI or the Google Cloud console. The Editor role includes a broad range of permissions, so attaching the default service account to VMs creates a security risk. To avoid this risk, you can create and use dedicated service accounts for each application. To specify the resources that the service account can access, use fine-grained policies. For more information, see Limit service account privileges in "Best practices for using service accounts."

Network security

To control network traffic between the resources in the architecture, you must set up appropriate Cloud Next Generation Firewall rules. Each firewall rule lets you control traffic based on parameters like the protocol, IP address, and port. For example, you can configure a firewall rule to allow TCP traffic from the web server VMs to a specific port of the database VMs, and block all other traffic.

More security considerations

When you build the architecture for your workload, consider the platform-level security best practices and recommendations provided in the Enterprise foundations blueprint.


This section describes design factors that you should consider when you use this reference architecture to build and operate reliable infrastructure for your zonal deployments in Google Cloud.

Infrastructure outages

In a single-zone deployment architecture, if any component in the infrastructure stack fails, the application can process requests if each tier contains at least one functioning component with adequate capacity. For example, if a web server instance fails, the load balancer forwards user requests to the other available web server instances. If a VM that hosts a web server or app server instance crashes, the MIG recreates the VM automatically. If the database crashes, you must manually activate the second database and update the app server instances to connect to the database.

A zone outage or region outage affects all the Compute Engine VMs in a single-zone deployment. A zone outage doesn't affect the load balancer in this architecture because it's a regional resource. However, the load balancer can't distribute traffic, because there are no available backends. If a zone or region outage occurs, you must wait for Google to resolve the outage, and then verify that the application works as expected.

You can reduce the downtime caused by zone or region outages by maintaining a passive (failover) replica of the infrastructure stack in another Google Cloud zone or region. If an outage occurs in the primary zone, you can activate the stack in the failover zone or region, and use DNS routing policies to route traffic to the load balancer in the failover zone or region.

For applications that require robustness against zone or region outages, consider using a regional or multi-regional architecture. See the following reference architectures:

MIG autoscaling

The autoscaling capability of stateless MIGs lets you maintain application availability and performance at predictable levels. Stateful MIGs can't be autoscaled.

To control the autoscaling behavior of your MIGs, you can specify target utilization metrics, such as average CPU utilization. You can also configure schedule-based autoscaling. For more information, see Autoscaling groups of instances.

MIG size limit

By default, a zonal MIG can have up to 1,000 VMs. You can increase the size limit of a MIG to 2,000 VMs.

VM autohealing

Sometimes the VMs that host your application might be running and available, but there might be issues with the application itself. It might freeze, crash, or not have sufficient memory. To verify whether an application is responding as expected, you can configure application-based health checks as part of the autohealing policy of your MIGs. If the application on a particular VM isn't responding, the MIG autoheals (repairs) the VM. For more information about configure autohealing, see Set up an application health check and autohealing.

VM placement

In the architecture that this document describes, the application tier and web tier run on Compute Engine VMs within a single zone. To improve the robustness of the architecture, you can create a spread placement policy and apply it to the MIG template. When the MIG creates VMs, it places the VMs on different physical servers (called hosts), so your VMs are robust against failures of individual hosts. For more information, see Apply spread placement policies to VMs.

VM capacity planning

To make sure that capacity for Compute Engine VMs is available when required for MIG autoscaling, you can create reservations. A reservation provides assured capacity in a specific zone for a specified number of VMs of a machine type that you choose. A reservation can be specific to a project, or shared across multiple projects. You incur charges for reserved resources even if the resources aren't provisioned or used. For more information about reservations, including billing considerations, see Reservations of Compute Engine zonal resources.

Persistent disk state

A best practice in application design is to avoid the need for stateful local disks. But if the requirement exists, you can configure your persistent disks to be stateful to ensure that the data is preserved when the VMs are repaired or recreated. However, we recommend that you keep the boot disks stateless, so that you can update them to the latest images with new versions and security patches. For more information, see Configuring stateful persistent disks in MIGs.

Data durability

You can use Backup and DR to create, store, and manage backups of the Compute Engine VMs. Backup and DR stores backup data in its original, application-readable format. When required, you can restore your workloads to production by directly using data from long-term backup storage without time-consuming data movement or preparation activities.

If you use a managed database service like Cloud SQL, backups are taken automatically based on the retention policy that you define. You can supplement the backup strategy with additional logical backups to meet regulatory, workflow, or business requirements.

If you use a third-party database and you need to store database backups and transaction logs, you can use regional Cloud Storage buckets. Regional Cloud Storage buckets provide low-cost backup storage that's redundant across zones.

Compute Engine provides the following options to help you to ensure the durability of data that's stored in Persistent Disk volumes:

  • You can use snapshots to capture the point-in-time state of Persistent Disk volumes. Standard snapshots are stored redundantly in multiple regions, with automatic checksums to ensure the integrity of your data. Snapshots are incremental by default, so they use less storage space and you save money. Snapshots are stored in a Cloud Storage location that you can configure. For more recommendations about using and managing snapshots, see Best practices for Compute Engine disk snapshots.
  • Regional Persistent Disk volumes let you run highly available applications that aren't affected by failures in persistent disks. When you create a regional Persistent Disk volume, Compute Engine maintains a replica of the disk in a different zone in the same region. Data is replicated synchronously to the disks in both zones. If any one of the two zones has an outage, the data remains available.

Database availability

If you use a managed database service like Cloud SQL in HA configuration, then in the event of a failure of the primary database, Cloud SQL fails over automatically to the standby database. You don't need to change the IP address for the database endpoint. If you use a self-managed third-party database that's deployed on a Compute Engine VM, then you must use an internal load balancer or other mechanism to ensure that the application can connect to another database if the primary database is unavailable.

To implement cross-zone failover for a database deployed on a Compute Engine VM, you need a mechanism to identify failures of the primary database and a process to fail over to the standby database. The specifics of the failover mechanism depend on the database that you use. You can set up an observer instance to detect failures of the primary database and orchestrate the failover. You must configure the failover rules appropriately to avoid a