High availability for your apps

This strategy guide provides technical guidance and best practices for designing and deploying highly available (HA) workloads to a Google Distributed Cloud (GDC) air-gapped universe configured with multiple zones, or multi-zone. The guide outlines key architectural patterns, service configurations, and operational considerations necessary to minimize downtime and provide business continuity for applications running on GDC.

High availability strategies are intended for technical professionals involved in designing, deploying, and managing applications on GDC, which include the following:

  • Cloud architects within the platform administrator group: Designing resilient infrastructure and application architectures on GDC.

  • DevOps engineers and site reliability engineers (SREs) within the application operator group: Implementing deployment strategies, automation, monitoring, and incident response for HA workloads.

  • Application developers within the application operator group: Building applications that are fault-tolerant and integrate seamlessly with HA infrastructure patterns.

For more information, see Audiences for GDC air-gapped documentation.

Importance of high availability

In modern distributed systems, planning for high availability is critical. Downtime, whether planned or unplanned, can lead to significant business disruption, revenue loss, damage to reputation, and poor user experience. For workloads running at the edge or in private data centers using GDC, availability often correlates directly with core operational success, especially for latency-sensitive or mission-critical applications. Designing for HA from the outset is essential to build resilient and reliable services.

Hyperscale capabilities, delivered locally

GDC extends Google Cloud infrastructure and services to the edge and your data centers. GDC provides a fully managed hardware and software solution, letting you run Google Kubernetes Engine (GKE) on GDC clusters and other Google Cloud services closer to where your data is generated and consumed.

This guide focuses specifically on GDC universes configured in a multi-zone topology. With multi-zone, a single GDC universe comprises multiple, physically isolated zones within the same location, such as a data center campus or metropolitan area. These zones have independent power, cooling, and networking, providing protection against localized physical infrastructure failures. The low-latency, high-bandwidth network connectivity between zones within a GDC universe enables synchronous replication and rapid failover, forming the foundation for building highly available applications.

Scalability and load balancing

Beyond basic component redundancy, managing traffic effectively and enabling seamless scaling are crucial for maintaining high availability, especially with varying load conditions. GDC provides several mechanisms for load balancing and sophisticated traffic management.

External load balancer for north-south traffic

To expose your applications to users or systems outside a GKE on GDC cluster (north-south traffic), you use GDC's managed external load balancing capabilities. The external load balancer (ELB) service provides these capabilities and integrates seamlessly with Kubernetes.

The key characteristics of the ELB service that provides HA and scalability are the following:

  • Managed service: ELB is managed by GDC, designed for high availability and resilience.

  • External access: Provisions stable external IP addresses from GDC-managed pools, providing a consistent entry point for external clients.

  • Load balancer integration with Kubernetes: Automatically provisions and configures the load balancer when you create a Kubernetes Service of type: LoadBalancer without specific internal annotations.

  • Zone awareness: Distributes incoming traffic across healthy application pods running in all available zones within the GDC universe. The ELB relies on pod readiness probes to determine backend health.

  • Scalability: Handles distribution of external traffic as your application scales horizontally across nodes and zones.

Using an external load balancer is the standard and recommended way to achieve HA for external traffic ingress, so client requests are automatically routed away from failing zones or instances.

For more information, see Configure external load balancers.

Internal load balancer for east-west traffic

For communication between services running within the same GKE on GDC cluster (east-west traffic), GDC provides an internal load balancer (ILB). This is crucial for decoupling internal services and providing internal communication paths that are also highly available and scalable.

The key characteristics of the ILB service that provides HA and scalability are the following:

  • Internal access: Provisions a stable internal IP address accessible only from within the GDC network, such as cluster nodes or other internal services.

  • Load balancer integration with Kubernetes: Typically provisioned by creating a Kubernetes Service of type: LoadBalancer with a specific annotation to indicate it must be internal. For example, networking.gke.io/load-balancer-type: "Internal".

  • Zone awareness: Distributes traffic across healthy backend pods, which are identified with readiness probes, located in all available zones. This distribution prevents internal communication failures if one zone experiences issues.

  • Service discovery and decoupling: Provides a stable internal IP address and DNS name with kube-dns and CoreDNS integration. Services can discover and communicate with each other, removing the need for clients to know individual pod IP addresses.

  • Scalability: Facilitates scaling of internal backend services by distributing traffic across all available healthy replicas.

Using an ILB for internal service-to-service communication makes internal traffic flow resilient to zone failures and provides effective scaling, complementing the HA provided by the external ELB and underlying compute distribution. This is often used for tiered applications where frontends must communicate with backend APIs or databases within the Kubernetes cluster.

For more information, see Configure internal load balancers.

HA app deployment across zones with asynchronous storage

GDC lets you run infrastructure and applications closer to your data sources or end users. Achieving HA in your GDC universe is crucial for critical workloads. You can deploy HA applications across multiple zones within your GDC universe, implementing asynchronous storage replication for data persistence and disaster recovery.

Zones represent distinct failure domains within a single universe. By distributing application components and replicating data across zones, you can significantly improve resilience against localized hardware failures or maintenance events.

What's next

  • To deploy a service as a collection of virtual machines (VMs) distributed across zones using asynchronous replicated block storage, see Deploy an HA VM app.

  • To deploy a service as a containerized application on Kubernetes across zones using asynchronously replicated persistent volumes, see Deploy an HA container app.