Patterns for using floating IP addresses in Compute Engine

Last reviewed 2024-01-29 UTC

This document describes how to use floating IP address implementation patterns when migrating applications to Compute Engine from an on-premises network environment. This document is aimed at network engineers, system administrators, and operations engineers who are migrating applications to Google Cloud.

Also referred to as shared or virtual IP addresses, floating IP addresses are often used to make on-premises network environments highly available. Using floating IP addresses, you can pass an IP address between multiple identically configured physical or virtual servers. This practice allows for failover or for upgrading production software. However, you can't directly implement floating IP addresses in a Compute Engine environment without changing the architecture to one of the patterns described in this document.

The GitHub repository that accompanies this document includes sample deployments for each pattern that you can automatically deploy using Terraform.

Floating IP addresses in on-premises environments

Floating IP addresses are commonly used in on-premises environments. Example use cases are as follows:

There are several ways to implement floating IP addresses in an on-premises environment. Servers sharing floating IP addresses typically also share state information through a heartbeat mechanism. This mechanism lets the servers communicate their health status to each other; it also lets the secondary server take over the floating IP address after the primary server fails. This scheme is frequently implemented using the Virtual Router Redundancy Protocol, but you can also use other, similar mechanisms.

Once an IP address failover is initiated, the server taking over the floating IP address adds the address to its network interface. The server announces this takeover to other devices using Layer 2 by sending a gratuitous Address Resolution Protocol (ARP) frame. Alternatively, sometimes a routing protocol like Open Shortest Path First (OSPF), announces the IP address to the upstream Layer 3 router.

The following diagram shows a typical setup in an on-premises environment.

Typical on-premises environment.

The preceding diagram shows how a primary server and a secondary server connected to the same switch exchange responsiveness information through a heartbeat mechanism. If the primary server fails, the secondary server sends a gratuitous ARP frame to the switch to take over the floating IP address.

You use a slightly different setup with on-premises load-balancing solutions, such as Windows Network Load Balancing or a Linux Load Balancing with Direct Server response like IPVS. In these cases, the service also sends out gratuitous ARP frames, but with the MAC address of another server as the gratuitous ARP source. This action essentially spoofs the ARP frames and takes over the source IP address of another server.

This action is done to distribute the load for one IP address between different servers. However, this kind of setup is out of scope for this document. In almost all cases when floating IP addresses are used for on-premises load balancing, migrating to Cloud Load Balancing is preferred.

Challenges with migrating floating IP addresses to Compute Engine

Compute Engine uses a virtualized network stack in a Virtual Private Cloud (VPC) network, so typical implementation mechanisms don't work without changes in Google Cloud. For example, the VPC network handles ARP requests in the software-defined network, and ignores gratuitous ARP frames. In addition, it's impossible to directly modify the VPC network routing table with standard routing protocols such as OSPF or Border Gateway Protocol (BGP). The typical mechanisms for floating IP addresses rely on ARP requests being handled by switching infrastructure or they rely on networks programmable by OSPF or BGP. Therefore, IP addresses don't failover using these mechanisms in Google Cloud. If you migrate a virtual machine (VM) image using an on-premises floating IP address, the floating IP address can't fail over without changing the application.

You could use an overlay network to create a configuration that enables full Layer 2 communication and IP takeover through ARP requests. However, setting up an overlay network is complex and makes managing Compute Engine network resources difficult. That approach is also out of scope for this document. Instead, this document describes patterns for implementing failover scenarios in a Compute Engine networking environment without creating overlay networks.

To implement highly available and reliable applications in Compute Engine, use horizontally scaling architectures. This type of architecture minimizes the effect of a single node failure.

This document describes multiple patterns to migrate an existing application using floating IP addresses from on-premises to Compute Engine, including the following:

Using Alias IP addresses that move between VM instances is discouraged as a failover mechanism because it doesn't meet high availability requirements. In certain failure scenarios, like a zonal failure event, you might not be able to remove an Alias IP address from an instance. Therefore, you might not be able to add it to another instance—making failover impossible.

Selecting a pattern for your use case

Depending on your requirements, one or more of the patterns described in this solution might be useful to implement floating IP addresses in an on-premises environment.

Consider the following factors when deciding what pattern best lets you use an application:

  • Floating internal or floating external IP address: Most applications that require floating IP addresses use floating internal IP addresses. Few applications use floating external IP addresses, because typically traffic to external applications should be load balanced.

    The table later in this section recommends patterns you can use for floating internal IP addresses and for floating external IP addresses. For use cases that rely on floating internal IP addresses, any of these patterns might be viable for your needs. However, we recommend that use cases relying on floating external IP addresses should be migrated to one of the patterns using load balancing.

  • Application protocols: If your VM only uses TCP and UDP, you can use all of the patterns in the table. If it uses other protocols on top of IPv4 to connect, only some patterns are appropriate.

  • Active-active deployment compatibility: Some applications, while using floating IP addresses on-premises, can work in an active-active deployment mode. This capability means they don't necessarily require failover from the primary server to the secondary server. You have more choices of patterns to move these kinds of applications to Compute Engine. Applications that require only a single application server to receive traffic at any time aren't compatible with active-active deployment. You can only implement these applications with some patterns in the following table.

  • Failback behavior after primary VM recovers: When the original primary VM recovers after a failover, depending on the pattern used, traffic does one of two things. It either immediately moves back to the original primary VM or it stays on the new primary VM until failback is initiated manually or the new primary VM fails. In all cases, only newly initiated connections fail back. Existing connections stay at the new primary VM until they are closed.

  • Health check compatibility: If you can't check if your application is responsive using Compute Engine health checks, without difficulty, you can't use some patterns described in the following table.

  • Instance groups: Any pattern with health check compatibility is also compatible with instance groups. To automatically recreate failed instances, you can use a managed instance group with autohealing. If your VMs keep state, you can use a stateful managed instance group. If your VMs can't be recreated automatically or you require manual failover, use an unmanaged instance group and manually recreate the VMs during failover.

  • Existing heartbeat mechanisms: If the high availability setup for your application already uses a heartbeat mechanism to trigger failover, like Heartbeat, Pacemaker, or Keepalived, you can use some patterns described in the following table.

The following table lists pattern capabilities. Each pattern is described in the following sections:

Pattern name IP address Supported protocols Deployment mode Failback Application health check compatibility required Can integrate heartbeat mechanism
Patterns using load balancing
Active-active load balancing Internal or external TCP/UDP only Active-active N/A Yes No
Load balancing with failover and application-exposed health checks Internal or external TCP/UDP only Active-passive Immediate (except existing connections) Yes No
Load balancing with failover and heartbeat-exposed health checks Internal or external TCP/UDP only Active-passive Configurable No Yes
Patterns using Google Cloud routes
Using ECMP routes Internal All IP protocols Active-active N/A Yes No
Using different priority routes Internal All IP protocols Active-passive Immediate (except existing connections) Yes No
Using a heartbeat mechanism to switch route next hop Internal All IP protocols Active-passive Configurable No Yes
Pattern using autohealing
Using an autohealing single instance Internal All IP protocols N/A N/A Yes No

Deciding which pattern to use for your use case might depend on multiple factors. The following decision tree can help you narrow your choices to a suitable option.

A decision tree that helps you pick a load balancer.

The preceding diagram outlines the following steps:

  1. Does a single autohealing instance provide good enough availability for your needs?
    1. If yes, see Using an autohealing single instance later in this document. Autohealing uses a mechanism in a VM instance group to automatically replace a faulty VM instance.
    2. If not, proceed to the next decision point.
  2. Does your application need protocols on top of IPv4 other than TCP and UDP?
    1. If yes, proceed to the next decision point.
    2. If no, proceed to the next decision point.
  3. Can your application work in active-active mode?
    1. If yes and it needs protocols on top of IPv4 other than TCP and UDP, see Using equal-cost multipath (ECMP) routes later in this document. ECMP routes distribute traffic among the next hops of all route candidates.
    2. If yes and it doesn't need protocols on top of IPv4 other than TCP and UDP, see Active-active load balancing later in this document. Active-active load balancing uses your VMs as backends for an internal TCP/UDP load balancer.
    3. If not–in either case–proceed to the next decision point.
  4. Can your application expose Google Cloud health checks?
    1. If yes and it needs protocols on top of IPv4 other than TCP and UDP, see Load balancing with failover and application-exposed health checks later in this document. Load balancing with failover and application-exposed health checks uses your VMs as backends for an internal TCP/UDP load balancer. It also uses the Internal TCP/UDP Load Balancing IP address as a virtual IP address.
    2. If yes and it doesn't need protocols on top of IPv4 other than TCP and UDP, see Using different priority routes later in this document. Using different priority routes helps ensure that traffic always flows to a primary instance unless that instance fails.
    3. If no and it needs protocols on top of IPv4 other than TCP and UDP, see Load balancing with failover and heartbeat-exposed health checks later in this document. In the load balancing with failover and heartbeat-exposed health checks pattern, health checks aren't exposed by the application itself but by a heartbeat mechanism running between both VMs.
    4. If no and it DOES NOT NEED protocols on top of IPv4 other than TCP and UDP, see Using a heartbeat mechanism to switch a route's next hop later in this document. Using a heartbeat mechanism to switch a route's next hop uses a single static route with the next-hop pointing to the primary VM instance.

Patterns using load balancing

Usually, you can migrate your application using floating IP addresses to an architecture in Google Cloud that uses Cloud Load Balancing. You can use an internal passthrough Network Load Balancer, as this option fits most use cases where the on-premises migrated service is only exposed internally. This load-balancing option is used for all examples in this section and in the sample deployments on GitHub. If you have clients accessing the floating IP address from other regions, select the global access option.

If your application communicates using protocols on top of IPv4, other than TCP or UDP, you must choose a pattern that doesn't use load balancing. Those patterns are described later in this document.

If your application uses HTTP(S), you can use an internal Application Load Balancer to implement the active-active pattern.

If the service you are trying to migrate is externally available, you can implement all the patterns that are discussed in this section by using an external passthrough Network Load Balancer. For active-active deployments, you can also use an external Application Load Balancer, a TCP proxy, or an SSL proxy if your application uses protocols and ports supported by those load balancing options.

Consider the following differences between on-premises floating-IP-address-based implementations and all load-balancing-based patterns:

  • Failover time: Pairing Keepalived with gratuitous ARP in an on-premises environment might fail over an IP address in a few seconds. In the Compute Engine environment, the mean recovery time from failover depends on the parameters you set. In case the virtual machine (VM) instance or the VM instance service fails, the mean-time-to-failover traffic depends on health check parameters such as Check Interval and Unhealthy Threshold. With these parameters set to their default values, failover usually takes 15–20 seconds. You can reduce the time by decreasing those parameter values.

    In Compute Engine, failovers within zones or between zones take the same amount of time.

  • Protocols and Ports: In an on-premises setup, the floating IP addresses accept all traffic. Choose one of the following port specifications in the internal forwarding rule for the internal passthrough Network Load Balancer:

    • Specify at least one port and up to five ports by number.
    • Specify ALL to forward traffic on all ports for either TCP or UDP.
    • Use multiple forwarding rules with the same IP address to forward a mix of TCP and UDP traffic or to use more than five ports with a single IP address:
      • Only TCP or UDP and 1—5 ports: Use one forwarding rule.
      • TCP and UDP and 1—5 ports: Use multiple forwarding rules.
      • 6 or more ports and TCP or UDP: Use multiple forwarding rules.
  • Health checking: On-premises, you can check application responsiveness on a machine in the following ways:

Active-active load balancing

In the active-active load balancing pattern, your VMs are backends for an internal passthrough Network Load Balancer. You use the internal passthrough Network Load Balancer IP address as a virtual IP address. Traffic is equally distributed between the two backend instances. Traffic belonging to the same session goes to the same backend instance as defined in the session affinity settings.

Use the active-active load balancing pattern if your application only uses protocols based on TCP and UDP and doesn't require failover between machines. Use the pattern in a scenario where applications can answer requests depending on the content of the request itself. If there is a machine state that isn't constantly synchronized, don't use the pattern—for example, in a primary or secondary database.

The following diagram shows an implementation of the active-active load balancing pattern:

How an internal client navigates the active-active load balancing pattern.

The preceding diagram shows how an internal client accesses a service that runs on two VMs through an internal passthrough Network Load Balancer. Both VMs are part of an instance group.

The active-active load balancing pattern requires your service to expose health checks using one of the supported health check protocols to ensure that only responsive VMs receive traffic.

For a full sample implementation of this pattern, see the example deployment with Terraform on GitHub.

Load balancing with failover and application-exposed health checks

Similar to the active-active pattern, the load balancing through failover and application-exposed health checks pattern uses your VMs as backends for an internal passthrough Network Load Balancer. It also uses the internal passthrough Network Load Balancer IP address as a virtual IP address. To ensure that only one VM receives traffic at a time, this pattern applies failover for internal passthrough Network Load Balancers.

This pattern is recommended if your application only has TCP or UDP traffic, but doesn't support an active-active deployment. When you apply this pattern, all traffic flows to either the primary VM or the failover VM.

The following diagram shows an implementation of the load balancing with failover and application-exposed health checks pattern:

How an internal client navigates a service behind an internal passthrough Network Load Balancer.

The preceding diagram shows how an internal client accesses a service behind an internal passthrough Network Load Balancer. Two VMs are in separate instance groups. One instance group is set as a primary bac