Network segmentation and connectivity for distributed applications in Cross-Cloud Network

Last reviewed 2024-04-05 UTC

This document is part of a design guide series for Cross-Cloud Network.

The series consists of the following parts:

This part explores the network segmentation structure and connectivity, which is the foundation of the design. This document explains the phases in which you make the following choices:

  • The overall network segmentation and project structure.
  • Where you place your workload.
  • How your projects are connected to external on-premises and other cloud provider networks, including the design for connectivity, routing, and encryption.
  • How your VPC networks are connected internally to each other.
  • How your Google Cloud VPC subnets are connected to each other and to other networks, including how you set up service reachability and DNS.

Network segmentation and project structure

During the planning stage, you must decide between one of two project structures:

  • A consolidated infrastructure host project, in which you use a single infrastructure host project to manage all networking resources for all applications
  • Segmented host projects, in which you use an infrastructure host project in combination with a different host project for each application

During the planning stage, we recommend that you also decide the administrative domains for your workload environments. Scope the permissions for your infrastructure administrators and developers based on the principle of least privilege, and scope application resources into different application projects. Because infrastructure administrators need to set up connectivity to share resources, infrastructure resources can be handled within an infrastructure project. For example, to set up connectivity to shared infrastructure resources, infrastructure administrators can use an infrastructure project to handle those shared resources. At the same time, the development team might manage their workloads in one project, and the production team might manage their workloads in a separate project. Developers would then use the infrastructure resources in the infrastructure project to create and manage resources, services, load balancing, and DNS routing policies for their workloads.

In addition, you must decide how many VPC networks you will implement initially and how they will be organized in your resource hierarchy. For details about how to choose a resource hierarchy, see Decide a resource hierarchy for your Google Cloud landing zone. For details about how to choose the number of VPC networks, see Deciding whether to create multiple VPC networks.

For the Cross-Cloud Network, we recommend using the following VPCs:

  • One or more application VPCs to host the resources for the different applications.
  • A transit VPC, where all external connectivity is handled.
  • An optional central services VPC, which can be used to consolidate the deployment of private access to published services.

The following diagram shows a visual representation of the recommended VPC structure that was just described. You can use the VPC structure shown in the diagram with either a consolidated or segmented structure, as described in subsequent sections. The diagram shown here doesn't show connectivity between the VPC networks.

Recommended VPC structure

Consolidated infrastructure host project

You can use a consolidated infrastructure host project to manage all networking resources such as subnets, peering, and load balancers.

Multiple application Shared VPCs with their corresponding application service projects can be created in the infrastructure host project to match the organization structure. Use multiple application service projects to delegate resource administration. All networking across all application VPCs is billed to the consolidated infrastructure host project.

For this project structure, many application service projects can share a smaller number of application VPCs.

The following diagram provides a visual representation of the consolidated infrastructure host project and multiple application service projects that were just described. The diagram does not show connectivity among all projects.

Consolidated infrastructure host project and multiple application service projects

Segmented host projects

In this pattern, each group of applications has its own application host project and VPCs. Multiple application service projects can be attached to the host project. Billing for network services is split between the infrastructure host project and application host projects. Infrastructure charges are billed to the infrastructure host project, and network charges for applications are billed to each application host project.

The following diagram provides a visual representation of the multiple host projects and multiple application service projects that were just described. The diagram does not show connectivity among all projects.

Multiple host projects and multiple application service projects

Workload placement

Many connectivity choices depend upon the regional locations of your workloads. For guidance on placing workloads, see Best practices for Compute Engine regions selection. You should decide where your workloads will be before choosing connectivity locations.

External and hybrid connectivity

This section describes the requirements and recommendations for the following connectivity paths:

  • Private connections to other cloud providers
  • Private connections to on-premises data centers
  • Internet connectivity for workloads, particularly outbound connectivity

Cross-Cloud Network involves the interconnection of multiple cloud networks or on-premises networks. External networks can be owned and managed by different organizations. These networks physically connect to each other at one or more network-to-network interfaces (NNIs). The combination of NNIs must be designed, provisioned, and configured for performance, resiliency, privacy, and security.

For modularity, reusability, and the ability to insert security NVAs, place external connections and routing in a transit VPC, which then serves as a shared connectivity service for other VPCs. Routing policies for resiliency, failover, and path preference across domains can be configured once in the transit VPC and leveraged by many other VPC networks.

The design of the NNIs and the external connectivity is used later for Internal connectivity and VPC networking.

The following diagram shows the transit VPC serving as a shared connectivity service for other VPCs, which are connected using VPC Network Peering, Network Connectivity Center, or HA VPN:

Transit VPC used as a shared connectivity service for other VPCs

Private connections to other cloud providers

If you have services running in other cloud service provider (CSP) networks that you want to connect to your Google Cloud network, you can connect to them over the internet or through private connections. We recommend private connections.

When choosing options, consider throughput, privacy, cost, and operational viability.

To maximize throughput while enhancing privacy, use a direct high-speed connection between cloud networks. Direct connections remove the need for intermediate physical networking equipment. We recommend that you use Cross-Cloud Interconnect, which provides these direct connections, as well as MACsec encryption and a throughput rate of up to 100 Gbps per link.

If you can't use Cross-Cloud Interconnect, you can use Dedicated Interconnect or Partner Interconnect through a colocation facility.

Select the locations where you connect to the other CSPs based on the location's proximity to the target regions. For location selection, consider the following:

  • Check the list of locations:
    • For Cross-Cloud Interconnect, check the list of locations that are available for both Google Cloud and CSPs (availability varies by cloud provider).
    • For Dedicated Interconnect or Partner Interconnect, choose a low-latency location for the colocation facility.
  • Evaluate the latency between the given point of presence (POP) edge and the relevant region in each CSP.

To maximize the reliability of your cross-cloud connections, we recommend a configuration that supports a 99.99% uptime SLA for production workloads. For details, see Cross-Cloud Interconnect High availability, Establish 99.99% availability for Dedicated Interconnect, and Establish 99.99% availability for Partner Interconnect.

If you don't require high bandwidth between different CSPs, it's possible to use VPN tunnels. This approach can help you get started, and you can upgrade to Cross-Cloud Interconnect when your distributed applications use more bandwidth. VPN tunnels can also achieve a 99.99% SLA. For details, see HA VPN topologies.

Private connections to on-premises data centers

For connectivity to private data centers, you can use one of the following hybrid connectivity options:

  • Dedicated Interconnect
  • Partner Interconnect
  • HA VPN

The routing considerations for these connections are similar to those for Private connections to other cloud providers.

The following diagram shows connections to on-premises networks and how on-premises routers can connect to Cloud Router through a peering policy:

Connections to on-premises networks

Inter-domain routing with external networks

To increase resiliency and throughput between the networks, use multiple paths to connect the networks.

When traffic is transferred across network domains, it must be inspected by stateful security devices. As a result, flow symmetry at the boundary between the domains is required.

For networks that transfer data across multiple regions, the cost and service quality level of each network might differ significantly. You might decide to use some networks over others, based on these differences.

Set up your inter-domain routing policy to meet your requirements for inter-regional transit, traffic symmetry, throughput, and resiliency.

The configuration of the inter-domain routing policies depends on the available functions at the edge of each domain. Configuration also depends on how the neighboring domains are structured from an autonomous system and IP addressing (subnetting) perspective across different regions. To improve scalability without exceeding prefix limits on edge devices, we recommend that your IP addressing plan results in fewer aggregate prefixes for each region and domain combination.

When designing inter-regional routing, consider the following:

  • Google Cloud VPC networks and Cloud Router both support global cross-region routing. Other CSPs might have regional VPCs and Border Gateway Protocol (BGP) scopes. For details, see the documentation from your other CSP.
  • Cloud Router automatically advertises routes with predetermined path preferences based on regional proximity. This routing behavior is dependent on the configured dynamic routing mode of the VPC. You might need to override these preferences, for the routing behavior that you want.
  • Different CSPs support different BGP and Bidirectional Forwarding Detection (BFD) functions, and Google's Cloud Router also has specific route policy capabilities as described in Establish BGP sessions.
  • Different CSPs might use different BGP tie-breaking attributes to dictate preference for routes. Consult your CSP's documentation for details.

Single region inter-domain routing

We suggest that you start with single region inter-domain routing, which you build upon to create multiple region connections with inter-domain routing.

Designs that use Cloud Interconnect are required to have a minimum of two connection locations that are in the same region but different edge availability domains.

Decide whether to configure these duplicate connections in an active/active or active/passive design:

  • Active/active uses Equal Cost Multi-Path (ECMP) routing to aggregate the bandwidth of both paths and use them simultaneously for inter-domain traffic. Cloud Interconnect also supports the use of LACP-aggregated links to achieve up to 200 Gbps of aggregate bandwidth per path.
  • Active/passive forces one link to be a ready standby, only taking on traffic if the active link is interrupted.

We recommend an active/active design for intra-regional links. However, certain on-premise networking topologies combined with the use of stateful security functions can necessitate an active/passive design.

Cloud Router is instantiated across multiple zones, which provides higher resiliency than a single element would provide. The following diagram shows how all resilient connections converge at a single Cloud Router within a region. This design can support a 99.9% availability SLA within a single metropolitan area when following the guidelines to Establish 99.9% availability for Dedicated Interconnect.

The following diagram shows two on-premises routers connected redundantly to the managed Cloud Router service in a single region:

Resilient connections can use a single Cloud Router

Multi-region inter-domain routing

To provide backup connectivity, networks can peer at multiple geographical areas. By connecting the networks in multiple regions, the availability SLA can increase to 99.99%.

The following diagram shows the 99.99% SLA architectures. It shows on-premises routers in two different locations connected redundantly to the managed Cloud Router services in two different regions.

Cloud Interconnect connections in multiple regions

Beyond resiliency, the multi-regional routing design should accomplish flow symmetry. The design should also indicate the preferred network for inter-regional communications, which you can do with hot-potato and cold-potato routing. Pair cold-potato routing in one domain with hot-potato routing in the peer domain. For the cold-potato domain, we recommend using the Google Cloud network domain, which provides global VPC routing functionality.

Flow symmetry isn't always mandatory, but flow asymmetry can cause issues with stateful security functions.

The following diagram shows how you can use hot-potato and cold-potato routing to specify your preferred inter-regional transit network. In this case, traffic from prefixes X and Y stay on the originating network until they get to the region closest to the destination (cold-potato routing). Traffic from prefixes A and B switch to the other network in the originating region, then travel across the other network to the destination (hot-potato routing).

Using hot-potato and cold-potato routing
to specify your preferred inter-regional transit network

Encryption of inter-domain traffic

Unless otherwise noted, traffic is not encrypted on Cloud Interconnect connections between different CSPs or between Google Cloud and on-premise data centers. If your organization requires encryption for this traffic, you can use the following capabilities:

  • MACsec for Cloud Interconnect: Encrypts traffic over Cloud Interconnect connections between your routers and Google's edge routers. For details, see MACsec for Cloud Interconnect overview.
  • HA VPN over Cloud Interconnect: Uses multiple HA VPN tunnels to be able to provide the full bandwidth of the underlying Cloud Interconnect connections. The HA VPN tunnels are IPsec encrypted and are deployed over Cloud Interconnect connections that may also be MACsec encrypted. In this configuration, Cloud Interconnect connections are configured to allow only HA VPN traffic. For details, see HA VPN over Cloud Interconnect overview.

Internet connectivity for workloads

For both inbound and outbound internet connectivity, reply traffic is assumed to follow statefully the reverse direction of the original request's direction.

Generally, features that provide inbound internet connectivity are separate from outbound internet features, with the exception of external IP addresses which provide both directions simultaneously.

Inbound internet connectivity

Inbound internet connectivity is mainly concerned with providing public endpoints for services hosted on the cloud. Examples of this include internet connectivity to web application servers and game servers hosted on Google Cloud.

The main features providing inbound internet connectivity are Google's Cloud Load Balancing products.

All types of Cloud Load Balancing provide their own path for traffic returning to the internet source, regardless of whether you use VPC special return paths or user-defined proxy subnets. The design of the VPC is generally independent of its ability to provide inbound internet connectivity.

Outbound internet connectivity

Examples of outbound internet connectivity (where the initial request originates from the workload to an internet destination) include workloads accessing third-party APIs, downloading software packages and updates, and sending push notifications to webhook endpoints on the internet.

For outbound connectivity, you can use Google Cloud built-in options, as described in Building internet connectivity for private VMs. Alternatively, you can use central NGFW NVAs as described in Network security.

The main path to provide outbound internet connectivity is the default internet gateway destination in the VPC routing table, which is often the default route in Google VPCs. Both external IPs and Cloud NAT (Google Cloud's managed NAT service), require a route pointing at the default internet gateway of the VPC. Therefore, VPC routing designs that override the default route must provide outbound connectivity through other means. For details, see Cloud Router overview.

To secure outbound connectivity, Google Cloud offers both Cloud Next Generation Firewall enforcement and Secure Web Proxy to provide deeper filtering on HTTP and HTTPS URLs. In all cases, however, the traffic follows the default route out to the default internet gateway or through a custom default route in the VPC routing table.

Using your own IPs

You can use Google-owned IPv4 addresses for internet connectivity or you can use Bring your own IP addresses (BYOIP) to use an IPv4 space that your organization owns. Most Google Cloud products that require an internet-routable IP address support using BYOIP ranges instead.

You can also control the reputation of the IP space through the exclusive use of it. BYOIP helps with portability of connectivity, and can save IP address costs.

Internal connectivity and VPC networking

With the external and hybrid connectivity service configured, resources in the transit VPC can reach the external networks. The next step is to make this connectivity available to the resources that are hosted in other VPCs.

The following diagram shows the general structure of VPC, regardless of how you enabled external connectivity. It shows a transit VPC that terminates external connections and hosts a Cloud Router in every region. Each Cloud Router receives routes from its external peers over the NNIs in each region. Application VPCs are connected to the transit VPC so they can share external connectivity. In addition, the transit VPC functions as a hub for the spoke VPCs. The spoke VPCs can host applications, services, or a combination of both.

General structure of Cross-Cloud Network

Configure DNS forwarding and peering in the transit VPC as well. For details, see the DNS infrastructure design section.

For better performance and built-in cloud networking services, we recommend interconnecting VPCs with Cross-Cloud Network or VPC Network Peering, rather than HA VPN.

Private Service Connect endpoints and private services access frontends aren't reachable across VPC Network Peering or Cross-Cloud Network. We recommend the use of HA VPN for inter-VPC connectivity to overcome those limitations. Because the use of HA VPN for inter-VPC connectivity can result in lower throughput and increased operational overhead, the centralized services design reduces the span of the HA VPN deployment.

Alternatively, you can implement inter-VPC transitive connectivity to published service endpoints by placing an internal proxy Network Load Balancer in front of the service endpoints. This approach has some limitations to consider, which are discussed in the connectivity with Network Connectivity Center spokes using load balancing section.

The following sections discuss the possible designs for hybrid connectivity that support base IP connectivity as well as API endpoint deployments.

Inter-VPC connectivity for centralized services

When published service endpoints can be deployed in a central services VPC, we recommend that the application VPCs connect over VPC Network Peering to the hub (transit VPC) and that the central services VPC connects to the hub through HA VPN.

In this design, the transit VPC is the hub, and you deploy the forwarding rules for private service endpoints in a central services VPC. Make the central services VPC a Shared VPC network and let administrators of service projects create service endpoints in the shared network.

The design is a combination of two connectivity types:

  • VPC Network Peering: provides connectivity between the hub VPC and the application VPCs.
  • HA VPN inter-VPC connections: provide transitive connectivity for the central services VPC to the hub.

For detailed guidance and configuration blueprints to deploy these connectivity types, see Hub-and-spoke network architecture.

When you combine these architectures, plan for the following considerations:

  • Redistribution of VPC peer subnets into dynamic routing (to HA VPN and to hybrid)
  • Multi-regional routing considerations
  • Propagation of dynamic routes into VPC peering (from HA VPN and from hybrid)

The following diagram shows a central services VPC connected to the transit VPC with HA VPN, and the application VPCs connected to the transit VPC with VPC Network Peering:

Central services VPC connected to the transit VPC with HA VPN, and the application VPCs connected to the transit VPC with VPC Network Peering

The structure shown in the preceding diagram contains these components:

  • Customer location: A data center or remote office where you have network equipment. This example assumes that the locations are connected together using an external network.
  • Metro: A metropolitan area containing one or more Cloud Interconnect edge availability domains. Cloud Interconnect connects to other networks in such metropolitan areas.
  • Hub project: A project hosting at least one VPC network that serves as a hub to other VPC networks.
  • Transit VPC: A VPC network in the hub project that lands connections from on-premises and other CSPs, then serves as a transit path from other VPCs to on-premises and CSP networks.
  • App host projects and VPCs: Projects and VPC networks hosting various applications.
  • Services VPC: A VPC network hosting centralized access to services needed by applications in the application VPC networks.
  • Managed services VPC: Services provided and managed by other entities, but made accessible to applications running in VPC networks.

For the hub-and-spoke design, when application VPCs need to communicate with each other, you can connect the application VPCs to a Network Connectivity Center hub as spokes. This approach provides connectivity amongst all the VPCs in the Network Connectivity Center hub. Subgroups of communication can be created by using multiple Network Connectivity Center hubs. Any communication restrictions required among endpoints within a particular hub can be achieved using firewall policies.

Connectivity with Network Connectivity Center spoke VPCs using load balancing

This pattern includes all VPCs as spokes in a Network Connectivity Center hub, and can accommodate up to 250 interconnected VPCs. An Network Connectivity Center hub is a management plane construct that creates full mesh data plane connectivity amongst any VPC networks that are registered as spokes to the Network Connectivity Center hub. The pattern provides any-to-any connectivity and enables the deployment of managed services in any VPC, removing the need to decide between central or distributed services.

To overcome transitivity limitations, managed services and hybrid connections are accessed through internal proxy Network Load Balancers. Workload security for east-west connections can use the Cloud Next Generation Firewall. You can also use Inter-VPC NAT with this pattern.

This pattern has some limitations, so the following must be considered before adopting this pattern:

  • You can't use NVAs for perimeter firewalls with this pattern. Perimeter firewalls must remain on external networks.
  • Only TCP traffic is supported to and from external networks. This limitation occurs because connections to external networks run through a internal proxy Network Load Balancer.
  • Published services will have an additional frontend in the proxy load balancer. This additional frontend proliferates additional records in DNS and requires split-DNS lookups.
  • Layer 4 services require a new internal proxy Network Load Balancer for every new service. You might need different load balancers depending on the required protocols for the connection.
  • Load Balancing quotas are limited for each VPC network. This is an important consideration because Layer 4 services require a new internal proxy Network Load Balancer for each destination service.
  • The chosen high availability and cross-region failover design option depends on your requirements.
  • Encrypted traffic across the hybrid boundary has implications on certificate management coordination.

If the preceding considerations are manageable compromises, or irrelevant to your environment, we recommend this pattern as the preferred option.

The following diagram shows a Network Connectivity Center hybrid hub as a management plane for the Cloud Interconnect connections. It also shows a Network Connectivity Center VPC hub connecting the application and services VPC spokes:

Application VPCs connected as spokes to a Network Connectivity Center hub

Proxy load balancing for transitivity

The following are not reachable across Network Connectivity Center spoke VPCs:

  • Private Service Connect service endpoints and managed service frontends.
  • Networks behind hybrid connections (Cloud Interconnect or HA VPN) because dynamic routes are not propagated over Cross-Cloud Network.

These transitivity limitations can be overcome by deploying internal proxy Network Load Balancer