Designing networks for migrating enterprise workloads: Architectural approaches

Last reviewed 2023-11-13 UTC

This document introduces a series that describes networking and security architectures for enterprises that are migrating data center workloads to Google Cloud. These architectures emphasize advanced connectivity, zero-trust security principles, and manageability across a hybrid environment.

As described in an accompanying document, Architectures for Protecting Cloud Data Planes, enterprises deploy a spectrum of architectures that factor in connectivity and security needs in the cloud. We classify these architectures into three distinct architectural patterns: lift-and-shift, hybrid services, and zero-trust distributed. The current document considers different security approaches, depending on which architecture an enterprise has chosen. It also describes how to realize those approaches using the building blocks provided by Google Cloud. You should use these security guidances in conjunction with other architectural guidances covering reliability, availability, scale, performance, and governance.

This document is designed to help systems architects, network administrators, and security administrators who are planning to migrate on-premises workloads to the cloud. It assumes the following:

  • You are familiar with data center networking and security concepts.
  • You have existing workloads in your on-premises data center and are familiar with what they do and who their users are.
  • You have at least some workloads that you plan to migrate.
  • You are generally familiar with the concepts described in Architectures for Protecting Cloud Data Planes.

The series consists of the following documents:

This document summarizes the three primary architectural patterns and introduces the resource building blocks that you can use to create your infrastructure. Finally, it describes how to assemble the building blocks into a series of reference architectures that match the patterns. You can use these reference architectures to guide your own architecture.

This document mentions virtual machines (VMs) as examples of workload resources. The information applies to other resources that use VPC networks, like Cloud SQL instances and Google Kubernetes Engine nodes.

Overview of architectural patterns

Typically, network engineers have focused on building the physical networking infrastructure and security infrastructure in on-premises data centers.

The journey to the cloud has changed this approach because cloud networking constructs are software-defined. In the cloud, application owners have limited control of the underlying infrastructure stack. They need a model that has a secure perimeter and provides isolation for their workloads.

In this series, we consider three common architectural patterns. These patterns build on one another, and they can be seen as a spectrum rather than a strict choice.

Lift-and-shift pattern

In the lift-and-shift architectural pattern, enterprise application owners migrate their workloads to the cloud without refactoring those workloads. Network and security engineers use Layer 3 and Layer 4 controls to provide protection using a combination of network virtual appliances that mimic on-premises physical devices and cloud firewall rules in the VPC network. Workload owners deploy their services in VPC networks.

Hybrid services pattern

Workloads that are built using lift-and-shift might need access to cloud services such as BigQuery or Cloud SQL. Typically, access to such cloud services is at Layer 4 and Layer 7. In this context, isolation and security cannot be done strictly at Layer 3. Therefore, service networking and VPC Service Controls are used to provide connectivity and security, based on the identities of the service that's being accessed and the service that's requesting access. In this model, it's possible to express rich access-control policies.

Zero-trust distributed pattern

In a zero-trust architecture, enterprise applications extend security enforcement beyond perimeter controls. Inside the perimeter, workloads can communicate with other workloads only if their IAM identity has specific permission, which is denied by default. In a Zero Trust Distributed Architecture, trust is identity-based and enforced for each application. Workloads are built as microservices that have centrally issued identities. That way, services can validate their callers and make policy-based decisions for each request about whether that access is acceptable. This architecture is often implemented using distributed proxies (a service mesh) instead of using centralized gateways.

Enterprises can enforce zero-trust access from users and devices to enterprise applications by configuring Identity-Aware Proxy (IAP). IAP provides identity- and context-based controls for user traffic from the internet or intranet.

Combining patterns

Enterprises that are building or migrating their business applications to the cloud usually use a combination of all three architectural patterns.

Google Cloud offers a portfolio of products and services that serve as building blocks to implement the cloud data plane that powers the architectural patterns. These building blocks are discussed later in this document. The combination of controls that are provided in the cloud data plane, together with administrative controls to manage cloud resources, form the foundation of an end-to-end security perimeter. The perimeter that's created by this combination lets you govern, deploy, and operate your workloads in the cloud.

Resource hierarchy and administrative controls

This section presents a summary of the administrative controls that Google Cloud provides as resource containers. The controls include Google Cloud organization resources, folders, and projects that let you group and hierarchically organize cloud resources. This hierarchical organization provides you with an ownership structure and with anchor points for applying policy and controls.

A Google organization resource is the root node in the hierarchy and is the foundation for creating deployments in the cloud. An organization resource can have folders and projects as children. A folder has projects or other folders as children. All other cloud resources are the children of projects.

You use folders as a method of grouping projects. Projects form the basis for creating, enabling, and using all Google Cloud services. Projects let you manage APIs, enable billing, add and remove collaborators, and manage permissions.

Using Google Identity and Access Management (IAM), you can assign roles and define access policies and permissions at all resource hierarchy levels. IAM policies are inherited by resources lower in the hierarchy. These policies can't be altered by resource owners who are lower in the hierarchy. In some cases, the identity and access management is provided at a more granular level, for example at the scope of objects in a namespace or cluster as in Google Kubernetes Engine.

Design considerations for Google Virtual Private Cloud networks

When you're designing a migration strategy to the cloud, it's important to develop a strategy for how your enterprise will use VPC networks. You can think of a VPC network as a virtual version of your traditional physical network. It is a completely isolated, private network partition. By default, workloads or services that are deployed in one VPC network cannot communicate with jobs in another VPC network. VPC networks therefore enable workload isolation by forming a security boundary.

Because each VPC network in the cloud is a fully virtual network, each has its own private IP address space. You can therefore use the same IP address in multiple VPC networks without conflict. A typical on-premises deployment might consume a large portion of the RFC 1918 private IP address space. On the other hand, if you have workloads both on-premises and in VPC networks, you can reuse the same address ranges in different VPC networks, as long as those networks aren't connected or peered, thus using up IP address space less quickly.

VPC networks are global

VPC networks in Google Cloud are global, which means that resources deployed in a project that has a VPC network can communicate with each other directly using Google's private backbone.

As figure 1 shows, you can have a VPC network in your project that contains subnetworks in different regions that span multiple zones. The VMs in any region can communicate privately with each other using the local VPC routes.

Google Cloud global VPC network implementation with subnetworks configured in different regions.

Figure 1. Google Cloud global VPC network implementation with subnetworks configured in different regions.

Sharing a network using Shared VPC

Shared VPC lets an organization resource connect multiple projects to a common VPC network so that they can communicate with each other securely using internal IP addresses from the shared network. Network administrators for that shared network apply and enforce centralized control over network resources.

When you use Shared VPC, you designate a project as a host project and attach one or more service projects to it. The VPC networks in the host project are called Shared VPC networks. Eligible resources from service projects can use subnets in the Shared VPC network.

Enterprises typically use Shared VPC networks when they need network and security administrators to centralize management of network resources such as subnets and routes. At the same time, Shared VPC networks let application and development teams create and delete VM instances and deploy workloads in designated subnets using the service projects.

Isolating environments by using VPC networks

Using VPC networks to isolate environments has a number of advantages, but you need to consider a few disadvantages as well. This section addresses these tradeoffs and describes common patterns for implementing isolation.

Reasons to isolate environments

Because VPC networks represent an isolation domain, many enterprises use them to keep environments or business units in separate domains. Common reasons to create VPC-level isolation are the following:

  • An enterprise wants to establish default-deny communications between one VPC network and another, because these networks represent an organizationally meaningful distinction. For more information, see Common VPC network isolation patterns later in this document.
  • An enterprise needs to have overlapping IP address ranges because of pre-existing on-premises environments, because of acquisitions, or because of deployments to other cloud environments.
  • An enterprise wants to delegate full administrative control of a network to a portion of the enterprise.

Disadvantages of isolating environments

Creating isolated environments with VPC networks can have some disadvantages. Having multiple VPC networks can increase the administrative overhead of managing the services that span multiple networks. This document discusses techniques that you can use to manage this complexity.

Common VPC network isolation patterns

There are some common patterns for isolating VPC networks:

  • Isolate development, staging, and production environments. This pattern lets enterprises fully segregate their development, staging, and production environments from each other. In effect, this structure maintains multiple complete copies of applications, with progressive rollout between each environment. In this pattern, VPC networks are used as security boundaries. Developers have a high degree of access to development VPC networks to do their day-to-day work. When development is finished, an engineering production team or a QA team can migrate the changes to a staging environment, where the changes can be tested in an integrated fashion. When the changes are ready to be deployed, they are sent to a production environment.
  • Isolate business units. Some enterprises want to impose a high degree of isolation between business units, especially in the case of units that were acquired or ones that demand a high degree of autonomy and isolation. In this pattern, enterprises often create a VPC network for each business unit and delegate control of that VPC to the business unit's administrators. The enterprise uses techniques that are described later in this document to expose services that span the enterprise or to host user-facing applications that span multiple business units.

Recommendation for creating isolated environments

We recommend that you design your VPC networks to have the broadest domain that aligns with the administrative and security boundaries of your enterprise. You can achieve additional isolation between workloads that run in the same VPC network by using security controls such as firewalls.

For more information about designing and building an isolation strategy for your organization, see Best practices and reference architectures for VPC design and Networking in the Google Cloud enterprise foundations blueprint.

Building blocks for cloud networking

This section discusses the important building blocks for network connectivity, network security, service networking, and service security. Figure 2 shows how these building blocks relate to one another. You can use one or more of the products that are listed in a given row.

Building blocks in the realm of cloud network connectivity and security.

Figure 2. Building blocks in the realm of cloud network connectivity and security.

The following sections discuss each of the building blocks and which Google Cloud services you can use for each of the blocks.

Network connectivity

The network connectivity block is at the base of the hierarchy. It's responsible for connecting Google Cloud resources to on-premises data centers or other clouds. Depending on your needs, you might need only one of these products, or you might use all of them to handle different use cases.

Cloud VPN

Cloud VPN lets you connect your remote branch offices or other cloud providers to Google VPC networks through an IPsec VPN connection. Traffic traveling between the two networks is encrypted by one VPN gateway and then decrypted by the other VPN gateway, thereby helping to protect data as it traverses the internet.

Cloud VPN lets you enable connectivity between your on-premises environment and Google Cloud without the overhead of provisioning the physical cross-connects that are required for Cloud Interconnect (described in the next section). You can provision an HA VPN to meet an SLA requirement of up to 99.99% availability if you have the conforming architecture. You can consider using Cloud VPN if your workloads do not require low latency or high bandwidth. For example, Cloud VPN is a good choice for non-mission-critical use cases or for extending connectivity to other cloud providers.

Cloud Interconnect

Cloud Interconnect provides enterprise-grade dedicated connectivity to Google Cloud that has higher throughput and more reliable network performance compared to using VPN or internet ingress. Dedicated Interconnect provides direct physical connectivity to Google's network from your routers. Partner Interconnect provides dedicated connectivity through an extensive network of partners, who might offer broader reach or other bandwidth options than Dedicated Interconnect does. Cross-Cloud Interconnect provides dedicated direct connectivity from from your VPC networks to other cloud providers. Dedicated Interconnect requires that you connect at a colocation facility where Google has a presence, but Partner Interconnect does not. Cross-Cloud Interconnect lets you select locations that meet your requirements to establish the connections. Cloud Interconnect ensures that the traffic between your on-premises network or other cloud network and your VPC network doesn't traverse the public internet.

You can provision these Cloud Interconnect connections to meet an SLA requirement of up to 99.99% availability if you provision the appropriate architecture. You can consider using Cloud Interconnect to support workloads that require low latency, high bandwidth, and predictable performance while ensuring that all of your traffic stays private.

Network Connectivity Center for hybrid

Network Connectivity Center provides site-to-site connectivity among your on-premises and other cloud networks. It does this using Google's backbone network to deliver reliable connectivity among your sites.

Additionally, you can extend your existing SD-WAN overlay network to Google Cloud by configuring a VM or a third-party vendor router appliance as a logical spoke attachment.

You can access resources inside the VPC networks using the router appliance, VPN, or Cloud Interconnect network as spoke attachments. You can use Network Connectivity Center to consolidate connectivity between your on-premises sites, your presences in other clouds, and Google Cloud and manage it all using a single view.

Network Connectivity Center for VPC networks

Network Connectivity Center also lets you create a mesh between many VPC networks. You can connect this mesh to on-premises or other clouds using Cloud VPN or Cloud Interconnect.

VPC Network Peering

VPC Network Peering lets you connect Google VPC networks so that workloads in different VPC networks can communicate internally regardless of whether they belong to the same project or to the same organization resource. Traffic stays within Google's network and doesn't traverse the public internet.

VPC Network Peering requires that the networks to be peered do not have overlapping IP addresses.

Network security

The network security block sits on top of the network connectivity block. It's responsible for allowing or denying access to resources based on the characteristics of IP packets.

VPC firewall rules

VPC firewall rules apply to a given network. VPC firewall rules let you allow or deny connections to or from your VM instances, based on a configuration that you specify. Enabled VPC firewall rules are always enforced, protecting your instances regardless of their configuration, of the operating system, or whether the VMs have fully booted.

Every VPC network functions as a distributed firewall. Although firewall rules are defined at the network level, connections are allowed or denied on a per-instance basis. You can think of the VPC firewall rules as existing not only between your instances and other networks, but also between individual instances within the same network.

Hierarchical firewall policies

Hierarchical firewall policies let you create and enforce a consistent firewall policy across your enterprise. These policies contain rules that can explicitly deny or allow connections. You can assign hierarchical firewall policies to the organization resource as a whole or to individual folders.

Packet mirroring

Packet mirroring clones the traffic of specific instances in your VPC network and forwards it to collectors for examination. Packet mirroring captures all traffic and packet data, including payloads and headers. You can configure mirroring for both ingress and egress traffic, for only ingress traffic, or for only egress traffic. The mirroring happens on the VM instances, not on the network.

Network virtual appliance

Network virtual appliances let you apply security and compliance controls to the virtual network that are consistent with controls in the on-premises environment. You can do this by deploying VM images that are available in the Google Cloud Marketplace to VMs that have multiple network interfaces, each attached to a different VPC network, to perform a variety of network virtual functions.

Typical use cases for virtual appliances are as follows:

  • Next-generation firewalls (NGFWs). NGFWs consist of a centralized set of firewalls that run as VMs that deliver features that aren't available in VPC firewall rules. Typical features of NGFW products include deep packet inspection (DPI) and firewall protection at the application layer. Some NGFWs also provide TLS/SSL traffic inspection and other networking functions, as described later in this list.
  • Intrusion detection system/intrusion prevention system (IDS/IPS). A network-based IDS provides visibility into potentially malicious traffic. To prevent intrusions, IPS devices can block malicious traffic from reaching its destination.
  • Secure web gateway (SWG). A SWG blocks threats from the internet by letting enterprises apply corporate policies on traffic that's traveling to and from the internet. This is done by using URL filtering, malicious code detection, and access control.
  • Network address translation (NAT) gateway. A NAT gateway translates IP addresses and ports. For example, this translation helps avoid overlapping IP addresses. Google Cloud offers Cloud NAT as a managed service, but this service is available only for traffic that's going to the internet, not for traffic that's going to on-premises or to other VPC networks.
  • Web application firewall (WAF). A WAF is designed to block malicious HTTP(S) traffic that's going to a web application. Google Cloud offers WAF functionality through