Migrating containers to Google Cloud: Migrating to a multi-cluster GKE environment

Last reviewed 2023-05-08 UTC

This document helps you plan, design, and implement your migration from a Google Kubernetes Engine (GKE) environment to a new GKE environment. If done incorrectly, moving apps from one environment to another can be a challenging task. Therefore, you need to plan and execute your migration carefully.

This document is part of a multi-part series about migrating to Google Cloud. For an overview of the series, see Migration to Google Cloud: Choosing your migration path.

This document is part of a series that discusses migrating containers to Google Cloud:

This document is useful if you're planning to migrate from a GKE environment to another GKE environment. This document is also useful if you're evaluating the opportunity to migrate and want to explore what it might look like.

Reasons to migrate from a GKE environment to another GKE environment can include the following:

  • Enabling GKE features available only on cluster creation. GKE is constantly evolving with new features and security fixes. To benefit from most new features and fixes, you might need to upgrade your GKE clusters and node pools to a newer GKE version, either through auto-upgrade or manually.

    Some new GKE features can't be enabled on existing clusters, and they require you to create new GKE clusters with those new features enabled. For example, you can enable VPC-native networking in GKE, Dataplane V2 or Metadata concealment only when you create new clusters. You can't update the configuration of existing clusters to enable those features after their creation.

  • Implementing an automated provisioning and configuration process for your infrastructure. If you manually provision and configure your infrastructure, you can design and implement an automated process to provision and configure your GKE clusters, instead of relying on manual, and error-prone, methods.

When you design the architecture of your new environment, we recommend that you consider a multi-cluster GKE environment. By provisioning and configuring multiple GKE clusters in your environment, you do the following:

  • Reduce the chances of introducing a single point of failure in your architecture. For example, if a cluster suffers an outage, other clusters can take over.
  • Benefit from the greater flexibility that a multi-cluster environment provides. For example, by applying changes to a subset of your clusters, you can limit the impact of issues caused by erroneous configuration changes. You can then validate the changes before you apply them to your remaining clusters.
  • Let your workloads communicate across clusters. For example, workloads deployed in a cluster can communicate with workloads deployed in another cluster.

The guidance in this document is also applicable to a single-cluster GKE environment. When you migrate to a single-cluster GKE environment, your environment is less complex to manage compared to a multi-cluster environment. However, a single-cluster environment doesn't benefit from the increased flexibility, reliability, and resilience of a multi-cluster GKE environment.

The following diagram illustrates the path of your migration journey.

Migration path with four phases.

The framework illustrated in the preceding diagram has the following phases, which are defined in Migration to Google Cloud: Getting started:

  1. Assessing and discovering your workloads.
  2. Planning and building a foundation.
  3. Deploying your workloads.
  4. Optimizing your environment.

You follow the preceding phases during each migration step. This document also relies on concepts that are discussed in Migrating containers to Google Cloud: Migrating Kubernetes to GKE. It includes links where appropriate.

Assessing your environment

In the assessment phase, you gather information about your source environment and the workloads that you want to migrate. This assessment is crucial for your migration and to rightsize the resources that you need for the migration and your target environment. In the assessment phase, you do the following:

  1. Build a comprehensive inventory of your apps.
  2. Catalog your apps according to their properties and dependencies.
  3. Train and educate your teams on Google Cloud.
  4. Build an experiment and proof of concept on Google Cloud.
  5. Calculate the total cost of ownership (TCO) of the target environment.
  6. Choose the workloads that you want to migrate first.

The following sections rely on Migration to Google Cloud: Assessing and discovering your workloads. However, they provide information that is specific to assessing workloads that you want to migrate to new GKE clusters.

In Migrating Kubernetes to GKE, Assessing your environment describes how to assess Kubernetes clusters and resources, such as ServiceAccounts, and PersistentVolumes. The information also applies to assessing your GKE environment.

Build your inventories

To scope your migration, you must understand your current GKE environment. You start by gathering information about your clusters, and then you focus on your workloads deployed in those clusters and the workloads' dependencies. At the end of the assessment phase, you have two inventories: one for your clusters, and one for the workloads deployed in those clusters.

In Migrating Kubernetes to GKE, Build your inventories describes how to build the inventories of your Kubernetes clusters and workloads. It is also applicable to building the inventories of your GKE environments. Before you proceed with this document, follow that guidance to build the inventory of your Kubernetes clusters.

After you follow the Migrating Kubernetes to GKE guidance to build your inventories, you refine the inventories. To complete the inventory of your GKE clusters and Node pools, consider GKE-specific aspects and features for each cluster and Node pool, including the following:

When you build your inventory, you might find some GKE clusters that need to be decommissioned as part of your migration. Some Google Cloud resources aren't deleted when you delete the GKE clusters that created them. Make sure that your migration plan includes retiring those resources.

For information about other potential GKE-specific aspects and features, review the GKE documentation.

Complete the assessment

After you build the inventories related to your GKE clusters and workloads, complete the rest of the activities of the assessment phase in Migrating containers to Google Cloud: Migrating Kubernetes to GKE.

Planning and building your foundation

In the plan phase, you provision and configure the foundation, the cloud infrastructure, and services that support your workloads on Google Cloud. In the plan phase, you do the following:

  • Build a resource hierarchy.
  • Configure identity and access management.
  • Set up billing.
  • Set up network connectivity.
  • Harden your security.
  • Set up monitoring and alerting.

When you set up the network connectivity, ensure that you have enough IP addresses in your subnets to allocate for Nodes, Pods, and Services. When you set up networking for your clusters, plan your IP address allocations carefully—for example, you can configure privately used public IPs for GKE. The secondary IP address ranges that you set for Pods and Services on your clusters can't be changed after you allocate them. Take particular care if you allocate a Pod or Service range of /22 (1024 addresses) or smaller. Otherwise, you might run out of IP addresses for Pods and Services as your cluster grows. For more information, see IP address range planning.

We recommend that you use a separate shared subnet for internal load balancers that you create for your GKE environment. When you use a Kubernetes Service of type: LoadBalancer, you can specify a load balancer subnet. When you configure internal HTTP(S) internal load balancers, you must configure a proxy-only subnet.

To build the foundation of your GKE environment, complete the activities of the planning and building phase in Migrating containers to Google Cloud: Migrating Kubernetes to GKE.

Deploying your workloads

In the deployment phase, you do the following:

  1. Provision and configure the target environment.
  2. Migrate data from your source environment to the target environment.
  3. Deploy your workloads in the target environment.

This section provides information that is specific to deploying workloads to GKE. It builds on the information in Migrating Kubernetes to GKE: Deploying your workloads.

Evaluate your runtime platform and environments

To have a more flexible, reliable, and maintainable infrastructure, we recommend that you design and implement a multi-cluster architecture. In a multi-cluster architecture, you have multiple production GKE clusters in your environment. For example, if you provision multiple GKE clusters in your environment, you can implement advanced cluster lifecycle strategies, such as rolling upgrades or blue-green upgrades. For more information about multi-cluster GKE architecture designs and their benefits, see Multi-cluster GKE upgrades using Multi Cluster Ingress.

When you run your environment across multiple clusters, there are additional challenges to consider, such as the following:

  • You need to adapt configuration management, service discovery and communication, application rollouts, and load balancing for incoming traffic.
  • You likely need to run extra software on your cluster, and extra automation and infrastructure.

To address these challenges, you might need Continuous Integration/Continuous Deployment (CI/CD) pipelines to update the configuration of clusters sequentially to minimize the impact of mistakes. You might also need load balancers to distribute traffic from one cluster to other clusters.

Manually managing your infrastructure is error prone and exposes you to issues due to misconfiguration, and lack of internal documentation about the current state of your infrastructure. To help mitigate the risks due to those issues, we recommend that you apply the infrastructure as code pattern. When you apply this pattern, you treat the provisioning of your infrastructure the same way you handle the source code of your workloads.

There are several architecture options for your multi-cluster GKE environment, described later in this section. Choosing one option over the others depends on several factors, and no option is inherently better than the others. Each type has its own strengths and weaknesses. To choose a type of architecture, do the following:

  1. Establish a set of criteria to evaluate the types of architectures of multi-cluster GKE environments.
  2. Assess each option against the evaluation criteria.
  3. Choose the option that best suits your needs.

To establish the criteria to evaluate the architecture types of multi-clus