Choose and manage compute

Last reviewed 2023-10-03 UTC

This document in the Google Cloud Architecture Framework provides best practices to deploy your system based on compute requirements. You learn how to choose a compute platform and a migration approach, design and scale workloads, and manage operations and VM migrations.

Computation is at the core of many workloads, whether it refers to the execution of custom business logic or the application of complex computational algorithms against datasets. Most solutions use compute resources in some form, and it's critical that you select the right compute resources for your application needs.

Google Cloud provides several options for using time on a CPU. Options are based on CPU types, performance, and how your code is scheduled to run, including usage billing.

Google Cloud compute options include the following:

  • Virtual machines (VM) with cloud-specific benefits like live migration.
  • Bin-packing of containers on cluster-machines that can share CPUs.
  • Functions and serverless approaches, where your use of CPU time can be metered to the work performed during a single HTTP request.

Choosing compute

This section provides best practices for choosing and migrating to a compute platform.

Choose a compute platform

When you choose a compute platform for your workload, consider the technical requirements of the workload, lifecycle automation processes, regionalization, and security.

Evaluate the nature of CPU usage by your app and the entire supporting system, including how your code is packaged and deployed, distributed, and invoked. While some scenarios might be compatible with multiple platform options, a portable workload should be capable and performant on a range of compute options.

The following table provides an overview of the recommended Google Cloud compute services for various use cases:

Compute platform Use cases Recommended products
  • Deploy your first app.
  • Focus on data and processing logic and on app development, rather than maintaining infrastructure operations.
  • Cloud Run: Put your business logic in containers by using this fully managed serverless option. Cloud Run is designed for workloads that are compute intensive, but not always on. Scale cost effectively from 0 (no traffic) and define the CPU and RAM of your tasks and services. Deploy with a single command and Google automatically provisions the right amount of resources.
  • Cloud Functions: Separate your code into flexible pieces of business logic without the infrastructure concerns of load balancing, updates, authentication, or scaling.
Kubernetes Build complex microservice architectures that need additional services like Istio to manage service mesh control.
  • Google Kubernetes Engine: An open source container-orchestration engine that automates deploying, scaling, and managing containerized apps.
Virtual machines (VMs) Create and run VMs from predefined and customizable VM families that support your application and workload requirements, as well as third-party software and services.
  • Compute Engine: Add graphics processing units (GPUs) to your VM instances. You can use these GPUs to accelerate specific workloads on your instances like machine learning and data processing.

To select appropriate machine types based on your requirements, see Recommendations for machine families.

For more information, see Choosing compute options.

Choose a compute migration approach

If you're migrating your existing applications from another cloud or from on-premises, use one of the following Google Cloud products to help you optimize for performance, scale, cost, and security.

Migration goal Use case Recommended product
Lift and shift Migrate or extend your VMware workloads to Google Cloud in minutes. Google Cloud VMware Engine
Lift and shift Move your VM-based applications to Compute Engine. Migrate to Virtual Machines
Upgrade to containers Modernize traditional applications into built-in containers on Google Kubernetes Engine. Migrate to Containers

To learn how to migrate your workloads while aligning internal teams, see VM Migration lifecycle and Building a Large Scale Migration Program with Google Cloud.

Designing workloads

This section provides best practices for designing workloads to support your system.

Evaluate serverless options for simple logic

Simple logic is a type of compute that doesn't require specialized hardware or machine types like CPU-optimized machines. Before you invest in Google Kubernetes Engine (GKE) or Compute Engine implementations to abstract operational overhead and optimize for cost and performance, evaluate serverless options for lightweight logic.

Decouple your applications to be stateless

Where possible, decouple your applications to be stateless to maximize use of serverless computing options. This approach lets you use managed compute offerings, scale applications based on demand, and optimize for cost and performance. For more information about decoupling your application to design for scale and high availability, see Design for scale and high availability.

Use caching logic when you decouple architectures

If your application is designed to be stateful, use caching logic to decouple and make your workload scalable. For more information, see Database best practices.

Use live migrations to facilitate upgrades

To facilitate Google maintenance upgrades, use live migration by setting instance availability policies. For more information, see Set VM host maintenance policy.

Scaling workloads

This section provides best practices for scaling workloads to support your system.

Use startup and shutdown scripts

For stateful applications, use startup and shutdown scripts where possible to start and stop your application state gracefully. A graceful startup is when a computer is turned on by a software function and the operating system is allowed to perform its tasks of safely starting processes and opening connections.

Graceful startups and shutdowns are important because stateful applications depend on immediate availability to the data that sits close to the compute, usually on local or persistent disks, or in RAM. To avoid running application data from the beginning for each startup, use a startup script to reload the last saved data and run the process from where it previously stopped on shutdown. To save the application memory state to avoid losing progress on shutdown, use a shutdown script. For example, use a shutdown script when a VM is scheduled to be shut down due to downscaling or Google maintenance events.

Use MIGs to support VM management

When you use Compute Engine VMs, managed instance groups (MIGs) support features like autohealing, load balancing, autoscaling, auto updating, and stateful workloads. You can create zonal or regional MIGs based on your availability goals. You can use MIGs for stateless serving or batch workloads and for stateful applications that need to preserve each VM's unique state.

Use pod autoscalers to scale your GKE workloads

Use horizontal and vertical Pod autoscalers to scale your workloads, and use node auto-provisioning to scale underlying compute resources.

Distribute application traffic

To scale your applications globally, use Cloud Load Balancing to distribute your application instances across more than one region or zone. Load balancers optimize packet routing from Google Cloud edge networks to the nearest zone, which increases serving traffic efficiency and minimizes serving costs. To optimize for end-user latency, use Cloud CDN to cache static content where possible.

Automate compute creation and management

Minimize human-induced errors in your production environment by automating compute creation and management.

Managing operations

This section provides best practices for managing operations to support your system.

Use Google-supplied public images

Use public images supplied by Google Cloud. The Google Cloud public images are regularly updated. For more information, see List of public images available on Compute Engine.

You can also create your own images with specific configurations and settings. Where possible, automate and centralize image creation in a separate project that you can share with authorized users within your organization. Creating and curating a custom image in a separate project lets you update, patch, and create a VM using your own configurations. You can then share the curated VM image with relevant projects.

Use snapshots for instance backups

Snapshots let you create backups for your instances. Snapshots are especially useful for stateful applications, which aren't flexible enough to maintain state or save progress when they experience abrupt shutdowns. If you frequently use snapshots to create new instances, you can optimize your backup process by creating a base image from that snapshot.

Use a machine image to enable VM instance creation

Although a snapshot only captures an image of the data inside a machine, a machine image captures machine configurations and settings, in addition to the data. Use a machine image to store all of the configurations, metadata, permissions, and data from one or more disks that are needed to create a VM instance.

When you create a machine from a snapshot, you must configure instance settings on the new VM instances, which requires a lot of work. Using machine images lets you copy those known settings to new machines, reducing overhead. For more information, see When to use a machine image.

Capacity, reservations, and isolation

This section provides best practices for managing capacity, reservations, and isolation to support your system.

Use committed-use discounts to reduce costs

You can reduce your operational expenditure (OPEX) cost for workloads that are always on by using committed use discounts. For more information, see the Cost optimization category.

Choose machine types to support cost and performance

Google Cloud offers machine types that let you choose compute based on cost and performance parameters. You can choose a low-performance offering to optimize for cost or choose a high-performance compute option at higher cost. For more information, see the Cost optimization category.

Use sole-tenant nodes to support compliance needs

Sole-tenant nodes are physical Compute Engine servers that are dedicated to hosting only your project's VMs. Sole-tenant nodes can help you to meet compliance requirements for physical isolation, including the following:

  • Keep your VMs physically separated from VMs in other projects.
  • Group your VMs together on the same host hardware.
  • Isolate payments processing workloads.

For more information, see Sole-tenant nodes.

Use reservations to ensure resource availability

Google Cloud lets you define reservations for your workloads to ensure those resources are always available. There is no additional charge to create reservations, but you pay for the reserved resources even if you don't use them. For more information, see Consuming and managing reservations.

VM migration

This section provides best practices for migrating VMs to support your system.

Evaluate built-in migration tools

Evaluate built-in migration tools to move your workloads from another cloud or from on-premises. For more information, see Migration to Google Cloud. Google Cloud offers tools and services to help you migrate your workloads and optimize for cost and performance. To receive a free migration cost assessment based on your current IT landscape, see Google Cloud Rapid Assessment & Migration Program.

Use virtual disk import for customized operating systems

To import customized supported operating systems, see Importing virtual disks. Sole-tenant nodes can help you meet your hardware bring-your-own-license requirements for per-core or per-processor licenses. For more information, see Bringing your own licenses.


To apply the guidance in the Architecture Framework to your own environment, we recommend that you do following:

  • Review Google Cloud Marketplace offerings to evaluate whether your application is listed under a supported vendor. Google Cloud supports running various open source systems and various third-party software.

  • Consider Migrate to Containers and GKE to extract and package your VM-based application as a containerized application running on GKE.

  • Use Compute Engine to run your applications on Google Cloud. If you have legacy dependencies running in a VM-based application, verify whether they meet your vendor requirements.

  • Evaluate using a Google Cloud internal passthrough Network Load Balancer to scale your decoupled architecture. For more information, see Internal passthrough Network Load Balancer overview.

  • Evaluate your options for switching from traditional on-premises use cases like HA-Proxy usage. For more information, see best practice for floating IP address.

  • Use VM Manager to manage operating systems for your large VM fleets running windows or Linux on Compute Engine, and apply consistent configuration policies.

  • Consider using GKE Autopilot and let Google SRE fully manage your clusters.

  • Use Policy Controller and Config Sync for policy and configuration management across your GKE clusters.

  • Ensure availability and scalability of machines in specific regions and zones. Google Cloud can scale to support your compute needs. However, if you need a lot of specific machine types in a specific region or zone, work with your account teams to ensure availability. For more information, see Reservations for Compute Engine.

What's next

Learn networking design principles, including the following:

Explore other categories in the Architecture Framework such as reliability, operational excellence, and security, privacy, and compliance.