AI Infrastructure

Scalable, high performance, and cost effective infrastructure for every AI workload.

AI Accelerators for every use case from high performance training to low-cost inference
Scale faster with GPUs and TPUs on Google Kubernetes Engine or Google Compute Engine
Deployable solutions for Vertex AI, Google Kubernetes Engine, and the Cloud HPC Toolkit

Google is named a Leader in the 2023 Gartner® Magic Quadrant™ for Cloud AI Developer Services report

Benefits

Optimize performance and cost at scale

With Google Cloud, you can choose from GPUs, TPUs, or CPUs to support a variety of use cases including high performance training, low cost inference, and large-scale data processing.

Deliver results faster with managed infrastructure

Scale faster and more efficiently with managed infrastructure provided by Vertex AI. Set up ML environments quickly, automate orchestration, manage large clusters, and set up low latency applications.

Develop with software that’s purpose-built for AI

Improve AI development productivity by leveraging GKE to manage large-scale workloads. Train and serve the foundation models with support for autoscaling, workload orchestration, and automatic upgrades.

Key features

Flexible and scalable hardware for any use case

There’s no one-size-fits-all when it comes to AI workloads—that’s why together with our industry hardware partners, like NVIDIA, Intel, AMD, Arm and more, we provide customers with the widest range of AI-optimized compute options across TPUs, GPUs, and CPUs for training and serving the most data-intensive models.

Easy to use, manage, and scale

Orchestrating large-scale AI workloads with Cloud TPUs and Cloud GPUs has historically required manual effort to handle failures, logging, monitoring, and other foundational operations. Google Kubernetes Engine (GKE), the most scalable and fully-managed Kubernetes service, considerably simplifies the work required to operate TPUs and GPUs. Leveraging GKE to manage large-scale AI workload orchestration on Cloud TPU and Cloud GPU improves AI development productivity.

And for organizations that prefer the simplicity of abstracting away the infrastructure through managed services, Vertex AI now supports training with various frameworks and libraries using Cloud TPU and Cloud GPU.

Scale your AI models exponentially

Our AI-optimized infrastructure is built to deliver the global scale and performance demanded by Google products like YouTube, Gmail, Google Maps, Google Play, and Android that serve billions of users. Our AI infrastructure solutions are all underpinned by Google Cloud's Jupiter data center network which supports best-in-industry, scale-out capability for foundational services, through to high-intensity AI workloads.

Highly flexible and open platform

For decades, we’ve contributed to critical AI projects like TensorFlow and JAX. We co-founded the PyTorch Foundation and recently announced a new industry consortium—the OpenXLA project. Additionally, Google is the leading CNCF Open Source contributor, and has a 20+ year history of OSS contributions like TFX, MLIR, OpenXLA, KubeFlow, and Kubernetes as well as sponsorship of OSS projects critical to the data science community, like Project Jupyter and NumFOCverteUS.

Furthermore, our AI infrastructure services are embedded with the most popular AI frameworks such as TensorFlow, PyTorch, and MXNet allowing customers to continue using whichever framework they prefer, and not be constrained to a specific framework/or hardware architecture.

BLOG

Announcing A3 supercomputers for AI, with NVIDIA H100 GPUs

Customers

Customers leveraging Google Cloud's AI infrastructure

As AI opens the door for innovation across industries, companies are choosing Google Cloud to take advantage of our open, flexible, and performant infrastructure.

News

Anthropic forges partnership with Google Cloud to help deliver reliable and responsible AI

5-min read

Blog post

How Cohere is accelerating language model training with Google Cloud TPUs

10-min read

News

Midjourney selects Google Cloud to power AI-generated creative platform

5-min read

Blog post

How Osmo is digitizing smell with Google Cloud AI technology

5-min read

Blog post

AI21 leverages Google Cloud infrastructure including GPUs and TPUs for training and inference

5-min read

See all customers

What's new

Blog post

Announcing A3 supercomputers with NVIDIA H100 GPUs, purpose -built for AI Read the blog

Blog post

Getting started with Ray on Google Kubernetes Engine Read the blog

Blog post

How to build and execute AI use cases at the edge Read the blog

Blog post

Cloud TPU V4 records fastest training times compared to MLPerf benchmarks Read the blog

Blog post

TensorFlow on GKE Autopilot with GPU acceleration Read the blog

Blog post

Running AlphaFold batch inference with Vertex AI Pipelines Read the blog

Documentation

AI Infrastructure Tools on GKE

Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities.

Deep Learning VM Images

Deep Learning VM Images are optimized for data science and machine learning tasks. They come with key ML frameworks and tools pre-installed, and work with GPUs.

Deep Learning Containers

Deep Learning Containers are performance-optimized, consistent environments to help you prototype and implement workflows quickly on CPUs or GPUs.

How are Tensor Processing Units optimized for AI/ML?

Learn about the computational requirements of machine learning, and how TPUs were purpose-built to handle the task.

TPU System Architecture

TPUs are Google's custom-developed ASICs used to accelerate machine learning workloads. Learn about the underlying system architecture of TPUs from the ground up.

Pricing

Pricing for AI Infrastructure is based on the product selected. You can get started with Google's AI infrastructure for free with Colab or Google Cloud's free tier.

Cloud TPU	Cloud GPU
For information on TPU pricing for single device TPU types and TPU pod types, refer to TPU pricing.	For information about GPU pricing for the different GPU types and regions that are available, refer to the GPU pricing.

Cloud AI products comply with our SLA policies. They may offer different latency or availability guarantees from other Google Cloud services.

Take the next step

Start building on Google Cloud with $300 in free credits and 20+ always free products.

Try Google Cloud free

Take the next step

Start your next project, explore interactive tutorials, and manage your account.

Go to console

Need help getting started?
Contact sales
Work with a trusted partner
Find a partner
Continue browsing
See all products

Need help getting started?
Contact sales
Work with a trusted partner
Find a partner
Get tips & best practices
See tutorials

AI Infrastructure

Optimize performance and cost at scale

Deliver results faster with managed infrastructure

Develop with software that’s purpose-built for AI

Key features

Flexible and scalable hardware for any use case

Easy to use, manage, and scale

Scale your AI models exponentially

Highly flexible and open platform

Customers leveraging Google Cloud's AI infrastructure

What's new

Documentation

AI Infrastructure Tools on GKE

Deep Learning VM Images

Deep Learning Containers

How are Tensor Processing Units optimized for AI/ML?

TPU System Architecture

Not seeing what you’re looking for?

Pricing

Take the next step

Take the next step