About Ray on Google Kubernetes Engine (GKE)


This page provides an overview of the Ray Operator and relevant custom resources to deploy and manage Ray clusters and applications on Google Kubernetes Engine (GKE).

Ray is an open-source unified compute framework for scaling AI/ML and Python applications. Ray provides a set of libraries to distribute the compute runtime for AI/ML across multiple compute nodes.

To learn how to enable the Ray operator on GKE, see Enable the Ray operator on GKE.

Why use the Ray Operator on GKE

The Ray Operator is the recommended way to deploy and manage Ray clusters on GKE. When you run the Ray Operator on GKE, you benefit from Ray's support for Python and GKE's enterprise-grade reliability, portability, and scalability.

The Ray Operator on GKE is based on KubeRay, which provides declarative Kubernetes APIs specifically designed for managing Ray clusters. This means that you can provision, scale, and manage your Ray Deployments with other containerized workloads on GKE.

How the Ray Operator on GKE works

When you enable the Ray Operator in your GKE clusters, GKE automatically installs and hosts the KubeRay operator.

KubeRay provides Kubernetes custom resources to manage Ray Deployments on Kubernetes, including:

RayCluster custom resource

The RayCluster custom resource lets you specify a Ray cluster that GKE deploys as Kubernetes Pods. A Ray cluster typically consists of a single head Pod and multiple worker Pods.

RayJob custom resource

The RayJob custom resource lets you execute a single Ray job. KubeRay creates a RayCluster to provide compute resources for the job, then creates a Kubernetes Job that submits the Ray job to the head Pod of the RayCluster.

For efficient resource management, you can configure KubeRay to automatically clean up the RayCluster after your job is successfully completed.

RayService custom resource

The RayService custom resource lets you configure Ray Serve applications, such as applications for model serving and inference. KubeRay creates a RayCluster to provide the compute resources and then deploys the Ray Serve application as specified by the Ray Serve configuration.

What's next