Stay organized with collections
Save and categorize content based on your preferences.
Cluster Toolkit, formerly known as Cloud HPC Toolkit,
is open-source software offered by Google Cloud which
simplifies the process for you to deploy high performance computing (HPC),
artificial intelligence (AI), and machine learning (ML) workloads on
Google Cloud. It is designed to be highly customizable and extensible, and
intends to address the deployment needs of a broad range of use cases.
Benefits
Cluster Toolkit provides you with the following benefits:
Fast creation and deployment of turnkey HPC, AI, and ML clusters that
follow Google Cloud best practices
An open source solution that is configurable and extensible
Seamless integration with various partners such as Intel DAOS, DDN EXAscaler, and Slurm
Monitoring and performance visibility through integration with Cloud Monitoring
Components
Cluster Toolkit has the following main components:
Cluster blueprint: a YAML file that defines which modules to use
and how to customize them.
Modules: the building blocks of a deployment folder.
Modules are composed of Terraform or Packer configuration files.
gcluster engine: a Google Open Source tool that uses the information in the
cluster blueprint to combine different modules and produce a
deployment folder.
Deployment folder: a self-contained folder that can be
used to deploy a cluster onto Google Cloud. With Cluster Toolkit, you
have the added flexibility to configure a cluster to your specifications by
editing the deployment folder before you deploy.
How it works
Figure 1. Cluster Toolkit architecture overview
You can use Cluster Toolkit to deploy clusters on Google Cloud as follows:
Set up your working environment. Your working environment is the
command line from which you will run your commands. This can either be a
Linux or macOS command line or you can use Cloud Shell. If using a Linux or
macOS command line, you need to install a
few dependencies.
From the command line, complete the following:
Clone the Cluster Toolkit GitHub repository.
This repository contains the gcluster binary, modules, cluster blueprint
examples, and other resources needed for the configuration of your
cluster.
Use an editor to create your cluster blueprint file.
Example blueprints
are also available in the Cluster Toolkit GitHub repository. These
blueprints can be used either directly or as a template or starting point
for your custom cluster blueprint.
From the command line, complete the following:
Run the gcluster create command and specify your cluster blueprint. When
you run this command, gcluster engine then completes the following steps:
Builds a deployment folder that is based on the specified cluster
blueprint. This deployment folder contains all the specifications
and resources needed to deploy the cluster.
Prints instructions to the command-line on how to deploy the cluster.
This will provide you with the commands that you must run to deploy
the cluster. These will either be Terraform or Packer commands.
Run the commands provided by the gcluster engine. When you run these
commands, Terraform or Packer then deploys the cluster on Google Cloud.
After your cluster is deployed, you can submit jobs to your HPC
cluster. You can also use Cloud Monitoring to analyze and monitor
Google Cloud resources that are used by your cluster.
Limitations
Cluster Toolkit only supports creating and deleting a cluster. If you want to
modify the hardware or software configuration of an active cluster, Google
recommends the following steps:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[[["\u003cp\u003eCluster Toolkit, formerly Cloud HPC Toolkit, is an open-source Google Cloud software that simplifies the deployment of high-performance computing (HPC), artificial intelligence (AI), and machine learning (ML) workloads.\u003c/p\u003e\n"],["\u003cp\u003eThe toolkit offers rapid deployment of HPC, AI, and ML clusters while being highly configurable and extensible, and integrating seamlessly with partners like Intel DAOS, DDN EXAscaler, and Slurm.\u003c/p\u003e\n"],["\u003cp\u003eCluster Toolkit's architecture relies on components such as cluster blueprints, modules, the gcluster engine, and deployment folders to create and configure a cluster for Google Cloud deployment.\u003c/p\u003e\n"],["\u003cp\u003eUsers can deploy a cluster by setting up their environment, cloning the Cluster Toolkit repository, building the gcluster binary, creating a cluster blueprint, and running gcluster commands to generate and execute a cluster deployment.\u003c/p\u003e\n"],["\u003cp\u003eWhile Cluster Toolkit facilitates cluster creation and deletion, modifying an active cluster requires deleting the existing cluster, updating the cluster blueprint, and then redeploying the cluster.\u003c/p\u003e\n"]]],[],null,["# Cluster Toolkit, *formerly known as Cloud HPC Toolkit*,\nis open-source software offered by Google Cloud which\nsimplifies the process for you to deploy high performance computing (HPC),\nartificial intelligence (AI), and machine learning (ML) workloads on\nGoogle Cloud. It is designed to be highly customizable and extensible, and\nintends to address the deployment needs of a broad range of use cases.\n\nBenefits\n--------\n\nCluster Toolkit provides you with the following benefits:\n\n- Fast creation and deployment of turnkey HPC, AI, and ML clusters that follow Google Cloud best practices\n- An open source solution that is configurable and extensible\n- Seamless integration with various partners such as Intel DAOS, DDN EXAscaler, and Slurm\n- Monitoring and performance visibility through integration with Cloud Monitoring\n\nComponents\n----------\n\nCluster Toolkit has the following main components:\n\n- **Cluster blueprint**: a YAML file that defines which modules to use and how to customize them.\n- **Modules**: the building blocks of a deployment folder. Modules are composed of Terraform or Packer configuration files.\n- **gcluster engine**: a Google Open Source tool that uses the information in the cluster blueprint to combine different modules and produce a deployment folder.\n- **Deployment folder**: a self-contained folder that can be used to deploy a cluster onto Google Cloud. With Cluster Toolkit, you have the added flexibility to configure a cluster to your specifications by editing the deployment folder before you deploy.\n\nHow it works\n------------\n\n[](../images/cloud-cluster-toolkit-arch.svg) Figure 1. Cluster Toolkit architecture overview\n\nYou can use Cluster Toolkit to deploy clusters on Google Cloud as follows:\n\n1. Set up your working environment. Your working environment is the command line from which you will run your commands. This can either be a Linux or macOS command line or you can use Cloud Shell. If using a Linux or macOS command line, you need to install a [few dependencies](/cluster-toolkit/docs/setup/install-dependencies).\n2. From the command line, complete the following:\n\n 1. Clone the Cluster Toolkit GitHub repository. This repository contains the `gcluster` binary, modules, cluster blueprint examples, and other resources needed for the configuration of your cluster.\n 2. Build the `gcluster` binary.\n\n For detailed instructions, see\n [Configure your environment](/cluster-toolkit/docs/setup/configure-environment).\n3. Use an editor to create your cluster blueprint file.\n [Example blueprints](/cluster-toolkit/docs/setup/cluster-blueprint#example-blueprint)\n are also available in the Cluster Toolkit GitHub repository. These\n blueprints can be used either directly or as a template or starting point\n for your custom cluster blueprint.\n\n4. From the command line, complete the following:\n\n 1. Run the `gcluster create` command and specify your cluster blueprint. When you run this command, gcluster engine then completes the following steps:\n 1. Builds a deployment folder that is based on the specified cluster blueprint. This deployment folder contains all the specifications and resources needed to deploy the cluster.\n 2. Prints instructions to the command-line on how to deploy the cluster. This will provide you with the commands that you must run to deploy the cluster. These will either be Terraform or Packer commands.\n 2. Run the commands provided by the `gcluster` engine. When you run these commands, Terraform or Packer then deploys the cluster on Google Cloud.\n\n For detailed instructions, see\n [Deploy a cluster](/cluster-toolkit/docs/deploy/deploy-cluster-overview).\n5. After your cluster is deployed, you can submit jobs to your HPC\n cluster. You can also use Cloud Monitoring to analyze and monitor\n Google Cloud resources that are used by your cluster.\n\nLimitations\n-----------\n\nCluster Toolkit only supports creating and deleting a cluster. If you want to\nmodify the hardware or software configuration of an active cluster, Google\nrecommends the following steps:\n\n1. Delete the cluster\n2. Update the cluster blueprint\n3. Create the cluster deployment folder\n4. Deploy the cluster\n\nWhat's next\n-----------\n\n- Try a quickstart tutorial, see [Deploy an HPC cluster with Slurm](/cluster-toolkit/docs/quickstarts/slurm-cluster).\n- Review the [Cluster Toolkit GitHub repository](https://github.com/GoogleCloudPlatform/cluster-toolkit)."]]