Benchmarking recipes

To support you with running your workloads, we have curated a set of reproducible benchmark recipes that use some of the most common machine learning (ML) frameworks and models. These are stored in GitHub repositories. To access these repositories, see AI Hypercomputer GitHub organization. These benchmark recipes were tested on clusters created using Cluster Toolkit.

Overview

Before you get started with these recipes, ensure that you have completed the following steps:

Choose an accelerator that best suits your workload. See Choose a deployment strategy.
Select a consumption method based on your accelerator of choice, see Consumption options.
Create your cluster based on the type of accelerator selected. See Cluster deployment guides.

Recipes

The following reproducible benchmark recipes are available for pre-training and inference on GKE clusters.

To search the catalog, you can filter by a combination of your framework, model, and accelerator.

Recipe name	Accelerator	Model	Framework	Workload type
Llama3.1 70B - A3 Ultra	A3 Ultra	Llama3.1 70B	MaxText	Pre-training on GKE
Llama3.1 70B - A3 Ultra	A3 Ultra	Llama3.1 70B	NeMo	Pre-training on GKE
Mixtral-8-7B - A3 Ultra	A3 Ultra	Mixtral-8-7B	NeMo	Pre-training on GKE
GPT3-175B - A3 Mega	A3 Mega	GPT3-175B	NeMo	Pre-training on GKE
Mixtral 8x7B - A3 Mega	A3 Mega	Mixtral 8x7B	NeMo	Pre-training on GKE
Llama3 70B - A3 Mega Llama3.1 70B A3 Mega	A3 Mega	Llama3 70B Llama3.1 70B	NeMo	Pre-training on GKE
DeepSeek R1 671B	A3 Mega	DeepSeek R1 671B	SGLang	Inference on GKE
DeepSeek R1 671B	A3 Mega	DeepSeek R1 671B	vLLM	Inference on GKE
Llama-3.1-405B	A3 Ultra	Llama-3.1-405B	TensorRT-LLM	Inference on GKE
DeepSeek R1 671B	A3 Ultra	DeepSeek R1 671B	SGLang	Inference on GKE
DeepSeek R1 671B	A3 Ultra	DeepSeek R1 671B	vLLM	Inference on GKE