Benchmarking recipes

To support you with running your workloads, we have curated a set of reproducible benchmark recipes that use some of the most common machine learning (ML) frameworks and models. These are stored in GitHub repositories. To access these repositories, see AI Hypercomputer GitHub organization.

Overview

Before you get started with these recipes, ensure that you have completed the following steps:

  1. Choose an accelerator that best suits your workload. See Choose a deployment strategy.
  2. Select a consumption method based on your accelerator of choice, see Consumption options.
  3. Create your cluster based on the type of accelerator selected. See Cluster deployment guides.

Recipes

The following reproducible benchmark recipes are available for pre-training and inference on GKE clusters.

To search the catalog, you can filter by a combination of your framework, model, and accelerator.

  • NeMo
  • MaxText
  • SGLang
  • TensorRT-LLM
  • vLLM
  • DeepSeek R1 671B
  • GPT3-175B
  • Llama3 70B
  • Llama3.1 70B
  • Llama-3.1-405B
  • Mixtral-8-7B
  • A3 Ultra
  • A3 Mega
  • Inference
  • Pre-training
Recipe name Accelerator Model Framework Workload type
DeepSeek R1 671B A3 Mega DeepSeek R1 671B SGLang Inference on GKE
DeepSeek R1 671B A3 Mega DeepSeek R1 671B vLLM Inference on GKE
DeepSeek R1 671B A3 Ultra DeepSeek R1 671B SGLang Inference on GKE
DeepSeek R1 671B A3 Ultra DeepSeek R1 671B vLLM Inference on GKE
GPT3-175B - A3 Mega A3 Mega GPT3-175B NeMo Pre-training on GKE
Llama-3.1-405B A3 Ultra Llama-3.1-405B TensorRT-LLM Inference on GKE
A3 Mega
  • Llama3 70B
  • Llama3.1 70B
NeMo Pre-training on GKE
Llama3.1 70B - A3 Ultra A3 Ultra Llama3.1 70B MaxText Pre-training on GKE
Llama3.1 70B - A3 Ultra A3 Ultra Llama3.1 70B NeMo Pre-training on GKE
Mixtral 8x7B - A3 Mega A3 Mega Mixtral 8x7B NeMo Pre-training on GKE
Mixtral-8-7B - A3 Ultra A3 Ultra Mixtral-8-7B MaxText Pre-training on GKE
Mixtral-8-7B - A3 Ultra A3 Ultra Mixtral-8-7B NeMo Pre-training on GKE