gcloud alpha container ai profiles

NAME
gcloud alpha container ai profiles - quickstart engine for GKE AI workloads
SYNOPSIS
gcloud alpha container ai profiles GROUP [GCLOUD_WIDE_FLAG]
DESCRIPTION
(ALPHA) The GKE Inference Quickstart helps simplify deploying AI inference on Google Kubernetes Engine (GKE). It provides tailored profiles based on Google's internal benchmarks. Provide inputs like your preferred open-source model (e.g. Llama, Gemma, or Mistral) and your application's performance target. Based on these inputs, the quickstart generates accelerator choices with performance metrics, and detailed, ready-to-deploy profiles for compute, load balancing, and autoscaling. These profiles are provided as standard Kubernetes YAML manifests, which you can deploy or modify.
GCLOUD WIDE FLAGS
These flags are available to all commands: --help.

Run $ gcloud help for details.

GROUPS
GROUP is one of the following:
accelerators
(ALPHA) Manage supported accelerators for GKE Inference Quickstart.
manifests
(ALPHA) Generate optimized Kubernetes manifests.
model-and-server-combinations
(ALPHA) Manage supported model and model servers for GKE Inference Quickstart.
model-server-versions
(ALPHA) Manage supported model server versions for GKE Inference Quickstart.
model-servers
(ALPHA) Manage supported model servers for GKE Inference Quickstart.
models
(ALPHA) Manage supported models for GKE Inference Quickstart.
NOTES
This command is currently in alpha and might change without notice. If this command fails with API permission errors despite specifying the correct project, you might be trying to access an API with an invitation-only early access allowlist.