- NAME
-
- gcloud alpha container ai recommender - recommendation engine for GKE AI workloads
- SYNOPSIS
-
-
gcloud alpha container ai recommender
GROUP
[GCLOUD_WIDE_FLAG …
]
-
- DESCRIPTION
-
(ALPHA)
The GKE Inference Recommender helps simplify deploying AI inference on Google Kubernetes Engine (GKE). It provides tailored recommendations based on Google's internal benchmarks. Provide inputs like your preferred open-source model (e.g. Llama, Gemma, or Mistral) and your application's performance target. Based on these inputs, the recommender generates accelerator choices with performance metrics, and detailed, ready-to-deploy recommendations for compute, load balancing, and autoscaling. These recommendations are provided as standard Kubernetes YAML manifests, which you can deploy or modify. - GCLOUD WIDE FLAGS
-
These flags are available to all commands:
--help
.Run
$ gcloud help
for details. - GROUPS
-
is one of the following:GROUP
accelerators
-
(ALPHA)
Manage supported accelerators for GKE recommender. manifests
-
(ALPHA)
Generate optimized Kubernetes manifests. model-and-server-combinations
-
(ALPHA)
Manage supported model and model servers for GKE recommender. model-server-versions
-
(ALPHA)
Manage supported model server versions for GKE recommender. model-servers
-
(ALPHA)
Manage supported model servers for GKE recommender. models
-
(ALPHA)
Manage supported models for GKE recommender.
- NOTES
- This command is currently in alpha and might change without notice. If this command fails with API permission errors despite specifying the correct project, you might be trying to access an API with an invitation-only early access allowlist.
gcloud alpha container ai recommender
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-04-01 UTC.