Stay organized with collections
Save and categorize content based on your preferences.
This page describes how to drive deep-learning tasks such as image
recognition, natural language processing, as well as other compute-intensive
tasks using node pools with
NVIDIA graphics processing unit (GPU) hardware accelerators for compute
power with your Knative serving container instance.
Adding a node pool with GPUs to your GKE cluster
Have an administrator create a node pool with GPUs:
You can specify a
resource limit
to consume GPUs for your service by using the Google Cloud console or
the Google Cloud CLI when you deploy a new
service, update an existing service, or
deploy a revision:
Click Create service to display the Create service form.
In the Service settings section:
Select the GKE cluster with the GPU-enabled node
pool.
Specify the name you want to give to your service.
Click Next to continue to the next section.
In the Configure the service's first revision section:
Add a container image URL.
Click Advanced settings and in the GPU allocated menu, select
the number of GPUs
that you want to allocate to your service.
Click Next to continue to the next section.
In the Configure how this service is triggered section,
select which connectivity you would like to use to invoke the service.
Click Create to deploy the image to Knative serving and wait
for the deployment to finish.
Command line
You can download the configuration of an existing service into a
YAML file with the gcloud run services describe command by using the
--format=export flag.
You can then modify that YAML file and deploy
those changes with the gcloud beta run services replace command.
You must ensure that you modify only the specified attributes.
Download the configuration of your service into a file named
service.yaml on local workspace:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[[["\u003cp\u003eThis page guides users on utilizing NVIDIA GPUs with Knative serving container instances for compute-intensive tasks like image recognition and natural language processing.\u003c/p\u003e\n"],["\u003cp\u003eTo enable GPU usage, administrators must add a GPU-enabled node pool to their Google Kubernetes Engine (GKE) cluster and install NVIDIA's device drivers.\u003c/p\u003e\n"],["\u003cp\u003eUsers can specify GPU resource limits for their Knative service through the Google Cloud console or the Google Cloud CLI when deploying or updating a service.\u003c/p\u003e\n"],["\u003cp\u003eThe Google Cloud CLI method involves downloading a service's configuration to a YAML file, modifying the \u003ccode\u003envidia.com/gpu\u003c/code\u003e attribute, and then replacing the existing service with the updated configuration.\u003c/p\u003e\n"],["\u003cp\u003eThe number of GPUs allocated to a service is defined in the resource limit settings using Kubernetes GPU units.\u003c/p\u003e\n"]]],[],null,["# Using NVIDIA GPUs\n\nThis page describes how to drive deep-learning tasks such as image\nrecognition, natural language processing, as well as other compute-intensive\ntasks using [node pools](/kubernetes-engine/docs/concepts/node-pools) with\nNVIDIA graphics processing unit ([GPU](/gpu)) hardware accelerators for compute\npower with your Knative serving container instance.\n\nAdding a node pool with GPUs to your GKE cluster\n------------------------------------------------\n\nHave an administrator create a node pool with GPUs:\n\n1. [Add a GPU-enabled node pool to your GKE cluster](/kubernetes-engine/docs/how-to/gpus#create).\n\n2. [Install NVIDIA's device drivers to the nodes](/kubernetes-engine/docs/how-to/gpus#installing_drivers).\n\nSetting up your service to consume GPUs\n---------------------------------------\n\nYou can specify a\n[resource limit](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/)\nto consume GPUs for your service by using the Google Cloud console or\nthe Google Cloud CLI when you deploy a new\n[service](/anthos/run/archive/docs/deploying#service), update an existing service, or\ndeploy a [revision](/anthos/run/archive/docs/deploying#revision): \n\n### Console\n\n1. [Go to Knative serving](https://console.cloud.google.com/kubernetes/run?enableapi=true)\n2. Click **Create service** to display the *Create service* form.\n\n3. In the **Service settings** section:\n\n 1. Select the GKE cluster with the GPU-enabled node pool.\n 2. Specify the name you want to give to your service.\n 3. Click **Next** to continue to the next section.\n4. In the **Configure the service's first revision** section:\n\n 1. Add a container image URL.\n 2. Click **Advanced settings** and in the **GPU allocated** menu, select the [number of GPUs](/kubernetes-engine/docs/how-to/gpus#gpu_quota) that you want to allocate to your service.\n5. Click **Next** to continue to the next section.\n\n6. In the **Configure how this service is triggered** section,\n select which connectivity you would like to use to invoke the service.\n\n7. Click **Create** to deploy the image to Knative serving and wait\n for the deployment to finish.\n\n### Command line\n\n| **Caution:** Deploying configuration changes using YAML files replaces the configuration of your existing services. Since a YAML file completely overwrites all configurations, you should avoid using multiple methods to modify your services. For example, do not use YAML files in conjunction with the Google Cloud console or `gcloud` commands.\n\nYou can download the configuration of an existing service into a\nYAML file with the `gcloud run services describe` command by using the\n[`--format=export`](/sdk/gcloud/reference/run/services/describe) flag.\nYou can then modify that YAML file and deploy\nthose changes with the `gcloud beta run services replace` command.\nYou must ensure that you modify only the specified attributes.\n\n1. Download the configuration of your service into a file named\n `service.yaml` on local workspace:\n\n ```bash\n gcloud run services describe SERVICE --format export \u003e service.yaml\n ```\n\n Replace \u003cvar translate=\"no\"\u003eSERVICE\u003c/var\u003e with the name of your\n Knative serving service.\n2. In your local file, update the `nvidia.com/gpu` attribute:\n\n ```yaml\n apiVersion: serving.knative.dev/v1\n kind: Service\n metadata:\n name: SERVICE_NAME\n spec:\n template:\n spec:\n containers:\n -- image: IMAGE_URL\n resources:\n limits:\n nvidia.com/gpu: &\u003cvar translate=\"no\"\u003equot;GPU_\u003c/var\u003eUNITS\"\n ```\n\n Replace \u003cvar translate=\"no\"\u003eGPU_UNITS\u003c/var\u003e with the desired\n [GPU value](/kubernetes-engine/docs/how-to/gpus#gpu_quota)\n in Kubernetes GPU units. For example, specify `1` for 1 GPU.\n3. Deploy the YAML file and replace your service with the new configuration by\n running the following command:\n\n ```bash\n gcloud beta run services replace service.yaml\n ```\n\n \u003cbr /\u003e\n\nFor more information on GPU performance and cost, see [GPUs](/kubernetes-engine/docs/how-to/gpus)."]]