Stay organized with collections
Save and categorize content based on your preferences.
This page describes how to drive deep-learning tasks such as image
recognition, natural language processing, as well as other compute-intensive
tasks using node pools with
NVIDIA graphics processing unit (GPU) hardware accelerators for compute
power with your Knative serving container instance.
Adding a node pool with GPUs to your GKE cluster
Have an administrator create a node pool with GPUs:
You can specify a
resource limit
to consume GPUs for your service by using the Google Cloud console or
the Google Cloud CLI when you deploy a new
service, update an existing service, or
deploy a revision:
Click Create service to display the Create service form.
In the Service settings section:
Select the GKE cluster with the GPU-enabled node
pool.
Specify the name you want to give to your service.
Click Next to continue to the next section.
In the Configure the service's first revision section:
Add a container image URL.
Click Advanced settings and in the GPU allocated menu, select
the number of GPUs
that you want to allocate to your service.
Click Next to continue to the next section.
In the Configure how this service is triggered section,
select which connectivity you would like to use to invoke the service.
Click Create to deploy the image to Knative serving and wait
for the deployment to finish.
Command line
You can download the configuration of an existing service into a
YAML file with the gcloud run services describe command by using the
--format=export flag.
You can then modify that YAML file and deploy
those changes with the gcloud beta run services replace command.
You must ensure that you modify only the specified attributes.
Download the configuration of your service into a file named
service.yaml on local workspace:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-07 UTC."],[[["\u003cp\u003eThis page guides users on utilizing NVIDIA GPUs with Knative serving container instances for compute-intensive tasks like image recognition and natural language processing.\u003c/p\u003e\n"],["\u003cp\u003eTo enable GPU usage, administrators must add a GPU-enabled node pool to their Google Kubernetes Engine (GKE) cluster and install NVIDIA's device drivers.\u003c/p\u003e\n"],["\u003cp\u003eUsers can specify GPU resource limits for their Knative service through the Google Cloud console or the Google Cloud CLI when deploying or updating a service.\u003c/p\u003e\n"],["\u003cp\u003eThe Google Cloud CLI method involves downloading a service's configuration to a YAML file, modifying the \u003ccode\u003envidia.com/gpu\u003c/code\u003e attribute, and then replacing the existing service with the updated configuration.\u003c/p\u003e\n"],["\u003cp\u003eThe number of GPUs allocated to a service is defined in the resource limit settings using Kubernetes GPU units.\u003c/p\u003e\n"]]],[],null,[]]