Manage GPU devices with dynamic resource allocation

This page describes how to configure your GPU workloads to use dynamic resource allocation in your Google Distributed Cloud bare metal clusters. Dynamic resource allocation is a Kubernetes API that lets you request and share generic resources, such as GPUs, among Pods and containers. Third-party drivers manage these resources.

With dynamic resource allocation, Kubernetes schedules Pods based on the referenced device configuration. App operators don't need to select specific nodes in their workloads and don't need to ensure that each Pod requests exactly the number of devices that are attached to those nodes. This process is similar to allocating volumes for storage.

This capability helps you run AI workloads by dynamically and precisely allocating the GPU resources within your bare metal clusters, improving resource utilization and performance for demanding workloads.

This page is for Admins and architects and Operators who manage the lifecycle of the underlying tech infrastructure. To learn more about common roles and example tasks that we reference in Google Cloud content, see Common GKE Enterprise user roles and tasks.

Before you begin

Before you configure your GPU workloads to use dynamic resource allocation, verify that the following prerequisites are met:

Create GPU workloads that use dynamic resource allocation

For your GPU workloads to take advantage of dynamic resource allocation to request GPUs, they must be in a shared namespace with a ResourceClaim that describes the request for GPU device allocation. Your workloads must reference the ResourceClaim for Kubernetes to assign GPU resources.

The following steps set up an environment in which your workloads use dynamic resource allocation to request GPU resources:

  1. To create resources related to dynamic resource allocation, create a new Namespace in your cluster:

    cat <<EOF | kubectl apply --kubeconfig=CLUSTER_KUBECONFIG -f -
    apiVersion: v1
    kind: Namespace
    metadata:
      name: NAMESPACE_NAME
    EOF
    

    Replace the following:

    • CLUSTER_KUBECONFIG: the path of the user cluster kubeconfig file.

    • NAMESPACE_NAME with the name for your dynamic resource allocation namespace.

  2. Create a ResourceClaim to describe the request for GPU access:

    cat <<EOF | kubectl apply --kubeconfig=CLUSTER_KUBECONFIG -f -
    apiVersion: resource.k8s.io/v1beta1
    kind: ResourceClaim
    metadata:
      namespace: NAMESPACE_NAME
      name: RESOURCE_CLAIM_NAME
    spec:
        devices:
          requests:
          - name: gpu
            deviceClassName: gpu.nvidia.com
    EOF
    

    Replace RESOURCE_CLAIM_NAME with the name of your resource claim for GPU requests.

  3. Create workloads that reference the ResourceClaim created in the preceding step.

    The following workload examples show how to reference a ResourceClaim named gpu-claim in the dra-test namespace. The containers in the pod1 Pod are NVIDIA compute unified device architecture (CUDA) samples designed to run CUDA workloads on the GPUs. When the pod1 Pod completes successfully, it indicates that the dynamic resource allocation capability is working properly and dynamic resource allocation is ready to manage GPU resources in your cluster.

    Ubuntu

    1. Use the following command to apply the manifest to your cluster:

      cat <<EOF | kubectl apply --kubeconfig=CLUSTER_KUBECONFIG -f -
      apiVersion: v1
      kind: Pod
      metadata:
        name: pod1
        namespace: dra-test
      spec:
        restartPolicy: OnFailure
        resourceClaims:
          - name: gpu
            resourceClaimName: gpu-claim
        containers:
          - name: ctr0
            image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0
            resources:
              claims:
                - name: gpu
          - name: ctr1
            image: nvcr.io/nvidia/k8s/cuda-sample:devicequery
            resources:
              claims:
                - name: gpu
      EOF
      

    RHEL

    1. Download and install SELinux policy module nvidia_container_t, which is required to access GPUs.

      For more information, refer to the NVIDIA dgx-selinux repository.

    2. Use the following command to apply the manifest to your cluster:

      cat <<EOF | kubectl apply --kubeconfig=CLUSTER_KUBECONFIG -f -
      apiVersion: v1
      kind: Pod
      metadata:
        name: pod1
        namespace: dra-test
      spec:
        restartPolicy: OnFailure
        securityContext:
          seLinuxOptions:
            type: nvidia_container_t
        resourceClaims:
          - name: gpu
            resourceClaimName: gpu-claim
        containers:
          - name: ctr0
            image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0
            resources:
              claims:
                - name: gpu
          - name: ctr1
            image: nvcr.io/nvidia/k8s/cuda-sample:devicequery
            resources:
              claims:
                - name: gpu
      EOF
      

Limitations

Consider the following limitations when you use dynamic resource allocation:

  • When you use RHEL OS, SELinux policy can interfere with containers that try to access GPUs. For more information, see How to use GPUs in containers on bare metal RHEL 8.

  • This feature uses the resource.k8s.io/v1beta1 API group, which differs from the open source Kubernetes API group for this feature, resource.k8s.io/v1. The v1 open source API group provides more features and better stability than the v1beta1 API group.

What's next