Connect to an existing Managed Lustre instance from Google Kubernetes Engine

This guide describes how to connect to an existing Google Cloud Managed Lustre instance with the GKE Managed Lustre CSI driver. This lets you access existing fully managed Managed Lustre instances as volumes for your stateful workloads.

Limitations

Using Managed Lustre from GKE has the following limitations:

  • The GKE node pool version must be 1.31.5 or greater.

    • For 1.31.5, the GKE patch number must be at least 1299000.
    • For versions 1.31.6 and 1.31.7, any GKE patch number is supported.
    • For 1.32.1, the GKE patch number must be at least 1673000.
    • For versions 1.32.2 or higher, any GKE patch number is supported.

    Use the following command to check the version of the node pool:

    gcloud container clusters describe \
      CLUSTER_NAME --location=LOCATION | grep currentNodeVersion
    

    Replace LOCATION with the cluster's zone (for a zonal cluster) or region (for a regional cluster). For example, us-central1-a or us-central1.

    The patch number is the seven digit number at the end of the node version string.

  • The node image must be Container-Optimized OS with containerd (cos_containerd).

  • Only Standard clusters are supported. AutoPilot clusters are not supported.

  • Secure Boot must be disabled in your node pool. Secure Boot is disabled by default in Standard GKE clusters.

  • Only manual provisioning is supported. The Managed Lustre instance must be created before you can install the CSI driver. Dynamic provisioning is not supported.

Configure IAM permissions

You must have the following IAM permission in order to create a Kubernetes Engine cluster:

  • Kubernetes Engine Cluster Admin (roles/container.clusterAdmin)

To grant a role:

gcloud projects add-iam-policy-binding PROJECT_ID \
  --member="user:EMAIL_ADDRESS" \
  --role=roles/container.clusterAdmin

Enable the API

Enable the Google Kubernetes Engine API.

gcloud

Use gcloud services enable as follows:

gcloud services enable container.googleapis.com --project=PROJECT_ID

Google Cloud console

  1. Go to the Kubernetes Engine API page in the Google Cloud console.

    Go to Kubernetes Engine API

  2. Click Enable API. The API is enabled for your project.

Create a GKE cluster

If you already have a GKE Standard cluster, you can skip this step. Otherwise, run the following command:

gcloud container clusters create lustre-test \
  --cluster-version 1.32 --release-channel rapid \
  --location=$ZONE

Run the following commands to ensure the kubectl context is set up correctly:

gcloud container clusters get-credentials ${CLUSTER_NAME}
kubectl config current-context

Install the CSI driver

The CSI driver can be deployed using Kustomize.

  1. Download the Managed Lustre CSI driver from GitHub. To clone the repository, use git clone as follows:

    git clone https://github.com/GoogleCloudPlatform/lustre-csi-driver
    
  2. Install the jq utility:

    sudo apt-get update
    sudo apt-get install jq
    
  3. Install the CSI driver:

    cd ./lustre-csi-driver
    OVERLAY=gke-release make install
    
  4. Confirm that the CSI driver is successfully installed:

    kubectl get CSIDriver,DaemonSet,Pods -n lustre-csi-driver
    

    The command output indicates that the DaemonSet and Pods are running as expected:

    NAME                                                 ATTACHREQUIRED   PODINFOONMOUNT   STORAGECAPACITY   TOKENREQUESTS   REQUIRESREPUBLISH   MODES        AGE
    csidriver.storage.k8s.io/lustre.csi.storage.gke.io   false            false            false             <unset>         false               Persistent   27s
    
    NAME                             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
    daemonset.apps/lustre-csi-node   1         1         1       1            1           kubernetes.io/os=linux   28s
    
    NAME                          READY   STATUS    RESTARTS   AGE
    pod/lustre-csi-node-gqffs     2/2     Running   0          28s
    

Create a Persistent Volume and Persistent Volume Claim

Follow these instructions to create a Persistent Volume (PV) and Persistent Volume Claim (PVC).

  1. Open ~/lustre-csi-driver/examples/pre-prov/preprov-pvc-pv.yaml. This is an example configuration file that you can update for your use:

    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: preprov-pv
    spec:
      storageClassName: ""
      capacity:
        storage: 18Ti # The capacity of the instance
      accessModes:
        - ReadWriteMany
      persistentVolumeReclaimPolicy: Retain
      volumeMode: Filesystem
      csi:
        driver: lustre.csi.storage.gke.io
        volumeHandle: <project-id>/<instance-location>/<instance-name> # Update these values
        volumeAttributes:
          ip: ${EXISTING_LUSTRE_IP_ADDRESS} # The IP address of the existing Lustre instance
          filesystem: ${EXISTING_LUSTRE_FSNAME} # The filesystem name of the existing Lustre instance
    ---
    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
      name: preprov-pvc
    spec:
      accessModes:
        - ReadWriteMany
      storageClassName: ""
      volumeName: preprov-pv
      resources:
        requests:
          storage: 18Ti # The capacity of the instance
    
  2. Update the example file with the correct values:

    • volumeHandle: Update with the correct project ID, zone, and Managed Lustre instance name.
    • storage: This value should match the size of the underlying Managed Lustre instance.
    • volumeAttributes:
      • ip must point to the Managed Lustre instance IP.
      • filesystem must be the Managed Lustre instance's file system name.
  3. Apply the example PV and PVC configuration:

    kubectl apply -f ~/lustre-csi-driver/examples/pre-prov/preprov-pvc-pv.yaml
    
  4. Verify that the PV and PVC are bound:

    kubectl get pvc
    

    The expected output looks like the following example:

    NAME          STATUS   VOLUME       CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    preprov-pvc   Bound    preprov-pv   16Ti       RWX                           76s
    

Use the Persistent Volume in a Pod

The Managed Lustre CSI driver files include a sample Pod configuration YAML file.

  1. Open ~/lustre-csi-driver/examples/pre-prov/preprov-pod.yaml This is an example configuration file that you can update for your use:

    apiVersion: v1
    kind: Pod
    metadata:
      name: lustre-pod
    spec:
      containers:
      - name: nginx
        image: nginx
        volumeMounts:
          - mountPath: /lustre_volume
            name: mypvc
      volumes:
      - name: mypvc
        persistentVolumeClaim:
          claimName: preprov-pvc
    
  2. Update the volumeMounts values if needed.

  3. Deploy the Pod:

    kubectl apply -f ~/lustre-csi-driver/examples/pre-prov/preprov-pod.yaml
    
  4. Verify that the Pod is running. It can take a few minutes for the Pod to reach the Running state.

    kubectl get pods
    

    The expected output looks like the following example:

    NAME           READY   STATUS    RESTARTS   AGE
    lustre-pod     1/1     Running   0          11s
    

Your GKE + Managed Lustre environment is ready to use.