This guide describes how you can connect to an existing Managed Lustre instance by using the Managed Lustre CSI driver. This lets you access existing Managed Lustre instances as volumes for your stateful workloads, in a controlled and predictable way.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Cloud Managed Lustre API and the Google Kubernetes Engine API. Enable APIs
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
- For limitations and requirements, see the [CSI driver overview].
- Make sure to enable the Managed Lustre CSI driver. It is disabled by default in Standard and Autopilot clusters.
Set up environment variables
Set up the following environment variables:
export CLUSTER_NAME=CLUSTER_NAME
export PROJECT_ID=PROJECT_ID
export NETWORK_NAME=LUSTRE_NETWORK
export LOCATION=ZONE
Replace the following:
CLUSTER_NAME
: the name of the cluster.PROJECT_ID
: your Google Cloud project ID.LUSTRE_NETWORK
: the shared Virtual Private Cloud network where both the GKE cluster and Managed Lustre instance reside.ZONE
: the geographical zone of your GKE cluster; for example,us-central1-a
.
Configure the Managed Lustre CSI driver
This section covers how you can enable and disable the Managed Lustre CSI driver, if needed.
Enable the Managed Lustre CSI driver on a new GKE cluster
To enable the Managed Lustre CSI driver when creating a new GKE cluster, follow these steps:
Autopilot
gcloud container clusters create-auto "${CLUSTER_NAME}" \
--location=${LOCATION} \
--network="${NETWORK_NAME}" \
--cluster-version=1.33.2-gke.1111000 \
--enable-lustre-csi-driver \
--enable-legacy-lustre-port
Standard
gcloud container clusters create "${CLUSTER_NAME}" \
--location=${LOCATION} \
--network="${NETWORK_NAME}" \
--cluster-version=1.33.2-gke.1111000 \
--addons=LustreCsiDriver \
--enable-legacy-lustre-port
Enable the Managed Lustre CSI driver on an existing GKE cluster
If you want to enable the Managed Lustre CSI driver on an existing GKE cluster, use the following command:
gcloud container clusters update ${CLUSTER_NAME} \
--location=${LOCATION} \
--enable-legacy-lustre-port
After the Managed Lustre CSI driver is enabled in your cluster, you might notice that your nodes were recreated and that CPU nodes appear to be using a GPU image in the Google Cloud console or CLI output. For example:
config:
imageType: COS_CONTAINERD
nodeImageConfig:
image: gke-1330-gke1552000-cos-121-18867-90-4-c-nvda
This behavior is expected. The GPU image is being reused on CPU nodes to securely install the Managed Lustre kernel modules. You won't be overcharged for GPU usage.
Disable the Managed Lustre CSI driver
You can disable the Managed Lustre CSI driver on an existing GKE cluster by using the Google Cloud CLI.
gcloud container clusters update ${CLUSTER_NAME} \
--location=${LOCATION} \
--update-addons=LustreCsiDriver=DISABLED
Once the CSI driver is disabled, your nodes will be automatically recreated, and the Managed Lustre kernel modules will be uninstalled from your GKE nodes.
Access an existing Managed Lustre instance using the Managed Lustre CSI driver
If you already provisioned a Managed Lustre instance within the same network as your GKE cluster, you can follow these instructions to statically provision a PersistentVolume that refers to your instance.
The following sections describe the typical process for accessing an existing Managed Lustre instance by using the Managed Lustre CSI driver:
- Create a PersistentVolume that refers to the Managed Lustre instance.
- Use a PersistentVolumeClaim to access the volume.
- Create a workload that consumes the volume.
Create a PersistentVolume
To locate your Managed Lustre instance, run the following command.
gcloud lustre instances list \ --project=${PROJECT_ID} \ --location=${LOCATION}
The output should look similar to the following. Before you proceed to the next step, make sure to note down the Managed Lustre instance name, filesystem, and the mountPoint fields.
capacityGib: '18000' createTime: '2025-04-28T22:42:11.140825450Z' filesystem: testlfs gkeSupportEnabled: true mountPoint: 10.90.1.4@tcp:/testlfs name: projects/my-project/locations/us-central1-a/instances/my-lustre network: projects/my-project/global/networks/default perUnitStorageThroughput: '1000' state: ACTIVE updateTime: '2025-04-28T22:51:41.559098631Z'
Save the following manifest in a file named
lustre-pv.yaml
:apiVersion: v1 kind: PersistentVolume metadata: name: lustre-pv spec: storageClassName: "STORAGE_CLASS_NAME" capacity: storage: 18000Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain volumeMode: Filesystem claimRef: namespace: default name: lustre-pvc csi: driver: lustre.csi.storage.gke.io volumeHandle: "PROJECT_ID/LOCATION/INSTANCE_NAME" volumeAttributes: ip: IP_ADDRESS filesystem: FILESYSTEM
Replace the following:
storageClassName
: the name of your StorageClass. The value can be an empty string, but it must meet the specification of your PersistentVolumeClaim.volumeHandle
: the identifier for this volume.- PROJECT_ID: the Google Cloud project ID.
- LOCATION: the zonal location of your Lustre instance. You must specify a supported zone for the Managed Lustre CSI driver.
- INSTANCE_NAME: the name of your Lustre instance.
ip
: the IP address of your Lustre instance. You obtain this from themountPoint
field in the output of the previous command.filesystem
: the file system name of your Managed Lustre instance.
For the full list of fields that are supported in the PersistentVolume object, see the Managed Lustre CSI driver reference documentation.
Create the PersistentVolume by running this command:
kubectl apply -f lustre-pv.yaml
Use the PersistentVolumeClaim to access the volume
You can create a PersistentVolumeClaim resource that references the Managed Lustre CSI driver's StorageClass.
The following manifest file shows an example of how to create a
PersistentVolumeClaim in ReadWriteMany
access mode ,
which references the StorageClass you created earlier.
Save the following manifest in a file named
lustre-pvc.yaml
:kind: PersistentVolumeClaim apiVersion: v1 metadata: name: lustre-pvc spec: accessModes: - ReadWriteMany storageClassName: "STORAGE_CLASS_NAME" volumeName: lustre-pv resources: requests: storage: STORAGE_SIZE
Replace STORAGE_SIZE with the storage size; for example,
18000Gi
. It must match the specification in your PersistentVolume.Create the PersistentVolumeClaim by running this command:
kubectl create -f lustre-pvc.yaml
Create a workload that consumes the volume
This section shows how to create a Pod that consumes the PersistentVolumeClaim resource you created earlier.
Multiple Pods can share the same PersistentVolumeClaim resource.
Save the following manifest in a file named
my-pod.yaml
.apiVersion: v1 kind: Pod metadata: name: my-pod spec: containers: - name: nginx image: nginx volumeMounts: - name: lustre-volume mountPath: /data volumes: - name: lustre-volume persistentVolumeClaim: claimName: lustre-pvc
Run the following command to apply the manifest to the cluster:
kubectl apply -f my-pod.yaml
The Pod waits until GKE provisions the PersistentVolumeClaim before it starts running. This operation might take several minutes to complete.
Verify that the Pod is running:
kubectl get pods
It might take a few minutes for the Pod to reach the
Running
state.The output is similar to the following:
NAME READY STATUS RESTARTS AGE my-pod 1/1 Running 0 11s
Use fsGroup with Managed Lustre volumes
You can change the group ownership of the root level directory of the mounted file system to match a user-requested fsGroup specified in the Pod's SecurityContext.
Troubleshooting
For troubleshooting guidance, see the Troubleshooting page in the Managed Lustre documentation.
Clean up
To avoid incurring charges to your Google Cloud account, delete the storage resources you created in this guide.
Delete the Pod and PersistentVolumeClaim.
kubectl delete pod my-pod kubectl delete pvc lustre-pvc
Check the PersistentVolume status. After deleting the Pod and PersistentVolumeClaim, the PersistentVolume should report a "Released" state:
kubectl get pv
The output is similar to the following:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE lustre-pv 18000Gi RWX Retain Released default/preprov-pvc 2m28s
Reuse the PersistentVolume. To reuse the PersistentVolume, remove the claim reference (
claimRef
):kubectl patch pv lustre-pv --type json -p '[{"op": "remove", "path": "/spec/claimRef"}]'
The PersistentVolume should now report an "Available" status, indicating its readiness to be bound to a new PersistentVolumeClaim. Check the PersistentVolume status:
kubectl get pv
The output is similar to the following:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE lustre-pv 18000Gi RWX Retain Available 19m
Delete the PersistentVolume if it is no longer needed. If the PersistentVolume is no longer needed, delete it:
kubectl delete pv lustre-pv
Deleting the PersistentVolume does not remove the underlying Managed Lustre instance.