Filesystem in Userspace (FUSE) is an interface used to export a filesystem to the Linux kernel. Cloud Storage FUSE allows you to mount Cloud Storage buckets as a file system so that applications can access the objects in a bucket using common File IO operations (e.g. open, read, write, close) rather than using cloud-specific APIs.
The Cloud Storage FUSE CSI driver lets you use the Kubernetes API to consume pre-existing Cloud Storage buckets as volumes. Your applications can upload and download objects using Cloud Storage FUSE file system semantics. The Cloud Storage FUSE CSI driver provides a fully-managed experience powered by the open source Google Cloud Storage FUSE CSI driver.
The driver natively supports the following ways for you to configure your Cloud Storage-backed volumes:
CSI ephemeral volumes: You specify the Cloud Storage bucket in-line with the Pod specification. To learn more about this volume type, see the CSI ephemeral volumes overview in the open source Kubernetes documentation.
Static provisioning: You create a PersistentVolume resource that refers to the Cloud Storage bucket. Your Pod can then reference a PersistentVolumeClaim that is bound to this PersistentVolume. To learn more about this workflow, see Configure a Pod to Use a PersistentVolume for Storage.
You can use the Cloud Storage FUSE CSI driver with file caching to improve the read performance of applications handling small files from Cloud Storage buckets. The Cloud Storage FUSE file cache feature is a client-based read cache that allows repeated file reads to be served more quickly from cache storage of your choice. You can choose from a range of storage options for the read cache, including Local SSDs and Persistent Disk-based storage, based on your price-performance needs. You must opt-in to enable file caching with the Cloud Storage FUSE CSI driver. To learn more about best practices for caching, refer to Cloud Storage FUSE performance.
Benefits
- The Cloud Storage FUSE CSI driver on your cluster turns on automatic deployment and management of the driver. The driver works on both Standard and Autopilot clusters.
- The Cloud Storage FUSE CSI driver does not need privileged access that is typically required by FUSE clients. This enables a better security posture.
- The support of CSI ephemeral volumes simplifies volume configuration and management by eliminating the need for PersistentVolumeClaim and PersistentVolume objects.
- The Cloud Storage FUSE CSI driver supports the
ReadWriteMany
,ReadOnlyMany
, andReadWriteOnce
access modes. - You can use Workload Identity Federation for GKE to manage authentication while having granular control over how your Pods access Cloud Storage objects. Uniform bucket-level access is required for read-write workloads when using Workload Identity Federation.
- If you are running ML training and serving workloads with frameworks like Ray, PyTorch, Spark, and TensorFlow, the portability and simplicity provided by the Cloud Storage FUSE CSI driver allow you to run your workloads directly on your GKE clusters without additional code changes.
- You can read Cloud Storage objects with file caching enabled to boost the read performance. File caching accelerates repeat reads, by serving objects from local storage. To learn more about the benefits of file caching, refer to the Cloud Storage FUSE documentation.
- With Cloud Storage FUSE v.2.4.0 and file cache enabled, you can use the parallel download feature to accelerate reading large files from Cloud Storage for multi-threaded downloads. You can use this feature to improve model load times, especially for reads over 1 GB in size (for example, up to twice as fast when loading Llama2 70B).
- You can consume Cloud Storage FUSE volumes in init containers.
- You can view metrics insights for Cloud Storage FUSE, including file system, Cloud Storage, and file cache usage.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
- Create your Cloud Storage buckets. To
improve performance, set the
Location type
field toRegion
, and select a region where your GKE cluster is running.
Limitations
- The Cloud Storage FUSE file system has differences in performance, availability, access authorization, and semantics compared to a POSIX file system.
- The Cloud Storage FUSE CSI driver is not supported on GKE Sandbox.
- The Cloud Storage FUSE CSI driver does not support volume snapshots, volume cloning, or volume expansions.
- The Cloud Storage FUSE CSI driver does not support Pods running on the
host network
(
hostNetwork: true
) due to restrictions of Workload Identity Federation for GKE. - See the known issues in the Cloud Storage FUSE CSI driver GitHub project.
- See the open issues in the Cloud Storage FUSE CSI driver GitHub project. The issues are being triaged and will be resolved in future updates.
Requirements
To use the Cloud Storage FUSE CSI driver, your clusters must meet the following requirements:
- Use Linux clusters running GKE version 1.24 or later.
- Have Workload Identity Federation for GKE enabled.
- Have GKE metadata server enabled on your node pool.
- Make sure you have installed the latest version of the Google Cloud CLI.
- To use the private image for sidecar container feature, the custom write buffer volume feature, or configure the sidecar container resource requests, make sure your cluster uses these GKE versions: 1.25.16-gke.1360000, 1.26.13-gke.1052000, 1.27.10-gke.1055000, 1.28.6-gke.1369000, 1.29.1-gke.1575000, or later.
- To use the file cache feature or volume attributes, make sure your cluster uses these GKE versions: 1.25.16-gke.1759000, 1.26.15-gke.1158000, 1.27.12-gke.1190000, 1.28.8-gke.1175000, 1.29.3-gke.1093000 or later.
- To consume Cloud Storage FUSE volumes in init containers, make sure your cluster uses GKE version 1.29.3-gke.1093000 or later, and all the nodes in your cluster use GKE version 1.29 or later.
- To use the parallel download feature in GKE, your clusters must run 1.29.6-gke.1254000, 1.30.2-gke.1394000, or later.
- To view the Cloud Storage FUSE metrics, your cluster must run GKE version 1.31.1-gke.1621000 or later. These metrics are enabled by default.
Enable the Cloud Storage FUSE CSI driver
To create a Standard cluster with the Cloud Storage FUSE CSI driver enabled, you can use the gcloud CLI:
gcloud container clusters create CLUSTER_NAME \
--addons GcsFuseCsiDriver \
--cluster-version=VERSION \
--location=LOCATION \
--workload-pool=PROJECT_ID.svc.id.goog
Replace the following:
CLUSTER_NAME
: the name of your cluster.VERSION
: the GKE version number. You must select 1.24 or later.LOCATION
: the Compute Engine location for the cluster.PROJECT_ID
: your project ID.
To enable the driver on an existing Standard cluster, use the
gcloud container clusters update
command:
gcloud container clusters update CLUSTER_NAME \
--update-addons GcsFuseCsiDriver=ENABLED \
--location=LOCATION
Replace the following:
CLUSTER_NAME
: the name of your cluster.LOCATION
: the Compute Engine location for the cluster.
After you enable the Cloud Storage FUSE CSI driver, you can use the driver in
Kubernetes volumes by specifying the driver and provisioner name: gcsfuse.csi.storage.gke.io
.
Configure access to Cloud Storage buckets using GKE Workload Identity Federation for GKE
To make your Cloud Storage buckets accessible by your GKE cluster using Workload Identity Federation for GKE, follow these steps. See Configure applications to use Workload Identity Federation for GKE for more information.
Get credentials for your cluster:
gcloud container clusters get-credentials CLUSTER_NAME \ --location=LOCATION
Replace the following:
CLUSTER_NAME
: the name of your cluster that has Workload Identity Federation for GKE enabled.LOCATION
: the Compute Engine location for the cluster.
Create a namespace to use for the Kubernetes ServiceAccount. You can also use the
default
namespace or any existing namespace.kubectl create namespace NAMESPACE
Replace the following:
NAMESPACE
: the name of the Kubernetes namespace for the Kubernetes ServiceAccount.
Create a Kubernetes ServiceAccount for your application to use. You can also use any existing Kubernetes ServiceAccount in any namespace, including the
default
Kubernetes ServiceAccount.kubectl create serviceaccount KSA_NAME \ --namespace NAMESPACE
Replace the following:
KSA_NAME
: the name of your new Kubernetes ServiceAccount.NAMESPACE
: the name of the Kubernetes namespace for the Kubernetes ServiceAccount.
Grant one of the IAM roles for Cloud Storage to the Kubernetes ServiceAccount.
You can grant the role to your Kubernetes ServiceAccount to only access a specific Cloud Storage bucket using the following command:
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \ --member "principal://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/subject/ns/NAMESPACE/sa/KSA_NAME" \ --role "ROLE_NAME"
Replace the following:
BUCKET_NAME
: your Cloud Storage bucket name.PROJECT_NUMBER
: the numerical project number of your GKE cluster. To find your project number, see Identifying projects.PROJECT_ID
: the project ID of your GKE cluster.NAMESPACE
: the name of the Kubernetes namespace for the Kubernetes ServiceAccount.KSA_NAME
: the name of your new Kubernetes ServiceAccount.ROLE_NAME
: the IAM role to assign to your Kubernetes ServiceAccount.- For read-only workloads, use the Storage Object Viewer role (
roles/storage.objectViewer
). - For read-write workloads, use the Storage Object User role (
roles/storage.objectUser
).
- For read-only workloads, use the Storage Object Viewer role (
Optionally, you can grant the role to your Kubernetes ServiceAccount to access all your Cloud Storage buckets in the project using the following command:
gcloud projects add-iam-policy-binding GCS_PROJECT \ --member "principal://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/subject/ns/NAMESPACE/sa/KSA_NAME" \ --role "ROLE_NAME"
Replace the following:
GCS_PROJECT
: the project ID of your Cloud Storage buckets.PROJECT_NUMBER
: the numerical project number of your GKE cluster. To find your project number, see Identifying projects.PROJECT_ID
: the project ID of your GKE cluster.NAMESPACE
: the name of the Kubernetes namespace for the Kubernetes ServiceAccount.KSA_NAME
: the name of your new Kubernetes ServiceAccount.ROLE_NAME
: the IAM role to assign to your Kubernetes ServiceAccount.- For read-only workloads, use the Storage Object Viewer role (
roles/storage.objectViewer
). - For read-write workloads, use the Storage Object User role (
roles/storage.objectUser
).
- For read-only workloads, use the Storage Object Viewer role (
Prepare to mount Cloud Storage FUSE buckets
This section covers how to prepare to mount Cloud Storage FUSE buckets on your clusters.
Specify Pod annotations
The CSI driver relies on Pod annotations
to identify if your Pod uses Cloud Storage-backed volumes. If the driver
detects the necessary annotations, it injects a sidecar container called
gke-gcsfuse-sidecar
into your workload Pod. The Cloud Storage FUSE instances run
inside the sidecar container and mount the Cloud Storage buckets for your workload.
To enable the CSI driver to mount the Cloud Storage buckets, make sure you
specify the annotation gke-gcsfuse/volumes: "true"
in your Pod specification,
under the metadata
field. If you want your Cloud Storage-backed volumes to
be consumed by other Kubernetes workload types (for instance, Job,
Deployment, or
StatefulSet),
make sure you configure the annotations under the spec.template.metadata.annotations
field.
If you are using Istio or Cloud Service Mesh, add the following Pod-level annotations:
proxy.istio.io/config: '{ "holdApplicationUntilProxyStarts": true }'
traffic.sidecar.istio.io/excludeOutboundIPRanges: 169.254.169.254/32
Configure resources for the sidecar container
By default, the sidecar container is configured with the following resource requests, with resource limits unset (for Standard cluster):
- 250m CPU
- 256 MiB memory
- 5 GiB ephemeral storage
To overwrite these values, you can optionally specify the annotation
gke-gcsfuse/[cpu-limit|memory-limit|ephemeral-storage-limit|cpu-request|memory-request|ephemeral-storage-request]
as shown in the following example:
apiVersion: v1
kind: Pod
metadata:
annotations:
gke-gcsfuse/volumes: "true"
gke-gcsfuse/cpu-limit: "10"
gke-gcsfuse/memory-limit: 10Gi
gke-gcsfuse/ephemeral-storage-limit: 1Ti
gke-gcsfuse/cpu-request: 500m
gke-gcsfuse/memory-request: 1Gi
gke-gcsfuse/ephemeral-storage-request: 50Gi
Use the following considerations when deciding the amount of resources to allocate:
- If you set only one of the resource request or limit annotations, GKE Autopilot enforces the same values for the resource request and resource limit.
- If your workload Pod consumes multiple Cloud Storage volumes, the sidecar container resources are shared by multiple Cloud Storage FUSE instances. If this applies to you, consider increasing the resource allocation for multiple Cloud Storage volumes.
- Allocate more CPU to the sidecar container if your workloads need higher throughput. Insufficient CPU will cause Cloud Storage FUSE throttling.
- If your workloads need to process a large number of files, and the Cloud Storage FUSE metadata caching is enabled, increase the sidecar container's memory allocation. Cloud Storage FUSE memory consumption for metadata caching is proportional to the number of files but not the file size. Insufficient memory will cause Cloud Storage FUSE out-of-memory errors and crash the workload application.
- For file caching, Cloud Storage FUSE by default caches the files in a local temporary directory. Estimate how much free space your workload needs for file caching, and increase your ephemeral storage limit accordingly. To learn more, see volume attributes.
- For write operations, Cloud Storage FUSE by default stages the files in a local temporary directory before the files are uploaded to the Cloud Storage bucket. Estimate how much free space your workload needs for staging when writing large files, and increase your ephemeral storage limit accordingly. To learn more, see Read/Writes semantics in the Cloud Storage FUSE GitHub documentation.
- You can use value
"0"
to unset any resource limits or requests on Standard clusters. For example, annotationgke-gcsfuse/memory-limit: "0"
leaves the sidecar container memory limit empty with the default memory request. This is useful when you cannot decide on the amount of resources Cloud Storage FUSE needs for your workloads, and want to let Cloud Storage FUSE consume all the available resources on a node. After calculating the resource requirements for Cloud Storage FUSE based on your workload metrics, you can set appropriate limits.
Configure a private image for the sidecar container
This section describes how to use the sidecar container image if you are hosting it in a private container registry. This scenario might apply if you need to use private nodes for security purposes. To configure and consume the private sidecar container image, follow these steps:
Refer to this page to look for a compatible public sidecar container image.
Pull it to your local environment and push it to your private container registry.
In the manifest, specify a container named
gke-gcsfuse-sidecar
with only theimage
field. GKE will use the specified sidecar container image to prepare for the sidecar container injection. Here is an example:
apiVersion: v1
kind: Pod
metadata:
annotations:
gke-gcsfuse/volumes: "true"
spec:
containers:
- name: gke-gcsfuse-sidecar
image: PRIVATE_REGISTRY/gcs-fuse-csi-driver-sidecar-mounter:PRIVATE_IMAGE_TAG
- name: main # your main workload container.
Replace the following:
PRIVATE_REGISTRY
: your private container registry.PRIVATE_IMAGE_TAG
: your private sidecar container image tag.
Configure a custom write buffer volume for the sidecar container
This section describes how to configure a custom buffer volume for Cloud Storage FUSE write buffering.
This scenario might apply if you need to replace the default
emptyDir
volume for Cloud Storage FUSE to stage the files in write operations.
You can specify any type of storage supported by GKE, such as a
PersistentVolumeClaim
,
and GKE will use the specified volume for file write buffering.
This is useful if you need to write files larger than 10 GiB on Autopilot clusters.
To use the custom buffer volume, you must specify a non-zero fsGroup
.
The following example shows how you can use a predefined PVC as the buffer volume:
apiVersion: v1
kind: Pod
metadata:
annotations:
gke-gcsfuse/volumes: "true"
spec:
securityContext:
fsGroup: FS_GROUP
containers:
...
volumes:
- name: gke-gcsfuse-buffer
persistentVolumeClaim:
claimName: BUFFER_VOLUME_PVC
Replace the following:
FS_GROUP
: the fsGroup ID.BUFFER_VOLUME_PVC
: the predefined PVC name.
Configure a custom read cache volume for the sidecar container
This section describes how to configure a custom cache volume for Cloud Storage FUSE read caching.
This scenario might apply if you need to replace the default
emptyDir
volume for Cloud Storage FUSE to cache the files in read operations.
You can specify any type of storage supported by GKE, such as a
PersistentVolumeClaim
,
and GKE will use the specified volume for file caching.
This is useful if you need to cache files larger than 10 GiB on Autopilot clusters.
To use the custom cache volume, you must specify a non-zero fsGroup
.
The following example shows how you can use a predefined PVC as the cache volume:
apiVersion: v1
kind: Pod
metadata:
annotations:
gke-gcsfuse/volumes: "true"
spec:
securityContext:
fsGroup: FS_GROUP
containers:
...
volumes:
- name: gke-gcsfuse-cache
persistentVolumeClaim:
claimName: CACHE_VOLUME_PVC
Replace the following:
FS_GROUP
: the fsGroup ID.CACHE_VOLUME_PVC
: the predefined PVC name.
Provision your volume as a CSI ephemeral volume
CSI ephemeral volumes backed by Cloud Storage buckets are tied to the Pod lifecycle. With this provisioning approach, you don't need to maintain the PersistentVolume and PersistentVolumeClaim objects associated with the Cloud Storage buckets after Pod termination.
Consume the CSI ephemeral storage volume in a Pod
Save the following YAML manifest:
apiVersion: v1 kind: Pod metadata: name: gcs-fuse-csi-example-ephemeral namespace: NAMESPACE annotations: gke-gcsfuse/volumes: "true" spec: terminationGracePeriodSeconds: 60 containers: - image: busybox name: busybox command: ["sleep"] args: ["infinity"] volumeMounts: - name: gcs-fuse-csi-ephemeral mountPath: /data readOnly: true serviceAccountName: KSA_NAME volumes: - name: gcs-fuse-csi-ephemeral csi: driver: gcsfuse.csi.storage.gke.io readOnly: true volumeAttributes: bucketName: BUCKET_NAME mountOptions: "implicit-dirs" gcsfuseLoggingSeverity: warning
The previous example shows how you can specify the Cloud Storage bucket inline in the Pod manifest. The example includes the following fields:
metadata.annotations
: the annotationgke-gcsfuse/volumes: "true"
is required. See Configure resources for the sidecar container for optional annotations.spec.terminationGracePeriodSeconds
: optional. By default, this is set to 30. If you need to write large files to the Cloud Storage bucket, increase this value to make sure that Cloud Storage FUSE has enough time to flush the data after your application exits. To learn more, see Kubernetes best practices: Terminating with grace.spec.serviceAccountName
: use the same Kubernetes ServiceAccount as in the Configure access to Cloud Storage buckets using GKE Workload Identity Federation for GKE step.spec.volumes[n].csi.driver
: usegcsfuse.csi.storage.gke.io
as the CSI driver name.spec.volumes[n].csi.volumeAttributes.bucketName
: specify your Cloud Storage FUSE bucket name. You can specify an underscore (_
) to mount all buckets that the Kubernetes ServiceAccount can access. To learn more, see Dynamic Mounting in the Cloud Storage FUSE documentation.spec.volumes[n].csi.volumeAttributes.mountOptions
: optional. Pass mount options to Cloud Storage FUSE. Specify the flags in one string separated by commas, without spaces.spec.volumes[n].csi.volumeAttributes
: optional. Pass other volume attributes to Cloud Storage FUSE.spec.volumes[n].csi.readOnly
: optional. Specifytrue
if all the volume mounts are read-only.spec.containers[n].volumeMounts[m].readOnly
: optional. Specifytrue
if only a specific volume mount is read-only.
Apply the manifest to the cluster:
kubectl apply -f FILE_PATH
Replace
FILE_PATH
with the path to the YAML file.
Consume the CSI ephemeral storage volume in a Job workload
Save the following YAML manifest:
apiVersion: batch/v1 kind: Job metadata: name: gcs-fuse-csi-job-example namespace: NAMESPACE spec: template: metadata: annotations: gke-gcsfuse/volumes: "true" spec: serviceAccountName: KSA_NAME containers: - name: writer image: busybox command: - "/bin/sh" - "-c" - touch /data/test && echo $(date) >> /data/test && sleep 10 volumeMounts: - name: gcs-fuse-csi-ephemeral mountPath: /data - name: reader image: busybox command: - "/bin/sh" - "-c" - sleep 10 && cat /data/test volumeMounts: - name: gcs-fuse-csi-ephemeral mountPath: /data readOnly: true volumes: - name: gcs-fuse-csi-ephemeral csi: driver: gcsfuse.csi.storage.gke.io volumeAttributes: bucketName: BUCKET_NAME restartPolicy: Never backoffLimit: 1
Replace the following:
NAMESPACE
: the namespace of your workload.KSA_NAME
: the Kubernetes ServiceAccount name as in the Configure access to Cloud Storage buckets using GKE Workload Identity Federation for GKE step.BUCKET_NAME
: your Cloud Storage bucket name.
The manifest deploys a Job that consumes a Cloud Storage FUSE bucket through a CSI ephemeral volume.
Apply the manifest to the cluster:
kubectl apply -f FILE_PATH
Replace
FILE_PATH
with the path to the YAML file.
If you are using the CSI driver in a Job
workload, or if the Pod RestartPolicy
is Never
, the sidecar container will exit automatically after all the other
workload containers exit.
For additional examples, see Example Applications in the GitHub project documentation.
Provision your volume using static provisioning
With static provisioning, you create one or more PersistentVolume (PV) objects containing the details of the underlying storage system. Pods in your clusters can then consume the storage through PersistentVolumeClaims (PVCs).
Create a PersistentVolume
Save the following YAML manifest:
apiVersion: v1 kind: PersistentVolume metadata: name: gcs-fuse-csi-pv spec: accessModes: - ReadWriteMany capacity: storage: 5Gi storageClassName: example-storage-class mountOptions: - implicit-dirs csi: driver: gcsfuse.csi.storage.gke.io volumeHandle: BUCKET_NAME volumeAttributes: gcsfuseLoggingSeverity: warning claimRef: name: gcs-fuse-csi-static-pvc namespace: NAMESPACE
The example manifest shows how you can define a PersistentVolume for Cloud Storage buckets. The example includes the following fields:
spec.csi.driver
: usegcsfuse.csi.storage.gke.io
as the CSI driver name.spec.csi.volumeHandle
: specify your Cloud Storage bucket name. You can pass an underscore (_
) to mount all the buckets that the Kubernetes ServiceAccount is configured to have access to. To learn more, see Dynamic Mounting in the Cloud Storage FUSE documentation.spec.mountOptions
: optional. Pass mount options to Cloud Storage FUSE.spec.csi.volumeAttributes
: optional. Pass volume attributes to Cloud Storage FUSE.
Apply the manifest to the cluster:
kubectl apply -f FILE_PATH
Replace
FILE_PATH
with the path to the YAML file.
Create a PersistentVolumeClaim
Save the following YAML manifest:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: gcs-fuse-csi-static-pvc namespace: NAMESPACE spec: accessModes: - ReadWriteMany resources: requests: storage: 5Gi storageClassName: example-storage-class
The example manifest shows how you can define a PersistentVolumeClaim to bind the PersistentVolume. The example includes the following fields:
metadata.namespace
: specify the PersistentVolumeClaim namespace that must be consistent with the namespace of your workload.
To bind a PersistentVolume to a PersistentVolumeClaim, make sure to follow these guidelines:
spec.storageClassName
fields on PV and PVC manifests should match. ThestorageClassName
does not need to refer to an existing StorageClass object. To bind the claim to a volume, you can use any name you want but it cannot be empty.spec.accessModes
fields on PV and PVC manifests should match.spec.capacity.storage
on the PersistentVolume manifest should matchspec.resources.requests.storage
on the PersistentVolumeClaim manifest. Since Cloud Storage buckets don't have size limits, you can put any number for capacity but it cannot be empty.
Apply the manifest to the cluster:
kubectl apply -f FILE_PATH
Replace
FILE_PATH
with the path to the YAML file.
Consume the volume from a PersistentVolumeClaim
Save the following YAML manifest:
apiVersion: v1 kind: Pod metadata: name: gcs-fuse-csi-example-static-pvc namespace: NAMESPACE annotations: gke-gcsfuse/volumes: "true" spec: containers: - image: busybox name: busybox command: ["sleep"] args: ["infinity"] volumeMounts: - name: gcs-fuse-csi-static mountPath: /data readOnly: true serviceAccountName: KSA_NAME volumes: - name: gcs-fuse-csi-static persistentVolumeClaim: claimName: gcs-fuse-csi-static-pvc readOnly: true
The example shows how you can define a Pod that consumes a Cloud Storage FUSE bucket through a PersistentVolumeClaim. The example includes the following fields:
metadata.annotations
: the annotationgke-gcsfuse/volumes: "true"
is required. See Configure resources for the sidecar container for optional annotations.spec.serviceAccountName
: use the same Kubernetes ServiceAccount as in the Configure access to Cloud Storage buckets using GKE Workload Identity Federation for GKE step.spec.containers[n].volumeMounts[m].readOnly
: optional. specifytrue
if only specific volume mount is read-only.spec.volumes[n].persistentVolumeClaim.readOnly
: optional. Specifytrue
if all the volume mounts are read-only.
Apply the manifest to the cluster:
kubectl apply -f FILE_PATH
Replace
FILE_PATH
with the path to the YAML file.
For additional examples, see Example Applications in the GitHub project documentation.
Consume your volumes with file caching enabled
By default, the file caching feature is disabled on GKE.
To enable and control file caching, use the volume attribute
fileCacheCapacity
.
GKE uses an emptyDir
volume for Cloud Storage FUSE
file caching backed by the node VM boot disk. If you enable
Local SSD
on the node, GKE uses the Local SSD to back the emptyDir
volume.
You can configure a custom read cache volume for the sidecar container
to replace the default emptyDir
volume for file caching in read operations.
For CPU and GPU VM families with Local SSD support, we recommend using Local SSD storage.
For TPU families or Autopilot, we recommend using
Balanced Persistent Disk or SSD Persistent Disk.
Consume a CSI ephemeral storage volume with file caching enabled
To deploy a Pod that consumes a Cloud Storage FUSE bucket through a CSI ephemeral volume with file caching, follow these steps:
Create a cluster or node pool with Local SSD-backed ephemeral storage.
Follow the GKE documentation to create a cluster or node pool with Local SSD-backed ephemeral storage.
Save the following YAML manifest:
apiVersion: v1 kind: Pod metadata: name: gcs-fuse-csi-file-cache-example namespace: NAMESPACE annotations: gke-gcsfuse/volumes: "true" gke-gcsfuse/ephemeral-storage-limit: "50Gi" spec: nodeSelector: cloud.google.com/gke-ephemeral-storage-local-ssd: "true" restartPolicy: Never initContainers: - name: data-loader image: gcr.io/google.com/cloudsdktool/google-cloud-cli:slim resources: limits: cpu: 500m memory: 1Gi requests: cpu: 500m memory: 1Gi command: - "/bin/sh" - "-c" - | mkdir -p /test_files for i in $(seq 1 1000); do dd if=/dev/zero of=/test_files/file_$i.txt bs=1024 count=64; done gcloud storage cp /test_files gs://BUCKET_NAME --recursive containers: - name: data-validator image: busybox resources: limits: cpu: 500m memory: 512Mi requests: cpu: 500m memory: 512Mi command: - "/bin/sh" - "-c" - | echo "first read with cache miss" time cat /data/test_files/file_* > /dev/null echo "second read from local cache" time cat /data/test_files/file_* > /dev/null volumeMounts: - name: gcs-fuse-csi-ephemeral mountPath: /data serviceAccountName: KSA_NAME volumes: - name: gcs-fuse-csi-ephemeral csi: driver: gcsfuse.csi.storage.gke.io volumeAttributes: bucketName: BUCKET_NAME mountOptions: "implicit-dirs" fileCacheCapacity: "10Gi"
Replace the following:
NAMESPACE
: the namespace of your workload.KSA_NAME
: the Kubernetes ServiceAccount name you specified in the Configure access to Cloud Storage buckets using GKE Workload Identity Federation for GKE step.BUCKET_NAME
: your Cloud Storage bucket name.
The init container
data-loader
generates 1,000 files with size of 64 KiB, and uploads the files to a Cloud Storage bucket. The main containerdata-validator
reads all the files from the bucket twice, and logs the duration.Apply the manifest to the cluster:
kubectl apply -f FILE_PATH
Replace
FILE_PATH
with the path to the YAML file.To view the log output, run the following command:
kubectl logs -n NAMESPACE gcs-fuse-csi-file-cache-example -c data-validator
Replace
NAMESPACE
with the namespace of your workload.The output is similar to the following:
first read with cache miss real 0m 54.68s ... second read from local cache real 0m 0.38s ...
The output shows that the second read with local cache is much faster than the first read with cache miss.
Improve large file read performance using Cloud Storage FUSE parallel download
You can use Cloud Storage FUSE parallel download to accelerate reading large files from Cloud Storage for multi-threaded downloads. Cloud Storage FUSE parallel download can be particularly beneficial for model serving use cases with reads over 1 GB in size.
Common examples include:
- Model serving, where you need a large prefetch buffer to accelerate model download during instance boot.
- Checkpoint restores, where you need a read-only data cache to improve one-time access of multiple large files.
Use parallel download for applications that perform single-threaded large file reads. Applications with high read-parallelism (using more than eight threads) may encounter lower performance with this feature.
To use parallel download with the Cloud Storage FUSE CSI driver, follow these steps:
Enable file cache. Create a cluster with file caching enabled, as described in Consume a CSI ephemeral storage volume with file caching enabled.
Enable parallel download. In your manifest, configure these additional settings using mount options:
- Set
file-cache:enable-parallel-downloads:true
. - Adjust
file-cache:parallel-downloads-per-file
,file-cache:max-parallel-downloads
, andfile-cache:download-chunk-size-mb
as needed.
- Set
(Optional) Tune volume attributes. If needed, consider tuning these volume attributes:
fileCacheForRangeRead
for random or partial reads.metadataTypeCacheCapacity
andmetadataStatCacheCapacity
for training workloads.
Click one of these tabs to see how you can can enable parallel download depending or whether you are using ephemeral storage volumes or static provisioning:
Ephemeral storage
apiVersion: v1
kind: Pod
metadata:
name: gcs-fuse-csi-example-ephemeral
namespace: NAMESPACE
annotations:
gke-gcsfuse/volumes: "true"
spec:
containers:
...
volumes:
- name: gcs-fuse-csi-ephemeral
csi:
driver: gcsfuse.csi.storage.gke.io
volumeAttributes:
bucketName: BUCKET_NAME
mountOptions: "implicit-dirs,file-cache:enable-parallel-downloads:true,file-cache:parallel-downloads-per-file:4,file-cache:max-parallel-downloads:-1,file-cache:download-chunk-size-mb:3"
fileCacheCapacity: "-1"
Static provisioning
apiVersion: v1
kind: PersistentVolume
metadata:
name: gcs-fuse-csi-pv
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 5Gi
storageClassName: example-storage-class
mountOptions:
- implicit-dirs
- file-cache:enable-parallel-downloads:true
- file-cache:parallel-downloads-per-file:4
- file-cache:max-parallel-downloads:-1
- file-cache:download-chunk-size-mb:3
csi:
driver: gcsfuse.csi.storage.gke.io
volumeHandle: BUCKET_NAME
volumeAttributes:
fileCacheCapacity: "-1"
claimRef:
name: gcs-fuse-csi-static-pvc
namespace: NAMESPACE
Configure how Cloud Storage FUSE buckets are mounted
This section describes how you can configure the Cloud Storage FUSE volumes.
Mount options
The Cloud Storage FUSE CSI driver supports mount options to configure how Cloud Storage buckets are mounted on your local file system. For the full list of supported mount options, see the gcsfuse CLI documentation.
You can specify the mount flags in the following ways:
- In the
spec.mountOptions
field on aPersistentVolume
manifest, if you use static provisioning. - In the
spec.volumes[n].csi.volumeAttributes.mountOptions
field, if you use CSI ephemeral volumes.
Volume attributes
Cloud Storage FUSE CSI driver does not allow you to directly specify the Cloud Storage FUSE configuration file. You can configure some of the fields in the configuration file using the following volume attributes. The values are translated to the configuration file fields.
gcsfuseLoggingSeverity
Description: The severity of logs you want Cloud Storage FUSE to generate, expressed as an enum. This volume attribute is translated to the configuration file field
logging:severity
.Valid values (ordered from lowest severity to highest severity):
trace
debug
info
warning
error
Default value:
info
.
fileCacheCapacity
Description: The maximum size that the file cache can use. If a non-zero value presents, this volume attribute enables file caching in Cloud Storage FUSE. This volume attribute is translated to the configuration file field
file-cache:max-size-mb
.Valid values:
- Quantity values,
for example:
500Mi
,10Gi
. - "-1": to use the cache volume's entire available capacity.
- "0": the file cache is disabled.
- Quantity values,
for example:
Default value: "0".
fileCacheForRangeRead
Description: Whether the full object should be downloaded asynchronously and stored in the Cloud Storage FUSE cache directory when the first read is done from a non-zero offset. This should be set to "true" if you plan on performing several random reads or partial reads. This volume attribute is translated to the configuration file field
file-cache:cache-file-for-range-read
.Valid values:
- Boolean values in string format: "true", "false".
Default value: "false".
metadataStatCacheCapacity
Description: The maximum size that the stat cache can use. The stat cache is always entirely kept in memory. If you are already using
stat-cache-capacity
mount option, the value will still be honored and will be appropriately translated to this new configuration. This volume attribute is translated to the configuration file fieldmetadata-cache:stat-cache-max-size-mb
.Valid values:
- Quantity values,
for example:
500Mi
,1Gi
. - "-1": to let the stat cache use as much memory as needed.
- "0": the stat cache is disabled.
- Use the default value of
32Mi
if your workload involves up to 20,000 files. If your workload is larger than 20,000 files, increase the size by values of 10 MiB for every additional 6,000 files, an average of ~1,500 bytes per file.
- Quantity values,
for example:
Default value:
32Mi
.
metadataTypeCacheCapacity
Description: The maximum size per directory that the type cache can use. The type cache is always entirely kept in memory. This volume attribute is translated to the configuration file field
metadata-cache:type-cache-max-size-mb
.Valid values:
- Quantity values,
for example:
500Mi
,1Gi
. - "-1": to let the type cache use as much memory as needed.
- "0": the type cache is disabled.
- Use the default value of
4Mi
if the maximum number of files within a single directory from the bucket you're mounting contains 20,000 files or less. If the maximum number of files within a single directory that you're mounting contains more than 20,000 files, increase the size by 1 MiB for every 5,000 files, an average of ~200 bytes per file.
- Quantity values,
for example:
Default value:
4Mi
.
metadataCacheTTLSeconds
Description: The time to live (TTL), in seconds, of cached metadata entries. If you are already using the
stat-cache-ttl
ortype-cache-ttl
mount options, the values will still be honored and will be appropriately translated to this new configuration. This volume attribute is translated to the configuration file fieldmetadata-cache:ttl-secs
.Valid values:
- Integer values in string format, for example: "600".
- "-1": bypass a TTL expiration and serve the file from the cache whenever it's available.
- "0": ensure that the most up to date file is read. Using
a value of
0
issues aGet
metadata call to make sure that the object generation for the file in the cache matches what's stored in Cloud Storage.
Default value: "60".
You can specify the volume attributes in the following ways:
- In the
spec.csi.volumeAttributes
field on aPersistentVolume
manifest, if you use static provisioning. - In the
spec.volumes[n].csi.volumeAttributes
field, if you use CSI ephemeral volumes.
Considerations
Use the following considerations when configuring mounts:
- The following flags are disallowed:
app-name
,temp-dir
,foreground
,log-file
,log-format
,key-file
,token-url
, andreuse-token-from-url
. - Cloud Storage FUSE does not make implicit directories visible by default. To
make these directories visible, you can turn on the
implicit-dirs
mount flag. To learn more, see Files and Directories in the Cloud Storage FUSE GitHub documentation. - If you use a Security Context
for your Pod or container, or if your container image uses a non-root user or group,
you must set the
uid
andgid
mount flags. You also need to use thefile-mode
anddir-mode
mount flags to set the file system permissions. Note that you cannot runchmod
,chown
, orchgrp
commands against a Cloud Storage FUSE file system, souid
,gid
,file-mode
anddir-mode
mount flags are necessary to provide access to a non-root user or group. - If you only want to mount a directory in the bucket instead of the entire
bucket, pass the directory relative path by using the
only-dir=relative/path/to/the/bucket/root
flag. - To tune Cloud Storage FUSE caching behavior, configure volume attributes. Refer to Cloud Storage FUSE Caching documentation for details.
- If you need to specify a maximum number of TCP connections allowed per server, you can specify this maximum using
max-conns-per-host
flag. The maximum number of TCP connections you define becomes effective when--client-protocol
is set tohttp1
. The default value is 0 which indicates no limit on TCP connections (limited by the machine specifications). - If you need to configure the Linux kernel mount options, you can pass the options
using the
o
flag. For example, if you don't want to permit direct execution of any binaries on the mounted file system, set theo=noexec
flag. Each option requires a separate flag, for example,o=noexec,o=noatime
. Only the following options are allowed:exec
,noexec
,atime
,noatime
,sync
,async
, anddirsync
. - If you need to troubleshoot Cloud Storage FUSE issues, set the
log-severity
flag toTRACE
. Then thegcsfuseLoggingSeverity
volume attribute is automatically set totrace
. - Cloud Storage FUSE CSI driver does not allow you to modify the
cache-dir
field in the Cloud Storage FUSE configuration file, usefileCacheCapacity
volume attribute to enable or disable the file caching. To replace the defaultemptyDir
volume for file caching, you can configure a custom cache volume for the sidecar container.
Cloud Storage FUSE metrics
The following Cloud Storage FUSE metrics are now available through the GKE Monitoring API. Details about Cloud Storage FUSE metrics such as labels, type, and unit can be found in GKE System Metrics. These metrics are available for each Pod that uses Cloud Storage FUSE and lets you configure insights per volume and bucket.
File system metrics
File system metrics track the performance and health of your file system, including the number of operations, errors, and operation speed. These metrics can help identify bottlenecks and optimize performance.
gcsfusecsi/fs_ops_count
gcsfusecsi/fs_ops_error_count
gcsfusecsi/fs_ops_latency
Cloud Storage metrics
You can monitor Cloud Storage metrics, including data volume, speed, and request activity, to understand how your applications interact with Cloud Storage buckets. This data can help you identify areas for optimization, such as improving read patterns or reducing the number of requests.
gcsfusecsi/gcs_download_bytes_count
gcsfusecsi/gcs_read_count
gcsfusecsi/gcs_read_bytes_count
gcsfusecsi/gcs_reader_count
gcsfusecsi/gcs_request_count
gcsfusecsi/gcs_request_latencies
File cache metrics
You can monitor file cache metrics, including data read volume, speed, and cache hit rate, to optimize Cloud Storage FUSE and application performance. Analyze these metrics to improve your caching strategy and maximize cache hits.
gcsfusecsi/file_cache_read_bytes_count
gcsfusecsi/file_cache_read_latencies
gcsfusecsi/file_cache_read_count
Disable the Cloud Storage FUSE CSI driver
You cannot disable the Cloud Storage FUSE CSI driver on Autopilot clusters.
You can disable the Cloud Storage FUSE CSI driver on an existing Standard cluster by using the Google Cloud CLI.
gcloud container clusters update CLUSTER_NAME \
--update-addons GcsFuseCsiDriver=DISABLED
Replace CLUSTER_NAME
with the name of your cluster.
Troubleshooting
To troubleshoot issues when using the Cloud Storage FUSE CSI driver, see Troubleshooting Guide in the GitHub project documentation.