About the Google Kubernetes Engine Parallelstore CSI driver


Parallelstore is available by invitation only. If you'd like to request access to Parallelstore in your Google Cloud project, contact your sales representative.

Parallelstore is a fully managed, low-latency distributed file system designed to meet the demands of AI/ML training and high performance computing (HPC) workloads that need extremely low latency (sub-millisecond), full POSIX semantics, and high metadata operation throughput. Parallelstore scales to 1 TB/s read speeds and millions of IOPS.

To connect a Google Kubernetes Engine (GKE) cluster to a Parallelstore instance, use the Parallelstore Container Storage Interface (CSI) driver. The Parallelstore CSI driver lets you use the GKE API to consume Parallelstore instances as volumes for your stateful workloads (for example, Pods and Jobs). It's optimized for AI/ML training workloads, particularly those involving smaller file sizes and random reads.

GKE enables the CSI driver for you by default when you create a new GKE Autopilot cluster. On new and existing GKE Standard clusters, you'll need to enable the CSI driver.

Benefits

You can use the Parallelstore CSI driver to benefit from high-performance storage. With the Parallelstore CSI driver, you can accelerate your high performance computing and AI/ML training workloads, with fast, consistent access to shared data for efficient processing and analysis.

  • You have access to fully-managed parallel file systems as your storage through the Kubernetes APIs.
  • The Google Kubernetes Engine Parallelstore CSI driver supports the ReadWriteMany, ReadOnlyMany, and ReadWriteOnce access modes.
  • You can use the Google Kubernetes Engine Parallelstore CSI driver to dynamically provision your PersistentVolumes.
  • You can access existing Parallelstore instances in Kubernetes workloads. You can also dynamically create Parallelstore instances and use them in Kubernetes workloads with a StatefulSet or a Deployment.

Limitations

  • Data persistence: Parallelstore is a "scratch plus" file system. It's backed by Local SSD with 2+1 erasure coding, and the mean time to data loss is two months. Parallelstore is not long-term storage and should instead be considered an extremely fast file system for specific workloads.
  • Per-Pod limitation: GKE supports mounting only one Parallelstore instance per Pod.
  • Data transfers: Transferring data from Cloud Storage to Parallelstore is not supported by the GKE API. To perform the transfer, use the Parallelstore API.
  • Usable capacity: You can configure storage capacity from 12,000 GiB to 100,000 GiB.
  • Supported zones: Parallelstore is supported in these zones. If the region of your cluster differs from that of your Parallelstore instance, there will be a noticeable decline in the I/O performance.
  • VPC-SC limitations for Parallelstore: If you use both Shared VPC and VPC Service Controls, you must have the host project that provides the network and the service project that contains the Parallelstore instance inside the same perimeter for the Parallelstore instance to function correctly. Separating the host project and service project with a perimeter might cause the existing instances to become unavailable and might not create new instances.

Requirements

To use the Parallelstore CSI driver, your clusters must meet the following requirements:

  • Make sure you have installed the latest version of the Google Cloud CLI. The minimum supported gcloud CLI version for this feature is 469.0.0 or later.
  • Use Google Kubernetes Engine cluster version 1.29 or later.

What's next