Parallelstore

Parallelstore is available by invitation only. If you'd like to request access to Parallelstore in your Google Cloud project, contact your sales representative.

Parallelstore is a fully managed, low-latency distributed file system designed to meet the demands of high performance computing (HPC) and data-intensive applications.

Parallelstore is ideal for use cases where multiple clients need concurrent access to shared files with data integrity.

Parallelstore supports the POSIX standard, ensuring compatibility with a wide range of existing applications and tools, simplifying migration and integration.

Parallelstore instances can be mounted to Compute Engine VMs or Google Kubernetes Engine clusters. The Parallelstore CSI driver enables customers to use Kubernetes APIs to access the file system as volumes for their stateful workloads.

Batch data transfers into and out of Cloud Storage are available from the command line and the REST API.

Specifications

Parallelstore is a "scratch" file system: it's backed by local SSD with 2+1 erasure coding, with a mean time to data loss (MTTDL) from 2 to 16 months, depending on instance capacity. See the Performance table for details.
Usable capacity can be configured from 12TiB to 100TiB.
Supported in multiple regions.

Performance

Expected performance from Parallelstore is shown in the following table.

Metric	Result
Write Throughput	0.5 GiBps per TiB
Read throughput	1.15 GiBps per TiB
Read IOPS	30k IOPs per TiB
Write IOPS	10k IOPs per TiB
4K Read Latency	0.3 ms
Number of client processes supported	4000
Transfer speed (Parallelstore <> Cloud Storage)	Maximum transfer rate of 20 GiBps or 5000 files per second
Mean time to data loss (MTTDL)	100 TiB capacity: 2 months 48 TiB capacity: 4 months 12 TiB capacity: 16 months

These numbers are measured using 256 client connections to a single instance. Latency is measured from a single client. Directory and file striping settings are optimized for each metric.

Use Cases

High-performance computing: Parallelstore excels in HPC environments where multiple compute nodes need fast and consistent access to shared data for simulations, modeling, and analysis.
Machine learning: Parallelstore can handle the large datasets and high throughput requirements of machine learning workloads, enabling efficient training and inference.

Pricing

See the Pricing page for details.