Connect to an existing Parallelstore instance from Google Kubernetes Engine

Parallelstore is available by invitation only. If you'd like to request access to Parallelstore in your Google Cloud project, contact your sales representative.

This guide describes how you can connect to an existing Parallelstore instance with the GKE Parallelstore CSI driver with static provisioning. This lets you access existing fully managed Parallelstore instances as volumes for your stateful workloads, in a controlled and predictable way.

Before you begin

Before you start, make sure that you have performed the following tasks:

Enable the Parallelstore API and the Google Kubernetes Engine API.

Enable APIs

If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running the gcloud components update command. Earlier gcloud CLI versions might not support running the commands in this document.
Note: For existing gcloud CLI installations, make sure to set the compute/region property. If you use primarily zonal clusters, set the compute/zone instead. By setting a default location, you can avoid errors in the gcloud CLI like the following: One of [--zone, --region] must be supplied: Please specify location. You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.

See the CSI driver overview for limitations and requirements.
Create a Parallelstore instance if you haven't done so already.
Configure a VPC network.
If you want to use a GKE Standard cluster, make sure to enable the CSI driver.

Access an existing Parallelstore instance using the Parallelstore CSI driver

If you have already provisioned a Parallelstore instance within the same network as your GKE cluster, you can follow these instructions to statically provision a PersistentVolume that refers to your instance.

The following sections describe the typical process for accessing an existing Parallelstore instance using the Parallelstore CSI driver:

Create a PersistentVolume that refers to the Parallelstore instance..
Use a PersistentVolumeClaim to access the volume.
(Optional) Configure resources for the sidecar container.
Create a workload that consumes the volume.

Create a PersistentVolume

This section shows an example of how you can create a PersistentVolume that references an existing Parallelstore instance.

Run the following command to locate your Parallelstore instance.

gcloud beta parallelstore instances list \
    --project=PROJECT_ID \
    --location=LOCATION

Replace the following:

PROJECT_ID: the Google Cloud project ID.
LOCATION: the Compute Engine zone containing the cluster. You must specify a supported zone for the Parallelstore CSI driver.

The output should look similar to the following. Make sure to note down the Parallelstore instance name and the IP access points, before you proceed to the next step.

NAME                                                                                                     capacity  DESCRIPTION  CREATE_TIME                     UPDATE_TIME                     STATE   network  RESERVED_IP_RANGE  ACCESS_POINTS
projects/my-project/locations/us-central1-a/instances/pvc-eff1ed02-a8ed-48d2-9902-bd70a2d60563  12000                  2024-03-06T19:18:26.036463730Z  2024-03-06T19:24:44.561441556Z  ACTIVE                              10.51.110.2,10.51.110.4,10.51.110.3

Save the following manifest in a file named parallelstore-pv.yaml:
Pod mount
```
apiVersion: v1
kind: PersistentVolume
metadata:
  name: parallelstore-pv
spec:
  storageClassName: "STORAGECLASS_NAME"
  capacity:
    storage: STORAGE_SIZE
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  volumeMode: Filesystem
  csi:
    driver: parallelstore.csi.storage.gke.io
    volumeHandle: "PROJECT_ID/LOCATION/INSTANCE_NAME/default-pool/default-container"
    volumeAttributes:
      accessPoints: ACCESS_POINTS
      network: NETWORK_NAME
  claimRef:
    name: parallelstore-pvc
    namespace: default
```
Replace the following:
- PROJECT_ID: the Google Cloud project ID.
- LOCATION: the zonal location of your Parallelstore instance. You must specify a supported zone for the Parallelstore CSI driver.
- INSTANCE_NAME: the name of your Parallelstore instance. An example of a valid volumeHandle value is "my-project/us-central1-a/pvc-eff1ed02-a8ed-48d2-9902-bd70a2d60563/default-pool/default-container".
- ACCESS_POINTS: the access points of your Parallelstore instance; for example, 10.51.110.2,10.51.110.4,10.51.110.3.
- NETWORK_NAME: the VPC network where your Parallelstore instance can be accessed.
- STORAGECLASS_NAME: the name of your StorageClass. The value can be an empty string, but it must match the specification in your PersistentVolumeClaim.
- STORAGE_SIZE: the storage size; for example, 12000Gi.
For the full list of fields that are supported in the PersistentVolume object, refer to the Parallelstore CSI reference documentation.
Node mount
The Parallelstore CSI driver lets you mount volumes directly on your nodes. Node mount is supported on GKE clusters version 1.32.3 and later.

Node-level mounting allows all Pods on a node to share the same mount point. Sharing the mount point improves scalability because the number of mounts increases with the number of nodes, not the number of Pods (as with the sidecar mode).

As a result, you can run more Pods while sharing the same Parallelstore instance.

Note: This feature is enabled by specifying the mountLocality: node volume attribute. The default value for Pod mount locality is mountLocality: pod. The dfuse CPU, memory requests, and limit flags only work with the mountLocality: node setting.

If the value of either the request or the limit is set and the other is unset, they will both be set to the same, specified value.

You can use `'0'` as the value to unset any resource limits. For example, dfuseMemoryLimit: '0' removes the memory limit for the dfuse process.
```
apiVersion: v1
kind: PersistentVolume
metadata:
  name: parallelstore-pv
spec:
  storageClassName: "STORAGECLASS_NAME"
  capacity:
    storage: STORAGE_SIZE
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  volumeMode: Filesystem
  csi:
    driver: parallelstore.csi.storage.gke.io
    volumeHandle: "PROJECT_ID/LOCATION/INSTANCE_NAME/default-pool/default-container"
    volumeAttributes:
      accessPoints: ACCESS_POINTS
      network: NETWORK_NAME
      mountLocality: node
      dfuseCPURequest: DFUSE_CPU_REQUEST
      dfuseMemoryRequest: DFUSE_MEMORY_REQUEST
      dfuseCPULimit: DFUSE_CPU_LIMIT
      dfuseMemoryLimit: DFUSE_MEMORY_LIMIT
  claimRef:
    name: parallelstore-pvc
    namespace: default
```
Replace the following:
- PROJECT_ID: the Google Cloud project ID.
- LOCATION: the zonal location of your Parallelstore instance. You must specify a supported zone for the Parallelstore CSI driver.
- INSTANCE_NAME: the name of your Parallelstore instance. An example of a valid volumeHandle value is "my-project/us-central1-a/pvc-eff1ed02-a8ed-48d2-9902-bd70a2d60563/default-pool/default-container".
- ACCESS_POINTS: the access points of your Parallelstore instance; for example, 10.51.110.2,10.51.110.4,10.51.110.3.
- NETWORK_NAME: the VPC network where your Parallelstore instance can be accessed.
- STORAGECLASS_NAME: the name of your StorageClass. The value can be an empty string, but it must match the specification in your PersistentVolumeClaim.
- STORAGE_SIZE: the storage size; for example, 12000Gi.
- DFUSE_CPU_REQUEST: the CPU request for the dfuse process. The default is 250m.
- DFUSE_MEMORY_REQUEST: the memory request for the dfuse process. The default is 512Mi.
- DFUSE_CPU_LIMIT: the CPU limit for the dfuse process. The default is unset.
- DFUSE_MEMORY_LIMIT: the memory limit for the dfuse process. The default is 10Gi.
For the full list of fields that are supported in the PersistentVolume object, refer to the Parallelstore CSI reference documentation.
Create the PersistentVolume by running this command:
```
kubectl apply -f parallelstore-pv.yaml
```

(Optional) Mount the same Parallelstore instance with different mount options

Note: This feature is not supported with node mounts.

You can mount the same Parallelstore instance with different mount options. For example, you can mount the same Parallelstore instance with caching enabled and with caching disabled in the same Pod.

To mount the same Parallelstore instance with different mount options, you must create a PersistentVolume for each mount option. Use the following syntax for the volumeHandle field in the PersistentVolume object: "PROJECT_ID/LOCATION/INSTANCE_NAME/default-pool/default-container:RANDOM_SUFFIX" , where RANDOM_SUFFIX is a random string of your choice.

For example: "my-project/us-central1-a/pvc-eff1ed02-a8ed-48d2-9902-bd70a2d60563/default-pool/default-container:xyz123"

Use a PersistentVolumeClaim to access the volume

You can create a PersistentVolumeClaim resource that references the Parallelstore CSI driver's StorageClass.

The following manifest file shows an example of how to create a PersistentVolumeClaim in ReadWriteMany access mode that references the StorageClass you created earlier.

Save the following manifest in a file named parallelstore-pvc.yaml:

  kind: PersistentVolumeClaim
  apiVersion: v1
  metadata:
    name: parallelstore-pvc
    namespace: default
  spec:
    accessModes:
      - ReadWriteMany
    storageClassName: STORAGECLASS_NAME
    resources:
      requests:
        storage: STORAGE_SIZE

Replace the following:

STORAGECLASS_NAME: the name of your StorageClass. It must match the specification in your PersistentVolume.
STORAGE_SIZE: Storage size; for example, 12000Gi. It must match the specification in your PersistentVolume.

Create the PersistentVolumeClaim by running this command:
```
  kubectl create -f parallelstore-pvc.yaml
```

(Optional) Configure resources for the sidecar container

Note: You can only configure resources for the sidecar container if you use the Pod mount locality, that is, the mountLocality volume attribute in your PersistentVolume is either unset or set to pod

When you create a workload Pod that uses Parallelstore-backed volumes, the CSI driver determines whether your volume is based on Parallelstore instances.

If the driver detects that your volume is Parallelstore-based, or if you specify the annotation gke-parallelstore/volumes: "true", the CSI driver automatically injects a sidecar container named gke-parallelstore-sidecar into your Pod. This sidecar container mounts the Parallelstore instance to your workload.

By default, the sidecar container is configured with the following resource requests, with resource limits unset:

250 m CPU
512 MiB memory
10 MiB ephemeral storage

To overwrite these values, you can optionally specify the annotation gke-parallelstore/[cpu-request|memory-request|cpu-limit|memory-limit|ephemeral-storage-request] as shown in the following example:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    gke-parallelstore/volumes: "true"
    gke-parallelstore/cpu-request: 500m
    gke-parallelstore/memory-request: 1Gi
    gke-parallelstore/ephemeral-storage-request: 500Mi
    gke-parallelstore/cpu-limit: 1000m
    gke-parallelstore/memory-limit: 2Gi
    gke-parallelstore/ephemeral-storage-limit: 1Gi

Use the following considerations when deciding the amount of resources to allocate:

If one of the request or limit values is set and another is unset, GKE sets them both to the same, specified value.
Allocate more CPU to the sidecar container if your workloads need higher throughput. Insufficient CPU will cause I/O throttling.
You can use "0" as the value to unset any resource limits on Standard clusters; for example, gke-parallelstore/memory-limit: "0" removes the memory limit for the sidecar container. This is useful when you cannot decide on the amount of resources gke-parallelstore-sidecar needs for your workloads, and want to let the sidecar consume all the available resources on a node.

Create a workload that consumes the volume

This section shows an example of how to create a Pod that consumes the PersistentVolumeClaim resource you created earlier.

Multiple Pods can share the same PersistentVolumeClaim resource.

Save the following manifest in a file named my-pod.yaml.

  apiVersion: v1
  kind: Pod
  metadata:
    name: my-pod
  spec:
    containers:
    - name: nginx
      image: nginx
      volumeMounts:
        - name: parallelstore-volume
          mountPath: /data
    volumes:
    - name: parallelstore-volume
      persistentVolumeClaim:
        claimName: parallelstore-pvc

Run the following command to apply the manifest to the cluster:
```
  kubectl apply -f my-pod.yaml
```
The Pod waits until GKE provisions the PersistentVolumeClaim before it starts running. This operation might take several minutes to complete.

Manage the Parallelstore CSI driver

This section covers how you can enable and disable the Parallelstore CSI driver, if needed.

Enable the Parallelstore CSI driver on a new cluster

To enable the Parallelstore CSI driver when creating a new Standard cluster, run the following command with the Google Cloud CLI:

gcloud container clusters create CLUSTER_NAME \
    --location=LOCATION \
    --network=NETWORK_NAME \
    --addons=ParallelstoreCsiDriver \
    --cluster-version=VERSION

Replace the following:

CLUSTER_NAME: the name of your cluster.
LOCATION: the Compute Engine zone containing the cluster. You must specify a supported zone for the Parallelstore CSI driver.
NETWORK_NAME: name of the VPC network you created in Configure a VPC network.
VERSION: the GKE version number. You must specify a supported version number to use this feature, such as GKE version 1.29 or later. Alternatively, you can use the --release-channel flag and specify a release channel.

Enable the Parallelstore CSI driver on an existing cluster

To enable the driver on an existing GKE Standard cluster, run the following command with the Google Cloud CLI:

gcloud container clusters update CLUSTER_NAME \
  --location=LOCATION \
  --update-addons=ParallelstoreCsiDriver=ENABLED

Replace the following:

CLUSTER_NAME : the name of your cluster.
LOCATION: the Compute Engine zone containing the cluster. You must specify a supported zone for the Parallelstore CSI driver.

Make sure that your GKE cluster is running in the same VPC network that you set up in Configure a VPC network. To verify the VPC network for a GKE cluster, you can check in the Google Cloud console, or through the command gcloud container clusters describe $(CLUSTER) --format="value(networkConfig.network)" --location=$(LOCATION).

Disable the Parallelstore CSI driver

You can disable the Parallelstore CSI driver on an existing Autopilot or Standard cluster by using the Google Cloud CLI.

gcloud container clusters update CLUSTER_NAME \
    --location=LOCATION \
    --update-addons=ParallelstoreCsiDriver=DISABLED

Replace the following:

CLUSTER_NAME : the name of your cluster.
LOCATION: the Compute Engine zone containing the cluster. You must specify a supported zone for the Parallelstore CSI driver.

Use fsGroup with Parallelstore volumes

The Parallelstore CSI driver supports changing the group ownership of the root level directory of the mounted file system to match a user-requested fsGroup specified in the Pod's SecurityContext. This feature is only supported in GKE clusters version 1.29.5 or later, or version 1.30.1 or later.

Troubleshooting

For troubleshooting guidance, refer to the Troubleshooting page in the Parallelstore documentation.

Connect to an existing Parallelstore instance from Google Kubernetes Engine

Before you begin

Access an existing Parallelstore instance using the Parallelstore CSI driver

Create a PersistentVolume

Pod mount

Node mount

(Optional) Mount the same Parallelstore instance with different mount options

Use a PersistentVolumeClaim to access the volume

(Optional) Configure resources for the sidecar container

Create a workload that consumes the volume

Manage the Parallelstore CSI driver

Enable the Parallelstore CSI driver on a new cluster

Enable the Parallelstore CSI driver on an existing cluster

Disable the Parallelstore CSI driver

Use fsGroup with Parallelstore volumes

Troubleshooting

What's next