Transfer data to or from Cloud Storage

Parallelstore can import data from, and export data to, Cloud Storage. Data transfers allow you to quickly load data into your Parallelstore instance, and to use Cloud Storage as a durable backing layer for your Parallelstore instance.

Transfers from Cloud Storage are incremental; they only copy files to your Parallelstore instance that don't already exist on the instance, or that have changed since they were transferred.

Required permissions

The user or service account used for initiating the transfer requires the following permissions:

  • parallelstore.instances.exportData in order to transfer from Parallelstore to Cloud Storage.
  • parallelstore.instances.importData in order to transfer to Cloud Storage.

Both of these permissions are granted with the roles/parallelstore.admin role. You can create a custom role to grant permissions independently.

In addition, the Parallelstore service account requires the following permission:

  • roles/storage.admin on the Cloud Storage bucket.

To grant this permission, run the following gcloud command:

gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
  --member=service-PROJECT_ID@gcp-sa-parallelstore.iam.gserviceaccount.com \
  --role=roles/storage.admin

Import data to Parallelstore

You can import data from a Cloud Storage bucket. The bucket can be in the same or a different project. The bucket can be in a different zone or region from your Parallelstore instance, but inter-region transfers might be slower than intra-region transfers.

gcloud

gcloud beta parallelstore instances import-data INSTANCE_ID \
  --location=LOCATION \
  --source-gcs-bucket-uri=gs://BUCKET_NAME \
  --destination-parallelstore-path=PS_PATH \
  --request-id=REQUEST_ID \
  --async \

Where:

  • INSTANCE_ID is your Parallelstore instance name.
  • --location must be a supported zone.
  • --source-gcs-bucket-uri specifies the URI to a Cloud Storage bucket, or a path within a bucket, using the format gs://<bucket_name>/<optional_path_inside_bucket>.
  • --destination-parallelstore-path specifies the root directory path to the Parallelstore file system. Must start with /. Default is /.
  • --request-id allows you to assign a unique ID to this request. If you retry this request using the same request ID, the server will ignore the request if it has already been completed. Must be a valid UUID that is not all zeros.
  • --async returns a response immediately, without waiting for the operation to complete.

REST

POST https://parallelstore.googleapis.com/v1beta/projects/PROJECT_ID/locations/LOCATION/instances/INSTANCE_ID:importData
Authorization: Bearer [YOUR_ACCESS_TOKEN]

{
  "source_gcs_bucket" : {
    "uri" : "gs://BUCKET_NAME/"
  },
  "destination_parallelstore" : {
    "path" : "/PATH"
  }
}

Where:

  • PROJECT_ID is your Google Cloud project name.
  • LOCATION must be the supported zone in which your instance resides.
  • INSTANCE_ID is your Parallelstore instance name.
  • source-gcs-bucket contains a uri key whose value specifies the URI to a Cloud Storage bucket, or a path within a bucket, using the format gs://<bucket_name>/<optional_path_inside_bucket>.
  • destination-parallelstore contains a path key whose value specifies the root directory path to the Parallelstore file system. Must start with /. Default is /.

To use your own service account instead of the Google-managed service agent, the request supports a serviceAccount field in the JSON object:

"serviceAccount" : "projects/PROJECT_ID/serviceAccounts/SERVICE_ACCOUNT_ID"

An example cURL command looks like:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://parallelstore.googleapis.com/v1beta/projects/PROJECT_ID/locations/LOCATION/instances/INSTANCE_ID:importData \
  -d '{"source_gcs_bucket": {"uri":"gs://BUCKET_NAME/"}, "destination_parallelstore": {"path":"/"}}'

Export data

You can export data from your Parallelstore instance to a Cloud Storage bucket in the same or a different project. The bucket can be in a different zone or region from your Parallelstore instance, but inter-region transfers might be slower than intra-region transfers.

gcloud

gcloud beta parallelstore instances export-data \
  INSTANCE_ID \
  --location=LOCATION \
  --destination-gcs-bucket-uri="gs://BUCKET_NAME" \
  --source-parallelstore-path="/"

Where:

  • INSTANCE_ID is your Parallelstore instance name.
  • --location must be a supported zone.
  • --destination-gcs-bucket-uri specifies the URI to a Cloud Storage bucket, or a path within a bucket, using the format gs://<bucket_name>/<optional_path_inside_bucket>.
  • --async returns a response immediately, without waiting for the operation to complete.
  • --source-parallelstore-path specifies the root directory path to the Parallelstore file system. Must start with /. Default is /.
  • --request-id allows you to assign a unique ID to this request. If you retry this request using the same request ID, the server will ignore the request if it has already been completed. Must be a valid UUID that is not all zeros.

REST

POST https://parallelstore.googleapis.com/v1beta/projects/PROJECT_ID/locations/LOCATION/instances/INSTANCE_ID:exportData
Authorization: Bearer [YOUR_ACCESS_TOKEN]

{
  "source_parallelstore" : {
    "path" : "/"
  },
  "destination_gcs_bucket" : {
    "uri" : "gs://BUCKET_NAME/"
  }
}

Where:

  • PROJECT_ID is your Google Cloud project name.
  • INSTANCE_ID is your Parallelstore instance name.
  • LOCATION must be the supported zone in which your Parallelstore instance resides.
  • --source-parallelstore contains a path key whose value specifies the root directory path to the Parallelstore file system. Must start with /. Default is /.
  • --destination-gcs-bucket contains a uri key whose value specifies the URI to a Cloud Storage bucket, or a path within a bucket, using the format gs://<bucket_name>/<optional_path_inside_bucket>.

To use your own service account instead of the Google-managed service agent, the request supports a serviceAccount field in the JSON object:

"serviceAccount" : "projects/PROJECT_ID/serviceAccounts/SERVICE_ACCOUNT_ID"

An example cURL command looks like:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json"
  https://parallelstore.googleapis.com/v1beta/projects/PROJECT_ID/locations/LOCATION/instances/INSTANCE_ID:exportData \
  -d '{"source_parallelstore": {"path":"/"}, "destination_gcs_bucket": {"uri":"gs://BUCKET_NAME/"}}'

Get operation

To see the status of an import or export operation, you'll need the operation ID. This ID is returned by the service when you make an import or export request, and uses the following format:

  • operation-1234567890123-6127783ad26ea-88913969-02748053

gcloud

gcloud beta parallelstore operations describe OPERATION_ID \
  --location=LOCATION

REST

GET https://parallelstore.googleapis.com/v1beta/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID
Authorization: Bearer [YOUR_ACCESS_TOKEN]

An example cURL command looks like:

curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://parallelstore.googleapis.com/v1beta/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID

Cancel operation

To cancel an import or export operation, you'll need the operation ID. This ID is returned by the service when you make an import or export request, and uses the following format:

  • operation-1234567890123-6127783ad26ea-88913969-02748053

gcloud

gcloud beta parallelstore operations cancel OPERATION_ID \
  --location=LOCATION

REST

POST https://parallelstore.googleapis.com/v1beta/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID:cancel
Authorization: Bearer [YOUR_ACCESS_TOKEN]

An example cURL command looks like:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://parallelstore.googleapis.com/v1beta/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID:cancel