Transfer data to or from Cloud Storage

Google Cloud Managed Lustre can import data from, and export data to, Cloud Storage. Data transfers are incremental; they only copy files that don't already exist in the destination, or that have changed since they were transferred.

Cloud Storage buckets with hierarchical namespace enabled provide faster transfer speeds to and from Managed Lustre compared to standard buckets.

Required permissions

The user or service account used for initiating the transfer requires the following permissions:

  • lustre.instances.exportData in order to transfer from Managed Lustre to Cloud Storage.
  • lustre.instances.importData in order to transfer to Cloud Storage.

Both of these permissions are granted with the roles/lustre.admin role. You can create a custom role to grant permissions independently.

In addition, the Managed Lustre service account requires one of the following Cloud Storage roles:

  • To transfer data to and from Cloud Storage: roles/storage.objectUser on the Cloud Storage bucket.
  • To only transfer from Cloud Storage: roles/storage.objectViewer on the Cloud Storage bucket.

To grant one of these roles, run the following gcloud command:

gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
  --member=serviceAccount:service-PROJECT_NUMBER@gcp-sa-lustre.iam.gserviceaccount.com \
  --role=roles/storage.objectViewer_OR_objectUser

Your PROJECT_NUMBER is not the same as a project ID:

  • A project ID is a unique string that can be a combination of letters, numbers, and hyphens. You specify a project ID when creating your project. For example, example-project-123.
  • A project number is an automatically generated unique identifier for your project that consists only of numbers. For example, 1234567890.

To obtain the PROJECT_NUMBER for a given project ID, use the gcloud projects describe command:

gcloud projects describe PROJECT_ID --format="value(projectNumber)"

Import data to Managed Lustre

You can import data from a Cloud Storage bucket. The bucket can be in the same or a different project. The bucket can be in a different zone or region from your Managed Lustre instance, but inter-region transfers
might be slower than intra-region transfers.

gcloud

gcloud lustre instances import-data INSTANCE_ID \
  --location=LOCATION \
  --gcs-path-uri=gs://BUCKET_NAME/ \
  --lustre-path=PS_PATH

Where:

  • INSTANCE_ID is your Managed Lustre instance name.
  • --location is the zone of your Managed Lustre instance. For example, us-central1-a.
  • --gcs-path-uri specifies the URI to a Cloud Storage bucket, or a path within a bucket, using the format gs://<bucket_name>/<optional_path_inside_bucket>/. If a path inside the bucket is specified, it must end with a forward slash (/).
  • --lustre-path specifies the root directory path to the Managed Lustre file system. Must start with /. Default is /. If specifying a value other than the default, the directory must already exist on the file system.

The following parameters are optional:

  • --request-id lets you assign a unique ID to this request. If you retry this request using the same request ID, the server will ignore the request if it has already been completed. Must be a valid UUID that is not all zeros.
  • --async returns a response immediately, without waiting for the operation to complete.

REST

POST https://lustre.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/instances/INSTANCE_ID:importData
Authorization: Bearer [YOUR_ACCESS_TOKEN]

{
  "gcsPath" : {
    "uri" : "gs://BUCKET_NAME/"
  },
  "lustrePath" : {
    "path" : "/PATH"
  }
}

Where:

  • PROJECT_ID is your Google Cloud project name.
  • LOCATION is the zone of your Managed Lustre instance. For example, us-central1-a.
  • INSTANCE_ID is your Managed Lustre instance name.
  • gcsPath contains a uri key whose value specifies the URI to a Cloud Storage bucket, or a path within a bucket, using the format gs://<bucket_name>/<optional_path_inside_bucket>/. If a path inside the bucket is specified, it must end with a forward slash (/).
  • lustrePath contains a path key whose value specifies the root directory path to the Managed Lustre file system. Must start with /. Default is /. If specifying a value other than the default, the directory must already exist on the file system.

To use your own service account instead of the Google-managed service agent, the request supports a serviceAccount field in the JSON object:

"serviceAccount" : "projects/PROJECT_ID/serviceAccounts/SERVICE_ACCOUNT_ID"

An example curl command looks like:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://lustre.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/instances/INSTANCE_ID:importData \
  -d '{"gcsPath": {"uri":"gs://BUCKET_NAME/"}, "lustrePath": {"path":"/"}}'

Export data

You can export data from your Managed Lustre instance to a Cloud Storage bucket in the same or a different project. The bucket can be in a different zone or region from your Managed Lustre instance, but inter-region transfers might be slower than intra-region transfers.

gcloud

gcloud lustre instances export-data \
  INSTANCE_ID \
  --location=LOCATION \
  --gcs-path-uri="gs://BUCKET_NAME/" \
  --lustre-path="/"

Where:

  • INSTANCE_ID is your Managed Lustre instance name.
  • --location is the zone of your Managed Lustre instance. For example, us-central1-a.
  • --gcs-path-uri specifies the URI to a Cloud Storage bucket, or a path within a bucket, using the format gs://<bucket_name>/<optional_path_inside_bucket>/. If a path inside the bucket is specified, it must end with a forward slash (/).
  • --lustre-path specifies the root directory path to the Managed Lustre file system. Must start with /. Default is /.

The following parameters are optional:

  • --request-id lets you assign a unique ID to this request. If you retry this request using the same request ID, the server will ignore the request if it has already been completed. Must be a valid UUID that is not all zeros.
  • --async returns a response immediately, without waiting for the operation to complete.

REST

POST https://lustre.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/instances/INSTANCE_ID:exportData
Authorization: Bearer [YOUR_ACCESS_TOKEN]

{
  "lustrePath" : {
    "path" : "/"
  },
  "gcsPath" : {
    "uri" : "gs://BUCKET_NAME/"
  }
}

Where:

  • PROJECT_ID is your Google Cloud project name.
  • INSTANCE_ID is your Managed Lustre instance name.
  • LOCATION is the zone of your Managed Lustre instance. For example, us-central1-a.
  • lustrePath contains a path key whose value specifies the root directory path to the Managed Lustre file system. Must start with /. Default is /.
  • gcsPath contains a uri key whose value specifies the URI to a Cloud Storage bucket, or a path within a bucket, using the format gs://<bucket_name>/<optional_path_inside_bucket>/. If a path inside the bucket is specified, it must end with a forward slash (/).

To use your own service account instead of the Google-managed service agent, the request supports a serviceAccount field in the JSON object:

"serviceAccount" : "projects/PROJECT_ID/serviceAccounts/SERVICE_ACCOUNT_ID"

An example curl command looks like:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json"
  https://lustre.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/instances/INSTANCE_ID:exportData \
  -d '{"lustrePath": {"path":"/"}, "gcsPath": {"uri":"gs://BUCKET_NAME/"}}'

Get operation

To see the status of an import or export operation, you'll need the operation ID. This ID is returned by the service when you make an import or export request, and uses the following format:

  • operation-1234567890123-6127783ad26ea-88913969-02748053

gcloud

gcloud lustre operations describe OPERATION_ID \
  --location=LOCATION

REST

GET https://lustre.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID
Authorization: Bearer [YOUR_ACCESS_TOKEN]

An example curl command looks like:

curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://lustre.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID

Cancel operation

To cancel an import or export operation, you'll need the operation ID. This ID is returned by the service when you make an import or export request, and uses the following format:

  • operation-1234567890123-6127783ad26ea-88913969-02748053

gcloud

gcloud lustre operations cancel OPERATION_ID \
  --location=LOCATION

REST

POST https://lustre.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID:cancel
Authorization: Bearer [YOUR_ACCESS_TOKEN]

An example curl command looks like:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://lustre.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID:cancel