Create and manage batch operation jobs

This page describes how to create, view, list, cancel, and delete storage batch operations jobs. It also describes how to use Cloud Audit Logs with storage batch operations jobs.

Before you begin

To create and manage storage batch operations jobs, complete the steps in the following sections.

Configure Storage Intelligence

To create and manage storage batch operations jobs, configure Storage Intelligence on the bucket where you want to run the job.

Set up Google Cloud CLI

You must use Google Cloud CLI version 516.0.0 or later.

Set the default project

Set the project where you want to create the storage batch operations job.

gcloud config set project PROJECT_ID

where PROJECT_ID is the ID of your project.

Enable API

Enable the storage batch operations API.

gcloud services enable storagebatchoperations.googleapis.com

Create a manifest

To use a manifest for object selection, create a manifest.

Create a storage batch operations job

This section describes how to create a storage batch operations job.

Roles required

To get the required permissions for creating a storage batch operations job, ask your administrator to grant you the Storage Admin (roles/storage.admin) IAM role on the project. This predefined role contains the following permissions required to create a storage batch operations job:

storagebatchoperations.jobs.create
storage.objects.delete (Only required if running the delete objects storage batch operations job)
storage.objects.update (Only required if running the update object metadata, update object customer-managed encryption key, or update object hold storage batch operations job)

Command line

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
In your development environment, run the gcloud storage batch-operations jobs create command.
```
gcloud storage batch-operations jobs create JOB_NAME --bucket=BUCKET_NAME OBJECT_SELECTION_FLAG JOB_TYPE_FLAG
```
Where:
- JOB_NAME is the name of the storage batch operations job.
- BUCKET_NAME is the name of the bucket that contains one or more objects you want to process.
- OBJECT_SELECTION_FLAG is one of the following flags:
  - --included-object-prefixes: Specify one or more object prefixes. For example:
    - To match a single prefix, use: --included-object-prefixes='prefix1'.
    - To match multiple prefixes, use a comma-separated prefix list: --included-object-prefixes='prefix1,prefix2'.
    - To include all objects, use an empty prefix: --included-object-prefixes=''.
  - --manifest-location: Specify the manifest location. For example, gs://bucket_name/path/object_name.csv.
- JOB_TYPE_FLAG is one of the following flags, depending on the job type.
  - --delete-object: Delete one or more objects.
  - --put-metadata: Update object metadata. Object metadata is stored as key-value pairs. Specify the key-value pair for the metadata you want to modify. You can specify one or more key-value pairs as a list. You can also set object retention configurations using the --put-metadata flag. To do so, specify the retention parameters using the Retain-Until and Retention-Mode fields. For example,
```
gcloud storage batch-operations jobs create my-job --bucket=my-bucket --manifest=manifest.csv --put-metadata=Retain-Until=RETAIN_UNTIL_TIME, Retention-Mode=RETENTION_MODE
```
    Where:
    - RETAIN_UNTIL_TIME is the date and time, in RFC 3339 format, until which the object is retained. For example, 2025-10-09T10:30:00Z. To set the retention configuration on an object, you'll need to enable retention on the bucket which contains the object.
    - RETENTION_MODE is the retention mode, either Unlocked or Locked.
      
      When you send a request to update the RETENTION_MODE and RETAIN_UNTIL_TIME fields, consider the following:
      - To update the object retention configuration, you must provide non-empty values for both RETENTION_MODE and RETAIN_UNTIL_TIME fields; setting only one results in an INVALID_ARGUMENT error.
      - You can extend the RETAIN_UNTIL_TIME value for objects in both Unlocked or Locked modes.
      - The object retention must be in Unlocked mode if you want to do the following:
        
        Reduce the RETAIN_UNTIL_TIME value.
        
        Remove the retention configuration. To remove the configuration, you'll need to provide empty values for both RETENTION_MODE and RETAIN_UNTIL_TIME fields.
      - If you omit both RETENTION_MODE and RETAIN_UNTIL_TIME fields, the retention configuration remains unchanged.
    - --rewrite-object: Update the customer-managed encryption keys for one or more objects.
    - --put-object-event-based-hold: Enable event-based object holds.
    - --no-put-object-event-based-hold: Disable event-based object holds.
    - --put-object-temporary-hold: Enable temporary object holds.
    - --no-put-object-temporary-hold: Disable temporary object holds.

Client libraries

C++

For more information, see the Cloud Storage C++ API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

[](google::cloud::storagebatchoperations_v1::StorageBatchOperationsClient
       client,
   std::string const& project_id, std::string const& job_id,
   std::string const& target_bucket_name, std::string const& object_prefix) {
  auto const parent =
      std::string{"projects/"} + project_id + "/locations/global";
  namespace sbo = google::cloud::storagebatchoperations::v1;
  sbo::Job job;
  sbo::BucketList* bucket_list = job.mutable_bucket_list();
  sbo::BucketList::Bucket* bucket_config = bucket_list->add_buckets();
  bucket_config->set_bucket(target_bucket_name);
  sbo::PrefixList* prefix_list_config = bucket_config->mutable_prefix_list();
  prefix_list_config->add_included_object_prefixes(object_prefix);
  sbo::DeleteObject* delete_object_config = job.mutable_delete_object();
  delete_object_config->set_permanent_object_deletion_enabled(false);
  auto result = client.CreateJob(parent, job, job_id).get();
  if (!result) throw result.status();
  std::cout << "Created job: " << result->name() << "\n";
}

PHP

For more information, see the Cloud Storage PHP API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

use Google\Cloud\StorageBatchOperations\V1\Client\StorageBatchOperationsClient;
use Google\Cloud\StorageBatchOperations\V1\CreateJobRequest;
use Google\Cloud\StorageBatchOperations\V1\Job;
use Google\Cloud\StorageBatchOperations\V1\BucketList;
use Google\Cloud\StorageBatchOperations\V1\BucketList\Bucket;
use Google\Cloud\StorageBatchOperations\V1\PrefixList;
use Google\Cloud\StorageBatchOperations\V1\DeleteObject;

/**
 * Create a new batch job.
 *
 * @param string $projectId Your Google Cloud project ID.
 *        (e.g. 'my-project-id')
 * @param string $jobId A unique identifier for this job.
 *        (e.g. '94d60cc1-2d95-41c5-b6e3-ff66cd3532d5')
 * @param string $bucketName The name of your Cloud Storage bucket to operate on.
 *        (e.g. 'my-bucket')
 * @param string $objectPrefix The prefix of objects to include in the operation.
 *        (e.g. 'prefix1')
 */
function create_job(string $projectId, string $jobId, string $bucketName, string $objectPrefix): void
{
    // Create a client.
    $storageBatchOperationsClient = new StorageBatchOperationsClient();

    $parent = $storageBatchOperationsClient->locationName($projectId, 'global');

    $prefixListConfig = new PrefixList(['included_object_prefixes' => [$objectPrefix]]);
    $bucket = new Bucket(['bucket' => $bucketName, 'prefix_list' => $prefixListConfig]);
    $bucketList = new BucketList(['buckets' => [$bucket]]);

    $deleteObject = new DeleteObject(['permanent_object_deletion_enabled' => false]);

    $job = new Job(['bucket_list' => $bucketList, 'delete_object' => $deleteObject]);

    $request = new CreateJobRequest([
        'parent' => $parent,
        'job_id' => $jobId,
        'job' => $job,
    ]);
    $response = $storageBatchOperationsClient->createJob($request);

    printf('Created job: %s', $response->getName());
}

REST APIs

JSON API

Have gcloud CLI installed and initialized , which lets you generate an access token for the Authorization header.

Create a JSON file that contains the settings for the storage batch operations job. The following are common settings to include:
```
                                    

class="devsite-click-to-copy" translate="no" dir="ltr" is-upgraded syntax="JSON">{ "Description": "JOB_DESCRIPTION", "BucketList": { "Buckets": [ { "Bucket": "BUCKET_NAME", "Manifest": { "manifest_location": "MANIFEST_LOCATION" } "PrefixList": { "include_object_prefixes": "OBJECT_PREFIXES" } } ] }, "DeleteObject": { "permanent_object_deletion_enabled": OBJECT_DELETION_VALUE } "RewriteObject": { "kms_key":"KMS_KEY_VALUE" } "PutMetadata":{ "METADATA_KEY": "METADATA_VALUE", ..., "objectRetention": { "retainUntilTime": "RETAIN_UNTIL_TIME", "mode": "RETENTION_MODE" } } "PutObjectHold": { "temporary_hold": TEMPORARY_HOLD_VALUE, "event_based_hold": EVENT_BASED_HOLD_VALUE } } 
```
Where:
- JOB_NAME is the name of the storage batch operations job.
- JOB_DESCRIPTION is the description of the storage batch operations job.
- BUCKET_NAME is the name of the bucket that contains one or more objects you want to process.
- To specify the objects you want to process, use any one of the following attributes in the JSON file:
  - MANIFEST_LOCATION is the manifest location. For example, gs://bucket_name/path/object_name.csv.
  - OBJECT_PREFIXES is the comma-separated list containing one or more object prefixes. To match all objects, use an empty list.
- Depending on the job you want to process, specify any one of the following options:
  - Delete objects:
```
"DeleteObject":
{
"permanent_object_deletion_enabled": OBJECT_DELETION_VALUE
}
```
    Where OBJECT_DELETION_VALUE is TRUE to delete objects.
  - Update the Customer-managed encryption key for objects:
```
"RewriteObject":
{
"kms_key": KMS_KEY_VALUE
}
```
    Where KMS_KEY_VALUE is the value of the object's KMS key you want to update.
  - Update object metadata:
```
"PutMetadata": {
 "METADATA_KEY": "METADATA_VALUE",
 ...,
"objectRetention": {
   "retainUntilTime": "RETAIN_UNTIL_TIME",
   "mode": "RETENTION_MODE"
 }
}
```
    Where:
    - METADATA_KEY/VALUE is the object's metadata key-value pair. You can specify one or more pairs.
    - RETAIN_UNTIL_TIME is the date and time, in RFC 3339 format, until which the object is retained. For example, 2025-10-09T10:30:00Z. To set the retention configuration on an object, you'll need to enable retention on the bucket which contains the object.
    - RETENTION_MODE is the retention mode, either Unlocked or Locked.
      
      When you send a request to update the RETENTION_MODE and RETAIN_UNTIL_TIME fields, consider the following:
      
      To update the object retention configuration, you must provide non-empty values for both RETENTION_MODE and RETAIN_UNTIL_TIME fields; setting only one results in an INVALID_ARGUMENT error.
      
      You can extend the RETAIN_UNTIL_TIME value for objects in both Unlocked or Locked modes.
      
      The object retention must be in Unlocked mode if you want to do the following:
      
      Reduce the RETAIN_UNTIL_TIME value.
      
      Remove the retention configuration. To remove the configuration, you'll need to provide empty values for both RETENTION_MODE and RETAIN_UNTIL_TIME fields.
      
      If you omit both RETENTION_MODE and RETAIN_UNTIL_TIME fields, the retention configuration remains unchanged.
    - Update object holds:
      "PutObjectHold": { "temporary_hold": TEMPORARY_HOLD_VALUE, "event_based_hold": EVENT_BASED_HOLD_VALUE }
      Where:
      - TEMPORARY_HOLD_VALUE is used to enable or disable the temporary object hold. A value of 1 enables the hold, and a value of 2 disables the hold.
      - EVENT_BASED_HOLD_VALUE is used to enable or disable the event-based object hold. A value of 1 enables the hold, and a value of 2 disables the hold.
- Use cURL to call the JSON API with a POST storage batch operations job request:
```
curl -X POST --data-binary @JSON_FILE_NAME \
 -H "Authorization: Bearer $(gcloud auth print-access-token)" \
 -H "Content-Type: application/json" \
 "https://storagebatchoperations.googleapis.com/v1/projects/PROJECT_ID/locations/global/jobs?job_id=JOB_ID"
```
  Where:
  - JSON_FILE_NAME is the name of the JSON file.
  - PROJECT_ID is the ID or number of the project. For example, my-project.
  - JOB_ID is the name of the storage batch operations job.

Get storage batch operations job details

This section describes how to get the storage batch operations job details.

Roles required

To get the required permissions for viewing a storage batch operations job, ask your administrator to grant you Storage Admin (roles/storage.admin) IAM role on the project. This predefined role contains the following permissions required to view a storage batch operations job:

storagebatchoperations.jobs.get
storagebatchoperations.operations.get

Command line

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
In your development environment, run the gcloud storage batch-operations jobs describe command.
```
gcloud storage batch-operations jobs describe JOB_ID
```
Where:

JOB_ID is the name of the storage batch operations job.

Client libraries

C++

For more information, see the Cloud Storage C++ API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

[](google::cloud::storagebatchoperations_v1::StorageBatchOperationsClient
       client,
   std::string const& project_id, std::string const& job_id) {
  auto const parent =
      std::string{"projects/"} + project_id + "/locations/global";
  auto const name = parent + "/jobs/" + job_id;
  auto job = client.GetJob(name);
  if (!job) throw job.status();
  std::cout << "Got job: " << job->name() << "\n";
}

PHP

For more information, see the Cloud Storage PHP API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

use Google\Cloud\StorageBatchOperations\V1\Client\StorageBatchOperationsClient;
use Google\Cloud\StorageBatchOperations\V1\GetJobRequest;

/**
 * Gets a batch job.
 *
 * @param string $projectId Your Google Cloud project ID.
 *        (e.g. 'my-project-id')
 * @param string $jobId A unique identifier for this job.
 *        (e.g. '94d60cc1-2d95-41c5-b6e3-ff66cd3532d5')
 */
function get_job(string $projectId, string $jobId): void
{
    // Create a client.
    $storageBatchOperationsClient = new StorageBatchOperationsClient();

    $parent = $storageBatchOperationsClient->locationName($projectId, 'global');
    $formattedName = $parent . '/jobs/' . $jobId;

    $request = new GetJobRequest([
        'name' => $formattedName,
    ]);

    $response = $storageBatchOperationsClient->getJob($request);

    printf('Got job: %s', $response->getName());
}

REST APIs

JSON API

Have gcloud CLI installed and initialized , which lets you generate an access token for the Authorization header.

Use cURL to call the JSON API with a GET storage batch operations job request:
```
curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://storagebatchoperations.googleapis.com/v1/projects/PROJECT_ID/locations/global/jobs/JOB_ID"
```
Where:
- PROJECT_ID is the ID or number of the project. For example, my-project.
- JOB_ID is the name of the storage batch operations job.

List storage batch operations jobs

This section describes how to list the storage batch operations jobs within a project.

Roles required

To get the required permissions for listing all storage batch operations jobs, ask your administrator to grant you Storage Admin (roles/storage.admin) IAM role on the project. This predefined role contains the following permissions required to list storage batch operations jobs:

storagebatchoperations.jobs.list
storagebatchoperations.operations.list

Command line

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
In your development environment, run the gcloud storage batch-operations jobs list command.
```
gcloud storage batch-operations jobs list
```

Client libraries

C++

For more information, see the Cloud Storage C++ API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

[](google::cloud::storagebatchoperations_v1::StorageBatchOperationsClient
       client,
   std::string const& project_id) {
  auto const parent =
      std::string{"projects/"} + project_id + "/locations/global";
  for (auto const& job : client.ListJobs(parent)) {
    if (!job) throw job.status();
    std::cout << job->name() << "\n";
  }
}

PHP

For more information, see the Cloud Storage PHP API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

use Google\Cloud\StorageBatchOperations\V1\Client\StorageBatchOperationsClient;
use Google\Cloud\StorageBatchOperations\V1\ListJobsRequest;

/**
 * List Jobs in a given project.
 *
 * @param string $projectId Your Google Cloud project ID.
 *        (e.g. 'my-project-id')
 */
function list_jobs(string $projectId): void
{
    // Create a client.
    $storageBatchOperationsClient = new StorageBatchOperationsClient();

    $parent = $storageBatchOperationsClient->locationName($projectId, 'global');

    $request = new ListJobsRequest([
        'parent' => $parent,
    ]);

    $jobs = $storageBatchOperationsClient->listJobs($request);

    foreach ($jobs as $job) {
        printf('Job name: %s' . PHP_EOL, $job->getName());
    }
}

REST APIs

JSON API

Have gcloud CLI installed and initialized , which lets you generate an access token for the Authorization header.

Use cURL to call the JSON API with a LIST storage batch operations jobs request:

curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://storagebatchoperations.googleapis.com/v1/projects/PROJECT_ID/locations/global/jobs"

Where:

PROJECT_ID is the ID or number of the project. For example, my-project.

Cancel a storage batch operations job

This section describes how to cancel a storage batch operations job within a project.

Roles required

To get the required permissions for canceling a storage batch operations job, ask your administrator to grant you Storage Admin (roles/storage.admin) IAM role on the project. This predefined role contains the following permissions required to cancel a storage batch operations job:

storagebatchoperations.jobs.cancel
storagebatchoperations.operations.cancel

Command line

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
In your development environment, run the gcloud storage batch-operations jobs cancel command.
```
gcloud storage batch-operations jobs cancel JOB_ID
```
Where:

JOB_ID is the name of the storage batch operations job.

Client libraries

C++

For more information, see the Cloud Storage C++ API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

[](google::cloud::storagebatchoperations_v1::StorageBatchOperationsClient
       client,
   std::string const& project_id, std::string const& job_id) {
  auto const parent =
      std::string{"projects/"} + project_id + "/locations/global";
  auto const name = parent + "/jobs/" + job_id;
  auto response = client.CancelJob(name);
  if (!response) throw response.status();
  std::cout << "Cancelled job: " << name << "\n";
}

PHP

For more information, see the Cloud Storage PHP API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

use Google\Cloud\StorageBatchOperations\V1\Client\StorageBatchOperationsClient;
use Google\Cloud\StorageBatchOperations\V1\CancelJobRequest;

/**
 * Cancel a batch job.
 *
 * @param string $projectId Your Google Cloud project ID.
 *        (e.g. 'my-project-id')
 * @param string $jobId A unique identifier for this job.
 *        (e.g. '94d60cc1-2d95-41c5-b6e3-ff66cd3532d5')
 */
function cancel_job(string $projectId, string $jobId): void
{
    // Create a client.
    $storageBatchOperationsClient = new StorageBatchOperationsClient();

    $parent = $storageBatchOperationsClient->locationName($projectId, 'global');
    $formattedName = $parent . '/jobs/' . $jobId;

    $request = new CancelJobRequest([
        'name' => $formattedName,
    ]);

    $storageBatchOperationsClient->cancelJob($request);

    printf('Cancelled job: %s', $formattedName);
}

REST APIs

JSON API

Have gcloud CLI installed and initialized , which lets you generate an access token for the Authorization header.

Use cURL to call the JSON API with a CANCEL a storage batch operations job request:
```
curl -X CANCEL \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://storagebatchoperations.googleapis.com/v1/projects/PROJECT_ID/locations/global/jobs/JOB_ID"
```
Where:
- PROJECT_ID is the ID or number of the project. For example, my-project.
- JOB_ID is the name of the storage batch operations job.

Delete a storage batch operations job

This section describes how to delete a storage batch operations job.

Roles required

To get the required permission for deleting a storage batch operations job, ask your administrator to grant you Storage Admin (roles/storage.admin) IAM role on the project. This predefined role contains the storagebatchoperations.jobs.delete permission required to delete a storage batch operations job.

Command line

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
In your development environment, run the gcloud storage batch-operations jobs delete command.
```
gcloud storage batch-operations jobs delete JOB_ID
```
Where:

JOB_ID is the name of the storage batch operations job.

Client libraries

C++

For more information, see the Cloud Storage C++ API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

[](google::cloud::storagebatchoperations_v1::StorageBatchOperationsClient
       client,
   std::string const& project_id, std::string const& job_id) {
  auto const parent =
      std::string{"projects/"} + project_id + "/locations/global";
  auto const name = parent + "/jobs/" + job_id;
  auto status = client.DeleteJob(name);
  if (!status.ok()) throw status;
  std::cout << "Deleted job: " << name << "\n";
}

PHP

For more information, see the Cloud Storage PHP API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

use Google\Cloud\StorageBatchOperations\V1\Client\StorageBatchOperationsClient;
use Google\Cloud\StorageBatchOperations\V1\DeleteJobRequest;

/**
 * Delete a batch job.
 *
 * @param string $projectId Your Google Cloud project ID.
 *        (e.g. 'my-project-id')
 * @param string $jobId A unique identifier for this job.
 *        (e.g. '94d60cc1-2d95-41c5-b6e3-ff66cd3532d5')
 */
function delete_job(string $projectId, string $jobId): void
{
    // Create a client.
    $storageBatchOperationsClient = new StorageBatchOperationsClient();

    $parent = $storageBatchOperationsClient->locationName($projectId, 'global');
    $formattedName = $parent . '/jobs/' . $jobId;

    $request = new DeleteJobRequest([
        'name' => $formattedName,
    ]);

    $storageBatchOperationsClient->deleteJob($request);

    printf('Deleted job: %s', $formattedName);
}

REST APIs

JSON API

Have gcloud CLI installed and initialized , which lets you generate an access token for the Authorization header.

Use cURL to call the JSON API with a DELETE a storage batch operations job request:
```
curl -X DELETE \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://storagebatchoperations.googleapis.com/v1/projects/PROJECT_ID/locations/global/jobs/JOB_ID"
```
Where:
- PROJECT_ID is the ID or number of the project. For example, my-project.
- JOB_ID is the name of the storage batch operations job.

Create a storage batch operations job using Storage Insights datasets

To create a storage batch operations job using Storage Insights datasets, complete the steps in the following sections.

Roles required

To get the required permissions for creating storage batch operations jobs, ask your administrator to grant you the Storage Admin (roles/storage.admin) IAM role on the project. This predefined role contains the following permissions required to create storage batch operations jobs:

storagebatchoperations.jobs.create
storage.objects.delete (Only required if running the delete objects storage batch operations job)
storage.objects.update (Only required if running the update object metadata, update object customer-managed encryption key, or update object hold storage batch operations job)

Create a manifest using Storage Insights datasets

You can create the manifest for your storage batch operations job by extracting data from BigQuery. To do so, you'll need to query the linked dataset, export the resulting data as a CSV file, and save it to a Cloud Storage bucket. The storage batch operations job can then use this CSV file as its manifest.

Running the following SQL query in BigQuery on a Storage Insights dataset view retrieves objects larger than 1 KiB that are named Temp_Training:

  EXPORT DATA OPTIONS(
   uri=`URI`,
   format=`CSV`,
   overwrite=OVERWRITE_VALUE,
   field_delimiter=',') AS
  SELECT bucket, name, generation
  FROM DATASET_VIEW_NAME
  WHERE bucket = BUCKET_NAME
  AND name LIKE (`Temp_Training%`)
  AND size > 1024 * 1024
  AND snapshotTime = SNAPSHOT_TIME

Where:

URI is the URI to the bucket that contains the manifest. For example, gs://bucket_name/path_to_csv_file/*.csv. When you use the *.csv wildcard, BigQuery exports the result to multiple CSV files.
OVERWRITE_VALUE is a boolean value. If set to true, the export operation overwrites existing files at the specified location.
DATASET_VIEW_NAME is the fully qualified name of the Storage Insights dataset view in PROJECT_ID.DATASET_ID.VIEW_NAME format. To find the name of your dataset, view the linked dataset.

Where:
- PROJECT_ID is the ID or number of the project. For example, my-project.
- DATASET_ID is the name of the dataset. For example, objects-deletion-dataset.
- VIEW_NAME is the name of the dataset view. For example, bucket_attributes_view.
BUCKET_NAME is the name of the bucket. For example, my-bucket.
SNAPSHOT_TIME is the snapshot time of the Storage Insights dataset view. For example, 2024-09-10T00:00:00Z.

Create a storage batch operations job

To create a storage batch operations job to process objects contained in the manifest, complete the following steps:

Command line

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
In your development environment, run the gcloud storage batch-operations jobs create command:
```
gcloud storage batch-operations jobs create \
JOB_ID \
--bucket=SOURCE_BUCKET_NAME \
--manifest-location=URI \
--JOB_TYPE_FLAG
```
Where:
- JOB_ID is the name of the storage batch operations job.
- SOURCE_BUCKET_NAME is the bucket that contains one or more objects you want to process. For example, my-bucket.
- URI is the URI to the bucket that contains the manifest. For example, gs://bucket_name/path_to_csv_file/*.csv. When you use the *.csv wildcard, BigQuery exports the result to multiple CSV files.
- JOB_TYPE_FLAG is one of the following flags, depending on the job type.
  - --delete-object: Delete one or more objects.
  - --put-metadata: Update object metadata. Object metadata is stored as key-value pairs. Specify the key-value pair for the metadata you want to modify. You can specify one or more key-value pairs as a list. You can also provide object retention configurations using the --put-metadata flag.
  - --rewrite-object: Update the customer-managed encryption keys for one or more objects.
  - --put-object-event-based-hold: Enable event-based object holds.
  - --no-put-object-event-based-hold: Disable event-based object holds.
  - --put-object-temporary-hold: Enable temporary object holds.
  - --no-put-object-temporary-hold: Disable temporary object holds.

Integration with VPC Service Controls

You can provide an additional layer of security for storage batch operations resources by using VPC Service Controls. When you use VPC Service Controls, you add projects to service perimeters that protect resources and services from requests that originate from outside of the perimeter. To learn more about VPC Service Controls service perimeter details for storage batch operations, see Supported products and limitations.

Use Cloud Audit Logs for storage batch operations jobs

Storage batch operations jobs record transformations on Cloud Storage objects in Cloud Storage Cloud Audit Logs. You can use Cloud Audit Logs with Cloud Storage to track the object transformations that storage batch operations jobs perform. For information about enabling audit logs, see Enabling audit logs. In the audit log entry, the callUserAgent metadata field with the value StorageBatchOperations indicates a storage batch operations transformation.

Next Steps

Learn about Storage Insights datasets