Create Cloud Storage subscriptions

This document describes how to create a Cloud Storage subscription. You can use the Google Cloud console, the Google Cloud CLI, the client library, or the Pub/Sub API to create a Cloud Storage subscription.

Before you begin

Before reading this document, ensure that you're familiar with the following:

Required roles and permissions

The following is a list of guidelines regarding roles and permissions:

  • To create a subscription, you must configure access control at the project level.

  • You also need resource-level permissions if your subscriptions and topics are in different projects, as discussed later in this section.

  • To create a Cloud Storage subscription, the Pub/Sub service account must have permission to write to the specific Cloud Storage bucket and to read the bucket metadata. For more information about how to grant these permissions, see the next section of this document.

To get the permissions that you need to create Cloud Storage subscriptions, ask your administrator to grant you the Pub/Sub Editor (roles/pubsub.editor) IAM role on the project. For more information about granting roles, see Manage access to projects, folders, and organizations.

This predefined role contains the permissions required to create Cloud Storage subscriptions. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to create Cloud Storage subscriptions:

  • Create a subscription: pubsub.subscriptions.create
  • Attach a subscription to a topic: pubsub.topics.attachSubscription
  • Pull from a subscription: pubsub.subscriptions.consume
  • Get a subscription: pubsub.subscriptions.get
  • List a subscription: pubsub.subscriptions.list
  • Update a subscription: pubsub.subscriptions.update
  • Delete a subscription: pubsub.subscriptions.delete
  • Get the IAM policy for a subscription: pubsub.subscriptions.getIamPolicy
  • Configure the IAM policy for a subscription: pubsub.subscriptions.setIamPolicy

You might also be able to get these permissions with custom roles or other predefined roles.

If you need to create Cloud Storage subscriptions in one project that are associated with a topic in another project, ask your topic administrator to also grant you the Pub/Sub Editor (roles/pubsub.editor) IAM role on the topic.

Assign Cloud Storage roles to the Pub/Sub service account

Some Google Cloud services have Google Cloud-managed service accounts that lets the services access your resources. These service accounts are known as service agents. Pub/Sub creates and maintains a service account for each project in the format service-PROJECT_NUMBER@gcp-sa-pubsub.iam.gserviceaccount.com.

To create a Cloud Storage subscription, the Pub/Sub service account must have permission to write to the specific Cloud Storage bucket and to read the bucket metadata. Choose one of the following procedures:

  • Grant permissions at the bucket level. On the specific Cloud Storage bucket, grant the Storage Object Creator (roles/storage.objectCreator) role and the Storage Legacy Bucket Reader (roles/storage.legacyBucketReader) role to the Pub/Sub service account.

  • If you must grant roles at the project level, you might instead grant the Storage Admin (roles/storage.admin) role on the project containing the Cloud Storage bucket. Grant this role to the Pub/Sub service account.

Bucket permissions

Perform the following steps to grant the Storage Object Creator (roles/storage.objectCreator) and Storage Legacy Bucket Reader (roles/storage.legacyBucketReader) roles at the bucket level:

  1. In the Google Cloud console, go to the Cloud Storage page.

    Go to Cloud Storage

  2. Click the Cloud Storage bucket to which you would like to write messages.

    The Bucket details page opens.

  3. In the Bucket details page, click the Permissions tab.

  4. In the Permissions > View by Principals tab, click Grant access.

    The Grant access page opens.

  5. In the Add Principals section, enter the name of your Pub/Sub service account.

    The format of the service account is service-PROJECT_NUMBER@gcp-sa-pubsub.iam.gserviceaccount.com. For example, for a project with PROJECT_NUMBER=112233445566, the service account is of the format service-112233445566@gcp-sa-pubsub.iam.gserviceaccount.com.

  6. In the Assign roles > Select a role drop-down, enter Creator and select the Storage Object Creator role.

  7. Click Add another role.

  8. In the Select a role drop-down, enter Bucket Reader, and select the Storage Legacy Bucket Reader role.

  9. Click Save.

Project permissions

Perform the following steps to grant the Storage Admin (roles/storage.admin) role at the project level:

  1. In the Google Cloud console, go to the IAM page.

    Go to IAM

  2. In the Permissions > View by Principals tab, click Grant access.

    The Grant access page opens.

  3. In the Add Principals section, enter the name of your Pub/Sub service account.

    The format of the service account is service-PROJECT_NUMBER@gcp-sa-pubsub.iam.gserviceaccount.com. For example, for a project with PROJECT_NUMBER=112233445566, the service account is of the format service-112233445566@gcp-sa-pubsub.iam.gserviceaccount.com.

  4. In the Assign roles > Select a role drop-down, enter Storage Admin and select the Storage Admin role.

  5. Click Save.

For more information about Cloud Storage IAM, see Cloud Storage Identity and Access Management.

Cloud Storage subscription properties

When you configure a Cloud Storage subscription, you must specify the properties common to all subscription types and some additional Cloud Storage subscription-specific properties.

Common subscription properties

Learn about the common subscription properties that you can set across all subscriptions.

Bucket name

A Cloud Storage bucket must already exist before you create a Cloud Storage subscription.

The messages are sent as batches and stored in the Cloud Storage bucket. A single batch or file is stored as an object in the bucket.

The Cloud Storage bucket must have Requester Pays disabled.

To create a Cloud Storage bucket, see Create buckets.

Filename prefix, suffix, and datetime

The output Cloud Storage files generated by the Cloud Storage subscription are stored as objects in the Cloud Storage bucket. The name of the object stored in the Cloud Storage bucket is of the following format: <file-prefix><UTC-date-time>_<uuid><file-suffix>.

The following list includes details of the file format and the fields that you can customize:

  • <file-prefix> is the custom filename prefix. This is an optional field.

  • <UTC-date-time> is a customizable auto-generated string based on the time the object is created.

  • <uuid> is an auto-generated random string for the object.

  • <file-suffix> is the custom filename suffix. This is an optional field. The filename suffix cannot end in "/".

  • You can change the filename prefix and suffix:

    • For example, if the value of the filename prefix is prod_ and the value of the filename suffix is _archive, a sample object name is prod_2023-09-25T04:10:00+00:00_uN1QuE_archive.

    • If you don't specify the filename prefix and suffix, the object name stored in the Cloud Storage bucket is of the format: <UTC-date-time>_<uuid>.

    • Cloud Storage object naming requirements also apply to the filename prefix and suffix. For more information, see About Cloud Storage objects.

  • You can change how the date and time are displayed in the filename:

    • Required datetime matchers that you can use only once: year (YYYY or YY), month (MM), day (DD), hour (hh), minute (mm), second (ss). For example, YY-YYYY or MMM is invalid.

    • Optional matchers that you can use only once: datetime separator (T) and and timezone offset (Z or +00:00).

    • Optional elements that you can use multiple times: hyphen (-), underscore (_), colon (:), and forward slash (/).

    • For example, if the value of the filename datetime format is YYYY-MM-DD/hh_mm_ssZ, a sample object name is prod_2023-09-25/04_10_00Z_uNiQuE_archive.

    • If the filename datetime format ends in a character which is not a matcher, that character will replace the separator between <UTC-date-time> and <uuid>. For example, if the value of the filename datetime format is YYYY-MM-DDThh_mm_ss-, a sample object name is prod_2023-09-25T04_10_00-uNiQuE_archive.

File batching

Cloud Storage subscriptions let you decide when you want to create a new output file that is stored as an object in the Cloud Storage bucket. Pub/Sub writes an output file when one of the specified batching conditions are met. The following are the Cloud Storage batching conditions:

  • Storage batch max duration. This is a required setting. The Cloud Storage subscription writes a new output file if the specified value of max duration is exceeded. If you don't specify the value, a default value of 5 minutes is applied. The following are the applicable values for max duration:

    • Minimum value = 1 minute
    • Default value = 5 minutes
    • Maximum value = 10 minutes
  • Storage batch max bytes. This is an optional setting. The Cloud Storage subscription writes a new output file if the specified value of max bytes is exceeded. The following are the applicable values for max bytes:

    • Minimum value = 1 KB
    • Maximum value = 10 GiB
  • Storage batch max messages. This is an optional setting. The Cloud Storage subscription writes a new output file if the specified number of max messages is exceeded. The following are the applicable values for max messages:

    • Minimum value = 1000

For example, you can configure max duration as 6 minutes and max bytes as 2 GB. If at the 4th minute, the output file reaches a file size of 2 GB, Pub/Sub finalizes the previous file and starts writing to a new file.

A Cloud Storage subscription might write to multiple files in a Cloud Storage bucket simultaneously. If you have configured your subscription to create a new file every 6th minute, you might observe multiple Cloud Storage files being created every 6 minutes.

In some situations, Pub/Sub might start writing to a new file earlier than the time configured by the file batching conditions. A file might also exceed the Max bytes value if the subscription receives messages larger than the Max bytes value.

File format

When you create a Cloud Storage subscription, you can specify the format of the output files that are to be stored in a Cloud Storage bucket as Text or Avro.

  • Text: The messages are stored as plain text. A newline character separates a message from the previous message in the file. Only message payloads are stored, not attributes or other metadata.

  • Avro: The messages are stored in Apache Avro binary format. When you select Avro, you can enable the following additional properties:

    • Write metadata: This option lets you store the message metadata along with the message. Metadata such as subscription_name, message_id, publish_time, and attributes fields are written to top-level fields in the output Avro object while all other message properties other than data (for example, an ordering_key, if present) are added as entries in the attributes map.

      If write metadata is disabled, only the message payload is written to the output Avro object. Here is the Avro schema for the output messages with write metadata disabled:

      {
        "type": "record",
        "namespace": "com.google.pubsub",
        "name": "PubsubMessage",
        "fields": [
          { "name": "data", "type": "bytes" }
        ]
      }
      

      Here is the Avro schema for the output messages with write metadata enabled:

      {
        "type": "record",
        "namespace": "com.google.pubsub",
        "name": "PubsubMessageWithMetadata",
        "fields": [
          { "name": "subscription_name", "type": "string" },
          { "name": "message_id", "type": "string"  },
          { "name": "publish_time", "type": {
              "type": "long",
              "logicalType": "timestamp-micros"
            }
          },
          { "name": "attributes", "type": { "type": "map", "values": "string" } },
          { "name": "data", "type": "bytes" }
        ]
      }
      
    • Use topic schema: This option lets Pub/Sub use the schema of the Pub/Sub topic to which the subscription is attached when writing Avro files.

      When you use this option, remember to check the following additional requirements:

      • The topic schema must be in Apache Avro format.

      • If both use topic schema and write metadata are enabled, the topic schema must have a Record object at its root. Pub/Sub will expand the Record's list of fields to include the metadata fields. As a result, the Record cannot contain any fields with the same name as the metadata fields (subscription_name, message_id, publish_time, or attributes).

Create a Cloud Storage subscription

Console

  1. In the Google Cloud console, go to the Subscriptions page.

    Go to Subscriptions

  2. Click Create subscription.

  3. For the Subscription ID field, enter a name.

    For information about how to name a subscription, see Guidelines to name a topic or a subscription.

  4. Choose or create a topic from the drop-down menu.

    The subscription receives messages from the topic.

    For information about how to create a topic, see Create and manage topics.

  5. Select Delivery type as Write to Cloud Storage.

  6. For the Cloud Storage bucket, click Browse.

    • You can select an existing bucket from any appropriate project.

    • You can also click the create icon and follow the instructions on the screen to create a new bucket.

      After you create the bucket, select the bucket for the Cloud Storage subscription.

      For more information about how to create a bucket, see Create buckets.

    When you specify the bucket, Pub/Sub checks for the appropriate permissions on the bucket for the Pub/Sub service account. If there are permissions issues, you see a message similar to the following: Unable to verify if the Pub/Sub service agent has write permissions on this bucket. You may be lacking permissions to view or set permissions.

  7. If you get permission issues, click Set Permission and follow the on-screen instructions.

    Alternatively, follow the instructions in Assign Cloud Storage roles to the Pub/Sub service account.

  8. For File format, select Text or Avro.

    If you select Avro, you can also optionally specify if you want to store the message metadata in the output.

    For more information about the two options including the message metadata option for the Avro format, see File format.

  9. Optional: You can specify the File name prefix, suffix, and datetime for all your files that are to be written to the Cloud Storage bucket. A file is stored as an object in the bucket.

    For more information about how to set the file prefix, suffix, and datetime, see Filename prefix, suffix, and datetime.

  10. For File batching, specify a maximum time to elapse before creating a new file.

    You can also optionally set the maximum file size or maximum number of messages for the files.

    For more information about both file batching options, see File batching.

  11. We strongly recommend that you enable Dead lettering to handle message failures.

    For more information, see Dead letter topic.

  12. You can keep the other settings as their defaults and click Create.

gcloud

  1. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  2. To create a Cloud Storage subscription, run the gcloud pubsub subscriptions create command.
    gcloud pubsub subscriptions create SUBSCRIPTION_ID \
        --topic=TOPIC_ID \
        --cloud-storage-bucket=BUCKET_NAME \
        --cloud-storage-file-prefix=CLOUD_STORAGE_FILE_PREFIX \
        --cloud-storage-file-suffix=CLOUD_STORAGE_FILE_SUFFIX \
        --cloud-storage-file-datetime-format=CLOUD_STORAGE_FILE_DATETIME_FORMAT \
        --cloud-storage-max-duration=CLOUD_STORAGE_MAX_DURATION \
        --cloud-storage-max-bytes=CLOUD_STORAGE_MAX_BYTES \
        --cloud-storage-max-messages=CLOUD_STORAGE_MAX_MESSAGES \
        --cloud-storage-output-format=CLOUD_STORAGE_OUTPUT_FORMAT \
        --cloud-storage-write-metadata
        --cloud-storage-use-topic-schema

    In the command, only SUBSCRIPTION_ID, the --topic flag, and the --cloud-storage-bucket flag are required. The remaining flags are optional and can be omitted.

    Replace the following:

    • SUBSCRIPTION_ID: The name or ID of your new Cloud Storage subscription.
    • TOPIC_ID: The name or ID of your topic.
    • BUCKET_NAME: Specifies the name of an existing bucket. For example, prod_bucket. The bucket name must not include the project ID. To create a bucket, see Create buckets.
    • CLOUD_STORAGE_FILE_PREFIX: Specifies the prefix for the Cloud Storage filename. For example, log_events_.
    • CLOUD_STORAGE_FILE_SUFFIX: Specifies the suffix for the Cloud Storage filename. For example, .txt.
    • CLOUD_STORAGE_FILE_DATETIME_FORMAT: Specifies the datetime format for the Cloud Storage filename. For example, YYYY-MM-DD/hh_mm_ssZ.
    • CLOUD_STORAGE_MAX_DURATION: The maximum duration that can elapse before a new Cloud Storage file is created. The value must be between 1m and 10m. For example, 5m.
    • CLOUD_STORAGE_MAX_BYTES: The maximum bytes that can be written to a Cloud Storage file before a new file is created. The value must be between 1KB to 10GB. For example, 20MB.
    • CLOUD_STORAGE_MAX_MESSAGES: The maximum number of messages that can be written to a Cloud Storage file before a new file is created. The value must be greater than or equal to 1000. For example, 100000.
    • CLOUD_STORAGE_OUTPUT_FORMAT: The output format for data written to Cloud Storage. Values are as follows:
      • text: Messages are written as raw text, separated by a newline.
      • avro: Messages are written as an Avro binary. --cloud-storage-write-metadata and --cloud-storage-use-topic-schema only affect subscriptions with output format avro.

C++

Before trying this sample, follow the C++ setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub C++ API reference documentation.

To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

namespace pubsub = ::google::cloud::pubsub;
namespace pubsub_admin = ::google::cloud::pubsub_admin;
[](pubsub_admin::SubscriptionAdminClient client,
   std::string const& project_id, std::string const& topic_id,
   std::string const& subscription_id, std::string const& bucket) {
  google::pubsub::v1::Subscription request;
  request.set_name(
      pubsub::Subscription(project_id, subscription_id).FullName());
  request.set_topic(pubsub::Topic(project_id, topic_id).FullName());
  request.mutable_cloud_storage_config()->set_bucket(bucket);
  auto sub = client.CreateSubscription(request);
  if (!sub) {
    if (sub.status().code() == google::cloud::StatusCode::kAlreadyExists) {
      std::cout << "The subscription already exists\n";
      return;
    }
    throw std::move(sub).status();
  }

  std::cout << "The subscription was successfully created: "
            << sub->DebugString() << "\n";
}

Go

Before trying this sample, follow the Go setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub Go API reference documentation.

To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

import (
	"context"
	"fmt"
	"io"
	"time"

	"cloud.google.com/go/pubsub"
)

// createCloudStorageSubscription creates a Pub/Sub subscription that exports messages to Cloud Storage.
func createCloudStorageSubscription(w io.Writer, projectID, subID string, topic *pubsub.Topic, bucket string) error {
	// projectID := "my-project-id"
	// subID := "my-sub"
	// topic of type https://godoc.org/cloud.google.com/go/pubsub#Topic
	// note bucket should not have the gs:// prefix
	// bucket := "my-bucket"
	ctx := context.Background()
	client, err := pubsub.NewClient(ctx, projectID)
	if err != nil {
		return fmt.Errorf("pubsub.NewClient: %w", err)
	}
	defer client.Close()

	sub, err := client.CreateSubscription(ctx, subID, pubsub.SubscriptionConfig{
		Topic: topic,
		CloudStorageConfig: pubsub.CloudStorageConfig{
			Bucket:         bucket,
			FilenamePrefix: "log_events_",
			FilenameSuffix: ".avro",
			OutputFormat:   &pubsub.CloudStorageOutputFormatAvroConfig{WriteMetadata: true},
			MaxDuration:    1 * time.Minute,
			MaxBytes:       1e8,
		},
	})
	if err != nil {
		return fmt.Errorf("client.CreateSubscription: %w", err)
	}
	fmt.Fprintf(w, "Created Cloud Storage subscription: %v\n", sub)

	return nil
}

Java

Before trying this sample, follow the Java setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub Java API reference documentation.

To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

import com.google.cloud.pubsub.v1.SubscriptionAdminClient;
import com.google.protobuf.Duration;
import com.google.pubsub.v1.CloudStorageConfig;
import com.google.pubsub.v1.ProjectSubscriptionName;
import com.google.pubsub.v1.ProjectTopicName;
import com.google.pubsub.v1.Subscription;
import java.io.IOException;

public class CreateCloudStorageSubscriptionExample {
  public static void main(String... args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String topicId = "your-topic-id";
    String subscriptionId = "your-subscription-id";
    String bucket = "your-bucket";
    String filenamePrefix = "log_events_";
    String filenameSuffix = ".text";
    Duration maxDuration = Duration.newBuilder().setSeconds(300).build();

    createCloudStorageSubscription(
        projectId, topicId, subscriptionId, bucket, filenamePrefix, filenameSuffix, maxDuration);
  }

  public static void createCloudStorageSubscription(
      String projectId,
      String topicId,
      String subscriptionId,
      String bucket,
      String filenamePrefix,
      String filenameSuffix,
      Duration maxDuration)
      throws IOException {
    try (SubscriptionAdminClient subscriptionAdminClient = SubscriptionAdminClient.create()) {

      ProjectTopicName topicName = ProjectTopicName.of(projectId, topicId);
      ProjectSubscriptionName subscriptionName =
          ProjectSubscriptionName.of(projectId, subscriptionId);

      CloudStorageConfig cloudStorageConfig =
          CloudStorageConfig.newBuilder()
              .setBucket(bucket)
              .setFilenamePrefix(filenamePrefix)
              .setFilenameSuffix(filenameSuffix)
              .setMaxDuration(maxDuration)
              .build();

      Subscription subscription =
          subscriptionAdminClient.createSubscription(
              Subscription.newBuilder()
                  .setName(subscriptionName.toString())
                  .setTopic(topicName.toString())
                  .setCloudStorageConfig(cloudStorageConfig)
                  .build());

      System.out.println("Created a CloudStorage subscription: " + subscription.getAllFields());
    }
  }
}

Monitor a Cloud Storage subscription

Cloud Monitoring provides a number of metrics to monitor subscriptions.

For a list of all the available metrics related to Pub/Sub and their descriptions, see the Monitoring documentation for Pub/Sub.

You can also monitor subscriptions from within Pub/Sub.

What's next