A Managed Service for Apache Kafka cluster provides an environment for storing and processing streams of messages organized into topics.
To create a cluster you can use the Google Cloud console, the Google Cloud CLI, the client library, or the Managed Kafka API. You can't use the open source Apache Kafka API to create a cluster.
Before you begin
Ensure that you are familiar with the following:
Required roles and permissions to create a cluster
To get the permissions that you need to create a cluster,
ask your administrator to grant you the
Managed Kafka Cluster Editor (roles/managedkafka.clusterEditor
) IAM role on your project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the permissions required to create a cluster. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to create a cluster:
-
Create a cluster:
managedkafka.clusters.create
You might also be able to get these permissions with custom roles or other predefined roles.
The Managed Kafka Cluster Editor role does not let you create, delete, or modify topics and consumer groups on Managed Service for Apache Kafka clusters. Nor does it allow data plane access to publish or consume messages within clusters. For more information about this role, see Managed Service for Apache Kafka predefined roles.
Properties of a Managed Service for Apache Kafka cluster
When you create or update a Managed Service for Apache Kafka cluster, you must specify the following properties.
Cluster name
The name or ID of the Managed Service for Apache Kafka cluster that you are creating. For guidelines on how to name a cluster, see Guidelines to name a Managed Service for Apache Kafka resource. The name of a cluster is immutable.
Location
The location where you are creating the cluster. The location must be one of the supported Google Cloud regions. The location of a cluster cannot be changed later. For a list of available locations, see Managed Service for Apache Kafka locations.
Capacity configuration
Capacity configuration requires you to configure the number of vCPUs and the amount of memory for your Kafka setup. For more information on how to configure the capacity of a cluster, see Estimate vCPUs and memory for your Managed Service for Apache Kafka cluster.
The following are the properties for capacity configuration:
vCPUs: The number of vCPUs assigned to a cluster. The minimum value is 3 vCPUs. The number must also be a multiple of 3.
Memory: The amount of memory that is assigned to the cluster. You must provision between 1 GiB and 8 GiB per vCPU. The amount of memory can be increased or decreased within these limits after the cluster is created.
For example, if you create a cluster with 6 vCPUs, the minimum memory you can allocate to the cluster is 6 GiB (1 GiB per vCPU), and the maximum is 48 GiB (8 GiB per vCPU).
Network configuration
Network configuration is a list of subnets in the VPCs where the cluster is accessible. The IP addresses of the broker and bootstrap server are automatically allocated in each subnet. In addition, DNS entries for these IP addresses are created for each in the corresponding VPCs.
The following are some guidelines for your network configuration:
A minimum of 1 subnet is required for a cluster. The maximum is 10.
Each subnet must be in the same region as the cluster. The project or VPC can be in a different region.
Labels
Labels are key-value pairs that help you with organization and identification.
Labels enable categorizing resources based on environment. Examples are
"env:production"
and "owner:data-engineering"
.
You can filter and search for resources based on their labels. For example,
assume you have multiple Managed Service for Apache Kafka clusters for
different departments. You can
configure and search for clusters with the label "department:marketing"
to find the relevant one quickly.
Encryption
Managed Service for Apache Kafka can encrypt messages with Google-managed encryption keys (default) or Customer-managed encryption keys (CMEK). Every message is encrypted at rest and in transit. The encryption type for a cluster is immutable.
Google-managed encryption keys are used by default. These keys are created, managed, and stored entirely by Google Cloud within its infrastructure.
CMEK(s) are encryption keys that you manage using Cloud Key Management Service. This feature lets you have greater control over the keys that are used to encrypt data at rest within supported Google Cloud services. Using CMEK incurs additional costs related to Cloud Key Management Service. For CMEK usage, your key ring must be in the same location as the resources you use it with. For more information, see Configure message encryption.
Estimate vCPUs and memory for your cluster
Your goal is to pick the right capacity configuration. To do that, you must understand the throughput your cluster can handle. This section discusses how to estimate the number of vCPUs and size of memory required for your cluster.
Perform the following steps:
Calculate your write-equivalent data rate.
As a rule of thumb, read traffic is 3-4 times more efficient to process than write traffic. The write-equivalent data rate accounts for this difference and can be calculated as follows:
Write-equivalent rate = (publish rate) + (read rate / 3)
Assume a sample estimate that uses a publish rate of 50MBps and a read rate of 100MBps.
Write-equivalent rate = 50 + (100 / 3) = 83.33MBps
Determine the target vCPU utilization.
Start with an average utilization target of
50%
over a 30-minute time period. If you need to account for spiky traffic, decrease your target utilization to30%
or40%
and provision more CPUs. Higher utilization is cheaper but riskier if traffic exceeds estimates. You might encounter high latencies, producer back-offs, and potential out-of-memory issues if the traffic exceeds your estimation.Target utilization=50% or 0.5
Calculate the number of vCPUs required.
Divide your write-equivalent data rate by
10MBps
which is the estimated capacity for a single vCPU in a single zone.Divide the result by your target utilization rate that you determined in step 2.
Multiply by
3
to account for replication across availability zones.Number of vCPUs = ceiling (83.33 / 10 / 0.5 ) * 3 = 50 vCPUs
Multiply your vCPU count by
4GB
to estimate the required RAM.4GB of RAM is recommended for each vCPU.
Amount of memory= 50 * 4GB = 200GB RAM
These calculations assume messages of size between 1 and 100KB. Very large messages impact calculations, and significant traffic spikes require additional resources. While you cannot decrease vCPUs below the number of brokers in your cluster, you can adjust memory size.
Here's why vCPU reduction is limited: Managed Service for Apache Kafka doesn't allow decreasing the number of brokers. Each broker requires a minimum of 1 CPU. For example, a cluster configured with 45 CPUs (with a maximum of 15 CPUs per broker) results in 3 brokers. In this scenario, you can decrease CPUs to as low as 3 (1 CPU per broker), but no further.
Test with your real workload for the most accurate sizing. Watch your cluster's resource usage and scale up if needed.
Create a cluster
Before you create a cluster, review the documentation of cluster properties.
Creating a cluster usually takes 20-30 minutes.
To create a cluster, follow these steps:
Console
-
In the Google Cloud console, go to the Clusters page.
- Select Create.
The Create Kafka cluster page opens.
- For the Cluster name, enter a string.
For more information about how to name a cluster, see Guidelines to name a Managed Service for Apache Kafka resource.
- For Location, enter a supported location.
For more information about supported locations, see Supported Managed Service for Apache Kafka locations.
- For Capacity configuration, enter values for Memory
and vCPUs.
For more information about how to size an Managed Service for Apache Kafka cluster, see Estimate vCPUs and memory for your Managed Service for Apache Kafka cluster.
- For Network configuration, enter the following details:
- Project: The project where the subnetwork is located. The subnet must be located in the same region as the cluster, but the project might be different.
- Network: The network to which the subnet is connected.
- Subnetwork: The name of the subnet.
- Subnet URI path: This field is automatically
populated. Or, you can enter the subnet path here. The name
of the subnet must be in the format:
projects/PROJECT_ID/regions/REGION/subnetworks/SUBNET_ID
. - Click Done.
- (Optional) Add additional subnets by clicking Add a connected
subnet.
You can add additional subnets, up to a maximum value of ten.
- Retain the other default values.
- Click Create.
gcloud
-
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
-
Run the
gcloud managed-kafka clusters create
command:gcloud managed-kafka clusters create CLUSTER_ID \ --location=LOCATION \ --cpu=CPU \ --memory=MEMORY \ --subnets=SUBNETS \ --auto-rebalance \ --encryption-key=ENCRYPTION_KEY \ --async \ --labels=LABELS
Replace the following:
-
CLUSTER_ID: The ID or name of the cluster.
For more information about how to name a cluster, see Guidelines to name a Managed Service for Apache Kafka resource.
-
LOCATION: The location of the cluster.
For more information about supported locations, see Supported Managed Service for Apache Kafka locations.
-
CPU: The number of vCPUs for the cluster.
For more information about how to size an Managed Service for Apache Kafka cluster, see Estimate vCPUs and memory for your Managed Service for Apache Kafka cluster.
-
MEMORY: The amount of memory for the cluster. Use "MB", "MiB", "GB", "GiB", "TB", or "TiB" units. For example, "10GiB".
-
SUBNETS: The list of subnets to connect to. Use commas to separate multiple subnet values.
The format of the subnet is
projects/PROJECT_ID/regions/REGION/subnetworks/SUBNET_ID
. -
auto-rebalance
: Enables automatic rebalancing of topic partitions among brokers when the number of CPUs in the cluster changes. This is enabled by default. -
ENCRYPTION_KEY: ID of the customer-managed encryption key to use for the cluster.
The format is
projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/CRYPTO_KEY
. -
--async
: Lets the system send the create request and immediately returns a response, without waiting for the operation to complete. With the--async
flag, you can continue with other tasks while the cluster creation happens in the background. If you don't use the flag, the system waits for the operation to complete before returning a response. You have to wait until the cluster is fully updated before you can continue with other tasks. -
LABELS: Labels to associate with the cluster.
For more information about the format for labels, see Labels.
You get a response similar to the following:
Create request issued for: [CLUSTER_ID] Check operation [projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID] for status.
Store the
OPERATION_ID
to track progress. -
Terraform
You can use a Terraform resource to create a cluster.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
Go
Java
Python
Monitor the cluster creation operation
You can run the following command only if you ran the gcloud CLI for creating the cluster.
Creating a cluster usually takes 20-30 minutes. To track progress of the cluster creation, the
gcloud managed-kafka clusters create
command uses a long running operation (LRO), which you can monitor using the following command:curl -X GET \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ "https://managedkafka.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID"
Replace the following:
OPERATION_ID
with the value of the operation ID from the previous section.LOCATION
with the value of the location from the previous section.PROJECT_ID
with the project for your Kafka cluster.
Troubleshooting
The following are some errors you may encounter when creating clusters.
Service agent service-${PROJECT_NUMBER}@gcp-sa-managedkafka.iam.gserviceaccount.com
has not been granted the required role cloudkms.cryptoKeyEncrypterDecrypter to
encrypt data using the KMS key.
The Managed Service for Apache Kafka service agent is missing the required permission to access the Cloud KMS key. See the documentation for required roles for configuring CMEK.
Service does not have permission to retrieve subnet. Please grant
service-${PROJECT_NUMBER}@gcp-sa-managedkafka.iam.gserviceaccount.com the
managedkafka.serviceAgent role in the IAM policy of the project
${SUBNET_PROJECT} and ensure the Compute Engine API is enabled in project
${SUBNET_PROJECT}
The Managed Service for Apache Kafka service agent is missing the required role to configure networking in the VPC network that the Kafka clients run in. See the documentation for required permissions for configuring networking.