Quotas and limits

This document lists the quotas and system limits that apply to AlloyDB for PostgreSQL.

Quotas specify the amount of a countable, shared resource that you can use. Quotas are defined by Google Cloud services such as AlloyDB for PostgreSQL.
System limits are fixed values that cannot be changed.

Quotas

Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.

The Cloud Quotas system does the following:

Monitors your consumption of Google Cloud products and services
Restricts your consumption of those resources
Provides a way to request changes to the quota value and automate quota adjustments

In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.

Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.

Google Cloud also offers free trial quotas that provide limited access for projects to help you explore Google Cloud at no charge.

Not all projects have the same quotas. If your Google Cloud usage increases, your quotas might increase.

For more information about quotas, see the Cloud Quotas documentation.

For information specific to quotas imposed by AlloyDB, see Rate quotas and Resource quotas.

There are also limits on AlloyDB resources. Unlike quotas, system limits can't be changed.

Permissions for checking and editing quotas

To view your quotas, you must have the serviceusage.quotas.get permission.

To change your quotas, you must have the serviceusage.quotas.update permission.

These permissions are included by default in the basic IAM roles of Owner and Editor and in the predefined Quota Administrator role.

Check your quotas

By default, the quotas table in the Google Cloud console lists quotas for all services. You can check the current quotas for AlloyDB resources in your project by using the Filter list in the table.

To check the current quotas for AlloyDB resources in your project, complete the following steps:

In the Google Cloud console, go to the Quotas page.

Go to Quotas
In the quotas table, click Filter.
Select Service from the Properties list, and then select AlloyDB API from the Values list.

Increase your quotas

As your use of Google Cloud expands over time, your quotas can increase accordingly. If you expect a notable upcoming increase in usage, make your request a few days in advance to ensure your quotas are adequately sized.

In the Quotas page, click Filter.
Select Service from the Properties list, and then select AlloyDB API from the Values list.

If you do not see AlloyDB API, it means that the AlloyDB Admin API is not enabled.
Select the quotas you want to change.
Click Edit quotas.
Enter your name, email, phone number, and click Next.
Enter your quota request and click Submit request.

Rate quotas

AlloyDB supports rate quotas, also known as rate limits or API quotas. Rate quotas define the number of requests that you can make to the AlloyDB Admin API.

Each rate quota corresponds to all the requests for a group of one or more AlloyDB Admin API methods. Rate quotas reset after a time interval that's specific to the service—for example, the number of API requests per day.

When you use the Google Cloud CLI or the Google Cloud console, you're making requests to the API and these requests count toward your rate limits. If you use service accounts to access the API, those requests also count toward your rate limit.

The rate quotas are enforced and automatically refilled over 60-second (1-minute) intervals. That means that if your project reaches a rate quota's maximum any time within 60 seconds, you must wait for that quota to refill before making more requests in that group. If your project exceeds a rate limit, you receive an HTTP 429 status code with the reason rateLimitExceeded.

The AlloyDB Admin APIs are divided into six groups based on the operation type. The rate quotas are imposed per minute, per API group, per project, per region, and per user. For each unique combination of these attributes, AlloyDB imposes a separate quota. For example, if 100 users are accessing the Mutate APIs in a single minute for a given project and region, each user is given a default quota in the range of 180—250 requests per minute for each project and region combination.

The default quota range for each group is as follows:

Group name	Description	Default quota range in queries per minute	API methods
Connect APIs	Establish new connections.	180-2000	`projects.locations.clusters.generateClientCertificate` `projects.locations.clusters.instances.getConnectionInfo`
Get APIs	Read a single resource.	180-1000	`projects.locations.clusters.get` `projects.locations.clusters.instances.get` `projects.locations.backups.get` `projects.locations.get`
Get operation API	Get the latest state of a long-running operation.	950-1400	`projects.locations.operations.get`
List APIs	Read a group of resources of the same type.	180-1000	`projects.locations.clusters.list` `projects.locations.clusters.instances.list` `projects.locations.backups.list` `projects.locations.supportedDatabaseFlags.list` `projects.locations.list`
List operations API	List operations that match a specific filter in the request.	2200-3000	`projects.locations.operations.list`
Mutate APIs	Modify resource state.	180-250	`projects.locations.clusters.create` `projects.locations.clusters.patch` `projects.locations.clusters.delete` `projects.locations.clusters.restore` `projects.locations.clusters.instances.create` `projects.locations.clusters.instances.patch` `projects.locations.clusters.instances.delete` `projects.locations.clusters.instances.failover` `projects.locations.clusters.instances.restart` `projects.locations.backups.create` `projects.locations.backups.patch` `projects.locations.backups.delete` `projects.locations.operations.delete` `projects.locations.operations.cancel`

Resource quotas

AlloyDB supports resource quotas, also known as allocation quotas. Resource quotas are the maximum amount of resources that you can create for a resource type if those resources are available. Resource quotas restrict the use of resources that don't have a rate of usage, such as the number of virtual machine (VM) instances used by your project at a given time.

Resource quotas don't reset over time. Instead, you must take action to release the unused resources, such as deleting an unneeded cluster.

Resource quotas are imposed on the number of clusters and vCPUs used, as detailed in the following sections.

Resource quotas on clusters

This quota applies to the number of clusters per project per region. The default value for this quota ranges from 3 to 10 clusters per project per region depending upon the project's usage history. The maximum supported value for this quota is 15 clusters per project per region.

If you make a create or restore cluster request using the Google Cloud console, gcloud CLI, or AlloyDB Admin API, and if that results in quota violation, then the request fails with an an error message similar to the following:

Quota limit 'ClustersUsedPerProjectPerRegion' has been exceeded. Limit: 5 in region us-central1.

Resource quotas on vCPUs

This quota applies to the number of vCPUs per project per region. Each instance consumes some amount of this quota depending on how many VMs it uses. Each primary instance uses two VMs. Each read pool instance uses one VM for every node it contains. The number of vCPUs used by each VM is provided by you while creating or updating the instance.

The default value of the quota for all customer projects is 10,000 vCPUs.

If you make a create or update instance request using the Google Cloud console, gcloud CLI, or AlloyDB Admin API, and if that results in quota violation, then the request fails with an error message similar to the following:

Quota limit 'VCPUsUsedPerProjectPerRegion' has been exceeded. Limit: 128 in region us-central1.

Resource quotas on storage

This quota applies to the amount of data that can be stored in each cluster. The default value for this quota is 16 TiB per cluster. The maximum supported value is 128 TiB per cluster.

If you make a database write request, such as an INSERT statement that results in quota violation, then the request fails with the following error message:

AlloyDB instance exceeds available storage quota.

Resource availability

Resource quotas do not guarantee that resources are available at all times. If a resource is not physically available for your region, you cannot create new resources of that type, even if you have remaining quota in your project.

Limits

To request a limit increase, file a support case.

Item	Limit
Read pool nodes per cluster (across all read pool instances)	20
Maximum concurrent connections per instance	Defaults to 1,000; adjustable up to 240,000

Maximum concurrent connections

AlloyDB limits an instance's maximum concurrent connections to 1,000, unless you set its max_connections flag to a higher value.

Use the following table as a guideline to decide the max connections value based on your instance size:

VCPU	Memory	Recommended `max_connections` value
1	8	`500`
2	16	`1000`
4	32	`2000`
8	64	`4000`
16	128	`5000`
32	256	`5000`
48	384	`5000`
64	512	`5000`
72	576	`5000`
96	768	`5000`
128	864	`5000`

Note the following considerations before setting the value:

When you set the max_connections flag on a read pool instance, the new value must match or exceed the max_connections value of its cluster's primary instance.
We recommend running a maximum of four concurrent queries per instance vCPU.
For workloads that involve short-term connections, consider using a connection pooler such as pgbouncer or pgpool-II.
We recommend adding an application-side connection pooler such as HikariCP or c3p0.
If you decide to set the value to a value higher than the recommendations (up to 240,000), then consider the additional memory consumption for each active connection that would reduce the memory for the shared buffer.

This memory consumption can be calculated by multiplying the number of concurrent queries with the value set for the work_mem flag. The default value for this flag is 4MB or the number of vCPUs in the instance, whichever is higher.

Saved queries limits

Value	Limit
Maximum number of saved queries per project (including saved queries for other Google Cloud products)	10,000
Maximum size for each query	1 MiB