[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-11。"],[],[],null,["# Rate Limiting\n\n| **Beta**\n|\n|\n| This product or feature is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA products and features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\nThis page describes how to use Service Infrastructure to implement rate limiting for\n[managed services](/service-infrastructure/docs/glossary#managed) that are integrated with the\nService Management API.\n\nA managed service can serve many [service consumers](/service-infrastructure/docs/glossary#consumer). In\norder to protect system capacity and ensure fair usage, a managed service often\nuses rate limiting to distribute its capacity among its service consumers.\nThe [Service Management](/service-infrastructure/docs/glossary#management) and\n[Service Control](/service-infrastructure/docs/glossary#control) APIs allow you to manage and enforce\nrate limiting.\n\nConfiguring rate limits\n-----------------------\n\nTo use the rate limiting feature, configure `_quota metrics_` and\n`_quota limits_` in the [service configuration](/service-infrastructure/docs/glossary#config) for your\n[service producer](/service-infrastructure/docs/glossary#producer) project.\n\nCurrently, the supported rate limiting is the number of requests per minute per\nservice consumer, where the service consumer is a Google Cloud project as identified\nby an API key, a project id, or a project number. For rate limiting, the concept\nof request is an opaque concept. A service can choose an HTTP request as a\nrequest, or a byte of payload as a request. The rate limiting feature is\nindependent of the semantics of a request.\n\n### Quota metrics\n\nA metric is a named counter for measuring a certain value over time. For\nexample, the number of HTTP requests a service receives is a metric. A quota\nmetric is a metric that is used for quota and rate limiting purposes. When an\nactivity occurs with a service, one or more quota metrics may increase. When the\nmetric value hits the predefined quota limit, the service should reject the\nactivity with a `429` error.\n\n### Quota limits\n\nA quota limit represents an enforceable limit on a quota metric. For example,\nthe number of requests per service consumer per minute is a quota limit. At this\ntime, the only supported type of quota limit is per minute per consumer,\nspecifically, `1/min/{project}`.\n\nThe actual rate limit for a (service, consumer) pair is controlled by 3\nsettings:\n\n- The default limit specified for the managed service.\n- The service producer override for the service consumer.\n- The service consumer override for the service consumer.\n\nThe effective rate limit is:\n\n- The default limit if there is no override.\n- The service producer override if there is a service producer override, but no service consumer override.\n- The minimum(service consumer override, default limit) if there is a service consumer override, but no service producer override.\n- The minimum(service consumer override, service producer override) if there are both service producer and service consumer overrides.\n\nEnforcing rate limiting\n-----------------------\n\nTo enforce rate limiting, each server that belongs to a managed service needs to\ncall the Service Control API\n[`services.allocateQuota`](/service-infrastructure/docs/service-control/reference/rest/v1/services/allocateQuota)\nmethod regularly. If the response of the\n[`services.allocateQuota`](/service-infrastructure/docs/service-control/reference/rest/v1/services/allocateQuota)\nmethod indicates that the usage is above the limit, the server should reject the\nincoming request with a `429` error. For more information, see the reference\ndocumentation for the\n[`services.allocateQuota`](/service-infrastructure/docs/service-control/reference/rest/v1/services/allocateQuota)\nmethod.\n\nIt is recommended that each server should use batching, caching, and predictive\nlogic to improve system performance and reliability. In general, one server\nshould only call the\n[`services.allocateQuota`](/service-infrastructure/docs/service-control/reference/rest/v1/services/allocateQuota)\nmethod once per second for the same (service, consumer, metric) tuple.\n\nThe following example demonstrates how to call the\n[`services.allocateQuota`](/service-infrastructure/docs/service-control/reference/rest/v1/services/allocateQuota)\nmethod to check for rate limiting. The important request parameters that must be\nset correctly are the service name, the consumer id, the metric name, and the\nmetric value. The\n[`services.allocateQuota`](/service-infrastructure/docs/service-control/reference/rest/v1/services/allocateQuota)\nmethod will try to increase the usage by the specified amount for the (service,\nconsumer, metric) tuple. If the increased usage goes above the limit, an error\nis returned. The following example uses the `gcurl` command to demonstrate the\ncall. To learn how to set this up, see\n[Getting Started with the Service Control API](/service-infrastructure/docs/service-control/getting-started).\n**Note:** A service consumer can be specified using a project id, a project number, or an API key. \n\n```\ngcurl -d '{\n \"allocateOperation\": {\n \"operationId\": \"123e4567-e89b-12d3-a456-426655440000\",\n \"methodName\": \"google.example.hello.v1.HelloService.GetHello\",\n \"consumerId\": \"project:endpointsapis-consumer\",\n \"quotaMetrics\": [{\n \"metricName\": \"endpointsapis.appspot.com/requests\",\n \"metricValues\": [{\n \"int64Value\": 1\n }]\n }],\n \"quotaMode\": \"NORMAL\"\n }\n}' https://servicecontrol.googleapis.com/v1/services/endpointsapis.appspot.com:allocateQuota\n{\n \"operationId\": \"123e4567-e89b-12d3-a456-426655440000\",\n \"quotaMetrics\": [\n {\n \"metricName\": \"serviceruntime.googleapis.com/api/consumer/quota_used_count\",\n \"metricValues\": [\n {\n \"labels\": {\n \"/quota_name\": \"endpointsapis.appspot.com/requests\"\n },\n \"int64Value\": \"1\"\n }\n ]\n }\n ],\n \"serviceConfigId\": \"2017-09-10r0\"\n}\n```\n\nError handling\n--------------\n\nIf the HTTP response code is `200`, and the response contains\n[`RESOURCE_EXHAUSTED`](/service-infrastructure/docs/service-control/reference/rest/v1/services/allocateQuota#Code)\n[`QuotaError`](/service-infrastructure/docs/service-control/reference/rest/v1/services/allocateQuota#QuotaError),\nyour server should reject the request with a `429` error. If the response\ndoesn't contain any quota error, your server should\ncontinue serving the incoming requests. For all other quota errors, your server\nshould reject the request with a `409` error. Due to the security risks, you\nneed to be very careful about what error information you include in the error\nmessage.\n\nFor all other HTTP response codes, it is likely your server has some programming\nbug. It is recommended your server continue to serve the incoming requests while\nyou debug the problem. If the\n[`services.allocateQuota`](/service-infrastructure/docs/service-control/reference/rest/v1/services/allocateQuota)\nmethod returns any unexpected error, your service should log the error and\naccept the income requests. You can debug the error later.\n\n### Fail Open\n\nThe rate limiting feature is for protecting your managed service from getting\noverloaded and distributing your service capacity fairly among service\nconsumers. Because most service consumers should not reach their rate limits\nduring normal operations, your managed service should accept all incoming\nrequests if the rate limiting feature is unavailable, also known as *fail open*.\nThis prevents your service availability being affected by the rate limiting\nsystem.\n\nIf you use the\n[`services.allocateQuota`](/service-infrastructure/docs/service-control/reference/rest/v1/services/allocateQuota)\nmethod, your service must ignore `500`, `503` and `504` errors without any\nretry. To prevent a hard dependency on the rate limiting feature, the\nService Control API issues a limited amount of error injection on a\nregular basis."]]