[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-20。"],[],[],null,["# External Application Load Balancer performance best practices\n\n[Cloud Load Balancing](/load-balancing/docs/load-balancing-overview) provides\nmechanisms to distribute user traffic to multiple instances of an application.\nThey do this by spreading the load across application instances and delivering\noptimal application performance to end users. This page describes some best\npractices to ensure that the load balancer is optimized for your application. To\nensure optimal performance, we recommend benchmarking your application's traffic\npatterns.\n\nPlace backends close to clients\n-------------------------------\n\nThe closer your users or client applications are to your workloads (load\nbalancer backends), the lower the network latency between them. Therefore,\ncreate your load balancer backends in the region closest to where you anticipate\nyour users' traffic to arrive at the Google frontend. In many cases, running\nyour backends in multiple regions is necessary to minimize latency to clients in\ndifferent parts of the world.\n\nFor more information, see the following topics:\n\n- [Traffic distribution for\n external Application Load Balancers](/load-balancing/docs/https#load_distribution_algorithm)\n- [Best practices for Compute Engine region\n selection](/solutions/best-practices-compute-engine-region-selection)\n\nEnable caching with Cloud CDN\n-----------------------------\n\nTurn on Cloud CDN and caching as part of your default,\nglobal external Application Load Balancer configuration. For more information, see\n[Cloud CDN](/cdn/docs/overview).\n\nWhen you enable Cloud CDN, it might take a few minutes before responses\nbegin to be cached. Cloud CDN caches only responses with [cacheable\ncontent](/cdn/docs/caching#cacheability). If responses for a URL aren't being\ncached, check which response headers are being returned for that URL, and how\ncacheability is [configured](/cdn/docs/using-cache-modes) for your backend. For\nmore details, see [Cloud CDN\ntroubleshooting](/cdn/docs/troubleshooting-steps#responses-not-cached).\n\nForwarding rule protocol selection\n----------------------------------\n\n- **For the global external Application Load Balancer and the classic Application Load Balancer** ,\n we recommend HTTP/3 which is an internet protocol built on top\n of [IETF QUIC](https://datatracker.ietf.org/doc/rfc9000/).\n HTTP/3 is enabled by default in all major browsers, [Android\n Cronet](https://developer.android.com/guide/topics/connectivity/cronet), and\n [iOS](https://developer.apple.com/documentation/technotes/tn3102-http3-in-your-app).\n To use HTTP/3 for your applications, ensure that UDP\n traffic is not blocked or rate-limited on your network and that HTTP/3 was not\n previously [disabled on your\n global external Application Load Balancers](/load-balancing/docs/https#QUIC). Clients that don't\n yet support HTTP/3, such as older browsers or networking libraries, won't be\n impacted. For more information, see [HTTP/3\n QUIC](/blog/products/networking/cloud-cdn-and-load-balancing-support-http3).\n\n- **For the regional external Application Load Balancer**, we support\n HTTP/1.1, HTTPS, and HTTP/2. Both HTTPS and HTTP/2 require some upfront\n overhead to set up TLS.\n\nBackend service protocol selection\n----------------------------------\n\nYour choice of backend protocol (HTTP, HTTPS, or HTTP/2) impacts application\nlatency and the network bandwidth available for your application. For example,\nusing HTTP/2 between the load balancer and the backend instance can require\nsignificantly more TCP connections to the instance than HTTP(S). Connection\npooling, an optimization that reduces the number of these connections with\nHTTP(S), is not available with HTTP/2. As a result, you might see high\nbackend latencies because backend connections are made more frequently.\n\nThe backend service protocol also impacts how the traffic is [encrypted in\ntransit](/docs/security/encryption-in-transit#how_traffic_gets_routed). With\nexternal HTTP(S) load balancers, all traffic going to backends that reside\nwithin Google Cloud VPC networks is automatically encrypted. This is called\nautomatic network-level encryption. However, automatic network-level encryption\nis only available for communications with instance groups and zonal NEG\nbackends. For all other backend types, we recommend you use secure\nprotocol options such as HTTPS and HTTP/2 to encrypt communication with the\nbackend service. For details, see [Encryption from the load balancer to the\nbackends](/load-balancing/docs/ssl-certificates/encryption-to-the-backends#encryption-to-backends).\n\nRecommended connection duration\n-------------------------------\n\nNetwork conditions change and the set of backends might change based on load.\nFor applications which generate a lot of traffic to a single service, a long\nrunning connection isn't always an optimal setup. Instead of using a single\nconnection to the backend indefinitely, we recommend that you choose a maximum\nconnection lifetime (for example, between 10 and 20 minutes)\nand/or a maximum number of requests (for example, between 1000 and 2000\nrequests), after which a new connection is used for new requests. The\nold connection is closed when all active requests using it are done.\n\nThis lets the client application benefit from changes in the set of backends,\nwhich include the load balancer's proxies and any network reoptimization that's\nrequired to serve the clients.\n\nBalancing mode selection criteria\n---------------------------------\n\nFor better performance, consider choosing the backend group for each new request\nbased on which backend is the most responsive. This can be achieved by using the\n`RATE` balancing mode. In this case, the backend group with the lowest average\nlatency over recent requests, or, for HTTP/2 and HTTP/3, the backend group with\nthe fewest outstanding requests, is chosen.\n\nThe `UTILIZATION` balancing mode applies only to instance group backends and\ndistributes traffic based on the utilization of VM instances in an instance\ngroup.\n\nConfigure session affinity\n--------------------------\n\nIn some cases, it might be beneficial for the same backend to handle requests\nthat are from the same end users, or related to the same end user, at least for\na short period of time. This can be configured by using *session affinity*, a\nsetting configured on the backend service. Session affinity controls the\ndistribution of new connections from clients to the load balancer's backends.\nYou can use session affinity to ensure that the same backend handles requests\nfrom the same resource, for example, related to the same user account or from\nthe same document.\n\nSession affinity is specified for the entire backend service resource, and not\non a per backend basis. However, a URL map can point to multiple backend\nservices. Therefore, you don't have to use just one session affinity type for\nthe load balancer. Depending on your application, you can use different backend\nservices with different session affinity settings. For example, if a part of\nyour application is serving static content to many users, it is unlikely to\nbenefit from session affinity. You would use a\n[Cloud CDN](/cdn/docs/overview)-enabled backend service to serve cached\nresponses instead.\n\nFor more information, see [session\naffinity](/load-balancing/docs/backend-service#session_affinity)."]]