exportENV=my-environment-nameexportNAMESPACE=apigee#the namespace where apigee is deployedexportCOMPONENT=runtime#can be udca or synchronizerexportMAX_REPLICAS=2exportMIN_REPLICAS=1
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-29。"],[[["\u003cp\u003eApigee hybrid services in Kubernetes can be scaled using command-line tools or by modifying the \u003ccode\u003eoverrides.yaml\u003c/code\u003e configuration file, depending on the specific service.\u003c/p\u003e\n"],["\u003cp\u003eThe scaling of services like MART, Watcher, ConnectAgent, UDCA, Synchronizer, and Runtime components is managed by adjusting \u003ccode\u003ereplicaCountMin\u003c/code\u003e and \u003ccode\u003ereplicaCountMax\u003c/code\u003e properties, which interact with the Horizontal Pod Autoscaler (HPA) to automatically add or remove pods.\u003c/p\u003e\n"],["\u003cp\u003eAdvanced scaling configurations allow for environment-specific settings, such as setting different minimum and maximum replica counts for different environments or components within an environment, using the \u003ccode\u003ekubernetes patch\u003c/code\u003e command to modify the \u003ccode\u003emaxReplicas\u003c/code\u003e property.\u003c/p\u003e\n"],["\u003cp\u003eMetrics-based scaling is available for \u003ccode\u003eapigee-runtime\u003c/code\u003e pods, utilizing metrics like \u003ccode\u003eserverNioTaskWaitTime\u003c/code\u003e and \u003ccode\u003eserverMainTaskWaitTime\u003c/code\u003e, with adjustable \u003ccode\u003ehpaBehavior\u003c/code\u003e fields for customizing scale-up and scale-down policies based on various thresholds.\u003c/p\u003e\n"],["\u003cp\u003eWhile metrics-based scaling is enabled by default, it can be effectively disabled by setting the \u003ccode\u003eserverMainTaskWaitTime\u003c/code\u003e and \u003ccode\u003eserverNioTaskWaitTime\u003c/code\u003e thresholds very high, thus preventing the trigger of this scaling and falling back to CPU based scaling.\u003c/p\u003e\n"]]],[],null,["# Scale and autoscale runtime services\n\n| You are currently viewing version 1.11 of the Apigee hybrid documentation. **This version is end of life.** You should upgrade to a newer version. For more information, see [Supported versions](/apigee/docs/hybrid/supported-platforms#supported-versions).\n\nYou can scale most services running in Kubernetes from the\ncommand line or in a configuration override. You can set scaling\nparameters for Apigee hybrid runtime services in the\n[`overrides.yaml` file](/apigee/docs/hybrid/v1.11/customize-services).\n\nAdvanced configurations\n-----------------------\n\n\nIn some scenarios, you may need to use advanced scaling options. Example scenarios include:\n\n- Setting different scaling options for each environment. For example, where env1 has a `minReplica` of 5 and env2 has a `minReplica` of 2.\n- Setting different scaling options for each component within an environment. For example, where the `udca` component has a `maxReplica` of 5 and the `synchronizer` component has a `maxReplica` of 2.\n\n\nThe following example shows how to use the `kubernetes patch` command to change\nthe `maxReplicas` property for the `runtime` component:\n\n1. Create environment variables to use with the command: \n\n ```gdscript\n export ENV=my-environment-name\n export NAMESPACE=apigee #the namespace where apigee is deployed\n export COMPONENT=runtime #can be udca or synchronizer\n export MAX_REPLICAS=2\n export MIN_REPLICAS=1\n ```\n2. Apply the patch. Note that this example assumes that `kubectl` is in your `PATH`: \n\n ```carbon\n kubectl patch apigeeenvironment -n $NAMESPACE \\\n $(kubectl get apigeeenvironments -n $NAMESPACE -o jsonpath='{.items[?(@.spec.name == \"'$ENV'\" )]..metadata.name}') \\\n --patch \"$(echo -e \"spec:\\n components:\\n $COMPONENT:\\n autoScaler:\\n maxReplicas: $MAX_REPLICAS\\n minReplicas: $MIN_REPLICAS\")\" \\\n --type merge\n ```\n3. Verify the change: \n\n ```text\n kubectl get hpa -n $NAMESPACE\n ```\n\nEnvironment-based scaling\n-------------------------\n\n\nBy default, scaling is described at the organization level. You can\noverride the default settings by specifying environment-specific scaling\nin the `overrides.yaml` file as shown in the following example: \n\n```text\nenvs:\n # Apigee environment name\n - name: test\n components:\n # Environment-specific scaling override\n # Otherwise, uses scaling defined at the respective root component\n runtime:\n replicaCountMin: 2\n replicaCountMax: 20\n \n```\n\nMetrics-based scaling\n---------------------\n\nWith metrics-based scaling, the runtime can use CPU and application metrics to scale the `apigee-runtime` pods.\nThe Kubernetes [Horizontal Pod Autoscaler (HPA) API](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/horizontal-pod-autoscaler-v2/#HorizontalPodAutoscalerSpec),\nuses the `hpaBehavior` field to configure the scale-up and scale-down behaviors of the target service.\nMetrics-based scaling is not available for any other components in a hybrid deployment.\n| **Note:** The behavior and meaning of each of the individual fields in the YAML examples in this section are not unique to Apigee hybrid. These concepts come directly from [Kubernetes scaling policies](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-policies).\n| **Note** : An internal connection between Prometheus and the Prometheus-Adapter on port 6443 must be open in order to receive metrics data and enable scaling. For more information on required ports, see [Internal connections](/apigee/docs/hybrid/v1.11/ports#internal).\n\nScaling can be adjusted based on the following metrics:\n\nThe following example from the `runtime` stanza in the `overrides.yaml`\nillustrates the standard parameters (and permitted ranges) for scaling `apigee-runtime` pods in a hybrid implementation:\n\n```actionscript-3\nhpaMetrics:\n serverMainTaskWaitTime: 400M (300M to 450M)\n serverNioTaskWaitTime: 400M (300M to 450M)\n targetCPUUtilizationPercentage: 75\n hpaBehavior:\n scaleDown:\n percent:\n periodSeconds: 60 (30 - 180)\n value: 20 (5 - 50)\n pods:\n periodSeconds: 60 (30 - 180)\n value: 2 (1 - 15)\n selectPolicy: Min\n stabilizationWindowSeconds: 120 (60 - 300)\n scaleUp:\n percent:\n periodSeconds: 60 (30 - 120)\n value: 20 (5 - 100)\n pods:\n periodSeconds: 60 (30 - 120)\n value: 4 (2 - 15)\n selectPolicy: Max\n stabilizationWindowSeconds: 30 (30 - 120)\n \n```\n\n### Configure more aggressive scaling\n\nIncreasing the `percent` and `pods` values of the scale-up policy will result in a more aggressive\nscale-up policy. Similarly, increasing the `percent` and `pods` values in `scaleDown`\nwill result in an aggressive scale-down policy. For example:\n\n```actionscript-3\nhpaMetrics:\n serverMainTaskWaitTime: 400M\n serverNioTaskWaitTime: 400M\n targetCPUUtilizationPercentage: 75\n hpaBehavior:\n scaleDown:\n percent:\n periodSeconds: 60\n value: 20\n pods:\n periodSeconds: 60\n value: 4\n selectPolicy: Min\n stabilizationWindowSeconds: 120\n scaleUp:\n percent:\n periodSeconds: 60\n value: 30\n pods:\n periodSeconds: 60\n value: 5\n selectPolicy: Max\n stabilizationWindowSeconds: 30\n```\n\nIn the above example, the `scaleDown.pods.value` is increased to **5** , the `scaleUp.percent.value `\nis increased to **30** , and the `scaleUp.pods.value` is increased to **5**.\n| **Note** : The value of `periodSeconds` should not go below 30.\n\n### Configure less aggressive scaling\n\nThe `hpaBehavior` configuration values can also be decreased to implement less aggressive scale-up and scale-down policies. For example:\n\n```actionscript-3\nhpaMetrics:\n serverMainTaskWaitTime: 400M\n serverNioTaskWaitTime: 400M\n targetCPUUtilizationPercentage: 75\n hpaBehavior:\n scaleDown:\n percent:\n periodSeconds: 60\n value: 10\n pods:\n periodSeconds: 60\n value: 1\n selectPolicy: Min\n stabilizationWindowSeconds: 180\n scaleUp:\n percent:\n periodSeconds: 60\n value: 20\n pods:\n periodSeconds: 60\n value: 4\n selectPolicy: Max\n stabilizationWindowSeconds: 30\n```\n\nIn the above example, the `scaleDown.percent.value` is decreased to **10** , the `scaleDown.pods.value`\nis decreased to **1** , and the `scaleUp.stablizationWindowSeconds` is increased to **180**.\n\nFor more information about metrics-based scaling using the `hpaBehavior` field, see [Scaling policies](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-policies).\n\n### Disable metrics-based scaling\n\nWhile metrics-based scaling is enabled by default and cannot be completely disabled, you can\nconfigure the metrics thresholds at a level that metrics-based scaling will not be triggered. The resulting scaling\nbehavior will be the same as CPU-based scaling. For example, you can use the following configuration to prevent triggering metrics-based scaling:\n\n```actionscript-3\nhpaMetrics:\n serverMainTaskWaitTime: 4000M\n serverNioTaskWaitTime: 4000M\n targetCPUUtilizationPercentage: 75\n hpaBehavior:\n scaleDown:\n percent:\n periodSeconds: 60\n value: 10\n pods:\n periodSeconds: 60\n value: 1\n selectPolicy: Min\n stabilizationWindowSeconds: 180\n scaleUp:\n percent:\n periodSeconds: 60\n value: 20\n pods:\n periodSeconds: 60\n value: 4\n selectPolicy: Max\n stabilizationWindowSeconds: 30\n```\n\nTroubleshooting\n---------------\n\nThis section describes troubleshooting methods for common errors you may encounter while configuring scaling and auto-scaling.\n\n### HPA shows `unknown` for metrics values\n\nIf metrics-based scaling does not work and the HPA shows `unknown`\nfor metrics values, use the following command to check the HPA output: \n\n```\nkubectl describe hpa HPA_NAME\n```\n\nWhen running the command, replace \u003cvar translate=\"no\"\u003eHPA_NAME\u003c/var\u003e with the name of the HPA you wish to view.\n\nThe output will show the CPU target and utilization of the service, indicating that CPU scaling will work\nin the absence of metrics-based scaling. For HPA behavior using multiple\nparameters, see [Scaling on multiple metrics](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-on-multiple-metrics)."]]