[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[],[],null,["# Troubleshoot issues\n\nThis page explains various error scenarios, and provides guidance for resolving\nthe errors.\n\nReplication scenarios\n---------------------\n\nThis section explains replication issues that might occur with your instance.\n\n### How do you monitor replication lags?\n\nMemorystore for Valkey has the [`/instance/replication/maximum_offset_diff`](/memorystore/docs/valkey/supported-monitoring-metrics#instance-level_metrics) metric. This metric\nmonitors the maximum replication offset difference (in bytes) for a node in a\nprimary instance.\n\nBy keeping the replication offset difference low, replicas can perform\nincremental sync operations more frequently and at a lower cost than full sync\noperations.\n\nWe recommend that you set a threshold for the `maximum_offset_diff` metric. If\nthe threshold is exceeded, then Memorystore for Valkey can notify you by an\nalert.\n\nBased on the [node type](/memorystore/docs/valkey/instance-node-specification#choose-node-type) for your instance, we\nrecommend that you set the threshold, as follows:\n\n- If the node type is `shared-core-nano`, `standard-small`, or `highmem-medium`, then set the threshold to be less than 64 MB.\n- If the node type is `highmem-xlarge`, then set the threshold to be less than 1 GB.\n\n| **Caution** : We recommend that you use the `shared-core-nano` node type for development or testing purposes only. If you run Memorystore for Valkey in a production environment, then we recommend using the `standard-small`, `highmem-medium`, or `highmem-xlarge` node types. For more information about these node types, see [Choose a node type](/memorystore/docs/valkey/instance-node-specification#choose-node-type).\n\n### What do you do if there's a replication lag between your primary instance and its replicas?\n\nThere might be a significant replication lag if the primary instance has too\nmany write operations, and the replicas can't catch up to replicate these\noperations. To resolve this issue, we recommend that you scale the capacity of\nthe instance by [increasing the number of shards for the instance](/memorystore/docs/valkey/scale-instance-capacity#scale_the_shard_count).\n\nMemory management scenarios\n---------------------------\n\nThis section explains memory management issues that your instance might\nencounter.\n\n### Which metric can you use to determine that your instance is under memory stress?\n\nTo monitor the [memory usage](/memorystore/docs/valkey/general-best-practices#monitor-memory-usage) for a Memorystore for Valkey\ninstance, we recommend that you view the [`/instance/memory/maximum_utilization`](/memorystore/docs/valkey/supported-monitoring-metrics#instance-level_metrics)\nmetric. If the memory usage of the instance approaches 80% and you expect data\nusage to grow, then [scale up the size of the instance](/memorystore/docs/valkey/scale-instance-capacity#scale_the_node_type) to improve performance and to make\nroom for new data.\n\nMonitoring scenarios\n--------------------\n\nThis section explains monitoring issues that your instance might encounter.\n\n### How do you set up alerts for Memorystore for Valkey?\n\nYou can use [Cloud Monitoring](/monitoring) to set alerts to notify you if any\nmetrics exceed thresholds that you set for your instance. For more information\nabout setting alerts in Cloud Monitoring, see [Set a Monitoring alert for memory usage](/memorystore/docs/valkey/monitor-instances#create-stackdriver-alert).\n\nConnection management scenarios\n-------------------------------\n\nThis section explains connection management issues that your instance might\nencounter.\n\n### If you reach your connection limit or receive a connection timeout, then what do you do?\n\nWhen you reach your connection limit, your client fails to connect to your\nserver. This is known as a *connection rejection*.\n\nIf this happens, then do the following:\n\n- Use the [`/instance/node/stats/rejected_connections_count`](/memorystore/docs/valkey/supported-monitoring-metrics#node-level_metrics) metric to determine the number of connections that Memorystore for Valkey rejects because the instance node reaches the maximum clients limit.\n- Use the [`/instance/node/clients/connected_clients`](/memorystore/docs/valkey/supported-monitoring-metrics#node-level_metrics) metric to determine the number of clients connected to the instance node. This way, you can see if all of the nodes in the instance are under the limit.\n- Stop any leaked or undesired connections by using the [`client kill`](https://valkey.io/commands/client-kill/) command.\n- Reduce the connection count or pool size in the client application. For more information, see the documentation associated with the client application.\n- Adjust the maximum clients limit. For more information, see [Configure an instance](/memorystore/docs/valkey/configure-instances).\n- [Scale your instance up to a larger node type](/memorystore/docs/valkey/scale-instance-capacity#scale_the_node_type) so that your instance has a higher connection limit.\n\nTimeout scenarios\n-----------------\n\nThis section explains timeout issues that your instance might encounter.\n\n### If you receive an I/O timeout, then what do you do?\n\nWhen a read or write operation in Memorystore for Valkey fails to complete\nwithin a specified time, then an I/O timeout occurs. This timeout might happen\nbecause of various reasons. For example, one or more nodes of your instance\nmight be overloaded.\n\nIf you receive an I/O timeout, then do the following:\n\n- Use the [`instance/cpu/maximum_utilization`](/memorystore/docs/valkey/supported-monitoring-metrics#instance-level_metrics) metric to determine the CPU utilization for a node in your instance, from 0.0 (0%) to 1.0 (100%). We recommend that all nodes have a CPU utilization percentage of less than 80%. For more information, see [CPU usage best practices](/memorystore/docs/valkey/general-best-practices#cpu_usage_best_practices).\n- When the client disconnects from the server because the server times out, retry with [exponential backoff](/memorystore/docs/valkey/exponential-backoff) and with [Jitter](https://jitter.video/). This helps to avoid multiple clients overloading the server simultaneously.\n\nConnectivity error scenarios\n----------------------------\n\nThis section explains connectivity issues your instance might encounter.\n\n### Connection error caused by firewall rules\n\nFirewall rules can cause connection errors by blocking the ports used by\nMemorystore for Valkey. You should allow list all ports for both of your instance's\nPrivate Service Connect endpoints. For more information about the\nendpoints, see [Reserved network addresses](/memorystore/docs/valkey/networking#reserved_network_addresses).\n\n### Connection error caused by organization policies.\n\nYou can have an [organization policy](/vpc/docs/private-service-connect-security#org-policies)\nthat blocks your Private Service Connect connections to your Memorystore for Valkey instance.\n\nIf your organization policy uses the `.restrictPrivateServiceConnectProducer` policy,\nthen allow list the `961333125034` folder number, which is a folder specifically for Memorystore for Valkey. For example: \n\n```\nname: organizations/Consumer-org-1/policies/compute.restrictPrivateServiceConnectProducer\nspec:\n rules:\n - values:\n allowedValues:\n - under:folders/961333125034\n```\n\nIf your organization policy uses the `.disablePrivateServiceConnectCreationForConsumers` policy,\nyou should allow list `SERVICE_PRODUCERS`. For example: \n\n```\nname: organizations/Consumer-org-1/policies/compute.disablePrivateServiceConnectCreationForConsumers\nspec:\n rules:\n - values:\n allowedValues:\n - SERVICE_PRODUCERS\n```\n\n### Handling errors for Cluster Mode Disabled instances\n\n- If the application connects to the read endpoint of an instance which has no read replicas, then the connection closes and the `ERR no replicas found` error message appears. In this case, either try to connect the application to the primary endpoint or add read replicas to the instance.\n\n- In the event of a failover, the existing connections from your application closes and the `ERR role change occurred` error message appears. You would also see this error message if your application connects to the read endpoint of an instance, and all read replicas of the instance are failing. In this case, the application should retry the connection with [exponential backoff](/memorystore/docs/valkey/exponential-backoff)."]]