kubectl -n namespace get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cassandra-data-apigee-cassandra-default-0 Bound pvc-b247faae-0a2b-11ea-867b-42010a80006e 10Gi RWO standard 15m
...
描述失败的 pod 的 PVC。例如,以下命令描述了绑定到 pod apigee-cassandra-default-0 的 PVC:
kubectl apigee describe pvc cassandra-data-apigee-cassandra-default-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ProvisioningFailed 3m (x143 over 5h) persistentvolume-controller storageclass.storage.k8s.io "apigee-sc" not found
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-03。"],[[["\u003cp\u003eThis documentation pertains to Apigee hybrid version 1.4, which is now end-of-life and requires an upgrade to a newer version.\u003c/p\u003e\n"],["\u003cp\u003eCassandra pods in the Pending state during startup may be due to insufficient resources or issues with persistent volume creation, which can be resolved by modifying node pool resources or fixing StorageClass configurations.\u003c/p\u003e\n"],["\u003cp\u003eCassandra pods in the CrashLoopBackoff state can be caused by data center inconsistencies or problems with truststore directory access, which can be solved by deleting stale data or verifying the correctness of provided keys and certificates.\u003c/p\u003e\n"],["\u003cp\u003eNode failures can cause Cassandra pods to remain in the Pending state, necessitating the removal of the dead pod and its VolumeClaim, followed by the creation of a new PersistentVolume for a replacement node.\u003c/p\u003e\n"],["\u003cp\u003eThe troubleshooting process involves using kubectl commands to check pod states, describe pods and PersistentVolumeClaims, and examine logs to diagnose and resolve Cassandra pod issues within the Apigee hybrid environment.\u003c/p\u003e\n"]]],[],null,["# Cassandra troubleshooting guide\n\n| You are currently viewing version 1.4 of the Apigee hybrid documentation. **This version is end of life.** You should upgrade to a newer version. For more information, see [Supported versions](/apigee/docs/hybrid/supported-platforms#supported-versions).\n\n\nThis topic discusses steps you can take to troubleshoot and fix problems with the\n[Cassandra](/apigee/docs/hybrid/v1.4/what-is-hybrid#cassandra-datastore) datastore. Cassandra is a\npersistent datastore\nthat runs in the `cassandra` component of the\n[hybrid runtime architecture](/apigee/docs/hybrid/v1.4/what-is-hybrid#about-the-runtime-plane).\nSee also\n[Runtime service configuration overview](/apigee/docs/hybrid/v1.4/service-config).\n\nCassandra pods are stuck in the Pending state\n---------------------------------------------\n\n### Symptom\n\n\nWhen starting up, the Cassandra pods remain in the **Pending** state.\n\n### Error message\n\n\nWhen you use `kubectl` to view the pod states, you see that one or more\nCassandra pods are stuck in the `Pending` state. The\n`Pending` state indicates that Kubernetes is unable to schedule the pod\non a node: the pod cannot be created. For example: \n\n kubectl get pods -n \u003cvar translate=\"no\"\u003enamespace\u003c/var\u003e\n\n NAME READY STATUS RESTARTS AGE\n adah-resources-install-4762w 0/4 Completed 0 10m\n apigee-cassandra-default-0 0/1 Pending 0 10m\n ...\n\n### Possible causes\n\n\nA pod stuck in the Pending state can have multiple causes. For example:\n\n### Diagnosis\n\nUse `kubectl`\nto describe the pod to determine the source of the error. For example: \n\n```\nkubectl -n namespace describe pods pod_name\n```\n\n\nFor example: \n\n```\nkubectl -n apigee describe pods apigee-cassandra-default-0\n```\n\n\nThe output may show one of these possible problems:\n\n- If the problem is insufficient resources, you will see a Warning message that indicates insufficient CPU or memory.\n- If the error message indicates that the pod has unbound immediate PersistentVolumeClaims (PVC), it means the pod is not able to create its [Persistent volume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/).\n\n### Resolution\n\n#### Insufficient resources\n\nModify the Cassandra node pool so that it has sufficient CPU and memory resources.\nSee [Resizing a node pool](https://cloud.google.com/kubernetes-engine/docs/how-to/node-pools#resizing_a_node_pool) for details.\n\n#### Persistent volume not created\n\nIf you determine a persistent volume issue, describe the PersistentVolumeClaim (PVC) to determine\nwhy it is not being created:\n\n1. List the PVCs in the cluster: \n\n ```\n kubectl -n namespace get pvc\n\n NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE\n cassandra-data-apigee-cassandra-default-0 Bound pvc-b247faae-0a2b-11ea-867b-42010a80006e 10Gi RWO standard 15m\n ...\n ```\n2. Describe the PVC for the pod that is failing. For example, the following command describes the PVC bound to the pod `apigee-cassandra-default-0`: \n\n ```\n kubectl apigee describe pvc cassandra-data-apigee-cassandra-default-0\n\n Events:\n Type Reason Age From Message\n ---- ------ ---- ---- -------\n Warning ProvisioningFailed 3m (x143 over 5h) persistentvolume-controller storageclass.storage.k8s.io \"apigee-sc\" not found\n ```\n\n\n Note that in this example, the StorageClass named `apigee-sc` does not exist. To\n resolve this problem, create the missing StorageClass in the cluster, as explained in [Change the default StorageClass](/apigee/docs/hybrid/v1.4/cassandra-config).\n\n\nSee also [Debugging Pods](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-pod-replication-controller#debugging-pods).\n\nCassandra pods are stuck in the CrashLoopBackoff state\n------------------------------------------------------\n\n### Symptom\n\n\nWhen starting up, the Cassandra pods remain in the **CrashLoopBackoff** state.\n\n### Error message\n\n\nWhen you use `kubectl` to view the pod states, you see that one or more\nCassandra pods are in the `CrashLoopBackoff` state.\nThis state indicates that Kubernetes is unable to create the pod. For example: \n\n kubectl get pods -n \u003cvar translate=\"no\"\u003enamespace\u003c/var\u003e\n\n NAME READY STATUS RESTARTS AGE\n adah-resources-install-4762w 0/4 Completed 0 10m\n apigee-cassandra-default-0 0/1 CrashLoopBackoff 0 10m\n ...\n\n### Possible causes\n\n\nA pod stuck in the `CrashLoopBackoff` state can have multiple causes. For example:\n\n### Diagnosis\n\n\nCheck the [Cassandra error log](/apigee/docs/hybrid/v1.4/cassandra-logs) to determine the cause of the problem.\n\n1. List the pods to get the ID of the Cassandra pod that is failing: \n\n ```\n kubectl get pods -n namespace\n ```\n2. Check the failing pod's log: \n\n ```\n kubectl logs pod_id -n namespace\n ```\n\n### Resolution\n\n\nLook for the following clues in the pod's log:\n\n#### Data center differs from previous data center\n\n\nIf you see this log message: \n\n```\nCannot start node if snitch's data center (us-east1) differs from previous data center\n```\n\n- Check if there are any stale or old PVC in the cluster and delete them.\n- If this is a fresh install, delete all the PVCs and re-try the setup. For example: \n\n kubectl -n \u003cvar translate=\"no\"\u003enamespace\u003c/var\u003e get pvc\n kubectl -n \u003cvar translate=\"no\"\u003enamespace\u003c/var\u003e delete pvc cassandra-data-apigee-cassandra-default-0\n\n#### Truststore directory not found\n\n\nIf you see this log message: \n\n```\nCaused by: java.io.FileNotFoundException: /apigee/cassandra/ssl/truststore.p12\n(No such file or directory)\n```\n\n\nVerify the key and certificates if provided in your overrides file are correct and valid. For\nexample: \n\n```\ncassandra:\n sslRootCAPath: path_to_root_ca-file\n sslCertPath: path-to-tls-cert-file\n sslKeyPath: path-to-tls-key-file\n```\n\nNode failure\n------------\n\n### Symptom\n\nWhen starting up, the Cassandra pods remain in the Pending state. This\nproblem can indicate an underlying node failure.\n\n### Diagnosis\n\n1. Determine which Cassandra pods are not running: \n\n ```bash\n $ kubectl get pods -n your_namespace\n NAME READY STATUS RESTARTS AGE\n cassandra-default-0 0/1 Pending 0 13s\n cassandra-default-1 1/1 Running 0 8d\n cassandra-default-2 1/1 Running 0 8d\n ```\n2. Check the worker nodes. If one is in the **NotReady** state, then that is the node that has failed: \n\n ```scdoc\n kubectl get nodes -n your_namespace\n NAME STATUS ROLES AGE VERSION\n ip-10-30-1-190.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-1-22.ec2.internal Ready master 8d v1.13.2\n ip-10-30-1-36.ec2.internal NotReady \u003cnone\u003e 8d v1.13.2\n ip-10-30-2-214.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-2-252.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-2-47.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-3-11.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-3-152.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-3-5.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ```\n\n### Resolution\n\n1. Remove the dead Cassandra pod from the cluster. \n\n $ kubectl exec -it apigee-cassandra-default-0 -- nodetool status\n $ kubectl exec -it apigee-cassandra-default-0 -- nodetool removenode deadnode_hostID\n\n2. Remove the VolumeClaim from the dead node to prevent the Cassandra pod from attempting to come up on the dead node because of the affinity: \n\n kubectl get pvc -n your_namespace\n kubectl delete pvc \u003cvar translate=\"no\"\u003evolumeClaim_name\u003c/var\u003e -n \u003cvar translate=\"no\"\u003eyour_namespace\u003c/var\u003e\n\n3. Update the volume template and create PersistentVolume for the newly added node. The following is an example volume template: \n\n ```actionscript-3\n apiVersion: v1\n kind: PersistentVolume\n metadata:\n name: cassandra-data-3\n spec:\n capacity:\n storage: 100Gi\n accessModes:\n - ReadWriteOnce\n persistentVolumeReclaimPolicy: Retain\n storageClassName: local-storage\n local:\n path: /apigee/data\n nodeAffinity:\n \"required\":\n \"nodeSelectorTerms\":\n - \"matchExpressions\":\n - \"key\": \"kubernetes.io/hostname\"\n \"operator\": \"In\"\n \"values\": [\"ip-10-30-1-36.ec2.internal\"]\n ```\n4. Replace the values with the new hostname/IP and apply the template: \n\n ```text\n kubectl apply -f volume-template.yaml\n ```"]]