Achieving high availability (HA) in Kubernetes goes beyond just the control plane. You must also design and deploy container workloads in your Google Distributed Cloud (GDC) air-gapped universe resiliently. Kubernetes offers several powerful mechanisms to minimize downtime and ensure your services remain available even when facing infrastructure issues or during routine maintenance. The following topics are key strategies to consider for HA:
Maintain availability with replicas and autoscale: You must have enough running instances of your application to ensure HA.
ReplicaSet
: AReplicaSet
resource ensures that a specific number of identical pod replicas are running at any given time. If a pod fails or is terminated, theReplicaSet
controller automatically creates a new pod to replace it. See ReplicaSet Kubernetes documentation for more information.Horizontal Pod Autoscaler (HPA): While a
ReplicaSet
maintains a fixed number of replicas, the HPA automatically adjusts this number based on observed metrics like CPU utilization or memory usage. This allows your application to handle load spikes. See Horizontal Pod Autoscaling Kubernetes documentation for more information.
Minimize Downtime with
PodDisruptionBudget
(PDB): See Specifying a Disruption Budget for your Application Kubernetes documentation for more information.Spread Your Risk with Anti-Affinity Rules: See Affinity and anti-affinity Kubernetes documentation for more information.
Health Checks with Liveness, Readiness, and Startup Probes: See Configure Liveness, Readiness and Startup Probes Kubernetes documentation for more information.
Stable Endpoints and Load Balancing with Services: See Services Kubernetes documentation for more information.
Graceful Updates and Rollbacks with Deployments: See Rolling Back a Deployment Kubernetes documentation for more information.
Ensure Resources with Requests and Limits: See Resource Management for Pods and Containers Kubernetes documentation for more information.