This document describes how a cluster behaves if vCenter Server is down.
While vCenter Server is down:
The machines are in the
AvailablestateThe nodes are in the
Readystate.The Pods are in the
Runningstate.There are some expected errors in Pods that connect to vCenter Server; for example, the
vsphere-controller-managerandcluster-health-controllerPods.Stateless Pods can be created and deleted.
The creation of a stateful Pod will fail, because attaching a disk requires access to vCenter Server. These Pods will be in the
Pendingstate.The
gkectl diagnosecommand will fail with an error similar to the following:Exit with error: failed to prepare diagnose parameters: failed to create vSphere client: Post "https://my-server": dial tcp 203.0.113.1:443: connect: connection timed out
Auto repair is not triggered. This is because the machine and node states do not change states on connection errors to vCenter Server.
After vCenter Server comes back online (versions < 7.0U2)
The machines go to the
Unavailablestate, and auto repair or or a manual workaround is needed to get back the correct states.The cluster functions correctly even though the machines are in the
Unavailablestate.
After vCenter Server comes back online (versions >= 7.0U2)
- No extra steps are needed, and the cluster is healthy again.