Stay organized with collections
Save and categorize content based on your preferences.
In a Google Distributed Cloud implementation, the control-plane VM for an
admin cluster has two attached disks:
The boot disk has the operating system for the VM.
The data disk has credentials and the etcd database, which stores the state of
the admin cluster. That is, the data disk stores all of the Kubernetes objects
for the admin cluster.
This page shows you how to recover when the control-plane VM is lost or the boot
disk is compromised. For example:
The boot disk becomes read-only due to spam journal logs.
The Docker overlay filesystem gets corrupted.
This page does not cover recovery of the data disk. For instructions on how to
recover the data disk, see
Restoring an admin cluster.
Repair the control-plane VM
The steps that you do to repair the admin cluster's control-plane VM differ
slightly depending on whether you have a high-availability (HA) admin cluster
or a non-HA admin cluster.
HA
An HA admin cluster has three control plane VMs. You must have at least two VMs
to bring up the cluster control plane. If three VMs have failed, repair
the failed VMs one at a time. After the second VM is repaired and running,
the cluster control plane should come back up.
ADMIN_CLUSTER_CONFIG with the path of your admin cluster
configuration file.
ADMIN_CLUSTER_KUBECONFIG with the path of your admin cluster's
kubeconfig file.
The output of the command is similar to the following:
Please select the control plane VM template to be used for re-creating the admin cluster's control plane VM.
[1] VM template: /atl-qual-vc07/vm/gke-admin-57f8g-fx9f4c729448z2v8-2-tmpl
GKE on-prem version: 1.16.0-gke.550
Creation time: 2023-07-25 01:52:51.815518 +0000 UTC
CPU: 4 CPU(s)
Memory: 16384 MB
Data disk: [vsanDatastore] 37a73d64-b823-47cd-2e0c-00620b9189a0/gke-admin-57f8g/default/gke-admin-57f8g-2-data.vmdk
[2] VM template: /atl-qual-vc07/vm/gke-admin-57f8g-fx9f4c729448z2v8-0-tmpl
GKE on-prem version: 1.16.0-gke.550
Creation time: 2023-07-25 01:52:54.228252 +0000 UTC
CPU: 4 CPU(s)
Memory: 16384 MB
Data disk: [vsanDatastore] 37a73d64-b823-47cd-2e0c-00620b9189a0/gke-admin-57f8g/default/gke-admin-57f8g-0-data.vmdk
[3] VM template: /atl-qual-vc07/vm/gke-admin-57f8g-fx9f4c729448z2v8-1-tmpl
GKE on-prem version: 1.16.0-gke.550
Creation time: 2023-07-25 01:52:54.210705 +0000 UTC
CPU: 4 CPU(s)
Memory: 16384 MB
Data disk: [vsanDatastore] 37a73d64-b823-47cd-2e0c-00620b9189a0/gke-admin-57f8g/default/gke-admin-57f8g-1-data.vmdk
Please enter your numeric choice:
Enter the number for the VM that you want to repair. If you don't see
the VM in the output, contact Google Cloud Support.
If you have three VMs that need to be repaired, gkectl repair
admin-master outputs an error message similar to the
following after repairing the first VM:
If you are repairing admin control plane VM for HA admin cluster,
it's possible that the API server is still down after repairing one
of the VMs. Try continue fixing other control plane VMs listed to
recover the quorum of control plane.
In this case, re-run the command to repair the second VM.
ADMIN_CLUSTER_CONFIG with the path of your admin cluster
configuration file.
ADMIN_CLUSTER_KUBECONFIG with the path of your admin cluster's
kubeconfig file.
The admin cluster's control-plane VM is cloned into a VM template, which has
all the information needed to re-create the VM. The gkectl repair admin-master
command uses the VM template to create a new VM. Then it attaches a new
boot disk and the existing data disk.
If your cluster nodes get their addresses from a DHCP server, the new VM might
have a different IP address from the original VM.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-28 UTC."],[],[],null,["# Repair the admin cluster's control-plane VM\n\nIn a Google Distributed Cloud implementation, the control-plane VM for an\nadmin cluster has two attached disks:\n\n- The boot disk has the operating system for the VM.\n\n- The data disk has credentials and the etcd database, which stores the state of\n the admin cluster. That is, the data disk stores all of the Kubernetes objects\n for the admin cluster.\n\nThis page shows you how to recover when the control-plane VM is lost or the boot\ndisk is compromised. For example:\n\n- The boot disk becomes read-only due to spam journal logs.\n- The Docker overlay filesystem gets corrupted.\n\nThis page does not cover recovery of the data disk. For instructions on how to\nrecover the data disk, see\n[Restoring an admin cluster](/kubernetes-engine/distributed-cloud/vmware/docs/how-to/back-up-and-restore-an-admin-cluster-with-gkectl).\n\nRepair the control-plane VM\n---------------------------\n\n| **Warning:** Don't run `gkectl repair admin-master` after a failed admin upgrade attempt. Instead, [resume the admin upgrade](/kubernetes-engine/distributed-cloud/vmware/docs/how-to/upgrading#about_resume_admin).\n\nThe steps that you do to repair the admin cluster's control-plane VM differ\nslightly depending on whether you have a high-availability (HA) admin cluster\nor a non-HA admin cluster. \n\n### HA\n\nAn HA admin cluster has three control plane VMs. You must have at least two VMs\nto bring up the cluster control plane. If three VMs have failed, repair\nthe failed VMs one at a time. After the second VM is repaired and running,\nthe cluster control plane should come back up.\n\n1. Run the following command:\n\n ```\n gkectl repair admin-master --config ADMIN_CLUSTER_CONFIG --kubeconfig ADMIN_CLUSTER_KUBECONFIG\n ```\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eADMIN_CLUSTER_CONFIG\u003c/var\u003e with the path of your admin cluster\n configuration file.\n\n - \u003cvar translate=\"no\"\u003eADMIN_CLUSTER_KUBECONFIG\u003c/var\u003e with the path of your admin cluster's\n kubeconfig file.\n\n The output of the command is similar to the following: \n\n ```\n Please select the control plane VM template to be used for re-creating the admin cluster's control plane VM.\n [1] VM template: /atl-qual-vc07/vm/gke-admin-57f8g-fx9f4c729448z2v8-2-tmpl\n GKE on-prem version: 1.16.0-gke.550\n Creation time: 2023-07-25 01:52:51.815518 +0000 UTC\n CPU: 4 CPU(s)\n Memory: 16384 MB\n Data disk: [vsanDatastore] 37a73d64-b823-47cd-2e0c-00620b9189a0/gke-admin-57f8g/default/gke-admin-57f8g-2-data.vmdk\n\n [2] VM template: /atl-qual-vc07/vm/gke-admin-57f8g-fx9f4c729448z2v8-0-tmpl\n GKE on-prem version: 1.16.0-gke.550\n Creation time: 2023-07-25 01:52:54.228252 +0000 UTC\n CPU: 4 CPU(s)\n Memory: 16384 MB\n Data disk: [vsanDatastore] 37a73d64-b823-47cd-2e0c-00620b9189a0/gke-admin-57f8g/default/gke-admin-57f8g-0-data.vmdk\n\n [3] VM template: /atl-qual-vc07/vm/gke-admin-57f8g-fx9f4c729448z2v8-1-tmpl\n GKE on-prem version: 1.16.0-gke.550\n Creation time: 2023-07-25 01:52:54.210705 +0000 UTC\n CPU: 4 CPU(s)\n Memory: 16384 MB\n Data disk: [vsanDatastore] 37a73d64-b823-47cd-2e0c-00620b9189a0/gke-admin-57f8g/default/gke-admin-57f8g-1-data.vmdk\n\n Please enter your numeric choice:\n ```\n2. Enter the number for the VM that you want to repair. If you don't see\n the VM in the output, contact Google Cloud Support.\n\n If you have three VMs that need to be repaired, `gkectl repair\n admin-master` outputs an error message similar to the\n following after repairing the first VM: \n\n If you are repairing admin control plane VM for HA admin cluster,\n it's possible that the API server is still down after repairing one\n of the VMs. Try continue fixing other control plane VMs listed to\n recover the quorum of control plane.\n\n In this case, re-run the command to repair the second VM.\n\n### Non-HA\n\nRun the following command: \n\n```\ngkectl repair admin-master \\\n --config ADMIN_CLUSTER_CONFIG \\\n --kubeconfig ADMIN_CLUSTER_KUBECONFIG\n```\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eADMIN_CLUSTER_CONFIG\u003c/var\u003e with the path of your admin cluster configuration file.\n- \u003cvar translate=\"no\"\u003eADMIN_CLUSTER_KUBECONFIG\u003c/var\u003e with the path of your admin cluster's `kubeconfig` file.\n\nThe admin cluster's control-plane VM is cloned into a VM template, which has\nall the information needed to re-create the VM. The `gkectl repair admin-master`\ncommand uses the VM template to create a new VM. Then it attaches a new\nboot disk and the existing data disk.\n\nIf your cluster nodes get their addresses from a DHCP server, the new VM might\nhave a different IP address from the original VM.\n\nWhat's next\n-----------\n\n- If you need additional assistance, reach out to [Cloud Customer Care](/support-hub).\n\n You can also see\n [Getting support](/kubernetes-engine/distributed-cloud/vmware/docs/getting-support) for more information about support resources, including the following:\n - [Requirements](/kubernetes-engine/distributed-cloud/vmware/docs/getting-support#support_requirements) for opening a support case.\n - [Tools](/kubernetes-engine/distributed-cloud/vmware/docs/getting-support#support_tools) to help you troubleshoot, such as logs and metrics.\n - Supported [components](/kubernetes-engine/distributed-cloud/vmware/docs/getting-support#whats_supported), [versions](/kubernetes-engine/distributed-cloud/vmware/docs/getting-support#version_support_policy), and [features](/kubernetes-engine/distributed-cloud/vmware/docs/getting-support#supported_features) of Google Distributed Cloud for VMware (software only)."]]