When you install a new version of bmctl
, you can upgrade your existing
clusters that were created with an earlier version. Upgrading a cluster to the
latest Google Distributed Cloud version brings added features and fixes to your
cluster. It also ensures that your cluster remains
supported.
You can upgrade admin, hybrid, standalone, or user clusters with the
bmctl upgrade cluster
command, or you can use kubectl
.
Starting with Google Distributed Cloud version 1.15.0, the default upgrade behavior for self-managed (admin, hybrid, or standalone) clusters is an in-place upgrade. In-place upgrades use lifecycle controllers, instead of a bootstrap cluster, to manage the entire upgrade process. This change simplifies the process and reduces resource requirements, which makes cluster upgrades more reliable and scalable.
To learn more about the upgrade process, see Lifecycle and stages of cluster upgrades.
Upgrade considerations
This section contains information and links to information that you should consider before you upgrade a cluster.
Best practices
For information to help you prepare for a cluster upgrade, see Best practices for Anthos clusters on bare metal cluster upgrades.
Upgrade preflight checks
Preflight checks are run as part of the cluster upgrade to validate cluster status and node health. The cluster upgrade doesn't proceed if the preflight checks fail. For more information on preflight checks, see Understand preflight checks.
You can check if the clusters are ready for an upgrade by running the preflight check before running the upgrade. For more information, see Preflight checks for upgrades.
Known issues
For information about potential problems related to cluster upgrades, see Anthos clusters on bare metal known issues and select the Upgrades and updates problem category.
Upgrade admin, standalone, hybrid, or user clusters
This section contains instructions for upgrading clusters.
bmctl
When you download and install a new version of bmctl
, you can upgrade your
admin, hybrid, standalone, and user clusters created with an earlier version.
For a given version of bmctl
, a cluster can be upgraded to the same version
only.
Download the latest
bmctl
as described in Google Distributed Cloud downloads.Update
anthosBareMetalVersion
in the cluster configuration file to the upgrade target version.The upgrade target version must match the version of the downloaded
bmctl
file. The following cluster configuration file snippet shows theanthosBareMetalVersion
field updated to the latest version:--- apiVersion: baremetal.cluster.gke.io/v1 kind: Cluster metadata: name: cluster1 namespace: cluster-cluster1 spec: type: admin # Anthos cluster version. anthosBareMetalVersion: 1.15.11
Use the
bmctl upgrade cluster
command to complete the upgrade:bmctl upgrade cluster -c CLUSTER_NAME --kubeconfig ADMIN_KUBECONFIG
Replace the following:
- CLUSTER_NAME: the name of the cluster to upgrade.
- ADMIN_KUBECONFIG: the path to the admin cluster kubeconfig file.
The cluster upgrade operation runs preflight checks to validate cluster status and node health. The cluster upgrade doesn't proceed if the preflight checks fail. For troubleshooting information, see Troubleshoot cluster install or upgrade issues.
When all of the cluster components have been successfully upgraded, the cluster upgrade operation performs cluster health checks. This last step verifies that the cluster is in good operating condition. If the cluster doesn't pass all health checks, they continue to run until they pass. When all health checks pass, the upgrade finishes successfully.
For more information about the sequence of events for cluster upgrades, see Lifecycle and stages of cluster upgrades.
kubectl
To upgrade a cluster with kubectl
, perform the following steps:
Edit the cluster configuration file to set
anthosBareMetalVersion
to the upgrade target version.To initiate the upgrade, run the following command:
kubectl apply -f CLUSTER_CONFIG_PATH
Replace
CLUSTER_CONFIG_PATH
with the path to the edited cluster configuration file.
As with the upgrade process with bmctl
, preflight checks are run as part of
the cluster upgrade to validate cluster status and node health. If the preflight
checks fail, the cluster upgrade is halted. To troubleshoot any failures,
examine the cluster and related logs, since no bootstrap cluster is created. For
more information, see
Troubleshoot cluster install or upgrade issues.
Although you don't need the latest version of bmctl
to upgrade cluters with
kubectl
, we recommend that you
download the latest bmctl
. You need bmctl
to
perform other tasks, such as health checks and backups, to ensure that your
cluster stays in good working order.
Parallel upgrades
In a typical, default cluster upgrade, each cluster node is upgraded sequentially, one after the other. This section shows you how to configure your cluster and worker node pools so that multiple nodes upgrade in parallel when you upgrade your cluster. Upgrading nodes in parallel speeds up cluster upgrades significantly, especially for clusters that have hundreds of nodes.
There are two parallel upgrade strategies that you can use to speed up your cluster upgrade:
Concurrent node upgrade: you can configure your worker node pools so that multiple nodes upgrade in parallel. Parallel upgrades of nodes are configured in the NodePool spec (
spec.upgradeStrategy.parallelUpgrade
) and only nodes in a worker node pool can be upgraded in parallel. Nodes in control plane or load balancer node pools can only be upgraded one at a time. For more information, see Node upgrade strategy.Concurrent node pool upgrade: you can configure your cluster so that multiple node pools upgrade in parallel. Only worker node pools can be upgraded in parallel. Control plane and load balancer node pools can only be upgraded one at a time. The ability to upgrade multiple node pools concurrently is available for Public Preview. For more information, see Node pool upgrade strategy.
Node upgrade strategy
You can configure worker node pools so that multiple nodes upgrade concurrently
(concurrentNodes
). You can also set a minimum threshold for the number of
nodes able to run workloads throughout the upgrade process
(minimumAvailableNodes
). This configuration is made in the NodePool spec. For
more information about these fields, see the
Cluster configuration field reference.
The node upgrade strategy applies to worker node pools only. You can't specify a node upgrade strategy for control plane or load balancer node pools. During a cluster
upgrade, nodes in control plane and load balancer node pools upgrade
sequentially, one at a time. Control plane node pools and load balancer node
pools are specified in the Cluster spec (controlPlane.nodePoolSpec.nodes
and
loadBalancer.nodePoolSpec.nodes
).
When you configure parallel upgrades for nodes, note the following restrictions:
The value of
concurrentNodes
can't exceed either 20 percent of the number of nodes in the node pool, or 10, whichever is smaller. For example, if your node pool has 40 nodes, you can't specify a value greater than 8. If your node pool has 100 nodes, 10 is the maximum value you can specify.When you use
concurrentNodes
together withminimumAvailableNodes
, the combined values can't exceed the total number of nodes in the node pool. For example, if your node pool has 20 nodes andminimumAvailableNodes
is set to 18,concurrentNodes
can't exceed 2. Likewise, ifconcurrentNodes
is set to 10,minimumAvailableNodes
can't exceed 10.
The following example shows a worker node pool np1
with 10 nodes. In an
upgrade, nodes upgrade two at a time and at least 5 nodes must remain
available for the upgrade to proceed:
apiVersion: baremetal.cluster.gke.io/v1
kind: NodePool
metadata:
name: np1
namespace: cluster-cluster1
spec:
clusterName: cluster1
nodes:
- address: 10.200.0.1
- address: 10.200.0.2
- address: 10.200.0.3
- address: 10.200.0.4
- address: 10.200.0.5
- address: 10.200.0.6
- address: 10.200.0.7
- address: 10.200.0.8
- address: 10.200.0.9
- address: 10.200.0.10
upgradeStrategy:
parallelUpgrade:
concurrentNodes: 2
minimumAvailableNodes: 5
Node pool upgrade strategy
You can configure a cluster so that multiple worker node pools upgrade in
parallel. The nodePoolUpgradeStrategy.concurrentNodePools
Boolean field in the
cluster spec specifies whether or not to upgrade all worker node pools for a
cluster concurrently. By default (1
), node pools upgrade
sequentially, one after the other. When you set concurrentNodePools
to 0
, every worker node pool in the cluster upgrades in parallel.
Control plane and load balancing node pools are not affected by this setting.
These node pools always upgrade sequentially, one at a time. Control plane node
pools and load balancer node pools are specified in the Cluster spec
(controlPlane.nodePoolSpec.nodes
and loadBalancer.nodePoolSpec.nodes
).
The capability to upgrade all worker node pools concurrently is available for Preview only. Don't use this feature on your production clusters.
apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
name: cluster1
namespace: cluster-cluster1
spec:
...
nodePoolUpgradeStrategy:
concurrentNodePools: 0
...
How to perform a parallel upgrade
This section describes how to configure a cluster and a worker node pool for parallel upgrades.
To perform a parallel upgrade of worker node pools and nodes in a worker node pool, do the following:
Add an
upgradeStrategy
section to the NodePool spec.You can apply this manifest separately or as part of the cluster configuration file when you perform a cluster update.
Here's an example:
--- apiVersion: baremetal.cluster.gke.io/v1 kind: NodePool metadata: name: np1 namespace: cluster-ci-bf8b9aa43c16c47 spec: clusterName: ci-bf8b9aa43c16c47 nodes: - address: 10.200.0.1 - address: 10.200.0.2 - address: 10.200.0.3 ... - address: 10.200.0.30 upgradeStrategy: parallelUpgrade: concurrentNodes: 5 minimumAvailableNodes: 10
In this example, the value of the field
concurrentNodes
is5
, which means that 5 nodes upgrade in parallel. TheminimumAvailableNodes
field is set to10
, which means that at least 10 nodes must remain available for workloads throughout the upgrade.Add an
nodePoolUpgradeStrategy
section to the Cluster spec in the cluster configuration file.--- apiVersion: v1 kind: Namespace metadata: name: cluster-user001 --- apiVersion: baremetal.cluster.gke.io/v1 kind: Cluster metadata: name: user001 namespace: cluster-user001 spec: type: user profile: default anthosBareMetalVersion: 1.15.0 ... nodePoolUpgradeStrategy: concurrentNodePools: 0 ...
In this example, the
concurrentNodePools
field is set to0
, which means that all worker node pools upgrade concurrently during the cluster upgrade. The upgrade strategy for the nodes in the node pools is defined in the NodePool specs.Upgrade the cluster as described in the preceding Upgrade admin, standalone, hybrid, or user clusters section.
How to disable parallel upgrades of nodes
Parallel upgrades are disabled by default and the fields related to parallel upgrades are mutable. At any time, you can either remove the fields or set them to their default values to disable the feature before a subsequent upgrade.
The following table lists the parallel upgrade fields and their default values:
Field | Default value | Meaning |
---|---|---|
nodePoolUpgradeStrategy.concurrentNodePools (Cluster spec) |
1 |
Upgrade worker node pools sequentially, one after the other. |
upgradeStrategy.parallelUpgrade.concurrentNodes (NodePool spec) |
1 |
Upgrade nodes sequentially, one after the other. |
upgradeStrategy.parallelUpgrade.minimumAvailableNodes (NodePool spec) |
0 |
There is no requirement to ensure that any nodes are available during an upgrade. |