Non-Destructive Kubernetes Worker Node Updates

Introduction

We are excited to announce the ability for OCI Kubernetes Engine (OKE) users to update worker nodes through boot volume replacement. This feature enables you to update properties of worker nodes, both those backed by virtual machines and bare metal OCI Compute instances, in managed node pools, including the Kubernetes version, host OS image, and more, without the need to destroy the underlying instance. Boot volume replacement offers a much faster way to upgrade bare metal instances when compared to terminating and replacing them and because the instance itself is not terminated, it removes the possibility that there is insufficient capacity to create a new instance after the original was terminated. Cycling nodes through OKE is Kubernetes aware and will respect your Kubernetes-level availability configurations.

Cycling action selection

Background

OKE introduced worker node cycling to simplify the process of updating managed worker nodes in OKE clusters. It enables you to trigger the replacement of all existing nodes in a node pool with nodes running updated properties. You simply modify the properties of a node pool and then cycle the nodes to apply your changes.

The original version of node cycling could only be applied to worker nodes backed by virtual machine instances, the dominant worker node type, and excluded those backed by bare metal. Over the years, we started to see more and more customers adopt bare metal compute instances for their worker nodes owing to the high performance and strong isolation offered by the instance type. This was especially true for the large number of customers with AI/ML workloads who wanted to maximize the performance of their model training through the use of bare metal GPU instances. We wanted to help these customers address the need to apply updates to their nodes and do so in a manner that reduced the long recycle time to relaunch new instances.

Boot Volume Replacement

In 2024, OCI Compute released Boot Volume Replacement, a feature that allows you to restore their existing instance from a boot volume without terminating the instance. It offered a simple path to upgrade existing instances in a few clicks or an API call. OKE leveraged this feature to enable you to apply updates to nodes backed by both virtual machines and bare metal instances. Through boot volume replacement, you can update the following worker node properties without the need to terminate and replace your nodes:

Kubernetes Version
Host Image
Node Metadata
SSH Public Key
Boot Volume Size

Users who need to update other node pool properties, for example the shape of the node or the KMS key, can still do so through terminating and replacing worker nodes.

When you replace the boot volume of a worker node, the compute instance hosting the worker node is stopped, the boot volume is replaced, and the instance is returned to the state it was in prior to the boot volume replacement. When cycling nodes through boot volume replacement, the instance itself is will not be terminated and it will keep the same OCID and network address.

Ensuring Workload Availability

Replacing a boot volume requires an instance to stop, which will interrupt any workloads running on the node. OKE offers configuration options that help you avoid workload interruption. You can control the number of worker nodes concurrently unavailable during the update operation by specifying a maxUnavailable value. OKE also allows you to specify an eviction grace period, the amount of time allowed for draining to occur, before your choice of either timing out the operation or moving ahead with the replacement regardless of any workloads that may remain to be evicted. You can leverage pod disruption budgets as appropriate for your application to ensure that there are a sufficient number of replica pods running throughout the operation.

When you select a node pool and specify that you want to cycle by replacing worker node boot volumes, OKE automatically cordons your nodes, so no additional workloads are scheduled onto them, and then drains workloads off of them to avoid any interruptions that might occur if a pod was still running while the boot volume was replaced. The boot volume of the instance hosting each worker node is then replaced, without terminating the instance. You also have the option to terminate worker nodes immediately, without cordoning and draining them.

When to Update Nodes with Boot Volume Replacement?

Updating worker nodes through node cycling, whether through terminating and replacing instances, or through boot volume replacement, is a useful procedure in a number of scenarios:

When a new version of Kubernetes version is released and you want to upgrade your worker nodes to access new features and functionality.
A new version of your chosen worker node host OS is available and there are security patches that need to be applied.
You want to update the cloud-init script used to customize your worker node hosts.
You need to rotate SSH keys in order to adhere to organizational policies.

Replacing worker node boot volumes through node cycling is necessary:

When you want to apply updates to worker nodes backed by bare metal instances.
When you want to apply updates to worker nodes and you are facing capacity constraints for your desired shape in a particular fault domain.

Another great reason to take action on nodes using boot volume replacement is to correct issue caused by configuration drift of your boot volume. In this case, you perform a boot volume replacement without updating your node pool properties to replace existing volumes with newly provisioned ones that possess the same properties as your original boot volume. For more information, take a look at Kubernetes Worker Node Repair.

Conclusion

Cycling nodes through boot volume replacement allows you to update your instances without the need to terminate and replace them. This approach offers users a way to update managed nodes backed by bare metal instances, such as those with GPU accelerators in use for AI model training, for the first time. Cycling through boot volume replacement is also a great solution to update nodes when capacity for your desired shape is constrained.

For more information, see the following resources:

Non-Destructive Kubernetes Worker Node Updates

Introduction

Background

Boot Volume Replacement

Ensuring Workload Availability

When to Update Nodes with Boot Volume Replacement?

Conclusion

Mickey Boxell

Product Management

Kubernetes Worker Node Repair

ESRI ArcGIS on Oracle Cloud Infrastructure

Non-Destructive Kubernetes Worker Node Updates

Introduction

Background

Boot Volume Replacement

Ensuring Workload Availability

When to Update Nodes with Boot Volume Replacement?

Conclusion

Authors

Mickey Boxell

Product Management

Kubernetes Worker Node Repair

ESRI ArcGIS on Oracle Cloud Infrastructure