Have you ever wondered how Oracle Cloud keeps our cloud services updated across hundreds of services, thousands of instances, and over 51 regions in 26 countries? The answer is Oracle Linux with immutable, atomic updates. With hundreds of thousands of immutable Oracle Linux instances powering the Oracle Cloud – and more every day – Oracle is likely one of the largest fleets of immutable instances in the world.

This blog is intended for anyone who’s interested in deploying immutable images to simplify their deployment and updates.

The Value of Atomic Updates

A common issue with traditional Linux updates involves navigating the risk when updates are disrupted or fail to complete. A power failure or user canceled job can result in incomplete package application, and can leave the system in an unknown state. RPMs are notoriously difficult to remove once installed, and “rolling back” an update on a production workload can significantly increase downtime during a maintenance event.

With our immutable images an Oracle Linux update either completes or is reverted back to the previous, known-good state. We accomplish this with OSTree. OSTree is a mechanism for indexing and providing full OS images, and is sometimes described as “git for your filesystem”. All OSTree commits are signed and hash keys verify the integrity of the filesystem tree during deployment. The ostree clients verify the signature and reject the image if signatures do not match. This process validates the authenticity of the image and prevents tampering with malicious or untrusted content, improving the overall security of the environment.

OSTree has been around for years, but it was the introduction of rpm-ostree which really made this technology usable for Oracle Linux. While ostree manages the filesystem, rpm-ostree (re)introduces the concept of package management into the ostree environment, and provides the best hybrid between the fully git-managed filesystem and the advantages of package-based deployments. It also minimizes the amount of changes to the Oracle Linux build process necessary to make an atomic image.

At its core, rpm-ostree provides the foundation for managing atomic and immutable systems. The rpm-ostree environment utilizes a content-addressed object store to uniquely identify each file and creates bootable filesystem trees which are atomic, meaning they are modified and delivered as the entire tree vs package by package. If at any point a copy fails, it aborts the entire update, leaving the state of the system as it was before the copy began.

How we got here

Five years ago we faced several challenges involving updates across services, including time required to roll out monthly updates. Service teams have limited windows to execute update operations and Oracle needed a method for improving deployment times and reducing overhead associated with maintenance windows. Controlling maintenance window activities involves managing OS and application changes, replicated across all regions, every month. The model at the time was to use a utility that would apply DNF updates across each host, but in the unfortunate event where an update introduced unexpected behavior or change, rollbacks were not simple. These challenges presented the opportunity to evolve the existing process and design a solution for fast, secure and fault tolerant updates, using atomic updates with OSTree.

Deep Dive into OSTree and rpm-ostree

ostree is a version control system for complete filesystems, enabling users to manage and deploy immutable root filesystems much like git manages source code repositories. rpm-ostree extends ostree with support for RPM package management, allowing atomic updates and rollbacks by composing ostree trees using RPM packages. While ostree handles filesystem snapshots directly, rpm-ostree adds, removes, or updates RPM packages on these snapshots, making it ideal for managing system images in RPM-based Linux distributions. At Oracle, we tend to refer to OSTree and rpm-ostree interchangeably – because we’re only using rpm-ostree. While we looked at doing a pure ostree deployment in the past, the rpm-ostree model fits much better with our build and release processes.

OSTree uses a content-addressed object store to uniquely identify files and creates atomic, bootable filesystem trees delivered as complete units rather than individual packages. If a copy fails, the update aborts, and the system remains unchanged. This model makes it easy to roll back to a previous state and only requires transferring changed files, while preserving user data in updates and rollbacks.

Ostree-managed filesystems act like chroot, where each branch is an isolated root. An empty /sysroot in the tree directory is bind-mounted to the actual /root. On boot, root (/) is mounted read-only, preventing changes to critical system settings, which enhances security. Immutable and mutable content are separated: root and /usr are read-only, while directories under /var (including /var/home and /var/opt) and /etc are mutable and changes made there persist across updates and rollbacks. The rpm-ostree tool ensures packages needing access to /usr are supported with appropriate symlinks.

System updates with rpm-ostree replace the full filesystem tree. Changes to /var and /etc are layered in the commit and preserved, but manual modifications to files like /etc/passwd may detach them from automatic updates.

With rpm-ostree, updates and installs require a reboot to commit changes, similar to committing changes in git. There is also an apply-live feature, allowing temporary additions like RPM packages. In addition, on Oracle Linux, Ksplice is available for zero-downtime updates to the kernel and key libraries like glibc and OpenSSL, reducing the need for outages until scheduled maintenance. A future update will go into how we make ksplice compatible with atomic updates and immutable images.

How it’s going?

We’ve achieved some pretty impressive numbers by switching to this model for deploying Oracle Linux in the Oracle Cloud.

  • OS boot and first boot improvements: 20 minutes to 3.5 minutes
  • Patch times: 20 minutes to 3 minutes or less
  • Rollbacks were not used before and are now automated and executed within 2 minutes
  • Deterministic updates – no more broken states with failed updates
  • CPU reduction: Traditional updates required 1.5 cores to navigate updates vs 0.3 vCPU

The migration to Oracle Linux atomic images is an exciting new phase for our service teams in Oracle Cloud and the results are clear – faster boots, smaller maintenance windows, and better tools for navigating/managing change. There is more control with growing efficiencies and in upcoming blogs, we will share more details on how we use Oracle Linux to build atomic images and integration with existing package repositories for customizing content.

How can you try it out?

Here’s some simple steps for deploying ostree with Oracle Linux. Please note that this hand-waves away a lot of important detail when it comes to OS updates and lifecycles. For most use cases, you’ll be replacing your existing rpm operations with rpm-ostree install <pkgname>. Here’s how to get started.

  1. Install required packages: dnf install -y ostree rpm-ostree
  2. Create your ostree path: mkdir $HOME/ostree-test; cd $HOME/ostree-test
  3. Init ostree! ostree --repo=$(pwd) init
  4. Check out some packages with dnf: sudo dnf install --installroot=$(pwd) --releasever=9 oraclelinux-release rpm-ostree bash coreutils kernel-uek-core grub2-efi-x64 -y
  5. Commit your repo: ostree --repo=$(pwd) commit -b my_ostree_test --tree=dir=$(pwd) --subject="My first ostree OL9 commit"
  6. Generate a summary file: ostree --repo=$(pwd) summary --update

Congratulations, you’ve committed your first ostree repo!

From this point, you have two options: either check out this filesystem as your root filesystem, or check out the filesystem in a sub-tree to take a look at how ostree file management works. Let’s try checking out first:

# This will be a different path than the one you created above.
# It will treat the above repo as a "remote" and checkout from it.
$ cd $HOME
$ mkdir ostree-test-remote ; cd ostree-test-remote
$ sudo ostree --repo=$(pwd) init
$ sudo ostree --repo=$(pwd) remote add ol-local file:///$HOME/ostree-test --no-gpg-verify
$ sudo ostree --repo=$(pwd) remote refs ol-local
# response: ol-local:my_ostree_test

Then to check out the files, run these commands. Note that ostree requires that references are pulled locally first before being checked out (this behavior differs subtly from git).

$ sudo ostree --repo=$(pwd) pull ol-local my_ostree_test
$ sudo ostree --repo=$(pwd) checkout my_ostree_test newroot
$ ls newroot
afs   config  extensions  lib64  objects  refs  sbin   sys  var
bin   dev     home        media  opt      root  srv    tmp
boot  etc     lib         mnt    proc     run   state  usr

You can also use ostree to set this up as a bootable filesystem and reboot into it!

$ sudo ostree remote add ol-local <remote repo location>
$ sudo rpm-ostree rebase ol-local:my_ostree_test

What’s Next?

This blog describes simple steps for experimenting with ostree, and teases at the orchestration infrastructure we’ve built at Oracle for automating OS operations at scale. There are a number of services which have been created for managing and orchestrating our cloud which aren’t yet available for consumers outside of our service teams, and which we’re excited to bring to our customers within the cloud. Some of those services include a dedicated build management service for immutable images that helps with image builds, commit signoff, and image testing and release. We’ve also developed simplified and automated mechanisms for patching our immutable images, and running workloads on baremetal as well as within containers. We’re excited to share these innovations with you.