XFS – Online Filesystem Repair in UEK8

Last summer, I wrote about the online repair capabilities that I’ve been working on adding to the XFS filesystem in the Linux kernel for the past several years. I’m pleased to announce that online fsck is now available as a Technology Preview with the Unbreakable Enterprise Kernel (UEK) 8. This guide will walk you through the process of installing the software and enabling this technology preview.

Installation

The first step is to enable the Unbreakable Enterprise Kernel 8 yum repository, and install both the UEK kernel and its userspace support packages. Having UEK is critical here, because the tech preview is only available for UEK. Please refer to the UEK documentation for installation directions. A preconfigured cloud image from Oracle will suffice here.

The next step is to ensure that the userspace software is installed and configured correctly. The userspace programs that drive online repair are located in the xfsprogs-xfs_scrub package, which is not installed by default. Make sure that your system is ready:

$ sudo dnf install xfsprogs xfsprogs-xfs_scrub

Now confirm that the kernel and xfsprogs are installed.

$ uname -a
Linux mattk-ol9-test 6.12.0-1.23.3.2.el9uek.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 13 17:24:00 PDT 2025 x86_64 x86_64 x86_64 GNU/Linux

$ xfs_scrub -V
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
xfs_scrub version 6.12.0

The exact versions and timestamps may be slightly different on your computer.

Formatting Disks

Because online repair is a technology preview, it has undergone only limited testing outside of R&D here at Oracle – five years for the metadata checking capabilities, but only one year for metadata repair. For that reason, full repair capability is not enabled by default and must be enabled explicitly at format time.

Let’s say that we have a small 200TiB data partition for which we want to optimize uptime. To format the data partition, UEK 8 xfsprogs ships with a pre-built mkfs configuration file that enables all the features needed for online repair:

$ sudo mkfs.xfs -c options=/usr/share/xfsprogs/mkfs/ol_autofsck_10.0.conf /dev/nvme0n1
Parameters parsed from config file /usr/share/xfsprogs/mkfs/ol_autofsck_10.0.conf successfully
meta-data=/dev/nvme0n1           isize=512    agcount=200, agsize=268435455 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
         =                       exchange=1
data     =                       bsize=4096   blocks=53687091000, imaxpct=1
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=1
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

There are three filesystem features that are critical for supporting checking work: reverse mapping (rmapbt=1), directory parent pointers (parent=1), and atomic file data exchanges (exchange=1). If the status of all three features is 1, then they are enabled and the filesystem is ready for online fsck.

This filesystem is ready, so mount it and copy whatever data you like to it. We’ll deploy some container directory trees to the new filesystem:

$ sudo mount /dev/nvme0n1 /mnt
$ rsync -a -v -z -H -A -X -S /wherever/ /mnt/

On-Demand Scrubbing of Your Filesystem

Once you’ve exercised your filesystem for some time, you might want to start an explicit check that everything is still ok. To verify that everything is ok, one can run the scrubbing tool directly from the command line:

$ sudo xfs_scrub -v -n /mnt
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Phase 1: Find filesystem geometry.
/mnt: using 40 threads to scrub.
Phase 2: Check internal metadata.
Info: AG 1 superblock: Optimization is possible.
Info: AG 2 superblock: Optimization is possible.
Info: AG 3 superblock: Optimization is possible.
<snip>
Info: AG 198 superblock: Optimization is possible.
Info: AG 197 superblock: Optimization is possible.
Info: AG 199 superblock: Optimization is possible.
Phase 3: Scan all inodes.
Info: /mnt: Optimizations of inode record are possible.
Phase 5: Check directory tree.
Info: /mnt/c0/usr/share/terminfo: Unicode name "l" in directory could be confused with "1". (unicrash.c line 861)
Phase 7: Check summary counters.
16.TiB data used;  249.3M inodes used.
15.TiB data found; 249.7M inodes found.
249.7M inodes counted; 249.7M inodes checked.
Phase 8: Trim filesystem storage.

These findings are fairly routine. Scrub found some metadata optimization opportunities around the filesystem, but because we specified the -n option, no actions were taken. Scrub also found a directory containing filenames that render similarly in many file choosers, and noted that.

We’ll come back to this at the end.

Background Scrubbing of Your Filesystem

Ideally, a filesystem would take care of itself, rather than the system administrator remembering to intervene. UEK 8 will scan all mounted autofsck filesystems automatically.

Systemd schedules automatic fsck automatically when UEK 8’s xfsprogs-xfs_scrub package is installed:

$ systemctl list-timers xfs_scrub_all.timer
NEXT                        LEFT     LAST                        PASSED       UNIT                ACTIVATES
Sun 2025-02-16 03:10:56 PST 10h left Wed 2025-02-12 12:55:31 PST 3h 39min ago xfs_scrub_all.timer xfs_scrub_all.service

1 timers listed.

Timers represent services that are regularly activated at a particular time; they correspond (very roughly) to cron jobs on older Linux systems. Observe that the scan was last run on February 12th, and the next scan is scheduled for Sunday morning at 3:10am. If you do not see any output, please see the next section about enabling background scrub, and then come back.

The systemd timer file confirms this (excerpted):

$ systemctl cat xfs_scrub_all.timer
[Timer]
# Run on Sunday at 3:10am, to avoid running afoul of DST changes
OnCalendar=Sun *-*-* 03:10:00
RandomizedDelaySec=60
Persistent=true

[Install]
WantedBy=timers.target

This timer indeed starts at 3:10am, and is activated by default.

But what does this timer activate? Let’s look at the excerpted service file:

$ systemctl cat xfs_scrub_all.service
[Service]
ExecStart=/usr/sbin/xfs_scrub_all --auto-media-scan-interval 1mo

As you can see, the timer runs the xfs_scrub_all program. This program finds all mounted XFS filesystems and starts the xfs_scrub@ service for each of them, being careful not to schedule simultaneous scrubs of any two filesystems that persist on the same block device.

Let’s look at an excerpted xfs_scrub@ service file:

$ systemctl cat xfs_scrub@.service
[Service]
ExecStart=/usr/sbin/xfs_scrub -b -o autofsck -M /tmp/scrub/ %f
IOSchedulingClass=idle
CPUSchedulingPolicy=idle
Nice=19

# Create the service underneath the scrub background service slice so that we
# can control resource usage.
Slice=system-xfs_scrub.slice

There are a few things to note here:

The -b option in combination with the systemd slice limits the process to a single thread, and all scrub instances to 60% of a single CPU.
The IO scheduler limits scrubs to idle priority, so real work is minimally impacted by the scrub.
The -o autofsck option enables a per-filesystem property to control how aggressively the scrub process treats the filesystem. This will be discussed in the next section.

In summary, with UEK 8, an intentionally configured XFS filesystem will be checked automatically every Sunday morning at approximately 3:10.

Enabling Automatic Scrub

If you do not see xfs_scrub_all.timer in the output of systemctl list-timers, then automatic background scrubbing is not enabled. Although it should be enabled by default on a fresh installation of UEK 8, this is not the case for upgraded systems.

First, check the status of the timer:

$ sudo systemctl status xfs_scrub_all.timer
○ xfs_scrub_all.timer - Periodic XFS Online Metadata Check for All Filesystems
     Loaded: loaded (/usr/lib/systemd/system/xfs_scrub_all.timer; disabled; preset: enabled)
     Active: inactive (dead)
    Trigger: n/a
   Triggers: ● xfs_scrub_all.service

Notice that the “Loaded” state is “disabled”, and the “Active:” state is “inactive”. This timer is not enabled and not running.

Enable the timer:

$ sudo systemctl enable xfs_scrub_all.timer
Created symlink '/etc/systemd/system/timers.target.wants/xfs_scrub_all.timer' → '/usr/lib/systemd/system/xfs_scrub_all.timer'.
$ sudo systemctl status xfs_scrub_all.timer
○ xfs_scrub_all.timer - Periodic XFS Online Metadata Check for All Filesystems
     Loaded: loaded (/usr/lib/systemd/system/xfs_scrub_all.timer; enabled; preset: enabled)
     Active: inactive (dead)
    Trigger: n/a
   Triggers: ● xfs_scrub_all.service

The timer Loaded state has moved from disabled to enabled, but it is not started:

$ sudo systemctl start xfs_scrub_all.timer
$ sudo systemctl status xfs_scrub_all.timer
● xfs_scrub_all.timer - Periodic XFS Online Metadata Check for All Filesystems
     Loaded: loaded (/usr/lib/systemd/system/xfs_scrub_all.timer; enabled; preset: enabled)
     Active: active (waiting) since Thu 2025-02-13 01:03:48 GMT; 1s ago
 Invocation: 01c2e2bea6e047cc89558f001eb634f4
    Trigger: Thu 2025-02-13 03:10:33 GMT; 2h 6min left
   Triggers: ● xfs_scrub_all.service

Feb 13 01:03:48 ol9 systemd[1]: Started xfs_scrub_all.timer - Periodic XFS Online Metadata Check for All Filesystems.

Now the background service will run:

$ systemctl list-timers xfs_scrub_all.timer
NEXT                            LEFT LAST                        PASSED UNIT                ACTIVATES
Sun 2025-02-16 03:10:33 GMT   3 days Wed 2025-02-12 03:10:14 GMT      - xfs_scrub_all.timer xfs_scrub_all.service

1 timers listed.
Pass --all to see loaded but inactive timers, too.

Now we’re in good shape!

(Re)-Configuring Automatic Scrub

As noted two sections ago, automatic online fsck can be controlled through a per-filesystem property. User-configurable per-filesystem properties are a new feature that were added in xfsprogs 6.10, and available on UEK 8.

Let’s look at the properties of our test filesystem:

$ sudo xfs_property /mnt list
autofsck

Note the autofsck property, which we saw in the last section. Let’s examine the state of that property:

$ sudo xfs_property /mnt get autofsck
autofsck=repair

The filesystem is configured to check and repair the metadata. For the sake of demonstration, let’s change the property to allow only automatic optimization:

$ sudo xfs_property /mnt set autofsck=optimize
autofsck=optimize

Now we will automatically apply optimizations to this filesystem on Sunday mornings.

Filesystem properties can be managed without mounting the filesystem:

$ sudo xfs_property /dev/nvme0n1 get autofsck
autofsck=optimize

Observing Automatic Scrub

Go to sleep until next week, and let the automatic scrub run over the weekend. If you’re really impatient, you can run the background scrub now by activating the xfs_scrub_all.service unit.

Now look at the service status:

$ systemctl status xfs_scrub_all
○ xfs_scrub_all.service - Online XFS Metadata Check for All Filesystems
     Loaded: loaded (/usr/lib/systemd/system/xfs_scrub_all.service; static)
     Active: inactive (dead) since Sun 2025-02-13 03:14:52 GMT; 9h ago
 Invocation: a7e42aee66044f729a263a37ca8d84ce
TriggeredBy: ● xfs_scrub_all.timer
       Docs: man:xfs_scrub_all(8)
    Process: 4095 ExecStart=/usr/sbin/xfs_scrub_all --auto-media-scan-interval 1mo (code=exited, status=0/SUCCESS)
   Main PID: 4095 (code=exited, status=0/SUCCESS)
   Mem peak: 11.7M
        CPU: 108ms

Feb 16 03:14:26 ol9 xfs_scrub_all[4095]: Scrubbing /mnt...
Feb 16 03:14:30 ol9 xfs_scrub_all[4095]: Scrubbing /mnt done, (err=0)
Feb 16 03:14:32 ol9 xfs_scrub_all[4095]: Scrubbing /...
Feb 16 03:14:50 ol9 xfs_scrub_all[4095]: Scrubbing / done, (err=0)
Feb 16 03:14:52 ol9 systemd[1]: xfs_scrub_all.service: Deactivated successfully.
Feb 16 03:14:52 ol9 systemd[1]: Finished xfs_scrub_all.service - Online XFS Metadata Check for All Filesystems.

We can confirm that the scrub scheduler ran:

$ journalctl -u xfs_scrub_all.service --since 03:00
Feb 16 03:14:26 ol9 systemd[1]: Starting xfs_scrub_all.service - Online XFS Metadata Check for All Filesystems...
Feb 16 03:14:26 ol9 xfs_scrub_all[4095]: Scrubbing /mnt...
Feb 16 03:14:30 ol9 xfs_scrub_all[4095]: Scrubbing /mnt done, (err=0)
Feb 16 03:14:32 ol9 xfs_scrub_all[4095]: Scrubbing /...
Feb 16 03:14:50 ol9 xfs_scrub_all[4095]: Scrubbing / done, (err=0)
Feb 16 03:14:52 ol9 systemd[1]: xfs_scrub_all.service: Deactivated successfully.
Feb 16 03:14:52 ol9 systemd[1]: Finished xfs_scrub_all.service - Online XFS Metadata Check for All Filesystems.

We can also check the filesystem itself was really scrubbed. First let’s find out the name of the service:

$ systemd-escape --path /mnt
mnt

This means that the service name will be called xfs_scrub@mnt.service. Let’s see what it did:

$ journalctl -u xfs_scrub@mnt.service --since 03:00
Feb 16 03:14:26 ol9 systemd[1]: Starting xfs_scrub@mnt.service - Online XFS Metadata Check for /mnt...
Feb 16 03:14:26 ol9 xfs_scrub@mnt[4103]: EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Feb 16 03:14:26 ol9 xfs_scrub@mnt[4103]: Info: /mnt: Optimizing per autofsck directive.
Feb 16 03:14:26 ol9 xfs_scrub@mnt[4103]: 16.TiB data used;  249.3M inodes used.
Feb 16 03:14:26 ol9 xfs_scrub@mnt[4103]: 15.TiB data found; 249.7M inodes found.
Feb 16 03:14:27 ol9 xfs_scrub@mnt[4103]: /mnt: optimizations made: 199.
Feb 16 03:14:29 ol9 systemd[1]: xfs_scrub@mnt.service: Deactivated successfully.
Feb 16 03:14:29 ol9 systemd[1]: Finished xfs_scrub@mnt.service - Online XFS Metadata Check for /mnt.

Notice the message about optimizing? That’s the effect of us setting autofsck=optimize above. Everything looks good so far. Let’s re-run scrub from the CLI:

$ sudo xfs_scrub -v -n /mnt
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Phase 1: Find filesystem geometry.
/mnt: using 40 threads to scrub.
Phase 2: Check internal metadata.
Phase 3: Scan all inodes.
Phase 5: Check directory tree.
Info: /mnt/c0/usr/share/terminfo: Unicode name "l" in directory could be confused with "1". (unicrash.c line 861)
Phase 7: Check summary counters.
16.TiB data used;  249.3M inodes used.
15.TiB data found; 249.7M inodes found.
249.7M inodes counted; 249.7M inodes checked.
Phase 8: Trim filesystem storage.

No more messages about optimizations! The filesystem has cleaned itself up without intervention.

Conclusion

We’re very happy to be delivering automatic filesystem checking and repair with UEK 8. This gives system administrators a new tool to proactively find errors and bitrot in the filesystem metadata, and to be able to schedule repairs without incurring downtime. Please let us know what you think of this new feature!

XFS – Online Filesystem Repair in UEK8

Installation

Formatting Disks

On-Demand Scrubbing of Your Filesystem

Background Scrubbing of Your Filesystem

Enabling Automatic Scrub

(Re)-Configuring Automatic Scrub

Observing Automatic Scrub

Conclusion

Darrick Wong

Oracle Linux Automation Manager 2.3 is now available on Oracle Linux 9

ignore_rt: Teaching cgrulesengd to Separate Real-Time Tasks from the Rest

XFS – Online Filesystem Repair in UEK8

Installation

Formatting Disks

On-Demand Scrubbing of Your Filesystem

Background Scrubbing of Your Filesystem

Enabling Automatic Scrub

(Re)-Configuring Automatic Scrub

Observing Automatic Scrub

Conclusion

Authors

Darrick Wong

Oracle Linux Automation Manager 2.3 is now available on Oracle Linux 9

ignore_rt: Teaching cgrulesengd to Separate Real-Time Tasks from the Rest