X

News, tips, partners, and perspectives for the Oracle Linux operating system and upstream Linux kernel work

Btrfs on the Unbreakable Enterprise Kernel 6

In this blog we delve into the new features and enhancements for Btrfs that are available in the Unbreakable Enterprise Kernel 6, as described by Oracle Linux kernel engineer Anand Jain.

Oracle's release of the Unbreakable Enterprise Kernel 6 (UEK6) is based on the Linux kernel version 5.4. In which Btrfs continues to be a fully supported file-system. Let's look at some of the notable new features and enhancements in Btrfs on UEK6.

Compression level

Btrfs supports three compression types, zlib, lzo and zstd and there are three hierarchical ways to set the scale of compression in a Btrs file-system. You can set the scale to encompass the entire file-system, a specific subvolume or at the file/directory level. Like for example, mount option -o compress=<type> will set the compression type at the scale of file-system, where as btrfs property set <subvolume> compression <type> will set the compression type at the scale of subvolume and btrfs property set <file|directory-path> compression <type> is used to set the compression type on the file or directory level.

The compression type applies to new writes only. Existing data may be compressed using the command btrfs filesystem defragment -r -c<type> <path>.

In UEK6, the compression types zlib and zstd expose the compression level as a tunable parameter. The level matches the default zlib and zstd levels. You can now set the compression level using the mount option mount -o compress=<type>:<level> as shown below.

$ mount -o compress=zstd:9 /dev/sda /mnt

The zlib and zstd compressoin level ranges and the default level are outlined in the following table:

Type
Level
Default
zlib
1 - 9
3
zstd
1 - 15
3

 

Any level specified outside of the accepted range will simply set the level to the default level without any error. System log output using the command dmesg -k can be used to determine the actual applied compression level. For example:

$ dmesg -k | grep compression
BTRFS info (device sda): use zstd compression, level 9

The compression speed and ratio depends on the file data. A higher level provides a better ratio at the cost of slower compression speed. At any level, the decompression-speed and memory consumption remain almost constant. The higher compression level is expected to benefit read-mostly file-systems, or when creating images.

Early detection of in-compressible data

Until now Btrfs used the trial and error method to determine if the file would benefit from compression. The file inode which doesn't provide any compression benefit gets the NOCOMPRESS flag and the file is written uncompressed. While the trial and error method is the most accurate, it is less efficient, and wastes a lot of CPU cycles if the data is determined to be incompressible. It wastes even more CPU cycles if the file-system is mounted with the -o compress-force option. This mount option ignores the NOCOMPRESS flag for every new write on the file. In UEK6, this trial and error method for the early detection of incompressible data is replaced with a heuristic that does repeated pattern detection, frequency sampling, and Shannon entropy calculation to find out if the file is compressible.

Fallocate zero-range

Btrfs on UEK6 adds support for fallocate zero-range (FALLOC_FL_ZERO_RANGE) and joins the other file-systems (ext4 and xfs) that support it. So now after calling fallocate(1) with the zero-range option, you can expect the blocks on the device to be zeroed.

Swapfile support

Btrfs didn't support swapfile because it uses bmap to make a mapping of extents in the file.

The Btrfs bmap call would return logical addresses that weren't suitable for IO as they would change frequently as COW operations happen. The logical addresses could be on different devices configured as a raid, and therfore the swapfile mapping of extents in the file would be wrong.

Now, with the address_space_operations activation for the swapfiles, Btrfs code is enhanced to support swapfiles. Note that using a Btrfs swapfile comes with a few restrictions. The swapfile must be fully allocated as NOCOW, compression cannot be used, and it must reside on one device.

rmdir(1) a subvolume

rmdir(1) is now allowed to delete an empty subvolume. The rmdir(1) call will check the necessary user permission for the delete. Non-sudo users can now fully manage a subvolume similar to a directory.

$ id -u
1000
$ btrfs subvolume create /btrfs/sv1
Create subvolume '/btrfs/sv1'
$ btrfs subvolume delete /btrfs/sv1
Delete subvolume (no-commit): '/btrfs/sv1'
ERROR: Could not destroy subvolume/snapshot: Operation not permitted
$ rmdir /btrfs/sv1

The mount option -o user_subvol_rm_allowed will continue to allow non-empty subvolume delete from a non-sudoer.

Forget scanned devices

You can now un-register devices previously added by a device scan. A new ioctl BTRFS_IOC_FORGET_DEV frees the previously scanned devices that are unmounted.

$ btrfs device scan
$ btrfs device scan --forget

Out of band deduplication

In UEK5, the deduplication limit for ioctl fideduperange(2) is 16 MiB, and Btrfs silently limited the deduplication to the first 16 MiB. In UEK6 deduplication is no longer limited to the first 16 MiB, this is overcome by splitting the range of out-of-band deduplication into 16 MiB chunks.

Change file-system UUID instantly

With Btrfs on UEK6 you can assign a new file-system UUID without overwriting all metadata blocks. The original UUID is stored as metadata_uuid in the super-block. This provides a faster way to change the file-system UUID using the btrfstune(1) command with the -M|m option as shown below:

$ btrfs filesystem show /dev/sda
Label: none uuid: 16a0e00c-cb98-4f44-8fb4-730bf0a32ab4
Total devices 1 FS bytes used 176.00KiB
devid 1 size 12.00GiB used 20.00MiB path /dev/sda

The following example shows how to change the file-system UUID:

$ time btrfstune -m /dev/sda
real 0m0.052s
user 0m0.009s
sys 0m0.009s

$ btrfs filesystem show /dev/sda
Label: none uuid: caa6a218-4b23-43d4-9b0a-08a42f0ddca5
Total devices 1 FS bytes used 176.00KiB
devid 1 size 12.00GiB used 20.00MiB path /dev/sda

Summary

In this blog, we have looked at the notable enhancements and new features in Btrfs on UEK6, which also contains a number of other Brtfs stability fixes as well.

Join the discussion

Comments ( 2 )
  • Hi-Angel Friday, May 22, 2020
    Amazing, thanks! Is "Shannon entropy calculation" upstreamed?
  • Anand Jain Friday, May 29, 2020
    Oh. Yes. Oracle Linux follows the upstream first policy. The related upstream commit is 19562430c621 (Btrfs: heuristic: add Shannon entropy calculation). I hope this helps.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.