XFS has many features that make it unique and can cause behavior that is different from other file systems. As an example, in a recent finding, we observed that the df command shows higher space utilization compared to du when many small files are copied. Over time, the outputs of both df and du converge. This happens because XFS initially reserves additional space for these files.

The feature that causes this behavior is Dynamic Speculative End of File (EOF) Preallocation. This feature allows files to dynamically reserve more space to prevent fragmentation in case the file is grown later on. This blog post explores what this feature is, how it works, and how it can be beneficial for certain use cases.

End-Of-File (EOF) Preallocation is a technique used by file systems to optimize the handling of file growth. It involves reserving space at the end of a file in advance, even if the file does not need the space yet. This preallocation of space can help improve performance and prevent fragmentation as the file grows over time. For example, imagine you have a file that will eventually grow to 1 GB in size, but you only write 100 MB initially. Without EOF preallocation, the file system may only allocate space for 100 MB and will need to perform additional allocation every time new data is written. With EOF preallocation, the system may reserve extra space for future growth, making it ready for new data without needing additional allocations. Even though the file only contains 100 MB initially, the reserved space at the end of the file enables it to expand easily and grow into the same extent.

XFS supports a dynamic preallocation size. Instead of preallocating a fixed amount of space at the end of the file, XFS speculatively allocates space based on the expected growth pattern of the file. This speculative approach is more efficient, as it tries to predict how much space the file will need in the near future, reducing the need for large allocation operations. As the file grows, XFS dynamically adjusts the speculative allocation based on the actual usage pattern. If the file grows faster than expected, more space will be allocated dynamically. If it grows slower, less space is preallocated, reducing waste. Note that XFS speculative preallocation past EOF is an unwritten allocation type; space is reserved, but blocks are not allocated.

The dynamic preallocation size increases as the file grows, with the limit being a single extent or less if the filesystem is nearly full. This limit is shown below. Since speculative preallocation uses space in the filesystem and that space is more precious as the filesystem gets full, the preallocation size is reduced as the filesystem gets full.

freespace       max prealloc size
  >5%             full extent (8GB)
  4-5%             2GB (8GB >> 2)
  3-4%             1GB (8GB >> 3)
  2-3%           512MB (8GB >> 4)
  1-2%           256MB (8GB >> 5)
  <1%            128MB (8GB >> 6)

XFS automatically reclaims speculative preallocation space under certain conditions to improve space efficiency. This space is typically freed when the filesystem starts to run low on free space or when the dirty watchdog is triggered. The dirty watchdog identifies inactive files that have unused preallocated space and marks them for cleanup.

When a file is truncated, the xfs_inactive() function is responsible for releasing the speculative preallocation associated with that file. This ensures that the reserved space does not remain indefinitely allocated to a file that no longer needs it.

By default, the dirty watchdog is triggered every 5 minutes (300 seconds). This interval is configurable using the procfs interface. You can check the current setting with:

# cat /proc/sys/fs/xfs/speculative_prealloc_lifetime
300

To change the interval, for example, to make cleanup more aggressive by reducing it to 150 seconds, use:

# echo 150 > /proc/sys/fs/xfs/speculative_prealloc_lifetime

This mechanism helps XFS strike a balance between performance (by reducing fragmentation) and space efficiency (by reclaiming unused reserved space). If you prefer a more predictable and static behavior, you can disable dynamic speculative allocation by explicitly setting the preallocation size during mount using the allocsize option. This option enforces a fixed allocation unit for all file growth, rather than relying on the dynamic estimation done by XFS.

For example, to mount a filesystem with a fixed preallocation size of 64 KB:

# mount -o allocsize=64k test.img tmp_mnt/

This forces XFS to allocate space in 64 KB chunks whenever files are written to the mounted volume. While this removes the flexibility of dynamic preallocation, it can be beneficial in workloads that require tight control over file layout, fragmentation, or space consumption.

Here’s a simple way to visualize how speculative preallocation in XFS affects space reporting:

Create a large number of small files and copy them into a freshly mounted XFS filesystem. During this operation, you may observe that the df and du commands report different space usage values – especially right after the copy completes. This happens because df reports allocated space from the filesystem’s perspective, including speculative preallocation. In contrast, du only reports the actual size of file data written to disk. So immediately after file creation, df shows higher usage due to extra space reserved by XFS for future file growth. This space is released later, typically when the dirty watchdog kicks in and cleans up unused speculative allocations.

# du /tmp_mnt; df /tmp_mnt
520428  /tmp_mnt

Filesystem     1K-blocks   Used Available Use% Mounted on
/dev/loop1      10383360 804824   9578536   8% /tmp_mnt

After some time (after the dirty watchdog reclaims speculative space):

# du /tmp_mnt; df /tmp_mnt
520428  /tmp_mnt

Filesystem     1K-blocks   Used Available Use% Mounted on
/dev/loop1      10383360 630864   9752496   7% /tmp_mnt

As you can see, the du output remains constant because the actual file content hasn’t changed. However, df reflects reduced space usage after XFS frees up the preallocated blocks that were never used. This difference is a direct result of XFS’s dynamic preallocation, which can temporarily inflate space usage but improves performance and fragmentation control during file growth.

XFS also pre-reserves space for metadata btree expansion for any btree that could split (and thus require more space). Specifically, the free inode, reverse mapping, and reference count btrees each reserve enough space to allow for growth to the maximum possible size. As a result, the discrepancy can be as much as 2–3% of the total filesystem space on a freshly formatted filesystem. These btrees were not enabled by default before OL8; therefore, we now see the space consumption as reflected in the df output.

Benefits

  • Improved Efficiency: By allocating space dynamically and speculatively, XFS minimizes wasted disk space. Instead of allocating too much space upfront (which may go unused) or too little (which would require frequent reallocations), XFS adapts allocation based on actual file growth. This balance ensures optimal space usage while reducing allocation overhead.
  • Higher Throughput for Sequential Workloads: For applications that generate large files or engage in sequential writing, this can significantly reduce the actions required to allocate space, resulting in higher throughput.
  • Reduced Fragmentation: As previously mentioned, this method helps reduce fragmentation by allocating contiguous blocks of space based on the anticipated file growth. This is especially important in environments with a high degree of file operations, where fragmentation could degrade performance over time.
  • Lower Latency: By anticipating the need for additional space, XFS reduces the frequency of allocation requests, which minimizes latency during file write operations. This results in a smoother experience for applications that write large files or perform continuous logging or database operations.

Conclusion

Dynamic Speculative EOF Preallocation is a feature in XFS that optimizes space allocation for large and growing files, particularly in environments that deal with heavy write workloads. By speculatively allocating space based on predicted file growth patterns and adjusting dynamically as needed, XFS reduces fragmentation, improves performance, and ensures that disk space is used more efficiently.

For applications that involve large sequential writes such as databases, video editing, or log management – XFS with dynamic speculative EOF preallocation offers significant advantages in both throughput and space management. Understanding and leveraging this feature can provide critical performance gains in high-demand environments. Other filesystems like ext4 and btrfs also support variations of EOF preallocation, though with different implementations and tunable behaviors.