• ZFS
    January 29, 2007

a small ZFS hack

Guest Author

I've been dabbling a bit in ZFS recently, and what's amazing is not just how well it solved the well-understood filesystem problem, but how its design opens the door to novel ways to manage data. Compression is a great example. An almost accidental by-product of the design is that your data can be stored compressed on disk. This is especially interesting in an era when we have CPU cycles to spare, many too few available IOPs, and disk latencies that you can measure with a stop watch (well, not really, but you get the idea). With ZFS can you trade in some of those spare CPU cycles for IOPs by turning on compression, and the additional latency introduced by decompression is dwarfed by the time we spend twiddling our thumbs waiting for the platter to complete another revolution.

smaller and smaller

Turning on compression in zfs (zfs compression=on <dataset>) enables the so called LZJB compression algorithm -- a variation on Lempel-Ziv tagged by its humble author. LZJB is fast, reasonably effective, and quite simple (compress and decompress are implemented in about a hundred lines of code). But the ZFS architecture can support many compression algorithms. Just as users can choose from several different checksum algorithms (fletcher2, fletcher4, or sha256), ZFS lets you pick your compression routine -- it's just that there's only the one so far.

putting the z(lib) in ZFS

I thought it might be interesting to add a gzip compression algorithm based on zlib. I was able to hack this up pretty quicky because the Solaris kernel already contains a complete copy of zlib (albeit scattered around a little) for decompressing CTF data for DTrace, and apparently for some sort of compressed PPP streams module (or whatever... I don't care). Here's what the ZFS/zlib mash-up looks like (for the curious, this is with the default compression level -- 6 on a scale from 1 to 9):

# zfs create pool/gzip
# zfs set compression=gzip pool/gzip
# cp -r /pool/lzjb/\* /pool/gzip
# zfs list
pool/gzip 64.9M 33.2G 64.9M /pool/gzip
pool/lzjb 128M 33.2G 128M /pool/lzjb

That's with a 1.2G crash dump (pretty much the most compressible file imaginable). Here are the compression ratios with a pile of ELF binaries (/usr/bin and /usr/lib):

# zfs get compressratio
pool/gzip compressratio 3.27x -
pool/lzjb compressratio 1.89x -

Pretty cool. Actually compressing these files with gzip(1) yields a slightly smaller result, but it's very close, and the convenience of getting the same compression transparently from the filesystem is awfully compelling. It's just a prototype at the moment. I have no idea how well it will perform in terms of speed, but early testing suggests that it will be lousy compared to LZJB. I'd be very interested in any feedback: Would this be a useful feature? Is there an ideal trade-off between CPU time and compression ratio? I'd like to see if this is worth integrating into OpenSolaris.

Technorati Tags:

Join the discussion

Comments ( 10 )
  • Chris Gerhard Monday, January 29, 2007

    Useful? Yes

    The trouble with any compression in the file system and this one makes it even more clear that you would want to be able to get at both the compressed and the uncompressed data.

    Consider an ftp server it would be good if it could offer compressed data without having the system uncompress in the file system only to compress again in the ftp server.

    Then there is NFS......

  • ivanvdb25 Monday, January 29, 2007
    Useful? Yes
    It's very interesting that a compression algorithm could easily added to ZFS. Is this hack available as source code somewhere? :-)
    Thanks and best regards,
  • Robert Milkowski Monday, January 29, 2007
    Of course it would be useful and should be integrated! I was playing with the same idea some time ago and adding another compression algorithm to ZFS is easy - the hard part is to do compression/decompression in kernel. If you've got gzip it should be integrated ASAP. There's one thing which can limit performance - there's open bug that couses all compression/decompression in ZFS being run by only one thread so only one CPU is utilized. Anyway lot of people especially with SATA disks are using ZFS for long term storage and do not necessary need lot of IOs. Simple way of specifying level of compression would be also useful - maybe in form of compression=gzip-N where N is compression level. Without specyfying -N (so only compression=gzip) default level would be enforced. Hope to see it integrated in hours... ok, in days :))) Great job!
    If you can provide you code changes privately right now it would be great.
  • Anantha Monday, January 29, 2007
    One side effect of the compression feature is that it skews the CPU utilization. I've been using the compress feature on a 3TB filesystem with excellent results. The one issue I notice is that when I've a decent amount of I/O against the filesystem my CPU spends most of its time in 'sys', >40% is not abnormal on my E2900 (24 x 96GB)
  • ivanvdb25 Monday, January 29, 2007
    I was just wondering if the gzip compression has been enabled, does it give problems when an ZFS volume is created on an X86 system and afterwards imported on a Sun Sparc?
    Best regards,
  • UX-admin Monday, January 29, 2007
    Adam, this is positively and without a doubt some really great stuff! One could choose between lzjb for day-to-day use, or bzip2 for heavily compressed, "archival" file systems (as we all know, bzip2 beats the living daylights out of gzip in terms of compression about 95-98% of the time).

    Historical tidbit:

    ZFS finally implemented per-filesystem, one could say "per-directory" compression that AmigaOS had with the XFH: pseudo drive implemented with the xpkmaster.library (http://www.dstoecker.eu/xpkmaster.html).
  • dennis Monday, January 29, 2007
    Are there any documents somewhere explaining the hooks of zfs and how to add features like this to zfs? Would be useful for developers who want to add features like filesystem-based encryption to it.
    Thanks for your great work!
  • Derek Morr Monday, January 29, 2007
    UX-admin, how do you claim that bzip2 "beats the living daylights out of gzip" ? I haven't seen it compress files significantly better than gzip, and it uses considerably more CPU time to do so.
  • Robert Milkowski Tuesday, January 30, 2007

    Dennis - just look into source, in .h files which have lot of useful comments. See also http://opensolaris.org/os/community/zfs/source/.
    ZFS is really nicely written and as i wrote before if you want to add another compression algorithm to ZFS the hard part is to implement the algorithm in kernel rather than hook it up into ZFS which is easy.
  • Adam Leventhal Wednesday, January 31, 2007
    Thanks for all the feedback! I've posted an update that includes some more information and responses to your questions and suggestions.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.