gzip for ZFS update

The other day I posted about a prototype I had created that adds a gzip compression algorithm to ZFS. ZFS already allows administrators to choose to compress filesystems using the LZJB compression algorithm. This prototype introduced a more effective -- albeit more computationally expensive -- alternative based on zlib.

As an arbitrary measure, I used tar(1) to create and expand archives of an ON (Solaris kernel) source tree on ZFS filesystems compressed with lzjb and gzip algorithms as well as on an uncompressed ZFS filesystem for reference:

Thanks for the feedback. I was curious if people would find this interesting and they do. As a result, I've decided to polish this wad up and integrate it into Solaris. I like Robert Milkowski's recommendation of options for different gzip levels, so I'll be implementing that. I'll also upgrade the kernel's version of zlib from 1.1.4 to 1.2.3 (the latest) for some compression performance improvements. I've decided (with some hand-wringing) to succumb to the requests for me to make these code modifications available. This is not production quality. If anything goes wrong it's completely your problem/fault -- don't make me regret this. Without further disclaimer: pdf patch

In reply to some of the comments:

UX-admin One could choose between lzjb for day-to-day use, or bzip2 for heavily compressed, "archival" file systems (as we all know, bzip2 beats the living daylights out of gzip in terms of compression about 95-98% of the time).

It may be that bzip2 is a better algorithm, but we already have (and need zlib) in the kernel, and I'm loath to add another algorithm

ivanvdb25 Hi, I was just wondering if the gzip compression has been enabled, does it give problems when an ZFS volume is created on an X86 system and afterwards imported on a Sun Sparc?

That isn't a problem. Data can be moved from one architecture to another (and I'll be verifying that before I putback).

dennis Are there any documents somewhere explaining the hooks of zfs and how to add features like this to zfs? Would be useful for developers who want to add features like filesystem-based encryption to it. Thanks for your great work!

There aren't any documents exactly like that, but there's plenty of documentation in the code itself -- that's how I figured it out, and it wasn't too bad. The ZFS source tour will probably be helpful for figuring out the big picture.

Update 3/22/2007: This work was integrated into build 62 of onnv.

Technorati Tags:


Thanks for doing this -- way cool. One thought on the different compression levels: if I recall correctly, the level isn't part of the compressed output, so it's not strictly necessary to burn 9 slots (gzip1...gzip9) in the compression vector array. Keeping an extra byte per dataset to remember the level would be trivial; the only awkward part is the dnode. Although we don't currently export it in the admin model, compression is actually settable on a per-file basis, not just per-filesystem, and follows the same inheritance rules that we use for nested datasets. So the dnode also has a byte for compression flavor, and if we didn't use 9 distinct values, we'd need another byte to encode the level. We have spare bytes in the dnode, but it seems kind of wasteful. Then again, so is this comment. If you want to burn 9 slots, be my guest. ;-)

Posted by Jeff Bonwick on January 31, 2007 at 07:07 PM PST #

Thanks Adam for your work and your answers to our questions :-)

Posted by alexvdb25 on January 31, 2007 at 08:05 PM PST #

Adam, That's great feature! What is time(sec) on left diagram - wall or CPU time? What was the other time? What hardware it was measured on (CPUs, HD etc)? Is your implementation expected to scale with # of CPUs?

Posted by Igor on January 31, 2007 at 10:15 PM PST #

Thanks for the note. I'll ping you about the trade-offs between using extra compression slots versus using a byte in the dnode.

The performance test was very ad hoc, but the numbers are the real time on an otherwise idle 2-way 2Ghz Opteron. As for whether it will scale, tar itself is single-threaded and there's the problem of ZFS compression executing only on a single CPU (6460622). When these issues get sorted out and I'm closer to putting back, I or someone else will gather sore more meaningful data.

Posted by Adam Leventhal on February 01, 2007 at 12:41 AM PST #

Thanks for clarifications. It seems there's no good reason to use tar for per. measurements of ZFS.

Posted by Igor on February 01, 2007 at 05:05 AM PST #

hey, this is really damn great stuff! i would really like to try zfs with zlib compression, especially doing performance comparison and compression efficiency. unfortunately, i`m not that much into solaris but more on linux, so i`m a real newbie when it comes to applying patches to solaris kernel source and compile/install such kernel. any hints how to do this ? or - should i just sit and wait for some number of days/weeks until this shows up in some update-package/distro release ? regards roland

Posted by roland on February 02, 2007 at 09:37 PM PST #

I hope to have this putback to Solaris within the next few weeks, and it will be available shortly after that. I didn't really intend anyone but the most adventurous OpenSolaris user to attempt to apply the patch, so unless that describes you sit tight -- gzip for ZFS will be available relatively soon.

Posted by Adam Leventhal on February 09, 2007 at 12:53 AM PST #

Post a Comment:
Comments are closed for this entry.

Adam Leventhal, Fishworks engineer


« April 2014