I've been dabbling a bit in ZFS recently, and what's amazing is not just how well it solved the well-understood filesystem problem, but how its design opens the door to novel ways to manage data. Compression is a great example. An almost accidental by-product of the design is that your data can be stored compressed on disk. This is especially interesting in an era when we have CPU cycles to spare, many too few available IOPs, and disk latencies that you can measure with a stop watch (well, not really, but you get the idea). With ZFS can you trade in some of those spare CPU cycles for IOPs by turning on compression, and the additional latency introduced by decompression is dwarfed by the time we spend twiddling our thumbs waiting for the platter to complete another revolution.
Turning on compression in zfs (zfs compression=on <dataset>) enables the so called LZJB compression algorithm -- a variation on Lempel-Ziv tagged by its humble author. LZJB is fast, reasonably effective, and quite simple (compress and decompress are implemented in about a hundred lines of code). But the ZFS architecture can support many compression algorithms. Just as users can choose from several different checksum algorithms (fletcher2, fletcher4, or sha256), ZFS lets you pick your compression routine -- it's just that there's only the one so far.
I thought it might be interesting to add a gzip compression algorithm based on zlib. I was able to hack this up pretty quicky because the Solaris kernel already contains a complete copy of zlib (albeit scattered around a little) for decompressing CTF data for DTrace, and apparently for some sort of compressed PPP streams module (or whatever... I don't care). Here's what the ZFS/zlib mash-up looks like (for the curious, this is with the default compression level -- 6 on a scale from 1 to 9):
# zfs create pool/gzip
# zfs set compression=gzip pool/gzip
# cp -r /pool/lzjb/\* /pool/gzip
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
pool/gzip 64.9M 33.2G 64.9M /pool/gzip
pool/lzjb 128M 33.2G 128M /pool/lzjb
That's with a 1.2G crash dump (pretty much the most compressible file imaginable). Here are the compression ratios with a pile of ELF binaries (/usr/bin and /usr/lib):
# zfs get compressratio
NAME PROPERTY VALUE SOURCE
pool/gzip compressratio 3.27x -
pool/lzjb compressratio 1.89x -
Pretty cool. Actually compressing these files with gzip(1) yields a slightly smaller result, but it's very close, and the convenience of getting the same compression transparently from the filesystem is awfully compelling. It's just a prototype at the moment. I have no idea how well it will perform in terms of speed, but early testing suggests that it will be lousy compared to LZJB. I'd be very interested in any feedback: Would this be a useful feature? Is there an ideal trade-off between CPU time and compression ratio? I'd like to see if this is worth integrating into OpenSolaris.
Technorati Tags:
ZFS
OpenSolaris
Useful? Yes
The trouble with any compression in the file system and this one makes it even more clear that you would want to be able to get at both the compressed and the uncompressed data.
Consider an ftp server it would be good if it could offer compressed data without having the system uncompress in the file system only to compress again in the ftp server.
Then there is NFS......
It's very interesting that a compression algorithm could easily added to ZFS. Is this hack available as source code somewhere? :-)
Thanks and best regards,
Ivan
If you can provide you code changes privately right now it would be great.
I was just wondering if the gzip compression has been enabled, does it give problems when an ZFS volume is created on an X86 system and afterwards imported on a Sun Sparc?
Best regards,
Ivan
Historical tidbit:
ZFS finally implemented per-filesystem, one could say "per-directory" compression that AmigaOS had with the XFH: pseudo drive implemented with the xpkmaster.library (http://www.dstoecker.eu/xpkmaster.html).
Thanks for your great work!
Dennis - just look into source, in .h files which have lot of useful comments. See also http://opensolaris.org/os/community/zfs/source/.
ZFS is really nicely written and as i wrote before if you want to add another compression algorithm to ZFS the hard part is to implement the algorithm in kernel rather than hook it up into ZFS which is easy.