Taking ZFS deduplication for a test drive

Now that I have a working OpenSolaris build 128 system, I just had to take ZFS deduplication for a spin, to see if it was worth all of the hype.

Here is my test case: I have 2 directories of photos, totaling about 90MB each. And here's the trick - they are almost complete duplicates of each other. I downloaded all of the photos from the same camera on 2 different days. How many of you do that ? Yeah, me too.

Let's see what ZFS can figure out about all of this. If it is super smart we should end up with a total of 90MB of used space. That's what I'm hoping for.

The first step is to create the pool and turn on deduplication from the beginning.
# zpool create -f scooby -O dedup=on c2t2d0s2
This will use sha256 for determining if 2 blocks are the same. Since sha256 has such a low collision probability (something like 1x10\^-77), we will not turn on automatic verification. If we were using an algorithm like fletcher4 which has a higher collision rate we should also perform a complete block compare before allowing the block removal (dedup=fletcher4,verify)

Now copy the first 180MB (remember, this is 2 sets of 90MB which are nearly identical sets of photos).
# zfs create scooby/doo
# cp -r /pix/Alaska\* /scooby/doo
And the second set.
# zfs create scooby/snack
# cp -r /pix/Alaska\* /scooby/snack
And finally the third set.
# zfs create scooby/dooby
# cp -r /pix/Alaska\* /scooby/dooby
Let's make sure there are in fact three copies of the photos.
# df -k | grep scooby
scooby               74230572      25 73706399     1%    /scooby
scooby/doo           74230572  174626 73706399     1%    /scooby/doo
scooby/snack         74230572  174626 73706399     1%    /scooby/snack
scooby/dooby         74230572  174625 73706399     1%    /scooby/dooby

OK, so far so good. But I can't quite tell if the deduplication is actually doing anything. With all that free space, it's sort of hard to see. Let's look at the pool properties.
# zpool get all scooby
scooby  size           71.5G       -
scooby  capacity       0%          -
scooby  altroot        -           default
scooby  health         ONLINE      -
scooby  guid           5341682982744598523  default
scooby  version        22          default
scooby  bootfs         -           default
scooby  delegation     on          default
scooby  autoreplace    off         default
scooby  cachefile      -           default
scooby  failmode       wait        default
scooby  listsnapshots  off         default
scooby  autoexpand     off         default
scooby  dedupratio     5.98x       -
scooby  free           71.4G       -
scooby  allocated      86.8M       -
Now this is telling us something.

First notice the allocated space. Just shy of 90MB. But there's 522MB of data (174MB x 3). But only 87MB used out of the pool. That's a good start.

Now take a look at the dedupratio. Almost 6. And that's exactly what we would expect, if ZFS is as good as we are lead to believe. 3 sets of 2 duplicate directories is 6 total copies of the same set of photos. And ZFS caught every one of them.

So if you want to do this yourself, point your OpenSolaris package manager at the dev repository and wait for build 128 packages to show up. If you need instructions on using the OpenSolaris dev repository, point the browser of your choice at http://pkg.opensolaris.org/dev/en/index.shtml. And if you can't wait for the packages to show up, you can always .

Technocrati Tags:
<script type="text/javascript"> var sc_project=1193495; var sc_invisible=1; var sc_security="a46f6831"; </script> <script type="text/javascript" src="http://www.statcounter.com/counter/counter.js"></script>

Greetings, Bob. Thanks for trying this, I appreciate the early reporting on real-world results. =-). I'm curious about your thoughts on auto-ditto.

According to PSARC 2009/571, there's a default threshold to induce the creation of an additional copy (dedupditto=100) once you add so many references to a block.

Your test pool has a single vdev, so overall resilience is limited. Obviously this is just a scratch testbed, but for scenarios such as this (perhaps embedded devices) would you consider setting dedupditto=2 or 3? This would protect dedup'ed blocks a bit more, but still give you some space advantage.

I imagine that if you had mirror or raidz vdev's, then you could rely entirely upon that instead, even without auto-ditto. I'm not sure just where the benefits of dedupditto kick in, or whether that would actually help with resiliency.

Also, your data may not benefit from compression, but if you plan any more experimentation, I would be interested to see how enabling compression changes the reported values, and how easy it is to piece it all together from reading "zfs list -o space".

Thanks... -cheers, CSB

Posted by Craig S. Bell on November 23, 2009 at 04:51 AM CST #

Thanks for sharing Bob. Since ZFS dedup is on block level, did you ever try to have only one file but within this file there is redundant data. I'm curious to know whether we can get some dedup from this. My customer have one big file (VM ware), wondering how dedup can help them to lower cost of storage.

Posted by Paisit Wongsongsarn on November 24, 2009 at 01:30 AM CST #

dear bob, could you let me know when this dedupe feature will appear in a standard production release?

Posted by mike on November 27, 2009 at 01:52 AM CST #

Hey Mike,

I've seen the question asked quite a few times at Jeff Bonwick's blog (http://blogs.sun.com/bonwick/entry/zfs_dedup) and without an answer. At this point I would not want to speculate about future Solaris 10 features because I just don't know. I would keep an eye on Jeff's blog or follow some of the discussions at zfs-discuss@opensolaris.org. I will certainly post something about this (and other Solaris 10 features) as soon as they become public.

Posted by Bob Netherton on November 27, 2009 at 05:01 AM CST #

Post a Comment:
  • HTML Syntax: NOT allowed

Bob Netherton is a Principal Sales Consultant for the North American Commercial Hardware group, specializing in Solaris, Virtualization and Engineered Systems. Bob is also a contributing author of Solaris 10 Virtualization Essentials.

This blog will contain information about all three, but primarily focused on topics for Solaris system administrators.

Please follow me on Twitter Facebook or send me email


« April 2014