ZFS v. UFS performance for building Solaris

Building Solaris: ZFS vs. UFS

I'd been an early-adopter of ZFS; it's amazingly easy to use and promises good performance.  My personal workstation is always running a relatively recent build of Solaris, and one of the most time-consuming tasks I do is full "nightly" builds.  Once I upgraded my workstation to a dual-core CPU, I noticed some oddity in ZFS performance and did some investigation.


My personal workstation is an Asus A8N-SLI Deluxe with 1GB of RAM and an Opteron 180 CPU (effectively identical to an Athlon 64 X2 4800+), upgraded last month from an Athlon 64 3200+.

Several months ago, when ZFS first integrated into Solaris Nevada, I did some experimentation with ZFS on a Dell PowerEdge equipped with 5 fast SAS disks.  I found that single-threaded access to ZFS was fast, and I could easily achieve over 200MB/sec transfer rates to a pool of 4 disks.  I saw some oddities with Bonnie numbers, and mentioned this to Tim Bray and he was kind enough to blog about this early experiment. and perform some experimentation of his own.

I've been using ZFS on my personal workstation since then; when I upgraded from a single-core Athlon 64 to a dual-core Athlon 64 X2, I wondered why I didn't see as much of an improvement in full nightly build times as I expected, and compared notes with another Athlon 64 X2 user.  I discovered that my build time was about 40% longer than his, and the only difference I could find was that he was using UFS and I was using ZFS for the workspace storage.

In the last week, a wad of ZFS fixes have integrated into Solaris Nevada, and, I'm sure, will roll out into the sunlight in a Solaris Express release.  The fixes sounded quite promising, so I ran an experiment this weekend.

DISCLAIMER: these are not official benchmarks of any kind.  They're just some stuff I did because it was too wet to go outside and mow the lawn.

I added another disk to my personal workstation (an 80GB Hitachi SATA 7200RPM drive, if it matters), partitioned it into two equal 40GB slices, and created a UFS in one slice and a ZFS pool in the other (I suppose it is possible that the location of the slices on the disk may have some impact on the overal performance; I may try swapping the filesystems in the slices as a sanity-test for this.  For now, I'm going to assume that the disk performance is the same for both slices).

... Then I got to work with some simple Solaris nightly-build benchmarking.

The tests are simple:
  • Initial workspace bringover time
  • Several trials of full nightly build time
  • A "parallel" bringover, in which two workspaces were brought over at the same time
  • A "parallel" build, in which two workspace were built at the same time.
Note that I only tested one kind of filesystem at a time, and that the system configuration was unchanged during the tests.  The system was essentially idle otherwise.  Overall, I'm assuming that the impact of all other system elements is identical.

On to the numbers!

First, the intial bringover:

real    7m31.42s
user    0m13.58s
sys     0m19.53s
real    6m25.35s
user    0m13.56s
sys     0m18.63s

Well, the user time is identical in both trials, which one would expect.  I ran the same application twice, so the times should be about the same.  Whew! Sane results so far.  Note that ZFS seems to enjoy about a 17% performance advantage in this trial.  Note that this test is dominated by filesystem performance.

Now, the full nightly build trials.  I did three; the first one does a little less work in that the workspace is not populated with binaries to delete (make clobber).  These are just the real elapsed-time for each build, in HH:MM:SS:

  1. real    1:39:08
  2. real    1:46:01
  3. real    1:47:56
  1. real    1:42:08
  2. real    1:43:47
  3. real    1:44:50

Interesting.  I'm not surprised UFS was quicker on the first trial and slower on the subsequent trials.  ZFS seems to be slightly faster overall, though the difference is just 1-2%.  Clearly, this experiment suggests that ZFS and UFS performance are essentially identical when it comes to building Solaris.  Note that this test is dominated by CPU performance.

Before the recent wad of ZFS fixes, I began to suspect that ZFS was too aggressive in internal locking and that multiple threads would become synchonized on a ZFS pool.  In fact, this would tend to explain the 40% penalty for using ZFS that I had seen on earlier Solaris bits.  So the next tests attempt to do two things in parallel; I launched the tests with a script that did something like

start test 1 &
sleep 10
start test 2 &

First parallel test - two bringovers in parallel:

real    13m57.95s
user    0m27.83s
sys     0m41.19s
real    8m21.28s
user    0m27.30s
sys     0m36.77s

More sane numbers!  User times are the same and twice what a single bringover took.  UFS was almost but not quite twice the single-bringover real-time.  ZFS does very well in this test, around 65% faster.  Again, this test is dominated by filesystem performance.

OK, let's do two builds in parallel:

real    8:09:42
real    3:34:15

The ZFS trial looks totally sane; 3h 34m is just about exactly twice the 1h 47m times we saw in the single-build trials.  The UFS trial took over 8 hours, which isn't easy to explain.  I did not monitor the system page/swap statistics during either trial, but I have a hard time believing ZFS wouldn't equally suffer page/swap penalties.   If I wasn't busy enough with real work, I'd try this again, but I don't believe I did anything blatantly wrong in this test case.

Whew.  This is strange - but the ZFS numbers remain totally sane.

Let's try one more round of tests, this time, turning compression=on in the ZFS.  Since I didn't do UFS trials, I'll just compare uncompressed to compressed here:

ZFS (uncompressed)
ZFS (compressed)
real    6m25.35s
user    0m13.56s
sys     0m18.63s

Single build times:
real    1:42:08
real    1:43:47
real    1:44:50

Parallel bringover (2 ws):
real    8m21.28s
user    0m27.30s
sys     0m36.77s

Parallel build time (2 jobs):
real    3:34:15
real    6m32.13s
user    0m13.56s
sys     0m19.99s

Single build:
real    1:43:14

Parallel bringover (2 ws):
real    9m48.18s
user    0m27.55s
sys     0m41.02s

Parallel build time (2 jobs):
real    3:30:35

From this, I suspect that ZFS compression exacts a slight penalty on write performance, but may impart a slight improvement in read/build performance.  In any case, I noted a compression ratio of over 1.7x after the parallel build time tests, so it's pretty cool that I'm getting more data on the disk and not paying a performance penalty at all.

As soon as build 36 makes it out into Solaris Express, I'd love to get some comparable results from the community.


If you're using single-platter 80GB hd (very likely if it's a modern one), performance of 1st and 2nd slice SHOULD be different - especially for larger reads/writes. But even random access may (possibly) differ slightly due to large seek distances for inner slice.

Posted by Igor on March 12, 2006 at 11:41 PM PST #

Post a Comment:
Comments are closed for this entry.



« December 2016