Monday May 24, 2010

A ZFS Home Server

This entry will be a departure from my recent focus on the 7000 series to explain how I replaced my recently-dead Sun Ultra 20 with a home-built ZFS workstation/NAS server. I hope others considering building a home server can benefit from this experience.

Of course, if you're considering building a home server, it pays to think through your goals, constraints, and priorities, and then develop a plan.

Goals: I use my desktop as a general purpose home computer, as a development workstation for working from home and on other hobby projects, and as a NAS file server. This last piece is key because I do a lot of photography work in Lightroom and Photoshop. I want my data stored redundantly on ZFS with snapshots and such, but I want to work from my Macbook. I used pieces of this ZFS CIFS workgroup sharing guide to set this up, and it's worked out pretty well. More recently I've also been storing videos on the desktop and streaming them to my Tivo using pytivo and streambaby.

Priorities: The whole point is to store my data, so data integrity is priority #1 (hence ZFS and a mirrored pool). After that, I want something unobtrusive (quiet) and inexpensive to run (low-power). Finally, it should be reasonably fast and cheap, but I'm willing to sacrifice sacrifice speed and up-front cost for quiet and low-power.

The Pieces

Before sinking money into components I wanted to be sure they'd work well with Solaris and ZFS, so I looked around at what other people have been doing and found Constantin Gonzalez's "reference config." Several people have successfully built ZFS home NAS systems with similar components so I started with that design and beefed it up slightly for my workstation use:

  • CPU: AMD Athlon II x3 405e. 3 cores, 2.3GHz, 45 watts, supports ECC. I had a very tough time finding the "e" line of AMD processors in stock anywhere, but finally got one from buy.com via Amazon.
  • Motherboard: Asus M4A785T-M CSM. It's uATX, has onboard video, audio, NIC, and 6 SATA ports, and supports DDR3 and ECC (which uses less power than DDR2).
  • RAM: 2x Super Talent 2GB DDR3-1333MHz with ECC. ZFS loves memory, and NAS loves cache.
  • Case: Sonata Elite. Quiet and lots of internal bays.
  • Power supply: Cooler Master Silent Pro 600W modular power supply. Quiet, energy-efficient, and plenty of power. Actually, this unit supplies much more power than I need.

I was able to salvage several components from the Ultra 20, including the graphics card, boot disk, mirrored data disks, and a DVD-RW drive. Total cost of the new components was $576.

The Build

Putting the system together went pretty smoothly. A few comments about the components:

  • The CoolerMaster power supply comes with silicone pieces that wrap around both ends, ostensibly to dampen the vibe transferred to the chassis. I only ran with this on so I can't say how effective it was, but it made the unit more difficult to install.
  • The case has similar silicone grommets on the screws that lock the disks in place.
  • The rails for installing 5.25" external drives are actually sitting on the inside of each bay's cover. I didn't realize this at first and bought (incompatible) rails separately.
  • Neither the case nor the motherboard has an onboard speaker, so you can't hear the POST beeps. This can be a problem if there's a problem with the electrical connections when you're first bringing it up. I had an extra case from another system around, so I wired that up, but I've since purchased a standalone internal speaker. (I did not find one of these at any of several local RadioShacks or computer stores, contrary to common advice.)
  • I never got the case's front audio connector working properly. Whether I connected the AC97 or HDA connectors (both supplied by the case, and both supported by the motherboard), the front audio jack never worked. I don't care because the rear audio is sufficient for me and works well.

The trickiest part of this whole project was migrating the boot disk from the Ultra 20 to this new system without reinstalling. First, identifying the correct boot device in the BIOS was a matter of trial and error. For future reference, it's worth writing the serial numbers for your boot and data drives down somewhere so you can tell what's what if you need to use them in another system. Don't save this in files on the drives themselves, of course.

When I first booted from the correct disk, the system reset immediately after picking the correct GRUB entry. I booted into kmdb and discovered that the system was actually panicking early because it couldn't mount the root filesystem. With no dump device, the system had been resetting before the panic message ever showed up on the screen. With kmdb, I was able to actually see the panic message. Following the suggestion from someone else who had hit this same problem, I burned an OpenSolaris LiveCD, booted it, and imported and exported the root pool. After this, the system successfully booted from the pool. This experience would be made much better with 6513775/6779374.

Performance

Since several others have used similar setups for home NAS, my only real performance concern was whether the low-power processor would be enough to handle desktop needs. So far, it seems about the same as my old system. When using the onboard video, things are choppy moving windows around, but Flash videos play fine. The performance is plenty for streaming video to my Tivo and copying photos from my Macbook.

The new system is pretty quiet. While I never measured the Ultra 20's power consumption, the new system runs at just 85W idle and up to 112W with disks and CPU pegged. That's with the extra video card, which sucks about 30W. That's about $7.50/month in power to keep it running. Using the onboard video (and removing the extra card), the system idles at just 56W. Not bad at all.

Wednesday Mar 18, 2009

Compression followup

My previous post discussed compression in the 7000 series. I presented some Analytics data showing the effects of compression on a simple workload, but I observed something unexpected: the system never used more than 50% CPU doing the workloads, even when the workload was CPU-bound. This caused the CPU-intensive runs to take a fair bit longer than expected.

This happened because ZFS uses at most 8 threads for processing writes through the ZIO pipeline. With a 16-core system, only half the cores could ever be used for compression - hence the 50% CPU usage we observed. When I asked the ZFS team about this, they suggested that nthreads = 3/4 the number of cores might be a more reasonable value, leaving some headroom available for miscellaneous processing. So I reran my experiment with 12 ZIO threads. Here are the results of the same workload (the details of which are described in my previous post):

Summary: text data set
Compression
Ratio
Total
Write
Read
off
1.00x
3:29
2:06
1:23
lzjb 1.47x
3:36
2:13
1:23
gzip-2
2.35x
5:16
3:54
1:22
gzip
2.52x
8:39
7:17
1:22
gzip-9 2.52x
9:13
7:49
1:24
Summary: media data set
Compression
Ratio
Total
Write
Read
off 1.00x
3:39
2:17
1:22
lzjb
1.00x
3:38
2:16
1:22
gzip-2 1.01x
5:46
4:24
1:22
gzip
1.01x
5:57
4:34
1:23
gzip-9 1.01x
6:06
4:43
1:23

We see that read times are unaffected by the change (not surprisingly), but write times for the CPU-intensive workloads (gzip) are improved over 20%:

From the Analytics, we can see that CPU utilization is now up to 75% (exactly what we'd expect):

CPU usage with 12 ZIO threads

Note that in order to run this experiment, I had to modify the system in a very unsupported (and unsupportable) way. Thus, the above results do not represent current performance of the 7410, but only suggest what's possible with future software updates. For these kinds of ZFS tunables (as well as those in other components of Solaris, like the networking stack), we'll continue to work with the Solaris teams to find optimal values, exposing configurables to the administrator through our web interface when necessary. Expect future software updates for the 7000 series to include tunable changes to improve performance.

Finally, it's also important to realize that if you run into this limit, you've got 8 cores (or 12, in this case) running compression full-tilt and your workload is CPU-bound. Frankly, you're using more CPU for compression than many enterprise storage servers even have today, and it may very well be the right tradeoff if your environment values disk space over absolute performance.

Update Mar 27, 2009: Updated charts to start at zero.

Monday Mar 16, 2009

Compression on the Sun Storage 7000

Built-in filesystem compression has been part of ZFS since day one, but is only now gaining some enterprise storage spotlight. Compression reduces the disk space needed to store data, not only increasing effective capacity but often improving performance as well (since fewer bytes means less I/O). Beyond that, having compression built into the filesystem (as opposed to using an external appliance between your storage and your clients to do compression, for example) simplifies the management of an already complicated storage architecture.

Compression in ZFS

Your mail client might use WinZIP to compress attachments before sending them, or you might unzip tarballs in order to open the documents inside. In these cases, you (or your program) must explicitly invoke a separate program to compress and uncompress the data before actually using it. This works fine in these limited cases, but isn't a very general solution. You couldn't easily store your entire operating system compressed on disk, for example.

With ZFS, compression is built directly into the I/O pipeline. When compression is enabled on a dataset (filesystem or LUN), data is compressed just before being sent to the spindles and decompressed as it's read back. Since this happens in the kernel, it's completely transparent to userland applications, which need not be modified at all. Besides the initial configuration (which we'll see in a moment is rather trivial), users need not do anything to take advantage of the space savings offered by compression.

A simple example

Let's take a look at how this works on the 7000 series. Like all software features, compression comes free. Enabling compression for user data is simple because it's just a share property. After creating a new share, double-click it to modify its properties, select a compression level from the drop-down box, and apply your changes:

Click for larger image

GZIP optionsAfter that, all new data written to the share will be compressed with the specified algorithm. Turning compression off is just as easy: just select 'Off' from the same drop-down. In both cases, extant data will remain as-is - the system won't go rewrite everything that already existed on the share.

Note that when compression is enabled, all data written to the share is compressed, no matter where it comes from: NFS, CIFS, HTTP, and FTP clients all reap the benefits. In fact, we use compression under the hood for some of the system data (analytics data, for example), since the performance impact is negligible (as we will see below) and the space savings can be significant.

You can observe the compression ratio for a share in the sidebar on the share properties screen. This is the ratio of uncompressed data size to actual (compressed) disk space used and tells you exactly how much space you're saving.



The cost of compression

People are often concerned about the CPU overhead associated with compression, but the actual cost is difficult to calculate. On the one hand, compression does trade CPU utilization for disk space savings. And up to a point, if you're willing to trade more CPU time, you can get more space savings. But by reducing the space used, you end up doing less disk I/O, which can improve overall performance if your workload is bandwidth-limited.

But even when reduced I/O doesn't improve overall performance (because bandwidth isn't the bottleneck), it's important to keep in mind that the 7410 has a great deal of CPU horsepower (up to 4 quad-core 2GHz Opterons), making the "luxury" of compression very affordable.

The only way to really know the impact of compression on your disk utilization and system performance is to run your workload with different levels of compression and observe the results. Analytics is the perfect vehicle for this: we can observe CPU utilization and I/O bytes per second over time on shares configured with different compression algorithms.

Analytics results

I ran some experiments to show the impact of compression on performance. Before we get to the good stuff, here's the nitty-gritty about the experiment and results:

  • These results do not demonstrate maximum performance. I intended to show the effects of compression, not the maximum throughput of our box. Brendan's already got that covered.
  • The server is a quad-core 7410 with 1 JBOD (configured with mirrored storage) and 16GB of RAM. No SSD.
  • The client machine is a quad-core 7410 with 128GB of DRAM.
  • The basic workload consists of 10 clients, each writing 3GB to its own share and then reading it back for a total of 30GB in each direction. This fits entirely in the client's DRAM, but it's about twice the size of the server's total memory. While each client has its own share, they all use the same compression level for each run, so only one level is tested at a time.
  • The experiment is run for each of the compression levels supported on the 7000 series: lzjb, gzip-2, gzip (which is gzip-6), gzip-9, and none.
  • The experiment uses two data sets: 'text' (copies of /usr/dict/words, which is fairly compressible) and 'media' (copies of the Fishworks code swarm video, which is not very compressible).
  • I saw similar results with between 3 and 30 clients (with the same total write/read throughput, so they were each handling more data).
  • I saw similar results whether each client had its own share or not.

Now, below is an overview of the text (compressible) data set experiments in terms of NFS ops and network throughput. This gives a good idea of what the test does. For all graphs below, five experiments are shown, each with a different compression level in increasing order of CPU usage and space savings: off, lzjb, gzip-2, gzip, gzip-9. Within each experiment, the first half is writes and the second half reads:

NFS and network stats

Not surprisingly, from the NFS and network levels, the experiments basically appear the same, except that the writes are spread out over a longer period for higher compression levels. The read times are pretty much unchanged across all compression levels. The total NFS and network traffic should be the same for all runs. Now let's look at CPU utilization over these experiments:

CPU usage

Notice that CPU usage increases with higher compression levels, but caps out at about 50%. I need to do some digging to understand why this happens on my workload, but it may have to do with the number of threads available for compression. Anyway, since it only uses 50% of CPU, the more expensive compression runs end up taking longer.

Let's shift our focus now to disk I/O. Keep in mind that the disk throughput rate is twice that of the data we're actually reading and writing because the storage is mirrored:

Disk I/O

We expect to see an actual decrease in disk bytes written and read as the compression level increases because we're writing and reading more compressed data.

I collected similar data for the media (uncompressible) data set. The three important differences were that with higher compression levels, each workload took less time than the corresponding text one:

Network bytes

the CPU utilization during reads was less than in the text workload:

CPU utilization

and the total disk I/O didn't decrease nearly as much with the compression level as it did in the text workloads (which is to be expected):

Disk throughput

The results can be summarized by looking at the total execution time for each workload at various levels of compression:

Summary: text data set
Compression
Ratio
Total
Write
Read
off
1.00x
3:30
2:08
1:22
lzjb 1.47x
3:26
2:04
1:22
gzip-2
2.35x
6:12
4:50
1:22
gzip
2.52x
11:18
9:56
1:22
gzip-9 2.52x
12:16
10:54
1:22
Summary: media data set
Compression
Ratio
Total
Write
Read
off 1.00x
3:29
2:07
1:22
lzjb
1.00x
3:31
2:09
1:22
gzip-2 1.01x
6:59
5:37
1:22
gzip
1.01x
7:18
5:57
1:21
gzip-9 1.01x
7:37
6:15
1:22
Space chart Time chart

What conclusions can we draw from these results? Of course, what we knew, that compression performance and space savings vary greatly with the compression level and type of data. But more specifically, with my workloads:

  • read performance is generally unaffected by compression
  • lzjb can afford decent space savings, but performs well whether or not it's able to generate much savings.
  • Even modest gzip imposes a noticeable performance hit, whether or not it reduces I/O load.
  • gzip-9 in particular can spend a lot of extra time for marginal gain.

Moreover, the 7410 has plenty of CPU headroom to spare, even with high compression.

Summing it all up

We've seen that compression is free, built-in, and very easy to enable on the 7000 series. The performance effects vary based on the workload and compression algorithm, but powerful CPUs allow compression to be used even on top of serious loads. Moreover, the appliance provides great visibility into overall system performance and effectiveness of compression, allowing administrators to see whether compression is helping or hurting their workload.

About

On Fishworks, Sun, and software engineering

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today