OpenSolaris Home Server: ZFS and USB Disks

My home server with a couple of USB disksA couple of weeks ago, OpenSolaris 2008.05, project Indiana, saw its first official release. I've been looking forward to this moment so I can upgrade my home server and work laptop and start benefiting from the many cool features. If you're running a server at home, why not use the best server OS on the planet for it?

This is the first in a small series of articles about using OpenSolaris for home server use. I did a similar series some time ago and got a lot of good and encouraging feedback, so this is an update, or a remake, or home server 2.0, if you will.

I'm not much of a PC builder, but Simon has posted his experience with selecting hardware for his home server. I'm sure you'll find good tips there. In my case, I'm still using my trusty old Sun Java W1100z workstation, running in my basement. And for storing data, I like to use USB disks.

USB disk advantages

This is the moment where people start giving me that "Yeah, right" or "Are you serious?" looks. But USB disk storage has some cool advantages:

  • It's cheap. About 90 Euros for half a TB of disk from a major brand. Can't complain about that.
  • It's hot-pluggable. What happens if your server breaks and you want to access your data? With USB it's as easy as unplug from broken server, plug into laptop and you're back in business. And there's no need to shut down or open your server if you just want to add a new disk or change disk configuration.
  • It scales. I have 7 disks running in my basement. All I needed to do to make them work with my server was to buy a cheap 15 EUR 4-port USB card to expand my existing 5 USB ports. I still have 3 PCI slots left, so I could add 12 disks more at full USB 2.0 speed if I wanted.
  • It's fast enough. I measure about 10MB/s in write performance with a typical USB disk. That's about as fast as you can get over a 100 MBit/s LAN network which most people use at home. As long as the network remains the bottleneck, USB disk performance is not the problem.

ZFS and USB: A Great Team

But this is not enough. The beauty of USB disk storage lies in its combination with ZFS. When adding some ZFS magic to the above, you also get:

  • Reliability. USB disks can be mirrored or used in a RAID-Z/Z2 configuration. Each disk may be unreliable (because they're cheap) individually, but thanks to ZFS' data integrity and self-healing properties, the data will be safe and FMA will issue a warning early enough so disks can be replaced before any real harm can happen.
  • Flexibility. Thanks to pooled storage, there's no need to wonder what disks to use for what and how. Just build up a single pool with the disks you have, then assign filesystems to individual users, jobs, applications, etc. on an as-needed basis.
  • Performance. Suppose you upgrade your home network to Gigabit Ethernet. No need to worry: The more disks you add to the pool, the better your performance will be. Even if the disks are cheap.

Together, USB disks and ZFS make a great team. Not enterprise class, but certainly an interesting option for a home server.

ZFS & USB Tips & Tricks

So here's a list of tips, tricks and hints you may want to consider when daring to use USB disks with OpenSolaris as a home server:

  • Mirroring vs. RAID-Z/Z2: RAID-Z (or its more reliable cousin RAID-Z2) is tempting: You get more space for less money. In fact, my earlier versions of zpools at home were a combination of RAID-Z'ed leftover slices with the goal to squeeze as much space as possible at some reliability level out of my mixed disk collection.
    But say you have a 3+1 RAID-Z and want to add some more space. Would you buy 4 disks at once? Isn't that a bit big, granularity-wise?
    That's why I decided to keep it simple and just mirror. USB disks are cheap enough, no need to be even more cheap. My current zpool has a pair of 1 TB USB disks and a pair of 512 GB USB disks and works fine.
    Another advantage of this aproach is that you can organically modernize your pool: Wait until one of your disks starts showing some flakyness (FMA and ZFS will warn you as soon as the first broken data block has been repaired). Then replace the disk with a bigger one, then its mirror with the same, bigger size. That will give you more space without the complexity of too many disks and keep them young enough to not be a serious threat to your data. Use the replaced disks for scratch space or less important tasks.
  • Instant replacement disk: A few weeks ago, one of my mirrored disks showed its first write error. It was a pair of 320GB disks, so I ordered a 512GB replacement (with the plan to order the second one later). But now, my mirror may be vulnerable: What if the second disk starts breaking before the replacement has arrived?
    That's why having a few old but functional disks around can be very valuable: In my case, took a 200GB and a 160GB disk and combined them into their own zpool:
    zpool create temppool c11t0d0 c12t0d0
    Then, I created a new ZVOL sitting on the new pool:
    zfs create -sV 320g temppool/tempvol
    Here's out temporary replacement disk! I then attached it to my vulnerable mirror:
    zfs attach santiago c10t0d0 /dev/zvol/dsk/temppool/tempvol
    And voilá, my precious production pool stated resilvering the new virtual disk. After the new disk arrived and has been resilvered, the temporary disk can be detached, destroyed and its space put to some other good use.
    Storage virtualization has never been so easy!
  • Don't forget to scrub: Especially with cheap USB disks, regular scrubbing is important. Scrubbing will check each and every block of your data on disk and make sure it's still valid. If not, it will repair it (since we're mirroring or using RAID-Z/Z2) and tell you what disk had a broken block so you can decide whether it needs to be replaced or not just yet.
    How often you want to or should scrub depends on how much you trust your hardware and how much your data is being read out anyway (any data that is read out is automatically checked, so that particular portion of the data is already "scrubbed" if you will). I find scrubbing once every two weeks a useful cycle, othery may prefer once a month or once a week.
    But scrubbing is a process that needs to be initiated by the administrator. It doesn't happen by itself, so it is important that you think of issuing the "zpool scrub" command regularly, or better, set up a cronjob for it to happen automatically.
    As an example, the following line:
    23 01 1,15 \* \* for i in `zpool list -H -o name`; do zpool scrub $i; done
    in your crontab will start a scrub for each of your zpools twice a month on the 1st and the 15th at 01:23 AM.
  • Snapshot often: Snapshots are cheap, but they can save the world if you accientally deleted that important file. Same rule as with scrubbing: Do it. Often enough. Automatically. Tim Foster did a great job of implementing an automatic ZFS snapshot service, so why don't you just install it now and set up a few snapshot schemes for your favourite ZFS filesystems?
    The home directories on my home server are snapshotted once a month (and all snapshots are kept), once a week (keeping 52 snapshots) and once a day (keeping 31 snapshots). This gives me a time-machine with daily, weekly and monthly granularities depending on how far back in time I want to travel through my snapshots.

So, USB disks aren't bad. In fact, thanks to ZFS, USB disks can be very useful building blocks for your own little cost-effective but reliable and integrity-checked data center.

Let me know what experiences you made while using USB storage at home, or with ZFS and what tips and tricks you have found to work well for you. Just enter a comment below or send me email!

Comments:

That is reason to convert my music library to ZFS. The hard is full, but sick to pay money for a new hard. Instead mobile hard drive will be converted to ZFS and added to zpool that on hard drive. No idea how they will able to work together, but they should. Fantastic simplicity offers ZFS.

Posted by Andrius Burlega on May 27, 2008 at 04:46 PM CEST #

Thank you for the comment, Andrius!

I might add that it makes sense to have separate pools for the OS and for the data. The reason is that you can employ different protection schemes (mirror vs. RAID-Z/Z2, while ZFS boot is restricted to mirroring only) and that boot disks are typically internal wheras USB disks have the advantage of being external and hot-pluggable.

Mixing both in the pool would lead to compromise, so having your own pool for the data in addition to a zpool for root/boot is recommended.

Cheers,
Constantin

Posted by Constantin on May 28, 2008 at 02:26 AM CEST #

It is although a nice feature to used zfs and usb disk for backup. Just attach a new disk to your mirror and after resilver take it away in a safe place.

I ll add in the tip and trick section the fact that dettaching a usb disk from the pool doesn't means you can recreate it on a other system.

On a complete off-topic note, I have although a w1100z home and I'm looking for a GPU to replace the nvidia quadro FX500. Any hints for a working card for good 3D acceleration because I already tried a nvidia 6800GT without success?

Posted by guest on May 28, 2008 at 04:11 AM CEST #

Hi anonymous user at 193.141.92.4, thank you for your comment!

The recommended way to replicate pools across systems is to use zfs send/receive.

Still, if you zpool offline a disk from your mirror, then connect it to another server, you should be able to zpool import -f it. But that's not the intended way to use, so completely unsupported.

The NVIDIA drivers for Solaris are available on www.nvidia.com and they're part of OpenSolaris 2008.05 as well. They are supposed to support the Quadro line of NVIDIA cards, so that's the preferred option to upgrade your W1100z. Make sure it's an AGP bus Quadro card. GeForce is not supported, but those cards can be made to work by looking up their PCI id through prtconf, then adding the PCI id to the list of drivers supported by the nvidia module through add_drv -i '"pci123,45"' nvidia (where pci123,45 is the id of your GeForce card). See also /etc/driver_aliases for example entries.

Hope this helps,
Constantin

Posted by Constantin on May 28, 2008 at 05:30 AM CEST #

There is a little confusion about ZFS - how to remove a device from zpool after it was add to zpool?

Posted by Andrius Burlega on May 28, 2008 at 05:08 PM CEST #

Hi Andrius,

A device that has been added to a pool can be replaced by a device that has the same size or is bigger. But it can't be removed.

The ZFS team is working on support for "reshuffling" of data which would allow the removal of devices as well as RAID-level changes and stripe-size changes. But that is not ready yet, so today, zpools cannot be shrunk.

ZFS filesystems in a pool (including the pool filesystem, that is the uppermost in the hierarchy) can be recursively snapshotted (zfs snapshot -r), then sent and received recursively to migrate them over to another pool.

So, today the way to shrink a pool is to snapshot, then migrate your filesystems into another pool, destroy the old pool, then recreate the pool with the desired size, then zfs send/receive the filesystems recursively back.

Hope this helps,
Constantin

Posted by Constantin Gonzalez on May 29, 2008 at 08:25 AM CEST #

On a 4GByte USB drive I faced problem reading a near 4GB DVD.iso image, switching to Linux could read the iso image from exactly the same hardware. I can try it again on the very latest OpenSolaris. No other problems reading small files were experienced.

Also for most purposes FAT 32 is used as filesystem. The file names sometimes get corrupt. UFS and FAT32 file names capitalization and unicode extensions sometimes attempt to merge differnt files with unicode name into one single FAT32 file. Esp the euro symbol and the french euvre letter is known to cause FAT32 problems.

Posted by Lordbyte Whitfield on June 04, 2008 at 02:33 PM CEST #

Hi Lordbyte , thank you for your comment.

Sorry to hear about the 4GB problem. I assume it is related to the handling of the FAT32 filesystem in Solaris. If the file you used can be obtained publicly and if the bug is easily reproducible, consider filing a bug with the OpenSolaris bug tracking system at: http://bugs.opensolaris.org/

Same thing goes about your observations around file names.

The trouble with FAT32 is: On one hand, it is a really old and ugly file system and its use should be deprecated. On the other hand, it is the lowest common denominator of file systems, understood by all OSes, making it still "useful".

For storing data at home etc., I would never use FAT32. One of the biggest advanteges of ZFS is data integrity, robustness and self-healing capabilities and these features are becoming more and more critical in the light of ever increasing use of storage vs. constant hard disk IO failure rates.

Cheers,
Constantin

Posted by Constantin Gonzalez on June 04, 2008 at 03:07 PM CEST #

Hi Constantin, great post as usual -- very informative.

Personally, I must admit that I'm not a great fan of USB-connected disks due to (1) low speed, (2) lots of wires, and (3) lots of power bricks.

Cooling is another serious, often-overlooked issue due to the need to keep disks from getting too hot, and small cases often don't have good cooling/airflow. I would be interested to see the drive temperatures people see when they run this script during this summer:
http://breden.org.uk/2008/05/16/home-fileserver-drive-temps/

From the speed point of view, it saddens me to take a disk capable of 100 MBytes/sec read/write speed and to cripple it with a USB connection that limits its speed to around 10-20 MBytes/sec. And these fast, standard SATA disks are the exact same "cheap" disks used in these USB-connected cases. I prefer to get a screwdriver and open the computer case and get more speed for the same money :)

However, I can see that USB-connected drive enclosures are easy to connect and use, and that is a good thing :)

Posted by Simon Breden on June 10, 2008 at 03:13 AM CEST #

Hi Simon,

thank you for your comment!

The wires can be brought under control with some routing and cable binders but the power bricks annoy me as well. I've been looking for some power supply consolidation opportunities but did not do much research. Buying a big, external power supply, then run multiple wires to the USB disks is probably the way to go, but then you'll get a single point of failure. Right now, the many power bricks at least bring higher availability of the storage side of the picture - better than the server :).

As for speed, I agree from a server to disk stand point, but ultimately, the access is going to happen over ethernet. Most people use 100BaseT in their homes, so a USB mirror is more than enough in terms of maximum speed you can reach over the network. Granted, I'll think differently once I upgrade my home network to GBE. I'm migrating home videos from DV tape to my server through my MacBook and so the time will come where I'll need the speed. But using higher speeds over eSATA and the likes will come at the cost of an extra enclosure, break laptop compatibility for disaster recovery and make matters more complicated. GBE speeds can be easily reached through using 6 disks in a mirrored config (for reading) or RAID-Z (for read/write of sequential data).

I'm going to think harder about this once I upgrade the network speed, but until then, that's the bottleneck to fix :).

Cheers,
Constantin

Posted by Constantin on June 10, 2008 at 03:43 AM CEST #

Hi Constantin,

Yes, I think when you are transferring lots of DV data you will appreciate the extra speed that GbE offers, which is several times that of 100 Mbit/sec ethernet.

With CIFS it seems you can get about 40-60 MBytes/sec using one GbE connection fairly easily, and using Link Aggregation to aggregate two GbE connections, I've had speeds of around 80-100 MBytes/sec, and I think the bottleneck there was (1) the single disk at the Mac Pro end (WD5000AAKS: quoted as having an 87 MByte/sec max sustained transfer speed), and (2) the low-power dual-core AMD 64 X2 BE-2350 processor seemed to max out when handling data flows of around 80-100 MBytes/sec.

Fixing those two bottlenecks might even enable higher throughput to be achieved :)

I got the speeds above using a 3 drive RAIDZ1 array on the ZFS fileserver.

Keep up the great posts here! Looking forward to the next 'story' :)

Posted by Simon Breden on June 10, 2008 at 08:32 AM CEST #

Post a Comment:
Comments are closed for this entry.
About

Tune in and find out useful stuff about Sun Solaris, CPU and System Technology, Web 2.0 - and have a little fun, too!

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Bookmarks
TopEntries
Blogroll
OldTopEntries