ZFS on a laptop?
By erickustarz on Jun 12, 2007
Sun is known for servers, not laptops. So a filesystem designed by Sun would surely be too powerful and too "heavy" for laptops, that the features of a "datacenter" filesystem wouldn't fit on a laptop. Right? Actually... not. As it turns out, ZFS is a great match for laptops.
One of the most important things a user needs to do on a laptop is to back his data up. Copying your data to DVD or an external drive is one way. ZFS snapshots with 'zfs send' and 'zfs recv' is a better way. Due to its architecture, snaphots in ZFS are very fast and only take up as much space as much data has changed. For a typical user, taking a snapshot every day, for example, will only take up a small amount of capacity.
So let's start off with a ZFS pool called 'swim' and two filesystems: 'Music' and 'Pictures':
fsh-mullet# zfs list NAME USED AVAIL REFER MOUNTPOINT swim 157K 9.60G 21K /swim swim/Music 18K 9.60G 18K /swim/Music swim/Pictures 19K 9.60G 19K /swim/Pictures fsh-mullet# ls /swim/Pictures bday.jpg good_times.jpg
Taking a snapshot 'today' of Pictures is this easy:
fsh-mullet# zfs snapshot swim/Pictures@today
And now we can see the contents of snapshot 'today' via the '.zfs/snapshot' directory:
fsh-mullet# ls /swim/Pictures/.zfs/snapshot/today bday.jpg good_times.jpg fsh-mullet#
If you want to take a snapshot of all your filesystems, then you can do:
fsh-mullet# zfs snapshot -r swim@today fsh-mullet# zfs list NAME USED AVAIL REFER MOUNTPOINT swim 100M 9.50G 21K /swim swim@today 0 - 21K - swim/Music 100M 9.50G 100M /swim/Music swim/Music@today 0 - 100M - swim/Pictures 19K 9.50G 19K /swim/Pictures swim/Pictures@today 0 - 19K - fsh-mullet#
Now that you have snapshots, you can use the built-in features of 'zfs send' and 'zfs recv' to backup your data - even to another machine.
fsh-mullet# zfs send swim/Pictures@today | ssh host2 zfs recv -d backupswim
After you've sent over the first snapshot via 'zfs send', you can then do incremental 'zfs send's:
fsh-mullet# zfs send -i swim/Pictures@today | ssh host2 zfs recv -d backupswim
Now let's look at the backup ZFS pool 'backupswim' on host 'host2':
host2# zfs list NAME USED AVAIL REFER MOUNTPOINT backupswim 100M 9.50G 21K /backupswim backupswim/Music 100M 9.50G 100M /backupswim/Music backupswim/Music@today 0 - 100M - backupswim/Pictures 18K 9.50G 18K /backupswim/Pictures backupswim/Pictures@today 0 - 18K -
What's really nice about using ZFS's snapshots is that you only need to send over (and store) the differences between snapshots. So if you're doing video editing on your laptop, and have a giant 10GB file, but only change, say, 1KB of data on this day, with ZFS you only have to send over 1KB of data - not the entire 10GB of the file. This also means you don't have to store multiple 10GB versions (one per snapshot) of the file on your backup device.
You can also backup with an external hard drive. Create a backup pool on the second hard drive, and just 'zfs send/recv' your nightly snapshots.
Since laptops (typically) only have 1 disk, handling disk errors is very important. Bill introduced ditto blocks to handle partial disk failures. With typical filesystems, if part of the disk is corrupted/failing and that part of the disk stores your metadata, you're screwed. There's no way to access the data associated with the inaccessible metadata without backing up. With ditto blocks, ZFS stores multiple copies of the metadata in the pool. In the single disk case, we strategically store multiple copies of the metadata on different locations on disk (such as at the front and back of the disk). A subtle partial disk failure can make other filesystems useless, whereas ZFS can survive.
Matt took ditto blocks one step further and allowed the user to apply it to any filesystem's data. What this means is that you can make your more important data more reliable by stashing away multiple copies of your precious data (without muddying your namespace). Here's how you store two copies of your pictures:
fsh-mullet# zfs set copies=2 swim/Pictures fsh-mullet# zfs get copies swim/Pictures NAME PROPERTY VALUE SOURCE swim/Pictures copies 2 local fsh-mullet#
Note, the number of copies property only affects future writes (not existing data). So i recommend you set this at filesystem creation time:
fsh-mullet# zfs create -o copies=2 swim/Music fsh-mullet# zfs get copies swim/Music NAME PROPERTY VALUE SOURCE swim/Music copies 2 local fsh-mullet#
With ZFS, compression comes built-in. The current algorithms are lzjb (based on Lempel-Ziv) and gzip. Now its true that your jpegs and mp4s are already compressed quite nicely, but if you want to save capacity on other filesystems, all you have to do is:
fsh-mullet# zfs set compression=on swim/Documents fsh-mullet# zfs get compression swim/Documents NAME PROPERTY VALUE SOURCE swim/Documents compression on local fsh-mullet#
The default compression algorithm is lzjb. If you want to use gzip, then do:
fsh-mullet# zfs set compression=gzip swim/Documents fsh-mullet# zfs get compression swim/Documents NAME PROPERTY VALUE SOURCE swim/Documents compression gzip local fsh-mullet#
That single disk stickiness
A major problem with laptops today is the single point of failure: the single disk. It makes complete sense today that laptops are designed this way given the physical space and power issues. But looking foward, as, say, flash gets cheaper and cheaper as well as more reliable, it becomes more and more of a possibility to replace the single disk in laptops. So now that you save physical space, you can actually fit more than one flash device in the laptop. Wouldn't it be really cool if you could then build RAID ontop of the multiple devices? Introducing some hardware RAID controller doesn't make any sense - but software RAID does.
Creating a mirrored pool is easy:
diskmonster# zpool create swim mirror c7t0d0 c7t1d0 diskmonster# zpool status pool: swim state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM swim ONLINE 0 0 0 mirror ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 errors: No known data errors diskmonster#
Similarly, creating a RAID-Z is also easy:
diskmonster# zpool create swim raidz c7t0d0 c7t1d0 c7t2d0 c7t5d0 diskmonster# zpool status pool: swim state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM swim ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 errors: No known data errors diskmonster#
With either of these configurations, your laptop can now handle a whole device failure.
ZFS on a laptop - a perfect fit.