Backing up a zvol

Over at Spiceworks, Michael2024 asks, "Anybody know how to get rsync to backup a ZFS zvol?"

My response is: "That's the wrong question." In fact, someone replied to Michael2024 already saying that rsync was not the right tool, but no one suggested the best tool for backing up zvols: snapshots

"But Mark," you say (because we're on first-name terms, and that is in fact my first name). "The snapshot is right there on the device that I'm trying to back up! How can that possibly help me?"

I'm glad you asked.

If you try to "back up" a zvol using a tool like dd, you're going to have to copy the whole volume, even the blocks that contain no data. But zvols are ZFS constructs which means they follow the copy-on-write paradigm which, in turn, means that ZFS needs to know what's data and what's not.

So that means that any snapshot will only contain the data that is actually on the disk. That's right: a snapshot of a 100TB volume that has 10MB of data will only contain those 10MB of data. And therefore, any "zfs send" stream will only contain real data and not a bunch of unwritten garbage.

To demonstrate, let's create a 100MB volume and snapshot it:

-bash-4.0# zfs create -V 100m tank/vol
-bash-4.0# zfs snapshot tank/vol@snap
How big is the send stream? Easy enough to check:
-bash-4.0# zfs send tank/vol@snap | wc -c
4256
Just a smidge over 4k. Let's write some data:
-bash-4.0# dd if=/dev/random of=/dev/zvol/rdsk/tank/vol bs=1k count=10
10+0 records in
10+0 records out
-bash-4.0# zfs snapshot tank/vol@snap2
-bash-4.0# zfs send tank/vol@snap2 | wc -c
21264
OK, we wrote 10k of data, and the send stream is 20k. With such a little amount of data, the overhead is about half the stream. But, what if we write to the same blocks again?
-bash-4.0# dd if=/dev/random of=/dev/zvol/rdsk/tank/vol bs=1k count=10
10+0 records in
10+0 records out
-bash-4.0# zfs snapshot tank/vol@snap3
-bash-4.0# zfs send tank/vol@snap3 | wc -c
21264
The exact same amount! So ZFS knows exactly how much data there is on the zvol. Let's write 1MB instead:
-bash-4.0# dd if=/dev/random of=/dev/zvol/rdsk/tank/vol bs=1k count=1024
1024+0 records in
1024+0 records out
-bash-4.0# zfs snapshot tank/vol@snap4
-bash-4.0# zfs send tank/vol@snap4 | wc -c
1092768
-bash-4.0#
And now the overhead is quite a bit smaller than the data, around 3-4%.

The question then is: which is more efficient? Doing a full block-by-block copy using something "higher up in the stack" (quoting from Michael2024 there), or creating another pool and doing a "zfs send | zfs recv"? On top of that, add the under-appreciated feature of incremental send streams, and you have a full backup solution that does not require any external tools.

I would respond on the Spiceworks website, but alas they are both members-only and require you download a Windows client just to register. Lame!

Comments:

Mark, how do you address the issue that while the ZFS volume will always be consistent on disk, the zvol snapshot representation of the FS using the volume may not necessarily be consistent. This makes ZFS snapshotting less attractive unless I can coordinate some sort of quiescence when I take the snapshot.

Posted by Eric Sproul on December 08, 2009 at 01:15 PM GMT #

Eric: You're exactly right: quiescing all applications using the zvol is a must if you want the application's data to be consistent. No way around that. But you say that using snapshots is "less attractive". Less attractive than what?

Posted by Mark Musante on December 08, 2009 at 01:22 PM GMT #

Less attractive than filesystem-based backup via the FS on top of the zvol, such as with traditional enterprise backup. The requirement to quiesce makes perfect sense, but IMHO people need to be aware of that aspect when considering whether to use zvols for a particular application.

Posted by Eric Sproul on December 08, 2009 at 01:55 PM GMT #

But even a filesystem-based backup requires the applications using that filesystem to quiesce. I don't think you can get around that.

Posted by Mark Musante on December 09, 2009 at 04:14 AM GMT #

Post a Comment:
Comments are closed for this entry.
About

Known throughout Sun as a man of infinite wit, of jovial attitude, and of making things up about himself at the slightest whim.

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today