ZFS Snapshot Replication Script

One of the OpenSolaris' ZFS filesystem's greatest features are its snapshots. You can easily create a snapshot by saying zfs create pool/filesystem@snapshot and then replicate that snapshot by saying something like zfs send pool/filesystem@snapshot | zfs receive newpool/some_other_filesystem. So far, so great.

Now let's say you have a nice pool and have been creating snapshots on a regular basis. After a few months, you decide to remodel your pool layout or migrate some of your filesystems over to a new pool for whatever reason. Then, you're facing a lot of those zfs send and receive commands. Especially, if you're like Chris and do snapshots on a per-minute basis. That's why the ZFS community is now waiting for 6421958: A zfs send -r option.

I had to migrate quite a few filesystems and many snapshots (thanks to Tim's excellent ZFS Snapshot SMF Service) lately when I set up a new pool strategy for my home server so I wrote myself a script to do the replication job. Since it may take some time for the send -r feature to be implemented, I turned it into a more generic utility.

Disclaimer: Please be advised that this script has only been tested a couple of times and it is provided to you completely on an "as-is" basis. Please have a look at the script to understand how it works and try it out on some non-risky pools and filesystems before you do real stuff with it. Run a backup before using this script and don't shoot me if something goes wrong.

Ok, what can this script do for you? First of all, check out its -h flag to see what options it provides:

# zfs-replicate -h
usage: zfs-replicate [-h] [-F ] [-n] [-s] [-v]
[-m [-c "FMRI|pattern[ FMRI|pattern]...]" ]
source dest

where source and dest is a ZFS filesystem, snapshot or volume.

-h: Print this help.
-F: Force a rollback of the destination filesystem to the most recent
snapshot before replicating a snapshot. This is equivalent to the -F
option of zfs receive.
-n: Don't actually replicate anything, just print what would be done.
This will only print the next step but nothing dependent on that step
since it won't actually be executed. For instance, -ns will print the
snapshot command but not print the subsequent send/receive command as it
depends on the snapshot actually being taken.
-s: After sending existing snapshots, make a final one and replicate it as well.
This option requires that the source be a filesystem and not a snapshot.
-m: After sending all snapshots, migrate the source to the dest filesystem by
unmounting the source filesystem and changing the new filesystem's
mountpoint to that of the source one. This option includes -s.
-c: A space delimited list of SMF services in quotes to be temporarily disabled
before unmounting the source, then re-enable after changing the mountpoint
of the destination. Requires -m.
-v: Be verbose.

Great, let's try it out. Here's a pool with some data and some snapshots as well as another, empty pool:

# zfs list -r piscina
NAME                            USED  AVAIL  REFER  MOUNTPOINT
piscina                         384M  87.6M    19K  /piscina
piscina/ficheros                384M  87.6M   384M  /piscina/ficheros
piscina/ficheros@instantanea1    17K      -   128M  -
piscina/ficheros@instantanea2    17K      -   256M  -
piscina/ficheros@instantanea3      0      -   384M  -
# zfs list -r lago
lago   105K   472M    18K  /lago

Now, let's copy the ficheros filesystem along with all its snapshots in one go:

# zfs-replicate -v piscina/ficheros lago
Sending piscina/ficheros@instantanea1 to lago.
Sending piscina/ficheros@instantanea2 to lago.
(incremental to piscina/ficheros@instantanea1.)
Sending piscina/ficheros@instantanea3 to lago.
(incremental to piscina/ficheros@instantanea2.)
# zfs list -r lago
NAME                         USED  AVAIL  REFER  MOUNTPOINT
lago                         384M  87.6M    20K  /lago
lago/ficheros                384M  87.6M   384M  /lago/ficheros
lago/ficheros@instantanea1    17K      -   128M  -
lago/ficheros@instantanea2    17K      -   256M  -
lago/ficheros@instantanea3      0      -   384M  - 

It works. And it automatically used incremental snapshots as well to save space, too!

If we now add another snapshot to our original pool piscina and then run zfs-replicate again, it will skip already replicated snapshots and just copy those that are additional:

# zfs snapshot piscina/ficheros@instantanea4 
# zfs-replicate -v piscina/ficheros lago
Sending piscina/ficheros@instantanea4 to lago.
(incremental to piscina/ficheros@instantanea3.)
# zfs list -r lago
NAME                         USED  AVAIL  REFER  MOUNTPOINT
lago                         384M  87.6M    20K  /lago
lago/ficheros                384M  87.6M   384M  /lago/ficheros
lago/ficheros@instantanea1    17K      -   128M  -
lago/ficheros@instantanea2    17K      -   256M  -
lago/ficheros@instantanea3      0      -   384M  -
lago/ficheros@instantanea4      0      -   384M  -

This is useful because you can now run this script on regularly basis to have one pool automatically backed up to another pool. In fact, the -s option will first make sure all existing snapshots are copied over, then create a new snapshot for you and copy that one over as well, all in one command.

Sometimes, the destination filesystem gets touched, or otherwise acted upon and then zfs receive to it will no longer work. In that case, you can use the -F switch which will force a rollback of the destination filesystem to its latest snapshot and then you'll be back in business.

Finally, another scenario is file system migration: You have a filesystem in one pool and want to migrate it with all it's snapshots to another pool, with minimal downtime. This can be done using the -m option: First copy all existing snapshots while the source file system is still live, then unmount it, create a final snapshot, copy that one over to the destination file system as well, set the destination filesystem's mountpoint to be the same as the source filesystem's one and remount the destination file system. (Please note that the script might appear frozen when unmounting while it's waiting for some open files to be closed, etc.).

If you're worried about some daemons depending on your filesystem's availability (like Samba), you can use the -c option to provide their names. zfs-replicate will then bring down the matching SMF services right before unmounting and restart them automatically after re-mounting the migrated filesystem. Again, you might need to wait until the SMF service is really down (Read: The last Samba connection has closed).

I hope this script is useful to you and again, I assume you know what you're doing and do some testing before using it in production. I'm sure there are still some bugs and shortcomings so please send me email to constantin (dot) gonzalez (at) sun (dot) com or leave a comment and I'll try to make the script better for you.

Many thanks to Chris Gerhard, whose backup script was an inspiration for me in hacking together this utility. Also, many thanks to Tim Foster for some code-review and initial feedback (Sorry, I haven't managed to implement some locking yet...). Let me know when you're in Munich and you'll get some well-deserved beer!


Post a Comment:
Comments are closed for this entry.

Tune in and find out useful stuff about Sun Solaris, CPU and System Technology, Web 2.0 - and have a little fun, too!


« April 2014