Sunday Apr 05, 2009

Recovering our Windows PC

I had reason to discover if my solution for backing up the windows PC worked. Apparently the PC had not been working properly for a while but no one had mentioned that to me. The symptoms were:

  1. No menu bar at the bottom of the screen. It was almost like the screen was the wrong size but how it was changed is/was a mystery.

  2. It was claiming it needed to revalidate itself as the hardware had changed, which it catagorically had not and I had 2 days to sort it out. Apparenty this message had been around for a few days (weeks?) but was ignored.

Now I'm sure I could have had endless fun reading forums to find out how to fix these things but it was Saturday night nd I was going cycling in the morning. So time to boot solaris and restore the back up. First I took a back up of what was on the disk, just in case I get a desire to relive the issue. I just needed one script to restore it over ssh. The script is:

: pearson FSS 14 $; cat /usr/local/sbin/xp_restore 

exec dd of=/dev/rdsk/c0d0p1 bs=1k
: pearson FSS 15 $; 

and the command was:

$ ssh pc pfexec /usr/local/sbin/xp_restore < backup.dd

having chosen the desired snapshot. Obviously the command was added to /etc/security/exec_attr. Then just leave that running over night. In the morning the system booted up just fine, complained about the virus definitions being out of date and various things needing updates but all working. Alas doing this before I went cycling made me late enough to miss the peleton, if it was there.

Wednesday Jun 13, 2007

Home server back to build 65

My home server is taking a bit of a battering of late. I keep tripping over bug 6566921 which I can work around by not running my zfs_backup script locally. I have an updated version which will send the snapshots over an ssh pipe to a remote system which in my case is my laptop. Obviously this just moves the panic from my server to the laptop but that is a very much better state of affairs. I'm currently building a fixed zfs module which I will test later.

However the final straw that has had me revert to build 65 is that smbd keeps core dumping. Having no reliable access to their data caused the family more distress than you would expect. This turns out to be bug 6563383 which should be fixed in build 67.

Friday Jun 01, 2007

Rolling incremental backups

Why do you take back ups?

  • User error

  • Hardware failure

  • Disaster Recovery

  • Admin Error

With ZFS using redundant storage and plenty of snapshots my data should be safe from the first two. However that still leaves two ways all my data could be lost if I don't take some sort of back up.

Given the constraints I have my solution is to use my external USB disk containing a standalone zpool and then use zfs send and receive via this script to send all the file systems I really care about to the external drive.

To make this easier I have put all the filesystems into another “container” filesystem which has the “nomount” option set so it is hidden from the users. I can then recursively send that file system to the external drive. Also to stop the users getting confused by extra file systems appearing and disappearing I have set the mount point on the external disk to “none”.

The script only uses the snapshots that are prefixed “day” (you can change that with the -p (prefix) option) so that it reduces the amount of work that the script does. Backing up the snapshots that happen every 10 minutes on this system does not seem worth while for a job I will run once a day or so.

The really cool part of this is that once I had the initial snapshot on the external drive every back up from now on will be incremental. A rolling incremental backup. How cool is that.

# time ./zfs_backup tank/fs/safe removable/safe      

real    12m10.49s

user    0m11.87s

sys     0m12.32s

 zfs list  tank/fs/safe removable/safe      


removable/safe  78.6G  66.0G    18K  none

tank/fs/safe    81.8G  49.7G    18K  /tank/fs


The performance is slightly disappointing due to the number of transport errors that are reported by the usb2scsa layer but the data is really on the disk so I am happy.

Currently I don't have the script clearing out the old snapshots but will get that going later. The idea of doing this over ssh to a remote site is compelling when I can find a suitable remote site.


This is the old blog of Chris Gerhard. It has mostly moved to


« April 2014