Recovering From a ZFS Disaster
By Alan_S on Dec 29, 2011
A few weeks ago I hit a rare bug in an internal Solaris 11 build that caused my laptop to panic when it rebooted. References to ZFS were listed in the panic output. I figured that I just needed to scrub my ZFS filesystem, which would find the error and remove it, and I would be up and running in no time.
I began by burning the Text Installer (available for download from OTN) onto a CD, then booting off of it. I selected the Shell option so I could run the OS off of RAM. I then issued the command:
# zfs import -f rpool
Right away it panicked. Again. That was not good. I did have a backup that was a few days old, but I really wanted to try to salvage my most current content. In doing some research, I found that I could use mdb (modular debugger) to create a breakpoint at the point where it would normally panic on a zfs error, and then tell it not to.
I was able to boot off of the Live image and used this sequence to mount the /export/home/user partition:
# mdb -kw > aok/W 0x1 > zfs_recover/E 0x1 # zpool import -f -o readonly=on rpool
Since the Live image is networked, I was able to dump the contents of my home directory to another system on the network. I just needed to set up an empty zfs fileystem remotely.
# zfs send rpool/export/home/user | ssh othersystem sudo zfs receive rpool/export/home/user-recovered
Once I had the contents of my user partition copied, I reinstalled my system. This recreated the rpool from scratch. I then did a reversal of the same zfs send command to populate a new home on my laptop:
# ssh othersystem sudo zfs send rpool/export/home/user-recovered | zfs receive rpool/export/home/user-recovered
Then as root I destroyed the current rpool/export/home/user and renamed rpool/export/home/user-recovered:
# zfs destroy rpool/export/home/user
# zfs rename rpool/export/home/user-recovered rpool/export/home/user
All my files were completely restored, without the need to go to my days-old backup copy or attempting an incremental update.
If I felt I needed the root partition saved, I could have done the same zfs send on the root system as well. Then I could restore it as a new BE, at least for comparing. This would be handy for restoring configuration information like security certificates and printer settings. It could also be used to generate a list of installed packages, in case the reinstallation didn't catch everything.