Thursday Dec 29, 2011

Recovering From a ZFS Disaster

A few weeks ago I hit a rare bug in an internal Solaris 11 build that caused my laptop to panic when it rebooted. References to ZFS were listed in the panic output. I figured that I just needed to scrub my ZFS filesystem, which would find the error and remove it, and I would be up and running in no time.

I began by burning the Text Installer (available for download from OTN) onto a CD, then booting off of it. I selected the Shell option so I could run the OS off of RAM. I then issued the command:

 # zfs import -f rpool

Right away it panicked. Again. That was not good. I did have a backup that was a few days old, but I really wanted to try to salvage my most current content. In doing some research, I found that I could use mdb (modular debugger) to create a breakpoint at the point where it would normally panic on a zfs error, and then tell it not to.

I was able to boot off of the Live image and used this sequence to mount the /export/home/user partition:

# mdb -kw
> aok/W 0x1
> zfs_recover/E 0x1

# zpool import -f -o readonly=on rpool

Since the Live image is networked, I was able to dump the contents of my home directory to another system on the network. I just needed to set up an empty zfs fileystem remotely.

# zfs send rpool/export/home/user | 
ssh othersystem sudo zfs receive rpool/export/home/user-recovered

Once I had the contents of my user partition copied, I reinstalled my system. This recreated the rpool from scratch. I then did a reversal of the same zfs send command to populate a new home on my laptop:

 # ssh othersystem sudo  zfs send rpool/export/home/user-recovered | 
 zfs receive rpool/export/home/user-recovered

Then as root I destroyed the current rpool/export/home/user and renamed rpool/export/home/user-recovered:

# zfs destroy rpool/export/home/user
# zfs rename  rpool/export/home/user-recovered rpool/export/home/user

All my files were completely restored, without the need to go to my days-old backup copy or attempting an incremental update.

 If I felt I needed the root partition saved, I could have done the same zfs send on the root system as well. Then I could restore it as a new BE, at least for comparing. This would be handy for restoring configuration information like security certificates and printer settings. It could also be used to generate a list of installed packages, in case the reinstallation didn't catch everything.

About

Insights on Solaris from the Release Engineering view. We aren't upstream in development, or downstream in QA. We're in the middle of each Solaris build.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today