Sunday Sep 07, 2008

Upgrading disks

Having run out of space in the root file systems and being close to full on the zpool the final straw was being able to get 2 750Gb sata drives for less than £100, that and knowing that sanpshots no longer cause re livering to restart which greatly simplifies the data migration. So I'm replacing the existing drives with new ones. Since the enclosure I have can only hold three drives this involved a two stage upgrade so that at no point was my data on less than two drives. First stage was to install one drive and label it:

partition> print
Current partition table (unnamed):
Total disk cylinders available: 45597 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders         Size            Blocks
  0       root    wm   39383 - 41992       39.99GB    (2610/0/0)    83859300
  1 unassigned    wm       0                0         (0/0/0)              0
  2     backup    wu       0 - 45596      698.58GB    (45597/0/0) 1465031610
  3 unassigned    wm       0                0         (0/0/0)              0
  4 unassigned    wm   36773 - 39382       39.99GB    (2610/0/0)    83859300
  5 unassigned    wm   45594 - 45596       47.07MB    (3/0/0)          96390
  6 unassigned    wm   36379 - 36772        6.04GB    (394/0/0)     12659220
  7 unassigned    wm       3 - 36378      557.31GB    (36376/0/0) 1168760880
  8       boot    wu       0 -     0       15.69MB    (1/0/0)          32130
  9 alternates    wm       1 -     2       31.38MB    (2/0/0)          64260

partition>

These map to the partitions from the original set up, only they are bigger. I'm confident that when the 40Gb root disks are to small I will have migrated to ZFS for root. So this looks like a good long term solution.

pearson # dumpadm -d /dev/dsk/c2d0s6
      Dump content: kernel pages
       Dump device: /dev/dsk/c2d0s6 (dedicated)
Savecore directory: /var/crash/pearson
  Savecore enabled: yes
pearson # metadb -a -c 3 /dev/dsk/c2d0s5
pearson # egrep c2d0 /etc/lvm/md.tab
d12 1 1 /dev/dsk/c2d0s0
d42 1 1 /dev/dsk/c2d0s4
pearson # metainit d12
d12: Concat/Stripe is setup
pearson # metainit d42
d42: Concat/Stripe is setup
pearson # metattach d0 d12
d0: submirror d12 is attached
pearson # 

Now wait until the disk has completed resyning. While you can do this in parallel this causes the disk heads to move more so overall it is slower. Left to just do one partition at a time it is really quite quick:

                 extended device statistics                 
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b 
cmdk0   357.2    0.0 18321.8    0.0  2.6  1.1   10.4  52  58 
cmdk1     0.0  706.4    0.0 36147.4  1.0  0.5    2.2  23  27 
cmdk2   350.2    0.0 17929.6    0.0  0.4  0.3    2.1  12  15 
md1      70.0   71.0 35859.2 36371.5  0.0  1.0    7.1   0 100 
md3       0.0   71.0    0.0 36371.5  0.0  0.3    3.8   0  27 
md15     35.0    0.0 17929.6    0.0  0.0  0.6   16.5   0  58 
md18     35.0    0.0 17929.6    0.0  0.0  0.1    4.3   0  15 
pearson # metastat d0 
d0: Mirror
    Submirror 0: d10
      State: Okay         
    Submirror 1: d11
      State: Okay         
    Submirror 2: d12
      State: Resyncing    
    Resync in progress: 70 % done
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 20482875 blocks (9.8 GB)

d10: Submirror of d0
    State: Okay         
    Size: 20482875 blocks (9.8 GB)
    Stripe 0:
        Device   Start Block  Dbase        State Reloc Hot Spare
        c1d0s0          0     No            Okay   Yes 


d11: Submirror of d0
    State: Okay         
    Size: 20482875 blocks (9.8 GB)
    Stripe 0:
        Device   Start Block  Dbase        State Reloc Hot Spare
        c5d0s0          0     No            Okay   Yes 


d12: Submirror of d0
    State: Resyncing    
    Size: 83859300 blocks (39 GB)
    Stripe 0:
        Device   Start Block  Dbase        State Reloc Hot Spare
        c2d0s0          0     No            Okay   Yes 


Device Relocation Information:
Device   Reloc  Device ID
c1d0   Yes      id1,cmdk@AST3320620AS=____________3QF09GL1
c5d0   Yes      id1,cmdk@AST3320620AS=____________3QF0A1QD
c2d0   Yes      id1,cmdk@AST3750840AS=____________5QD36N5M
pearson # 

Once complete do the other root disk:

pearson # metattach d4 d42 
d4: submirror d42 is attached
pearson # 

Finally attach slice 7 to the zpool:

pearson # zpool attach -f tank c1d0s7 c2d0s7
pearson # zpool status
  pool: tank
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 0.00% done, 252h52m to go
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c1d0s7  ONLINE       0     0     0
            c5d0s7  ONLINE       0     0     0
            c2d0s7  ONLINE       0     0     0

errors: No known data errors
pearson # 

The initial estimate is more pessimistic than reality but it still took over 11hours to complete. The next thing was to shut the system down and replace one of the old drives with the new. Once this was done the final slices in use from the old drive can be detached and in the case of the meta devices cleared.

: pearson FSS 4 $; zpool status
  pool: tank
 state: ONLINE
 scrub: scrub completed after 11h8m with 0 errors on Sat Sep  6 20:58:05 2008
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c5d0s7  ONLINE       0     0     0
            c2d0s7  ONLINE       0     0     0

errors: No known data errors
: pearson FSS 5 $; 
: pearson FSS 5 $; metastat
d6: Mirror
    Submirror 0: d62
      State: Okay         
    Submirror 1: d63
      State: Okay         
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 12659220 blocks (6.0 GB)

d62: Submirror of d6
    State: Okay         
    Size: 12659220 blocks (6.0 GB)
    Stripe 0:
        Device   Start Block  Dbase        State Reloc Hot Spare
        c5d0s6          0     No            Okay   Yes 


d63: Submirror of d6
    State: Okay         
    Size: 12659220 blocks (6.0 GB)
    Stripe 0:
        Device   Start Block  Dbase        State Reloc Hot Spare
        c2d0s6          0     No            Okay   Yes 


d4: Mirror
    Submirror 0: d42
      State: Okay         
    Submirror 1: d43
      State: Okay         
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 83859300 blocks (39 GB)

d42: Submirror of d4
    State: Okay         
    Size: 83859300 blocks (39 GB)
    Stripe 0:
        Device   Start Block  Dbase        State Reloc Hot Spare
        c5d0s4          0     No            Okay   Yes 


d43: Submirror of d4
    State: Okay         
    Size: 83859300 blocks (39 GB)
    Stripe 0:
        Device   Start Block  Dbase        State Reloc Hot Spare
        c2d0s4          0     No            Okay   Yes 


d0: Mirror
    Submirror 0: d12
      State: Okay         
    Submirror 1: d13
      State: Okay         
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 83859300 blocks (39 GB)

d12: Submirror of d0
    State: Okay         
    Size: 83859300 blocks (39 GB)
    Stripe 0:
        Device   Start Block  Dbase        State Reloc Hot Spare
        c5d0s0          0     No            Okay   Yes 


d13: Submirror of d0
    State: Okay         
    Size: 83859300 blocks (39 GB)
    Stripe 0:
        Device   Start Block  Dbase        State Reloc Hot Spare
        c2d0s0          0     No            Okay   Yes 


Device Relocation Information:
Device   Reloc  Device ID
c5d0   Yes      id1,cmdk@AST3750840AS=____________5QD36N5M
c2d0   Yes      id1,cmdk@AST3750840AS=____________5QD3EQEX
: pearson FSS 6 $; 

The old drive is still in the system but currently only has a metadb on it:

: pearson FSS 6 $; metadb -i
        flags           first blk       block count
     a m  p  luo        16              8192            /dev/dsk/c1d0s5
     a    p  luo        8208            8192            /dev/dsk/c1d0s5
     a    p  luo        16400           8192            /dev/dsk/c1d0s5
     a    p  luo        16              8192            /dev/dsk/c5d0s5
     a    p  luo        8208            8192            /dev/dsk/c5d0s5
     a    p  luo        16400           8192            /dev/dsk/c5d0s5
     a       luo        16              8192            /dev/dsk/c2d0s5
     a       luo        8208            8192            /dev/dsk/c2d0s5
     a       luo        16400           8192            /dev/dsk/c2d0s5
 r - replica does not have device relocation information
 o - replica active prior to last mddb configuration change
 u - replica is up to date
 l - locator for this replica was read successfully
 c - replica's location was in /etc/lvm/mddb.cf
 p - replica's location was patched in kernel
 m - replica is master, this is replica selected as input
 t - tagged data is associated with the replica
 W - replica has device write errors
 a - replica is active, commits are occurring to this replica
 M - replica had problem with master blocks
 D - replica had problem with data blocks
 F - replica had format problems
 S - replica is too small to hold current data base
 R - replica had device read errors
 B - tagged data associated with the replica is not valid
: pearson FSS 7 $; 

I'm tempted to leave the third disk in the system so that the disk suite configuration will always have a quorum if a single drive files. However since the BIOS only seems to be able to boot from the first disk drive this may be pointless.

I'm now keenly interested in bug 6592835 “resliver needs to go fasterâ€ since if a disk did fail I don't fancy waiting more than 24hours after I have sourced a new drive for the data to sync when the disks fill. The disk suite devices managed to drive the disk at over 40Mb/sec while ZFS achieved 5Mb/sec.

About

This is the old blog of Chris Gerhard. It has mostly moved to http://chrisgerhard.wordpress.com

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today