Tuesday Jun 14, 2005

Disk relocation in SVM

open_solaris_blog

Disk relocation in SVM (Solaris Volume Manager)

Blogging probably is the best place that I can share my experience with SVM and what I know about SVM internals. My name is Steve Peng and I have been with Sun since 1994 and came from BigBlue when they downsized the AIX development division. During my 11 years at Sun, I spend most of that time working on the Solaris Volume Manager (was called Solstice Disksuite before its integration into Solaris in Solaris 9) and being a key developer on most of cool projects such as 64 bit Solaris support, disk relocation support, 64 bit SVM and import of named disksets. In this debut blogging, I like to talk a bit about the disk relocation support.

So what happens if user uses old SDS (Solstice Disksuite) releases and moves disks around such as recabling the disks? The best thing he/she can do is to pray for the devts and ctds names to remain the same and if that is not the case then they are dead. Why? In the old SDS releases, it is the driver name and minor number along with the device name which is stored in the private configuration database and used to bring up its configuration during the system reboot. When disk is moved around and has a different devt and name as a result of movement then SDS simply can not locate the disk and will fail to bring up any existing configuration. This lack of ability to relocate the disks can result in a catastrophic situation if a wrong disk is located and configured.

When SDS is integrated into Solaris 9, the disk relocation support is put in by storing the unique disk device id such as WWN into the private databse. Now when a configuration is booted up, those stored unique device ids are used to locate the disks instead of using the stored devt/name tuples. This cool feature actually boosts the flexibility of SVM and makes the upgrade story even greater. If you know the upgrade story on the old SDS releases then you probably know what I meant when I say 'great'.

So one may wonder how the devt and ctds name of a disk device can change when the disk is moved around. When a disk is moved from one controller to another, the device instance number can change and since the disk now is attached to the new controller the device name will also change. One thing will not change is the disk unique device id such as WWN. So, exactly how SVM addresses this disk relocation issue? Let's use the simple stripe as example to see how SVM attack this issue internally. Says user creates a simple stripe d1 on top of /dev/dsk/c1t2d0s0 and when d1 is created the following database records will be created to store its configuration. Dump of the database shows the following configuration information:

RecId 0x00000003: Type:NM [0002] Type2: 1 Size = 512
sizeof(struct nm_rec)=52
sizeof(struct nm_rec_hdr)=24
sizeof(struct nm_name)=28
r_rec_hdr->r_alloc_size=512
r_rec_hdr->r_used_size=60
r_rec_hdr->r_next_recid=0x00000000
r_rec_hdr->r_next_key=0
n_side=0
n_key=0x00000001 <--- matched key
n_count=1
n_minor=0x00000010
n_drv_key=0x000003ef
n_dir_key=0x000003f0
n_namlen=9
n_name="c1t2d0s0"


RecId 0x00000007: Type:MD_STRIPE [1005] Type2: 0 Size = 208
==== ms_unit
==== mdc_unit
un_revision: 0 (MD_32BIT_META_DEV)
un_type: 1 (MD_DEVICE)
un_status: 0x00000000
un_self_id: 1(0x00000001)
un_record_id: 0x00000007
un_size: 208
un_flag: 0000
un_total_blocks: 35358848
un_actual_tb: 35358848
un_nhead: 19
un_nsect: 248
un_rpm: 7200
un_wr_reinstruct: 119
un_rd_reinstruct: 0
un_vtoc_id: 0x00000000
un_capabilities: 0x0000000b
un_parent: 0xffffffff (MD_NO_PARENT)
un_user_flags: 0x00000000
--
un_hsp_id: -1
un_nrows: 1
un_ocomp: 136
row: 0
un_icomp: 0
un_ncomp: 1
un_blocks: 35358848
un_cum_blocks: 35358848
un_interlace: 32
comp: 0
un_key: 0x1 <--- used to search for the match
un_dev: 0x2000000010
un_start_block: 0
un_mirror:
ms_flags: 0x0
ms_state: 1 (CS_OKAY)
ms_lasterrcnt: 0
ms_orig_dev: 0x0
ms_orig_blk: 0
ms_hs_key: 0x00000000
ms_hs_id: 0
ms_timestamp: Wed Jun 8 15:03:53 2005

As you can see that RecId 7 is used to describe the stripe d1 that jsut created. You can see that stripe d1 has only one row and one component and the un_key is used to locate the underlying device. In this case the '1' is used to locate its component by scanning the "NM" record for an entry matches the value of 1. In this case, it is c1t2d0s0. When an entry is located, its stored minor number will then be used to construct the devt for the component and the devt then will be used to access the device.

The above information is sufficient enough to bring up stripe d1 as long as minor number is not changed. As mentioned above, the minor number can change whenever the underlying disk is moved around. So you can see that some kind of persistent information needs to be stored to resolve the disk relocation problem. The approach that taken by SVM is to use the disk's unique device id (WWN). Whenever a SVM device is created, the device ids of all the underlying disk components are stored in its database also. When a metadevice is snarf'd, the stored device id is used instead of traditional devt/name to locate all the component devices and this gurantee the snarf operation.

The stored unique device id will have information looks like this:

RecId 0x00000006: Type:DID_SHR_NM [0008] Type2: 1 Size = 1024
sizeof(struct devid_shr_rec)=40
sizeof(struct nm_rec_hdr)=24
sizeof(struct did_shr_name)=16
did_rec_hdr->r_alloc_size=1024
did_rec_hdr->r_used_size=96
did_rec_hdr->r_next_recid=0x00000000
did_rec_hdr->r_next_key=0
did_key=0x00000001(1)
did_count=1
did_data=0x00000001
did_size=56
did_devid=696400010002002c73640000534541474154452053543331383430344c53554e31384720334254324c50433430303030323230314141364b
ascii did_devid=id1,sd@SSEAGATE_ST318404LSUN18G_3BT2LPC400002201AA6K



Technorati Tag:
Technorati Tag:
About

stevep

Search

Recent Posts
Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today