Converting sd minor numbers to instance numbers.

I had an email this week about a program that I wrote that would not die. The program is a disk test program that has been around in Sun for a while and with luck will be open sourced in the not to distant future, but I digress. The program was hanging no IO was going on and even sending it a kill -KILL would not kill it.

Generally if processes don't disappear when sent the signal “KILL” that is not the fault of the program. Since there is nothing the program can do to protect itself from KILL you need to look elsewhere. That elsewhere being in the kernel somewhere. So a crash dump was generated and I was pointed at it.

From the stack it was clear that the program could not die as there were outstanding async IO requests pending and looking at the aio_t confirmed this.

Walking the structures down to the first element on the aio_poolq to find a stuck IO and the buf's dev_t to see where we are hung up I do this:

> ::pgrep disko | ::print proc_t p_aio  | ::print aio_t aio_pollq | ::print  aio
_req_t   aio_req_buf.b_edev | ::devt
     MAJOR       MINOR
        27        9474
> 0t27::major2name
sd

Seeing that minor number rang alarm bells as the usual way to convert from a minor number for the sd driver into an instance is to divide by 8 (the number of partitions) but that would still leave over 1000 devices. Possible but not likely. Only at that point did it dawn on me that this was an x86 box which thanks to a long history supports a different number of slices. A short grok in the source and the conversion for x86 is to divide by 64.

> 0t9474%0t64=D
                148             

> \*sd_state::softstate 0t148 | ::print "struct sd_lun"
{
    un_sd = 0xffffff016b72daa8
    un_rqs_bp = 0xffffff07aef1db80
    un_rqs_pktp = 0xffffff0548871080
    un_sense_isbusy = 0
    un_buf_chain_type = 0x1
    un_uscsi_chain_type = 0x8

What shocked me was how far I could get through a crash dump before taking on board the architecture of the system.

Comments:

Hi Chris,
Is the disk test program, that you hope will be open sourced, the one that is mentioned at this link?
http://research.sun.com/minds/2008-0312/

Posted by Nigel Smith on October 29, 2008 at 05:38 PM GMT #

Yes. It is slightly more than "hope" as well. We are actively working this. There just seem to be a lot of hoops to jump through.

Posted by Chris Gerhard on October 30, 2008 at 12:57 AM GMT #

Post a Comment:
Comments are closed for this entry.
About

This is the old blog of Chris Gerhard. It has mostly moved to http://chrisgerhard.wordpress.com

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today