Wednesday Oct 29, 2008

Converting sd minor numbers to instance numbers.

I had an email this week about a program that I wrote that would not die. The program is a disk test program that has been around in Sun for a while and with luck will be open sourced in the not to distant future, but I digress. The program was hanging no IO was going on and even sending it a kill -KILL would not kill it.

Generally if processes don't disappear when sent the signal “KILL” that is not the fault of the program. Since there is nothing the program can do to protect itself from KILL you need to look elsewhere. That elsewhere being in the kernel somewhere. So a crash dump was generated and I was pointed at it.

From the stack it was clear that the program could not die as there were outstanding async IO requests pending and looking at the aio_t confirmed this.

Walking the structures down to the first element on the aio_poolq to find a stuck IO and the buf's dev_t to see where we are hung up I do this:

> ::pgrep disko | ::print proc_t p_aio  | ::print aio_t aio_pollq | ::print  aio
_req_t   aio_req_buf.b_edev | ::devt
     MAJOR       MINOR
        27        9474
> 0t27::major2name

Seeing that minor number rang alarm bells as the usual way to convert from a minor number for the sd driver into an instance is to divide by 8 (the number of partitions) but that would still leave over 1000 devices. Possible but not likely. Only at that point did it dawn on me that this was an x86 box which thanks to a long history supports a different number of slices. A short grok in the source and the conversion for x86 is to divide by 64.

> 0t9474%0t64=D

> \*sd_state::softstate 0t148 | ::print "struct sd_lun"
    un_sd = 0xffffff016b72daa8
    un_rqs_bp = 0xffffff07aef1db80
    un_rqs_pktp = 0xffffff0548871080
    un_sense_isbusy = 0
    un_buf_chain_type = 0x1
    un_uscsi_chain_type = 0x8

What shocked me was how far I could get through a crash dump before taking on board the architecture of the system.

Friday Dec 15, 2006

How to shoot yourself in the foot with mdb

This would be funny if it were not for the poor customer who actually did this, lead by the hand by an info doc that suggested you do this:

#mdb -kw
>do_tcp_fusion/W 0
That is it. No information as to what to do next.
However those of you steeped in adb history, remember mdb has full backward compatibility will now that if you type another address at this point it will repeat the previous command. So it will write 0 to the address specified.
If you were unfortunate enough to not be steeped in adb history they you may not know how to exit from the mdb session. If you were to guess that the way to do this was to type “exit” then mdb happily looks up “exit” in the symbol table, converts that to an address and writes 0 into that address:
# mdb -kw
Loading modules: [ unix krtld genunix specfs ufs ip sctp usba s1394 nca ipc nfs audiosup random sppp sd crypto ptm lofs ]
> do_tcp_fusion/W 0
do_tcp_fusion:  0x1             =       0x0
> exit
exit:           0x9de3bf50      =       0x0

If you are quick, lucky, and realise what has happened you can write the instruction back before the system crashes, but on a moderately busy system you have almost no time to. you have to do it before the next process exits. Hitting control D, or exiting mdb using any other method now results in the system crashing:

panic[cpu0]/thread=3000183c020: BAD TRAP: type=10 rp=2a10037ba50 addr=10c8a00

mdb: illegal instruction fault:
pid=2926, pc=0x10c8a00, sp=0x2a10037b2f1, tstate=0x9900001602, context=0x115b
g1-g7: 10403ac, 58692c, 10c865c, 20, 80000305cfcc0ef8, 0, 3000183c020

000002a10037b770 unix:die+9c (10, 2a10037ba50, 10c8a00, 0, 2a10037b830, c0000000
  %l0-3: ffffffff7f402000 0000000000000010 ffffffff7e6ebec4 0000000000000000
  %l4-7: 0000000000000000 0000000000001084 0000000000001000 000000000106b800
000002a10037b850 unix:trap+12b8 (2a10037ba50, 0, 0, 1835800, 180c000, 3000183c02
  %l0-3: 0000000000000000 0000000000000010 0000030001832a98 0000000000000000
  %l4-7: 0000000000010008 0000000000010000 0000000000000001 000000000180c180
000002a10037b9a0 unix:ktl0+48 (1, 0, 100173000, 100173, 5, 5)
  %l0-3: 0000000000000003 0000000000001400 0000009900001602 0000000001013c74
  %l4-7: 0000030001832cc0 0000000000000000 0000000000000000 000002a10037ba50

syncing file systems... done
dumping to /dev/dsk/c0t0d0s1, offset 107806720, content: kernel

It is actually a better way to induce a panic than most of the ones documented in books like Panic.

I've changed the info doc in question to have the command specified as:

echo 'do_tcp_fusion/W 0' | mdb -kw

So that it does not lead any more customers down that path, yes I've trawled sunsolve for all the cases where we suggest mdb -kw and updated them in a similar way.

Update: I also filed bug 6505499


Wednesday Oct 04, 2006

Windows Crash Dump Analysis

For a lot of people this was the shock talk of the CEC and not because people think Windows systems don't crash but why would Sun Engineers what to know how to diagnose those. Well we get to support systems running Windows so it is useful to be able at least do a first pass analysis of the crash so that you know who to call next. Is it Windows, a driver or the hardware?

Dimitri De Wild and Feri Chua did an excellent, if a bit rushed, job of presenting the topic. This was not their fault with only forty minutes it was always going to be a rush even for just an introduction into this topic. I think I would have changed the title of the talk to “Windows Crash Dump Analysis for Solaris Kernel Engineers” though as it seemed to make some (reasonable) assumptions about the audience in some places.

Another really good talk which while I suspect I will not directly use the information it is very useful to know what Windows can do if you configure it to collect the crash dump.



This is the old blog of Chris Gerhard. It has mostly moved to


« July 2016