Wednesday Jan 21, 2009

Getting the right CTF

I just spent to long, way too long, working out why a system dump's CTF did not seem to match the source code or for that matter the assembler that had been generated.

When a Solaris release is released all the CTF is merged into the unix file. As updates are released any structures that change are not updated in the unix file (since the old definition may still be being used) so the CTF definition is held in the module in which they are defined.

So faced with a dump where I needed to look at the “conn_udp” element in “struct conn_s” mdb kept saying there was no element “conn_udp”:

> ::print -at conn_t conn_udp
mdb: failed to find member conn_udp of conn_t: no such member of structure or union

since the assembler made it abundantly clear that we were indeed using this element (I would show you the source but this is Solaris 10 and the source is very different from the OpenSolaris code). The thing to recall was that the structure is really defined in the “ip” module so to get the correct definition you need this:

> ::print -at ip`conn_t conn_udp
30 struct udp_s \*conn_udp 

This also effects dtrace as that is also a consumer of CTF (note this dtrace is entirely pointless):

# dtrace -n 'fbt::udp_bind:entry / ((conn_t \*)(args[0]->q_ptr))->conn_udp / { tot++ }'
dtrace: invalid probe specifier fbt::udp_bind:entry / ((conn_t \*)(args[0]->q_ptr))->conn_udp / { tot++ }: in predicate: conn_udp is not a member of struct conn_s

and again by getting the definition from the original module gets the right answer:

# dtrace -n 'fbt::udp_bind:entry / ((ip`conn_t \*)(args[0]->q_ptr))->conn_udp / { tot++ }'
dtrace: description 'fbt::udp_bind:entry ' matched 1 probe


Since “ip`conn_t” will always give the right answer even in the case where the merged CTF data is in unix and that is the current version it is best to understand where the object was declared.

I kind of wish that at least in the case of dtrace it should get this right at the very least when you have specified the module since it knows what module you are in it could choose the CTF from that module.

# dtrace -n 'fbt:ip:udp_bind:entry / ((conn_t \*)(args[0]->q_ptr))->conn_udp / { tot++ }'
dtrace: invalid probe specifier fbt:ip:udp_bind:entry / ((conn_t \*)(args[0]->q_ptr))->conn_udp / { tot++ }: in predicate: conn_udp is not a member of struct conn_s

Should IMO work.

Wednesday Oct 29, 2008

Converting sd minor numbers to instance numbers.

I had an email this week about a program that I wrote that would not die. The program is a disk test program that has been around in Sun for a while and with luck will be open sourced in the not to distant future, but I digress. The program was hanging no IO was going on and even sending it a kill -KILL would not kill it.

Generally if processes don't disappear when sent the signal “KILL” that is not the fault of the program. Since there is nothing the program can do to protect itself from KILL you need to look elsewhere. That elsewhere being in the kernel somewhere. So a crash dump was generated and I was pointed at it.

From the stack it was clear that the program could not die as there were outstanding async IO requests pending and looking at the aio_t confirmed this.

Walking the structures down to the first element on the aio_poolq to find a stuck IO and the buf's dev_t to see where we are hung up I do this:

> ::pgrep disko | ::print proc_t p_aio  | ::print aio_t aio_pollq | ::print  aio
_req_t   aio_req_buf.b_edev | ::devt
     MAJOR       MINOR
        27        9474
> 0t27::major2name

Seeing that minor number rang alarm bells as the usual way to convert from a minor number for the sd driver into an instance is to divide by 8 (the number of partitions) but that would still leave over 1000 devices. Possible but not likely. Only at that point did it dawn on me that this was an x86 box which thanks to a long history supports a different number of slices. A short grok in the source and the conversion for x86 is to divide by 64.

> 0t9474%0t64=D

> \*sd_state::softstate 0t148 | ::print "struct sd_lun"
    un_sd = 0xffffff016b72daa8
    un_rqs_bp = 0xffffff07aef1db80
    un_rqs_pktp = 0xffffff0548871080
    un_sense_isbusy = 0
    un_buf_chain_type = 0x1
    un_uscsi_chain_type = 0x8

What shocked me was how far I could get through a crash dump before taking on board the architecture of the system.

Tuesday Mar 20, 2007

mdb pipes.

Sometimes I find my self doing things that I have been doing for years and just wonder whether by now there is some tool that I have missed that makes the old way of doing something redundant. So in the off chance that I have missed something I'll document the way I often drive mdb (or more often mdb+ a slightly improved mdb that I keep wishing would get open sourced) in the hope some bright spark will point me at a better way. Failing that someone might find this useful.

When looking at crash dumps I often want to process some data and then pipe it back into mdb via some text processing tool. Now given that the startup time for mdb, particularly when running under kenv is very significant you don't want to be doing the obvious of having mdb in a typical pipeline. So while I can do this:

nawk 'BEGIN { printf("::load chain\\n") }
/buf addr/ {
        printf("%s::print -at buf_t av_back | ::if scsi_pkt pkt_comp == ssdintr and pkt_address.a_hba_tran->tran_tgt_init == ssfcp_scsi_tgt_init |::print -at scsi_pkt  pkt_ha_private | ::print -at ssfcp_pkt cmd_flags cmd_timeout cmd_next\\n",
        $3) }'  act.0 | kenv -x explorer_dir mdb+ 0

It is a pain as kenv processes the explorer to build the correct environment and for large dumps loads the dump into memory only to throw it away.

So instead I start mdb as a cooperating process in the korn shell:

kenv -x explorer_dir mdb+ |&

Then I have a shell function called “mdbc” that will submit commands into the cooperating process and read the results back. So the above becomes:

nawk 'BEGIN { printf("::load chain\\n") }
/buf addr/ {
        printf("%s::print -at buf_t av_back | ::if scsi_pkt pkt_comp == ssdintr and pkt_address.a_hba_tran->tran_tgt_init == ssfcp_scsi_tgt_init |::print -at scsi_pkt  pkt_ha_private | ::print -at ssfcp_pkt cmd_flags cmd_timeout cmd_next\\n",
        $3) }'  act.0 | mdbc

or I can do

mdbc lbolt::print

Just by way of an example to show why I bother, compare the times of these two equivalent commands:

: dredd TS 243 $; time echo  lbolt::print | kenv -x explorer mdb+ 0

real    0m37.03s
user    0m7.02s
sys     0m7.26s
: dredd TS 244 $; time mdbc lbolt::print > /dev/null                                                              

real    0m0.01s
user    0m0.00s
sys     0m0.01s
: dredd TS 245 $; 

and just to show that I get the right results from the mdbc command.

: dredd TS 245 $; time mdbc lbolt::print            

real    0m0.01s
user    0m0.00s
sys     0m0.01s
: dredd TS 246 $; 

However like talk it does have a 1980s feel to it so I look forward to hearing the error of my ways.

If you think you might find it useful the shell function is here.


Friday Dec 15, 2006

How to shoot yourself in the foot with mdb

This would be funny if it were not for the poor customer who actually did this, lead by the hand by an info doc that suggested you do this:

#mdb -kw
>do_tcp_fusion/W 0
That is it. No information as to what to do next.
However those of you steeped in adb history, remember mdb has full backward compatibility will now that if you type another address at this point it will repeat the previous command. So it will write 0 to the address specified.
If you were unfortunate enough to not be steeped in adb history they you may not know how to exit from the mdb session. If you were to guess that the way to do this was to type “exit” then mdb happily looks up “exit” in the symbol table, converts that to an address and writes 0 into that address:
# mdb -kw
Loading modules: [ unix krtld genunix specfs ufs ip sctp usba s1394 nca ipc nfs audiosup random sppp sd crypto ptm lofs ]
> do_tcp_fusion/W 0
do_tcp_fusion:  0x1             =       0x0
> exit
exit:           0x9de3bf50      =       0x0

If you are quick, lucky, and realise what has happened you can write the instruction back before the system crashes, but on a moderately busy system you have almost no time to. you have to do it before the next process exits. Hitting control D, or exiting mdb using any other method now results in the system crashing:

panic[cpu0]/thread=3000183c020: BAD TRAP: type=10 rp=2a10037ba50 addr=10c8a00

mdb: illegal instruction fault:
pid=2926, pc=0x10c8a00, sp=0x2a10037b2f1, tstate=0x9900001602, context=0x115b
g1-g7: 10403ac, 58692c, 10c865c, 20, 80000305cfcc0ef8, 0, 3000183c020

000002a10037b770 unix:die+9c (10, 2a10037ba50, 10c8a00, 0, 2a10037b830, c0000000
  %l0-3: ffffffff7f402000 0000000000000010 ffffffff7e6ebec4 0000000000000000
  %l4-7: 0000000000000000 0000000000001084 0000000000001000 000000000106b800
000002a10037b850 unix:trap+12b8 (2a10037ba50, 0, 0, 1835800, 180c000, 3000183c02
  %l0-3: 0000000000000000 0000000000000010 0000030001832a98 0000000000000000
  %l4-7: 0000000000010008 0000000000010000 0000000000000001 000000000180c180
000002a10037b9a0 unix:ktl0+48 (1, 0, 100173000, 100173, 5, 5)
  %l0-3: 0000000000000003 0000000000001400 0000009900001602 0000000001013c74
  %l4-7: 0000030001832cc0 0000000000000000 0000000000000000 000002a10037ba50

syncing file systems... done
dumping to /dev/dsk/c0t0d0s1, offset 107806720, content: kernel

It is actually a better way to induce a panic than most of the ones documented in books like Panic.

I've changed the info doc in question to have the command specified as:

echo 'do_tcp_fusion/W 0' | mdb -kw

So that it does not lead any more customers down that path, yes I've trawled sunsolve for all the cases where we suggest mdb -kw and updated them in a similar way.

Update: I also filed bug 6505499



This is the old blog of Chris Gerhard. It has mostly moved to


« April 2014