An Oracle blog about Solaris

Debugging sparc really (and I do mean really) early boot problems

Chris Beal
Senior Principal Software Engineer

For some work I've been doing I've had to work out how to debug the sparc boot process, before you can get to kmdb. And yes you can do it, it's just not that easy. So I thought I'd put it on my blog, in case I lose the notes I made in a mail to myself, and it might be of interest to some of you.

First off get as much of the diagnostics available from the OBP as possible 

 {1} ok setenv fcode-debug? true

fcode-debug? =          true

{1} ok

{1} ok setenv diag-switch? true

diag-switch? =          true

{1} ok reset-all

The reset-all is important as it saves the options the the nvram.

Now we try and boot it up - before anything is loaded. Note this requires a debug kernel, but if you're playing in this space and you're on sparc then you probably know that already

{ 1} ok boot disk0 -F kernel/unix -H 

You will see the boot fail like this

Rebooting with command: boot disk0 -F kernel/unix -H                 
Boot device: /pci@1c,600000/scsi@2/disk@0,0  File and args: -F kernel/unix -H
Halted with -H flag.
Warning: Fcode sequence resulted in a net stack depth change of 1

The file just loaded does not appear to be executable.

This is expected and how we get to start playing with breakpoints really early on. Note the unix module is not yet loaded so we now have to load the unix module. To do this we load the boot forth code and copy what it does

{1} ok see do-boot
: do-boot  
   parse-bootargs halt? l->n if   
      " Halted with -H flag. " type cr exit
   then  get-bootdev load-pkg mount-root zflag? nested? invert and
   l->n if   
      fs-name$ open-zfs-fs
   then  load-file setup-props exec-file

So by copying what do-boot does we can intercept the boot process

{1} ok get-bootdev load-pkg mount-root

{1} ok load-file setup-props

Loading: /platform/SUNW,Sun-Fire-V240/kernel/unix

Loading: /platform/sun4u/kernel/unix

{1} ok 

Now we can start some more magic. A DEBUG kernel will check the stop-me property in kobj_start(). This is something we have to populated in the boor properties which is why we've done all this messing around to get to this point

{1} ok cd /chosen
{1} ok 00 0 " stop-me" property
{1} ok .properties
fs-package               ufs-file-system
whoami                   /platform/sun4u/kernel/unix
impl-arch-name           SUNW,Sun-Fire-V240
elfheader-length         001c55c0
elfheader-address        51000000
bootfs                   fed85a80
fstype                   ufs
bootargs                 -F kernel/unix -H
bootpath                 /pci@1c,600000/scsi@2/disk@0,0:a
mmu                      fff74080
memory                   fff74290
stdout                   fed97b90
stdin                    fed97ea8
stdout-#lines            ffffffff
name                     chosen

We can now start the boot process using exec-file. It will stop immediately because of the stop-me property (ctrace gives me the stacktrace)

{1} ok exec-file
Type  'go' to resume
{1} ok ctrace
PC: 0000.0000.f004.81e4
Last leaf: jmpl  0000.0000.f005.d274   from 0000.0000.0100.8aec client_handler+70 
     0 w  %o0-%o7: (f0000000 16 f0000000 6d 73 6 fedcb441 1008aec )

call 0000.0000.0106.bea8 p1275_sparc_cif_handler        from 0000.0000.0106.7de8 prom_enter_mon+24 
     1 w  %o0-%o7: (f005d274 fedcbda8 1839400 106af00 185fc00 f005d274 fedcb4f1 1067de8 )

call 0000.0000.0106.7dc4 prom_enter_mon        from 0000.0000.0101.9ed4 kobj_start+30 
     2 w  %o0-%o7: (0 10bdaf0 f002d224 1 1817700 1821dd8 fedcb5c1 1019ed4 )

call 0000.0000.0101.9ea4 kobj_start        from 0000.0000.0100.7ac8 _start+10 
     3 w  %o0-%o7: (f005d274 0 0 0 10bd800 181fc00 fedcb701 1007ac8 )

From this point we have access to the unix symbols and can start setting break points. For example

{1} ok load_primary +bp
{1} ok go
0000.0000.010a.c7b0 load_primary         save        %o6, ffffffffffffff30, %o6
{1} ok ctrace
PC: 0000.0000.010a.c7b0 load_primary    
Last leaf: call 0000.0000.010a.c7b0 load_primary        from 0000.0000.010a.b46c kobj_init+d8 
     0 w  %o0-%o7: (1879400 0 fedcbe78 184f000 1879340 181ac00 fedcb111 10ab46c )

call 0000.0000.010a.b394 kobj_init        from 0000.0000.0101.9fd0 kobj_start+12c
     1 w  %o0-%o7: (f005d274 185c800 184f000 fedcbe78 184f3f8 184e400 fedcb5c1 1019fd0 )

call 0000.0000.0101.9ea4 kobj_start        from 0000.0000.0100.7ac8 _start+10 
     2 w  %o0-%o7: (f005d274 7 0 51000040 51000000 51000040 fedcb701 1007ac8 )

I'm interested in getting some more module loading debug info out so lets set moddebug to 0xf

{1} ok moddebug l?

(displays current value of a long)

{1} ok F moddebug l!
{1} ok moddebug l?
{1} ok

(set the long to be F then display it again)

Now lets see what additional info I get

 {1} ok go
/kernel/fs/sparcv9/specfs symbol _info multiply defined
/kernel/fs/sparcv9/specfs symbol _init multiply defined
Returned from _info, retval = 1
init_stubs: couldn't find symbol in module fs/specfs
(Can't load specfs) Program terminated

OK That doesn't tell me much more but you get the idea. You can access the symbols - set break points, set variables. In addition you can  dump out memory with dump, single step with step and loads of other things that you might want to do, but this at least will act as a memory jogger for me

Let me know if you found this useful.


Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.