Fast Reboot: tightening the loop on anonymous tracing

Back in the day, Bryan, Mike and Adam used to give a DTrace presentation that had a final slide entitled "The DTrace Revolution". I really liked this slide because it distilled into a single bullet point what, essentially, made DTrace so exciting for me. The bullet point was:

  • DTrace tightens the diagnosis loop:

    hypothesis -> instrumentation -> data gathering -> analysis -> hypothesis

This simple statement succinctly puts down the steps we go through when analysing system behaviour and the realisation that DTrace could make this loop a whole lot shorter made me excessively happy (illegally so at times I'd say).

I mention the above because Sherry Moore and her team recently putback (I should say pushed now I guess) a project into Solaris which makes the diagnosis loop a whole lot tighter for anonymous tracing; that project is the Fast Reboot project. This project, not surprisingly, allows us to reboot x86 based systems a whole lot quicker than we used to. For example, my little v20z test system, now reboots to the console prompt in around 40 seconds instead of 3 minutes!! Hopefully you won't need to reboot too often but, if you do, that's the kind of juice you need! It really is amazing stuff.

For those that may have forgotten, anonymous tracing gives us the ability to trace from really early on in the Solaris boot process. Some of you may never need to do this but for those of us that do the data gathering phase in the diagnosis loop can become a real pain, especially on highly configured systems. Reducing the time to reboot systems by such a large degree is a positive boon for the anonymous tracer!

Here's a quick demo (yes, I know you really need to see Fast Reboot in action but text will have to do) of the up and coming CPU Performance Counter provider in use with anonymous tracing to look at kernel L2 Data Cache misses generated during the boot process. I think this is pretty much a first so you saw it here first:

# cat l2miss-bycpu.d 
#!/usr/sbin/dtrace -s

#pragma D option quiet

cpc:::PAPI_l2_dcm-kernel-100
{
        @[func(arg0)] = count();
}

END
{
        trunc(@, 30);
        printf("%-60s %16s\\n\\n", "FUNCTION", "FREQ");
        printa("%-60a %16@d\\n", @);
}
# dtrace -A -s ./l2miss-bycpu.d 
dtrace: saved anonymous enabling in /kernel/drv/dtrace.conf
dtrace: added forceload directives to /etc/system
dtrace: run update_drv(1M) or reboot to enable changes
# reboot -f
Oct  2 14:12:51 va64-v20zn-gmp03 reboot: initiated by LOGIN on /dev/console
Oct  2 14:13:37 va64-v20zn-gmp03 syslogd: going down on signal 15
updating /platform/i86pc/amd64/boot_archive
updating /platform/i86pc/boot_archive
Fast reboot.
syncing file systems... done
SunOS Release 5.11 Version dtrace-cpc-hg 64-bit
Copyright 1983-2008 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
DEBUG enabled
NOTICE: enabling probe 0 (cpc:::PAPI_l2_dcm-kernel-100)
NOTICE: enabling probe 1 (:::END)
NOTICE: enabling probe 2 (dtrace:::ERROR)

Hostname: va64-v20zn-gmp03
NIS+ domain name is eu.cte.sun.com
Reading ZFS config: done.

va64-v20zn-gmp03 console login: root
Password: 
Oct  2 14:14:27 va64-v20zn-gmp03 login: ROOT LOGIN /dev/console
Last login: Thu Oct  2 14:12:17 on console
Sun Microsystems Inc.   SunOS 5.11      dtrace-cpc-hg   Sep. 24, 2008
SunOS Internal Development:  jhaslam 2008-09-24 [dtrace-cpc-hg]
bfu'ed from /cpc on 2008-09-24
Sun Microsystems Inc.   SunOS 5.11      snv_96  November 2008
# dtrace -ae
FUNCTION                                                                 FREQ

unix`page_try_reclaim_lock                                               2862
unix`x86_hm_exit                                                         2883
genunix`copy_pattern                                                     2920
genunix`checkpage                                                        2975
unix`page_create_va                                                      2984
unix`disp_getwork                                                        3002
unix`splr                                                                3038
unix`atomic_and_64                                                       3076
genunix`kmem_log_enter                                                   3202
unix`bcopy                                                               3237
unix`sys_syscall32                                                       3615
unix`page_get_mnode_freelist                                             3858
genunix`kmem_cache_free_debug                                            4423
unix`mutex_exit                                                          4498
unix`tsc_read                                                            4659
genunix`kmem_hash_rescale                                                4756
unix`atomic_add_32                                                       4786
unix`bzero                                                               4809
unix`atomic_add_64                                                       4924
unix`do_splx                                                             4951
unix`strcmp                                                              5156
unix`kcopy                                                               6318
genunix`verify_and_copy_pattern                                          6833
unix`page_lookup_create                                                  7916
genunix`kmem_cache_alloc_debug                                           8007
genunix`getpcstack                                                       8370
dtrace`dtrace_unregister                                                11502
unix`hwblkpagecopy                                                      28359
unix`mutex_enter                                                        55687

And there you go! No waiting ages for the various BIOS checks and device probing - just straight back to us so we can get on with some useful work. Now, why exactly does dtrace_unregister() take so many L2 cache misses...

Comments:

Post a Comment:
Comments are closed for this entry.
About

jonh

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today