one child off sick, two sleepy dogs after some long walks at the weekend. If you live near Farnham
in Surrey, UK and you see two fluffy white dogs that's probably us.
Explained again that the "%b" column in iostat is not a measure of how busy a storage device. It is the
percentage of the sample time for which there is at least one i/o outstanding in HBA driver/disk. It is a
measure of how "bursty" your i/o load is. The storage device will perform best when the i/o is spread out
evenly over the sample time, conversely it will perform less well if all the i/o is dropped on it at the same instant.
Worst of all is when the burst of i/o exceeds the target drivers throttle and i/os are queued on the target driver's
waitq, there they linger until a i/o comes back from the disk and then they get to start; as far as the layer above
is concerned that i/o took 1.5 times as long as it should. So the warning sign is a low %b, and a wsvc_t > 0, somehow
you need to smooth out the workload, or disperse it across more drives, or check you haven't set the target drivers
throttle too low.
collect and analyzer and friends are the performance analysis parts of the compiler suite. I only learn't today that
you can merge profiling runs with harware counters. It took me a while to work out how to use the h/ware counters, but
it is easy, you pick the two h/ware counters that you want the cpu the count, you specify the number
of increments you want before a signal is sent to you process, so say you wanted to be notified on every dtlb miss
and functions that use a lot of instructions then you could use..
collect -p off -h Instr_cnt,200000,DTLB_miss,1 ls -al /tmp
those two counter names come from /usr/sbin/cpustat -h
so what can I see from this..
er_print -pcs test.1.er
test.1.er: Experiment has warnings, see header for details
Objects sorted by metric: Exclusive Instr_cnt Events
Excl. Incl. Excl. Incl. Name
Instr_cnt Instr_cnt DTLB_miss DTLB_miss
Events Events Events Events
0 0 2 2 _ndoprnt + 0x000003D0
0 0 2 2 _ndoprnt + 0x000020EC
0 0 0 3 _ndoprnt + 0x00002898
0 0 0 5 _rt_boot + 0x00000088
0 0 0 5 _setup + 0x000003AC
0 0 0 2 _start + 0x000000AC
0 1400067 0 53 _start + 0x00000108
0 200017 0 35 _start + 0x00000110
0 200008 0 17 _ti_bind_clear + 0x00000050
0 0 0 1 atexit + 0x0000001C
0 200017 0 33 atexit_fini + 0x00000068
0 0 0 3 call_fini + 0x000000D4
0 0 0 5 call_fini + 0x000000DC
0 200017 0 25 call_fini + 0x00000104
0 0 0 5 call_init + 0x000001B0
0 0 0 3 collector_close_experiment + 0x00000090
0 0 0 5 collector_init + 0x000000DC
0 0 0 1 collector_sigprof_sigaction + 0x00000108
0 200021 0 1 dcgettext + 0x000000A0
0 400018 0 27 do_exit_critical + 0x000000C8
0 0 1 1 doformat + 0x00000A18
0 0 0 1 doformat + 0x00000A2C
0 0 1 1 doformat + 0x000015A8
0 1400067 0 47 main + 0x00000A04
I have a lot more work to do to really understand what it is displaying, The benefit over cputrack
is that you "should" be able to find the pc causing the most data cache misses etc and then look at changing the
code/adding prefetches to reduce the stall and get the instruction execution rate up.
0 miles in the smart
30 miles on the bike. Nearly hit the back of a lotus - the highly skilled driver overtook me 50 feet from a
junction, pulled in front and braked hard for the junction -great! thanks!