Will a faster cpu make my application faster?

I was recently involved in an escalation in which a customer had moved from one sparc platform to another and also moved to a faster release of Ultrasparc-IV than they had previously looked at.

It turns out that they actually saw their application slow down.

This is not as silly, nor as unusual as it may at first seem.

The actual platform migration was from a US-III workgroup server to a starcat class machine.

Now, there are some things to watch for in this type of migration as there there are some major differences in the architecture. Most specifically you are moving from a platform with a two tier memory architecture to one with a three tier architecture.

Now in Solaris 9 we have some "new" bits that help us immensely here that are simply not there and not able to be backported to Solaris 8.

These are Memory Placement Optimization (MPO) and Multiple Page Size Support (MPSS). MPO is the important one as it attempts to run programs on the same board that the memory exists on.

OK, that probably accounts for the slow down. Why did we not see any improvement.

While I don't actually have the data on the previous system I have my suspicions.

On the Starcat box I am seeing large amounts of idle time. The immediate thought here is, is the application cpu bound or is something else the bottleneck?

If we already have lots of idle time on the previous box, then it's odds on that the cpu is not our bottleneck.

If, for example, we have threads that are doing a lot of I/O, and we haven't changed the I/O subsystem, then the time we spend waiting on the I/O is not going to change. If that is the limiting speed factor, then faster cpus are not going to help us.

The suggestion coming out of this is that before upgrading your hardware in order to speed up your applications, please have a look at the application to see exactly where the bottlenecks are. You may be pleasantly surprised to find that there are cheaper options to improving your application performance.

Some good \*stat commands to start with would be vmstat, iostat and mpstat.

First off, try "vmstat 5" This will give us output like

 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr cd s0 -- --   in   sy   cs us sy id
 0 0 0 692416 239220 33 129 80  1  4  0 140 37 0  0  0  469 2255  996  5  7 88
 0 0 0 626860 175244  0   9  0  0  0  0  0 26  0  0  0  427  214  278  1  3 96
 0 0 0 626860 175244  0   0  0  0  0  0  0  0  0  0  0  371  160  193  0  3 97
 0 0 0 626860 175244  0   0  0  0  0  0  0  0  0  0  0  372  195  203  1  3 97

Have a look at the 'cpu' columns, specifically the user/system/idle split. Do we have idle time? Does the system time look excessive? These kinds of things are more complex to investigate, but can be looked in to.

It's also worth looking at the 'thr' columns. These show

  • r - # threads ready to run but not yet running (a count of the threads on the dispatch queues
  • b - # threads that are blocked waiting for resources (eg I/O, paging, ...)
  • w - # swapped out lightweight processes that are waiting for processing resources to finish

Consistantly high numbers on any of these are cause for concern. Specifically, consistantly having threads in the dispatch queue is a sign that we are probably cpu starved in this box.

On to mpstat. This shows the following columns (I'm running this on a single cpu notebook, on multi-cpu machines you'd see more cpus).

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0  115   6    0   460  356  920  249    0    0    0  2068    5   6   0  89
  0    2   0    0   412  310  250    2    0    0    0   160    0   3   0  96
  0    0   0    0   375  273  198    2    0    0    0   153    1   3   0  97

Again we have the user/system/idle split, but now it's on a per cpu basis. High numbers of icsw (involuntary context switches) on a particular cpu is an indication that that cpu is handling a lot of interrupts. This can have incredibly detrimental effects on applications trying to use that cpu. It may be worthwhile considering either processor sets, or using psradm to disable all but interrupts on that cpu.

High numbers in 'migr' (thread migrations) can also be detrimental as we end up having to invalidate cached data on one cpu and reload it in another. Binding processes to particular cpus might help here.

iostat is a good way to see how the I/O subsystem is running. I generally use something like the following command:

$ iostat -xnz 5
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   20.6    8.9 1281.1   53.1  1.8  0.3   62.1   11.2   9  21 c0d0
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0   19.2    0.0   52.0  0.0  0.0    0.1    1.0   0   2 c0d0
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device

The -z argument tells iostat not to print lines of zeros (hence the multiple headers with no data).

Most of the columns are self explanatory. I'm generally interested in 'asvc_t' (active service time), which generally equates to 'time on the wire', or how long the device takes to service the request; '%w' shows the average number of I/O requests pending for this device and '%b' gives an indication of how busy the device is. Note that '%b' assumes that the device is only capable of sequential individual packets to the device. This is obviously not the case for arrays.

High '%w' numbers are an indication of a lot of I/O taking place. High 'asvc_t' numbers may indicate a problem with the storage device. You should be seeing times of the order of 1-10ms in this column in general on current hardware. The above were taken on my notebook which has a slower IDE drive in it that had had a bit of a workout before I ran the stats.

You should also be aware that in pretty much all of the \*stat commands, the first output is an average since boot, while good for a feel of the system average, it's not real useful when trying to get a feel for a particular time period.

Basically, a little bit of analysis can save you a considerable amount of money and then angst. I'd be failing in my role as a "trusted adviser" to recommend any other course. Yes, sure we'd like the money for customers buying more hardware, but for myself and many others, it's far more important for the customer to be a happy and returning customer rather than an unhappy one who is likely to look elsewhere for their next purchase due to such an experience.

Technorati Tags: , ,


Post a Comment:
Comments are closed for this entry.

* - Solaris and Network Domain, Technical Support Centre

Alan is a kernel and performance engineer based in Australia who tends to have the nasty calls gravitate towards him


« April 2014

No bookmarks in folder

Sun Folk

No bookmarks in folder

Non-Sun Folk
Non-Sun Folks

No bookmarks in folder