Vive la Revolution!

I've just come across a few thoughts by Paul Murphy in which he uses the old maxim of "you get what you measure". In this article he basically surmises that people generally maximise performance of a system (system in the broadest sense) with respect to whatever parameter they can measure. How very, very sadly true this is with respect to computer systems.

For years people have been hamstrung by the extremely restrictive data that is presented to them. Most literature they have been presented with has encouraged them to use this data in a very bottom-up manner to try and figure out what is happening on their systems. For example, they look at mpstat(1M) and worry about numbers which look high to them and automatically assume that this is the source of their poor application performance issues. They are conditioned into what can be termed a pattern matching approach to problem solving. By this I mean that they look for certain values or sets of values coming out of the standard tools and have a pre-defined set of problem causes that fit each pattern (this set is usually populated from theirs and their peers experiences).

However their are a few very serious and fundamental problems with the above approach.
  • Firstly, a "system" is just an N-tier application with the kernel being the bottom layers. When you use the standard tools (mpstat(1M), vmstat(1M), sar(1) you are viewing just a small piece of the stack and, even then, a very restricted set off data for that part of the stack.
  • Tied very much to the above point, remember that applications drive system characteristic. On it's own a kernel doesn't do too much. If you want to understand why your system is performing like it is then you need to up level and start with the applications.
  • John Donne said 'No man is an Island" and the same goes for your system(s). Thinking holistically is one of the keys to success when analysing the behaviour of a system. The inter and intra application interaction that occurs on a system is staggering and when married with kernel interaction becomes frightening. Software systems are complex and should be viewed as a whole to understand them.
  • Pattern matching has to be replaced with a scientific approach to problem solving. What I mean by that is that the hypotheses we make should be backed by data at all times. Make no assumptions and trust no-one. Presuppositions are everyones worst enemy when investigating any form of issue.
With discipline, all of the above could be adhered to previously apart from one and it was the last point that was the real problem. Extracting good data from the beast that is your N-tier application has historically been extraordinarily time consuming and prone to error. Instrumenting the varying layers in the stack, harvesting the data and then analysing the data is what costs organisations around the world small fortunes. The iterative cycle that a problem solver follows goes like this:
hypothesis -> instrumentation -> data harvesting -> data analysis ->hypothesis
So we see that we use data to refine our hypothesis. The miracle of DTrace is that we can spend much, much more time on what we should be doing (making hypothesis about system behaviour) and much, much less time and money on the truly mundane (instrumenting and analysing).

It's a brave new world indeed. Vive la Revolution!
Comments:

Post a Comment:
Comments are closed for this entry.
About

jonh

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today