Using Kernel Crash dumps for Performance Analysis

Kernel Crash dumps are a point in time snapshot of the Solaris Kernel state. The aim is to allow post mortem analysis of the system state at the point the crash dump was taken. For system panic's and hangs, the ability to look at the system state is the primary failure analysis tool and one of the reasons Solaris is as reliable as it is.

I think of system failures as a 2 dimensional problem. The interaction of data and code at the point in time of the failure can be analyzed with tools such as MDB which are designed for this type of post-mortem analysis.

Performance adds the 3rd dimension of time.

Autopsy is not commonly used as a tool for determining the root cause of individual productivity issues. In a small subset of cases, poor individual productivity may be the result of a medical condition requiring a CAT scan (the medical version of a live Kernel Crash Dump). However, these cases are very rare and such techniques would only be used with a significant body of supporting evidence.

Kernel Crash Dumps are useful for a very small subset of performance cases. Specific performance problems rooted in memory shortfall caused by a memory leak would be one example, but these are quite rare in the big scheme of things and would need supporting evidence to use the Kernel Crash Dump approach.

I have come across a number of cases in the last few months where a crash dump has been requested and only one was possibly valid.

Before collecting the CAT scan equivalent of your system (with the associated cost) in the hope it shows up the cause of a performance problem, check the pulse, breathing and circulation 1st. If you do collect a live crash dump, make sure the supporting evidence and rational are sound.

Comments:

Hi Clive, Ive run into this before.

A couple of years ago it caused me to investigate the possibility of integrating a flight data recorder kernel module into Solaris. FDR would be a tunable size or duration circular buffer into which all the kstats were periodically stored.

This would come in useful not only for helping the customers you mention in your post but would also serve as useful supporting information for panics. If only every dead body carried an up to date diary!

It's still ongoing on a back burner somewhere.

Keith.

Posted by Keith Bond on June 02, 2008 at 06:42 AM BST #

Post a Comment:
Comments are closed for this entry.
About

clive

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today