Monday Jun 27, 2011

I have a performance problem

(copied from my wordpress blog).

So start 95% of the performance calls that I receive. They usually continue something like:

I have gathered some *stat data for you (eg the guds tool from Document 1285485.1), can you please root cause our problem?

So, do you think you could?

Neither can I, based on this my answer inevitably has to be "No".

Given this kind of problem statement, I have no idea about the expectations, the boundary conditions, or even the application. The answer may as well be "Performance problems? Consult your local Doctor for Viagra". It's really not a lot to go on.

So, What kind of problem description is going to allow me to start work on the issue that is being seen? I don't doubt that there really is an issue, it just needs to be pinned down somewhat.

What behavior exactly are you expecting to see?

Be specific and use business metrics. For example "run-time", "response-time" and "throughput".

This helps us define exit criterea.

Now, let's look at the system that is having problems.

How is what you are seeing different? Use the same type of metrics.

The answers to these two questions take us a long way towards being able to work a call.

Even more helpful are answers to questions like

Has this system ever worked to expectation?

If so, when did it start exhibiting this behavior?

Is the problem always present, or does it sometimes work to expectation?

If it sometimes works to expectation, when are you seeing the problem? Is there any discernible pattern?

Is the impact of the problem getting better, worse, or remaining constant?

What kind of differences are there between when the system was performing to expectation and when it is not?

Are there other machines where we could expect to see the same issue (eg similar usage and load), but are not? Again, differences?

Once we start to gather information like this we start to build up a much clearer picture of exactly what we need to investigate, and what we need to achieve so that both you and me agree that the problem has been solved.

Please help get that figure of poorly defined problem statements down from it's current 95% value.

Tuesday Mar 03, 2009

My CMT machine loads Oracle Databases slower than ..

This is more of an "Oh no not again" type post, ...

I am constantly amazed at the number of escalations that make it to the performance group with this as the problem.

It really is a case of an unrealistic expectation and knowledge of what the machines excel at.

The most recent of these to cross my desk talks of a customer concerned that a dual core 2.5GHz x86 based box loads data into an Oracle database much faster than his shiny new T5220.

Until such a time as Oracle makes their SQL Loader run multi-threaded (which may bring in problems all of their own) this will always be the case.

The design of the system is such that it will run single threaded applications much slower than the x86 counterparts. These machines, however, come into their own once we enter production and start getting lots of parallel requests on the database. As we are running far more cpus, the load on the database must be much higher before we start to see any significant degradation.

The question that really should be asked here is, "Where do you want your performance? In the database load that you will do once, or in responding to production queries?"

About

* - Solaris and Network Domain, Technical Support Centre


Alan is a kernel and performance engineer based in Australia who tends to have the nasty calls gravitate towards him

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Links
Blogroll

No bookmarks in folder

Sun Folk

No bookmarks in folder

Non-Sun Folk
Non-Sun Folks

No bookmarks in folder