DProfile - The Scalability Infrastructure

We have talked about all the Perspectives, Insight and Knowledge available with DProfile. The purpose of all this technology is scalability.

As outlined in the Getting Ready for CMT section, scalability is key. And keeping all hardware structures equally utilized is the cornerstone to scalability: keeping all Virtual Processors busy, keeping all Memory Boards equally busy, and using your caches and banks uniformly.

My previous entry showed how to you profile all of these hardware components by teaching Sun Studio 11 Performance Analyzer about Perspectives.

Here, I'd like to show you some pseudo-code fragments running in multiple Software Threads within a Process cause a scalability problem:

for (;;) {
  spin_lock(&lock);
  if (var++ > limit) {
    unlock(&lock);
    break;
  }
  unlock(&lock);
}

There is only one central lock that is guards the shared variable var. I chose this obvious case because it's Data Movement Profile is dominated by one cache line taking >95% of time between all Threads.

Here is another pseudocode frament running in multiple Software Threads, that exhibits the same scalability problem:

   for (;(array[index]++ <= limit);) ;

The array is global, with a dedicated index for every Virtual Thread. This is not as obvious, but this case also has the same Data Movement Profile as the previous obvious example: one cache line holding the entire array is being passed among all Threads, and is taking >95% of time. This is false sharing; the next inhibitor to scalability after lock contention.

While the first example is easy to detect with lockstat and DProfile, only DProfile can identify the second example.

Any time there are a few hardware elements within structures being utilized (detected by DProfile), a scalability problem is available for resolution with DProfile.

Here are some examples:

One Memory Board used more frequently than others.
One Bank used more than others.
A group of Cache Lines used more than others.
One thread being used more frequently than others. (This is termed skew in HPC and DSS circles)

With DProfile you can select each of these objects, Filter, and then identify what Software View components are responsible for the underutilization.

[ T: ]

Comments:

haha

Posted by kebin on June 26, 2007 at 04:51 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

nk

Search

Top Tags
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today