UltraSPARC T2 Features for DProfile

This week Sun released new UltraSPARC T2 based systems. UltraSPARC T2 enhances support for DProfile by adding cache miss reporting, as the next phase of constant improvement to DProfile. Performance Analyzer reports cache miss metrics as additional bar graphs for UltraSPARC T2 as in the turquoise bars (3rd one down) shown here:

UltraSPARC T2 added precise trapping on data and instruction cache miss performance counters for DProfile. Read more about which specific counters in the OpenSPARC T2™ Supplement on page 81. The cache miss selector is sl=3.

The new hardware counters include: Icache misses, Dcache misses, L2 cache instruction misses, and L2 cache load misses. These are available in the Sun Studio 12 Performance Analyzer collect command with these options:

usage:  collect  target 
       Sun Analyzer 7.7 SunOS_sparc 2007/10/09
...
Specifying HW counters on `UltraSPARC T2':
    == [+][~=]...[~=][/][,]
      <+>
         for memory-related counters, attempt to backtrack to find
         the triggering instruction and the virtual and physical
         addresses of the memory reference
...
Well-known HW counters available for profiling:
   icm[/{0|1}],100003 (`I$ Misses', alias for IC_miss; load-store events)
   itlbm[/{0|1}],100003 (`ITLB Misses', alias for ITLB_miss; load-store events)
   ecim[/{0|1}],10007 (`E$ Instr. Misses', alias for L2_imiss; load-store events)
   dcm[/{0|1}],100003 (`D$ Misses', alias for DC_miss; load-store events)
   dtlbm[/{0|1}],100003 (`DTLB Misses', alias for DTLB_miss; load-store events)
   ecdm[/{0|1}],10007 (`E$ Data Misses', alias for L2_dmiss_ld; load-store events)
...
Raw HW counters available for profiling:
...
   IC_miss[/{0|1}],1000003 (load-store events)
   DC_miss[/{0|1}],1000003 (load-store events)
   L2_imiss[/{0|1}],1000003 (load-store events)
   L2_dmiss_ld[/{0|1}],1000003 (load-store events)
...
   ITLB_miss[/{0|1}],1000003 (load-store events)
   DTLB_miss[/{0|1}],1000003 (load-store events) 

Note that the sl=2 counters do not reliably support DProfile. I do not recommend these to be enabled.

With these performance metrics, measurement of miss rates are possible within objects. In the graph above, we're displaying L2 read misses per second. No recompilation is necessary!

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

nk

Search

Top Tags
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today