X

Everything you want and need to know about Oracle SPARC systems performance

Event Monitoring on SPARC M8

Martin Mueller
Senior Principal Software Engineer

Some time ago I published an article on monitoring DAX activity on SPARC M8 (and to a certain degree the whole article also applies to SPARC M7 or S7) This article better had preceded the article on the DAX as it targets the SPARC M8 core.

The goal is to give you a better understanding of the plethora of performance relevant events that a SPARC M8 CPU can measure or count. Like all SPARC CPUs (and Intel CPU have similar event counters) SPARC M8 provides a lot of information about what is happening at any stage of the processing of code. Most of the counters require an in-depth knowledge of how the pipeline works, but some are of general interest to anyone who analyzes the performance of a workload on SPARC M8.

Side note: there is also a huge amount of counters that monitor other system components like memory or I/O controllers (the DAX for example is monitored in that context), none of these will be covered in this article

PAPI Events and CPC Registers

There is a class of "general" purpose events called PAPI_ events, and some are implemented in Solaris' cpustat command. It is important to note that SPARC M8 can count or monitor up to four different events at the same time:

cpustat -c pic0=<event>,pic1=<event>,pic2=<event>,pic3=<event> \
  
interval [count]

 

(You do not need specify all four, and of course they all can be different)

Here's my personal list of most important events:

Event Name Meaning
PAPI_tot_ins, PAPI_tot_cyc Total instructions/cycles (indicates overall load on pipeline)
PAPI_fp_ops, PAPI_fp_ins Total FP ops, FP instructions (amount of FP processing)
PAPI_ld_ins, PAPI_sr_ins Total load / store instructions (indicates amount of memory I/O caused by the code directly)
PAPI_l1_dcm, PAPI_l1_icm L1D-cache misses or L1I-cache misses (if high, try to shrink working set)
PAPI_tlb_dm, PAPI_tlb_im TLB misses, if high try to increase page size

Let's look at an example output, in general you need to assume the root role to get access to these counters:

 

(This example also shows you how to aggregate events per core, which in general is a good idea on the highly threaded SPARC M8 core. All eight threads on the core share the cores ressources)

If you are only interested in the degree of saturation of the integer execution units of SPARC M8 the command pgstat would be all you need

 

The column entitles HW shows you the ratio of processed intructions to the theoretical maximum. Anything beyond 60% can be considered close to overloading this particular core.

If you invoke cpustat -h you would be swamped with events SPARC M8 could count, nearly all of them are only of interest to the developers of this CPUs (and this is not restricted to the SPARC M8 CPU)

If you want to dig deeper and use the non-generic events you might be able to figure out their meaning from their names, but there is no publicly accessible documentation of these events.

Solaris WebUI and CPU Events

Solaris WebUI, the graphical interface to Solaris' StatsStore, does visualize some of the statistics presented above. The intro diagram shows you a graphical representation of the amount of "integer" and floating point instructions, and these are the only ones visualized via WebUI.  The corresponding SSIDs (the unique identifiers of a certain statistic in Solaris' StatsStore) would be //:class.cpu//:stat.integer-pipe-usage//:op.rate//:op.util and //:class.cpu//:stat.fpu-usage//:op.rate (pls.note the different aggregations used, percentage of max. for the integer load and total number of instructions in the floating point case)

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.