Using Solaris Resource Management Utilities to Improve Application Performance

The SPECjAppServer2004 benchmark is a very complex benchmark produced by the Standard Performance Evaluation Corporation (SPEC). Servers measured with this benchmark exercise all major Java 2 Enterprise Edition (J2EE) technologies, including transaction management, database connectivity, web containers, and Enterprise JavaBeans. The benchmark heavily exercises the hardware, software, and network, as hundreds or thousands of JOPS (jAppServerOperationsPerSecond) are loaded onto Systems Under Test (SUTs).

This article introduces some of the Solaris resource management utilities that are used for this benchmark. These utilities may be useful to system managers who are responsible for complex servers. The author has applied these features to improve performance when using multiple instances of J2EE application server software with the SPECjAppServer2004 benchmark.

In SPECjAppServer2004 benchmark results submitted by Sun Microsystems, you can find references to Solaris Resource Management features such as Containers, Zones, Processor sets, and Scheduling classes. The recently published results for the  Sun Fire T5440 and the Sun Fire T5140 servers use many of these features.

Solaris Resource Management utilities are used to provide isolation of applications and better management of system resources. There are a number of publications which describe many of the features and benefits. The  Sun Solaris Container Administration Guide and  Sun Zones Blueprint are two of many sources of good information.

Solaris Containers

Looking at the first benchmark publication listed above, the Sun Fire T5440 server was configured with 8 Solaris Containers where each container or zone was setup to host a single application server instance. By hosting an application server instance in a container, the memory and network resources used by that instance are virtually isolated from the memory and network resources used by other instances running in separate containers.

While running the application software in a zone does not directly increase performance, using Containers with this benchmark workload makes it easier to manage multiple J2EE instances. When combined with the techniques below, using Solaris Containers can be an effective environment to help improve application performance.

Note that many Solaris performance utilities can be used to monitor and report process information for the configured zones, such as prstat with the -Z option.

Processor Sets

The System Administration Guide for Solaris Containers discusses use of Resource Pools to partition machine resources. A resource pool is a configuration mechanism used to implement a processor set and possibly combine with a scheduling class to configure with a zone. When configuring a resource pool, the administrator will specify the min and max cpu resources for the pool and the system will create the processor set with this information. The Resource Pool can then be configured with a specific zone using the zonecfg(1M) utility. However, in some scenarios, it is possible that the processor IDs selected for the resource pool may span multiple cpu chips, and thus may not make most efficient use of caches or access to local memory.

For the configurations in the published results, each Solaris Container was bound to a unique processor set, where each processor set was composed of 4 UltraSPARC T2 Plus cores. Since each UltraSPARC T2 Plus core consists of 8 hardware strands, each cpu chip was partitioned into two processor sets of 32 processor IDs. The processor sets were created by specifying the 32 processor ids as an argument to the psrset (1M) command as shown in the following example:

% psrset -c 32-63

The command above instructs Solaris to create a processor set using virtual processor numbers 32 thru 63 from 4 cores of an UltraSPARC T2 Plus cpu chip. With a total of four UltraSPARC T2 Plus cpu chips, the Sun Fire T5440 system was configured to use 7 processor sets of 4 cores each. The remaining 4 cores (virtual processor numbers 0-31) remained in the default processor set, as there must be at least 1 virtual processor ID in the default set.

Looking at the Sun Fire T5440 System Architecture , each  UltraSPARC T2 Plus cpu chip has 4 MB of L2 cache shared by all 8 cores in the chip. Each UltraSPARC T2 Plus cpu also has direct links to 16 DIMM slots of local memory with access to the remaining or remote memory DIMMs using an External Coherency Hub. Data references to local memory generally have slightly faster access as any data access through an External Coherency Hub will incur a small added latency as Denis indicates. This combination of CPU hardware and physically local memory is treated by Solaris as a Locality Group. Solaris attempts to allocate physical memory pages from the same locality group associated with the CPU executing the application process/thread. To help reduce latency for data accesses by an application, processor sets are a simple and effective means to co-locate data accesses within an L2 cache and a Locality Group boundary.

To use a Container with a specific processor set requires binding the processes running in the Container to the specified processor set. This can be done using the pgrep and psrset commands. Use pgrep -z ZONENAME to obtain the list of process IDs currently running in the specified zone. Then use psrset -b PSET PID to bind a process ID obtained earlier using pgrep to the specified processor set as shown in the following example:

% for PID in `pgrep -z ZONENAME`;  do psrset -b PSET_ID $PID;  done

Scheduling Class

Solaris offers a number of different process scheduling classes to execute user processes which are administered using the utilities dispadmin(1M) and priocntl(1M). The default is the Time Sharing or TS scheduling class. However many benchmark results have made use of the Fixed Priority or FX scheduling class. The dispadmin command can be used to list the classes supported on the system with associated priority and time quantim parameters. Processes normally running in the TS class can be run in the FX class using the priocntl command with either of the following methods:

% priocntl -e -c FX <COMMAND> <ARGS>

or

% priocntl -s -c FX -i pid <PID>

The first case executes a command starting in the FX class and the second case changes the scheduling class of a running process using the process ID.

The following article FX for Databases discusses this subject for the Database application space in some detail.  Similar considerations apply to J2EE application software. Running the application server instances in the FX scheduling class has shown to reduce the number of context switches and help improve overall throughput. 

Additional Sources:

Solaris™ Internals: Solaris 10 and OpenSolaris Kernel Architecture Second Edition by Richard McDougall and Jim Mauro

Solaris Best Practices

Disclosure:

SPEC, SPECjAppServer reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org  as of 6/10/09.


Comments:

Post a Comment:
Comments are closed for this entry.
About

BestPerf is the source of Oracle performance expertise. In this blog, Oracle's Strategic Applications Engineering group explores Oracle's performance results and shares best practices learned from working on Enterprise-wide Applications.

Index Pages
Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today