Java Performance: Solaris 10 x86 vs. Linux

Solaris 10 screams running Java. Competitive benchmarks do a good job highlighting this,just take a look at the latest SPECjbb2005 and SPECjappserver2004 results. I have noticed some fundamental differences in "Out of the Box" tuning when comparing Solaris and Linux. When running Java server applications, Solaris 10 default tuning is general purpose and tuned for moderate thread counts similar to a time shared system. This in many ways is an indication of the maturity of the platform. Linux, on the other hand, is specfically tuned for high thread counts and performance suffers when running low thread counts. A good example of this behavior can be seen comparing SPECjbb2005 results. Below are two results run on the exact same hardware, only differing the OS and minor JVM tuning (the heap tuning has minimal performance impact). SPECjbb2005 on Sun Fire X4200 running Solaris 10 Update 1, 49,097 SPECjbb2005 bops, 49,097 SPECjbb2005 bops/JVM SPECjbb2005 on Sun Fire X4200 running Red Hat EL 4, 43,076 SPECjbb2005 bops, 43,076 SPECjbb2005 bops/JVM Running SPECjbb2005 on identical hardware with optimal tuning parameters Solaris 10 is 14% faster than Linux. SPECjbb2005 on small x64 hardware runs only a moderate number of threads, in the above example to peak application thread count is 8. What tuning can be applied when running high thread counts on Solaris 10 x86? Here's two quick tuning steps you can try with your application. 1. If you're running many threads and performing socket I/O, try libumem.so. When launching your application within a shell script, set the following environment variable. LD_PRELOAD=/usr/lib/libumem.so;export LD_PRELOAD 2. Tune the Solaris scheduler. Simple scheduler tuning can yield significant performance gains, especially with highly threaded short lived applications. Try the FX scheduling class: priocntl -c FX -e java class_name Try the IA scheduling class: priocntl -c IA -e java class_name Every application is different and true performance is always defined by each individual running their own application. If you run into problems or have questions about Java on Solaris performance visit the java.net performance forum or feel free to send me a comment. Fine print SPEC disclosure: SPECjbb2005 Sun Fire X4200 on Solaris 10 (2 chips, 4 cores, 4 threads) 49,097 bops, 49,097 bops/JVM, SPECjbb2005 Sun Fire X4200 on Red Hat EL 4 (2 chips, 2 cores, 2 threads) 43,076 bops, 43,076 bops/JVM. SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 17, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.
Comments:

Just to be redundant - you say the default JVM parameters on Linux and Solaris differ, but the benchmark reflects optimal JVM parameters on identical hardware? The result is interesting, but leads into the next round of questions.
  1. Is the difference (presumably not) in the JVM?
  2. Is the difference in the general OS (say in thread switching)?
  3. Is the difference in the network interface (a more efficient stack)?
  4. Is the difference in the drivers (maybe especially good for Solaris, not so good for Linux, less favorable on non-Sun hardware)?

Posted by Preston L. Bannister on February 17, 2006 at 10:17 AM EST #

Thanks for your comments. 1. The only difference in the JVM is the tuning options. Solaris allows a larger contiguous memory space for the Java heap, allowing a setting of 3gb. Linux only allows 1500m. This is less than 1% of the score difference. No other JVM differences. 2. You're close here. It seems to be the OS scheduling policies. On low thread counts Solaris is significantly faster, on high thread counts the gap narrows a bit. 3. No network interface difference. The Solaris network stack is more efficient, but isn't a part of SPECjbb2005. SPECjappserver2004 on the other hand has significant network overhead. 4. Not likely with SPECjbb2005 as the benchmark does not stress network.

Posted by dagastine on February 17, 2006 at 11:39 AM EST #

I made a few errors in my response to your comments. 1. There is very little differnence in the tuning options, the only differnence is specifying large pages on Linux which is on by default in Solaris. 2. Confusing answer. The difference could be in the scheduler. In the SPECjbb2005 case the thread counts are low. With applications with high thread counts, such as Volano, the gap narrows and Linux performance improves.

Posted by dagastine on February 21, 2006 at 07:16 AM EST #

Just to nail things down - you say that dropping the Solaris contiguous memory space for the Java heap down to 1500m only takes ~1% notch off performance? Interesting stuff - Thanks.

Posted by Preston L. Bannister on February 23, 2006 at 06:35 AM EST #

Post a Comment:
Comments are closed for this entry.
About

dagastine

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today