Monday Oct 16, 2006

No Tuning Required: Java SE Out-of-Box Vs. Tuned Performance

In my last entry titled Java SE Out-of-Box Competitive Performance I stressed the importance of out-of-box performance to customers and developers and how it is a passionate focus for Sun HotSpot and JVM Performance engineering. The following is a comparison of out-of-box and hand-tuned performance. The charts below are run on the same system as my previous entry and the charts are normalized to the same baseline, therefore the two sets of charts are directly comparable.

I have to say the numbers are quite impressive (hence the "No Tuning Required" in the title). My colleagues are going to say I'm blogging us out of a job :-).

  • On SPECjbb2005 the numbers are impressive. JDK 5.0_08 is ~22% faster when tuned compared to JDK 5.0_08 right out of the box. JDK 6 is ~11% faster when tuned versus right out of the box, and JDK 6 out of the box is only ~7% slower than a highly tuned JDK 5.0_08. Very impressive indeed!
  • On Scimark, tuning only improved slightly when running JDK 5.0_08. JDK 6 is more or less a wash.
  • On Volano, except when running JDK 5.0_08 64-bit, the out-of-the-box configuration seems to work well, and doesn't require any explicit tuning.
The system under test is a 2-way dual-core Opteron 280 Processors (2 CPUs, 4 cores, 2.4 Ghz) and 8GB of RAM. The Operating System is Red Hat EL 4.0 AS Update 4. The kernel version is unmodified from the base install, which is 2.6.9-42.ELsmp. The charts are statistical comparisons. No less than 10 samples were performed, and a T-test (single-tailed) was used to ensure confidence in the significance of the result. The data is normalized to the 32-bit Sun JDK 1.5.0_08 out-of-box result.

The following JVMs were tested:

  • Sun JDK 1.5.0_08
    • 32-bit: java version "1.5.0_08" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_08-b03) Java HotSpot(TM) Server VM (build 1.5.0_08-b03, mixed mode)
    • 64-bit: java version "1.5.0_08" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_08-b03) Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_08-b03, mixed mode)
  • Sun Java SE 6 build 99
    • 32-bit: java version "1.6.0-rc" Java(TM) SE Runtime Environment (build 1.6.0-rc-b99) Java HotSpot(TM) Server VM (build 1.6.0-rc-b99, mixed mode)
    • 64-bit: java version "1.6.0-rc" Java(TM) SE Runtime Environment (build 1.6.0-rc-b99) Java HotSpot(TM) 64-Bit Server VM (build 1.6.0-rc-b99, mixed mode)
The following command line arguments were used:
  • SPECjbb2005
    • J2SE 5.0_08 32-bit: -Xmn1g -Xms1500m -Xmx1500m -XX:+UseBiasedLocking -XX:+AggressiveOpts -XX:+UseLargePages -XX:+UseParallelOldGC -Xss128k
    • J2SE 5.0_08 64-bit: -Xmn2g -Xms3g -Xmx3g -XX:+UseBiasedLocking -XX:+AggressiveOpts -XX:+UseLargePages -XX:+UseParallelOldGC -Xss128k
    • Java SE 6 RC1 32-bit: -Xmn1g -Xms1500m -Xmx1500m -XX:+UseLargePages -XX:+UseParallelOldGC -Xss128k
    • Java SE 6 RC1 64-bit: -Xmn2g -Xms3g -Xmx3g -XX:+UseLargePages -XX:+UseParallelOldGC -Xss128k
  • SciMark2
    • J2SE 5.0_08 32-bit: -XX:+UseBiasedLocking
    • J2SE 5.0_08 64-bit: -XX:+UseBiasedLocking
    • Java SE 6 RC1 32-bit: -XX:+DoEscapeAnalysis
    • Java SE 6 RC1 64-bit: -XX:+DoEscapeAnalysis
  • Volano 2.5.0.9
    • J2SE 5.0_08 32-bit: -XX:CompileThreshold=1500
    • J2SE 5.0_08 64-bit: -XX:CompileThreshold=1500
    • Java SE 6 RC1 32-bit: -XX:CompileThreshold=1500 -XX:-UseBiasedLocking
    • Java SE 6 RC1 64-bit: -XX:CompileThreshold=1500 -XX:-UseBiasedLocking
The SPECjbb2005 numbers are impressive. JDK 5.0_08 is ~22% faster tuned compared to JDK 5.0_08 out-of-box. JDK 6 is only ~11% faster tuned vs. JDK 6 out-of-box, and JDK 6 out-of-box is only ~7% slower than highly tuned JDK 5.0_08. Nice.

Tuning only improved Scimark slightly when running 5.0_08. When running JDK 6 it's more or less a wash. The JDK 6 64-bit difference is statistically insignificant.

Tuning seems to hurt Volano, except when running 5.0_08 64-bit. Come to find out the negative differences are statistically insignificant so tuning is a wash with Volano as well.

In summary, meeting or exceeding tuned performance is the end game for out-of-box performance engineering. The above results make me quite proud of our accomplishments. Yes, every application is different and in some cases we'll find ourselves needing to tune. But chances are if you let us know the issues you're facing a release or two down the line you won't need to tune. Eventually it will just be us geeks who can't help it :-). Next step is a Solaris x86 vs Linux comparison. Stay tuned.

SPEC(R) and the benchmark name SPECjbb(TM) are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect experiments performed by Sun Microsystems, Inc. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Friday Oct 06, 2006

Java SE Out of Box Competitive Performance

Out-of-Box Performance, or no tuning options is in many ways our ultimate goal in HotSpot development. As a JVM performance engineer I too have spent countless hours tweaking command line arguments to squeeze out the last remaining bit of performance. In my last blog entry I asked if there was interest in an out of box competitive performance comparison and the second comment I received hit it on the nose. Command line tuning, albeit fruitful at times, can also be a royal waste of time. Especially when you're shooting in the dark trying any option you can find without any knowledge to what the flag is doing.

Your friends in HotSpot engineering don't want you spending time tuning either. That was the driving force behind Java SE 5.0 Ergonomics and why key performance features previously available via JVM options are now enabled by default in Java SE 6.

The intention of the data charts below is to highlight the importance of customer experience and out-of-box performance to Sun Java Engineering. These are not meant to be high performance benchmark results. Hand tuning can change the results significantly.

The following is an out-of-box performance comparison on a Sun Fire X4200. The system is configured with 2 dual-core Opteron 280 Processors (2 CPUs, 4 cores, 2.4 Ghz) and 8GB of RAM. The Operating System is Red Hat EL 4.0 AS Update 4. The kernel version is unmodified from the base install, which is 2.6.9-42.ELsmp. The only variable in this configuration is the JVM.

The JVM distributions and versions tested were the latest versions publicly available at the time of testing. I was sure to use the BEA JRockit JVM used in recent SPECjbb2005 submissions. The IBM JVM is the latest available on the IBM developer website.

  • Sun JDK 1.5.0_08
    • 32-bit: java version "1.5.0_08" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_08-b03) Java HotSpot(TM) Server VM (build 1.5.0_08-b03, mixed mode)
    • 64-bit: java version "1.5.0_08" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_08-b03) Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_08-b03, mixed mode)
  • Sun Java SE 6 build 98
    • 32-bit: java version "1.6.0-rc" Java(TM) SE Runtime Environment (build 1.6.0-rc-b99) Java HotSpot(TM) Server VM (build 1.6.0-rc-b99, mixed mode)
    • 64-bit: java version "1.6.0-rc" Java(TM) SE Runtime Environment (build 1.6.0-rc-b99) Java HotSpot(TM) 64-Bit Server VM (build 1.6.0-rc-b99, mixed mode)
  • IBM JDK 5.0 SR2
    • 32-bit: Java(TM) 2 Runtime Environment, Standard Edition (build pxi32dev-20060511 (SR2)) IBM J9 VM (build 2.3, J2RE 1.5.0 IBM J9 2.3 Linux x86-32 j9vmxi3223-20060504 (JIT enabled) J9VM - 20060501_06428_lHdSMR JIT - 20060428_1800_r8 GC - 20060501_AA) JCL - 20060511a
    • 64-bit: Java(TM) 2 Runtime Environment, Standard Edition (build pxa64dev-20060511 (SR2)) IBM J9 VM (build 2.3, J2RE 1.5.0 IBM J9 2.3 Linux amd64-64 j9vmxa6423-20060504 (JIT enabled) J9VM - 20060501_06428_LHdSMr JIT - 20060428_1800_r8 GC - 20060501_AA) JCL - 20060511a
  • BEA JRockit 5.0_06 R26.4
    • 32-bit: java version "1.5.0_06" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_06-b05) BEA JRockit(R) (build R26.4.0-63-63688-1.5.0_06-20060626-2259-linux-ia32, )
    • 64-bit: java version "1.5.0_06" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_06-b05) BEA JRockit(R) (build P26.4.0-10-62459-1.5.0_06-20060529-2101-linux-x86_64, )
As stated above and in the title no JVM tuning options were used for these results. The results below are statistical comparisons. No less than 10 samples were performed, and a T-test (single-tailed) was used to ensure confidence in the result. The data is normalized to the 32-bit Sun JDK 1.5.0_08 result.

The first chart is SPECjbb2005. SPECjbb2005 is SPEC's benchmark for evaluating the performance of server side Java. It evaluates server side Java by emulating a three-tier client/server system (with emphasis on the middle tier). It extensively stress Java collections, BigDecimal, and XML processing. The cool thing about SPECjbb2005 is that optimizations targeted for it also show performance gains in other competitive benchmarks, such as SPECjappserver2004, and a broad range of customer workloads. The benchmark results below are run in single instance mode. Notice the impressive gains with Java SE 6 with nearly a 15% improvement over JDK 5.0_08. Also notice there is very little difference between 32-bit and 64-bit BEA JRockit results.

SciMark 2.0 is a Java benchmark for scientific and numerical computing and is a benchmark where Sun's JVMs have continued to shine. Its a decent test of generated code, particularly for tight computational loops. However it is particularly sensitive to alignment issues and can show some level of variance from run to run, mostly in a bimodal fashion. All in all its a good set of microbenchmarks. Notice that 64-bit is faster than 32-bit for all of the JVMs under test. The additional registers available running 64-bit on AMD Opteron certainly do impact computational performance.

Volano is a popular Java chat server. The benchmark is quick and involves both a client and server instance. From a JVM perspective the workload is heavily dominated by classic Java socket I/O which is a bit long in the tooth, an NIO version would be quite interesting. That being said, some customers have found this benchmark quite useful so we continue to test it. Running Volano the performance gaps are not as large, most likely because this benchmark has very little garbage collection overhead. BEA JRockit is showing good performance here with a result thats 10% over the baseline. Sun Java SE 6 shines as well with a result thats nearly 20% over baseline.

In summary, we in Java SE Performance and HotSpot Engineering feel that out-of-box performance is extremely important to Java developers and customers, and I hope the results above differentiate our product and highlight our ongoing work and focus. Next step is a out-of-box vs. highly tuned comparison. Stay tuned.

SPEC(R) and the benchmark name SPECjbb(TM) are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect experiments performed by Sun Microsystems, Inc. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Thursday Sep 21, 2006

Java SE Out of Box Performance: Any interest in a performance comparison?

Out of box performance, or using no JVM tuning options, has been a focus of Sun HotSpot Engineering for quite some time. Our first major steps came with J2SE 5.0 Ergonomics, and we're taking it further in JDK 6 with many of our performance features enabled by default. I find it quite cool when no tuning yields performance close to or exceeding the best I can muster with command line tuning.

With that, I'd like to publish some "Out of Box" competitive performance comparisions on my blog. As you can imagine this could be a bit of a touchy subject for our competitors. Before I post data I'd like to get a feel of how interesting this would be. So, I'd like to ask for a bit of feedback. Is there interest in a Java SE out of box competitive performance out there? Are there any benchmarks that people would like to see? I was thinking the usual benchmarks I talk about, SPECjbb2005 and SciMark. Any others?

Thanks in advance for the feedback!

Thursday Aug 10, 2006

Sun JDK 5.0_08 Is Now Available!

JDK 5.0_08 is now publicly available on Java.sun.com!. Another fine day for Sun Java Performance. This is our highest performing and most reliable release to date. We have demonstrated winning performance across Sun's server offering, from x64 Systems to CoolThread servers then all the way up to the Sun Fire E25K.

Winning performance on The Sun Blade X8400, beating BEA JRockit on a comparable system! (Sun Hotspot result, BEA JRockit result)
Winning performance on The Sun Fire T1000 and Sun Fire T2000 benchmark result (T1000 result,T2000 result)
Winning performance on The Sun Fire E25K (benchmark result)

SPECjbb2005 Sun Fire T1000 (1 chip, 8 cores) 60,323 SPECjbb2005 bops, 15,081 SPECjbb2005 bops/JVM; Sun Fire T2000 (1 chip, 8 cores) 74,365 SPECjbb2005 bops, 18,591 SPECjbb2005 bops/JVM; Sun Fire E25K (72-way, 72 chips, 144 cores) 1,387,437 SPECjbb2005 bops, 19,270 SPECjbb2005 bops/JVM; Sun Blade X8400 (8 cores, 4 chip, Solaris 10, Sun HotSpot 5.0_08) 121,228 SPECjbb2005 bops, 30,307 SPECjbb2005 bops/JVM; Fabric7 Q80 (8 cores, 4 chip, Microsoft Windows Server 2003, JRockit 5.0 P26.4.0) . SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 06/19/06 on www.spec.org.

Wednesday Jun 21, 2006

Sun Java and the Sun Fire E25K Raise the Bar on SPECjbb2005

The Sun Fire 25K and Sun J2SE 5.0_08 team up to demonstrate leadership on large servers running SPECjbb2005, increasing performance by 19.1% over our previous submission on the same hardware. Not bad for 6 months of performance work! The 72-way Sun Fire 25K score is 1,387,437 SPECjbb2005 bops, 19,270 SPECjbb2005 bops/JVM. That is 11% faster than the 128-way Fujitsu PRIMEPOWER 2500 and many times faster than IBM's fastest SPECjbb2005 result to date. The BMSeer once again beats me the punch talking about SPECjbb2005 results, he/she (who is BMSeer anyway?) has a great piece talking about this result. Required Disclosure Statement SPECjbb2005 Sun Fire E25K (72-way, 72 chips, 144 cores) 1,387,437 SPECjbb2005 bops, 19,270 SPECjbb2005 bops/JVM, Fujitsu PRIMEPOWER 2500 (128 chips, 128 cores) 1,251,024 SPECjbb2005 bops, 39,095 SPECjbb2005 bops/JVM, IBM eServer p5 570 (8 chips, 16 cores, 16-way) 244,361 SPECjbb2005 bops, 30,545 SPECjbb2005 bops/JVM. SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 06/19/06 on www.spec.org.

Monday Jun 19, 2006

Sun Java vs. C#

Here's my latest round of platform performance comparisons using Scimark. This time I compare Java to C# and once again Java performance is looking quite good. Thanks to Tony Zhang, another colleague of mine on the performance team who ran the initial performance comparision a few months back and provided me the environment to re-run the tests with our latest JVMs. The system under test was a 4 CPU Intel Xeon MP server (4x2.78 GHz, 8 cores, 3.87 GB memory) running Microsoft Windows 2003 Server and .NET 2.0. The CLR version under test according to SciMark was 2.0.50727.42. We used the Scimark 2.0 C# port found here. The HotSpot server compiler (-server) was used for both J2SE 5.0_08 and Java SE 6 b87. SciMark was run with the large data set (-large). Also, I found the chart below in an interesting writeup showing similar performance comparisions with older versions of the JVM. I particularly like HotSpot's performance lead over JRockit.

Friday Jun 09, 2006

Sun Java is faster than C/C++ (Round 2)

I received a few comments on my previous blog entry saying the results were bogus since I used an old compiler. I quickly found another test system running Suse SLES 9 U2 with gcc 3.3.3 and repeated the test. If I get around to installing the latest Visual Studio I'll repeat the test there as well. The JVM versions are different as I wanted to quickly post the results. Guess what, the results are a lot better! I ran this several times and its quite repeatable. I appreciate comments so please let me know what your thoughts. Especially if there are issues with the choice of gcc 3.3.3. The system under test was a 2 x 3.0Ghz Intel Xeon MP System (4-core) running Suse SLES 9 U2 and gcc 3.3.3. The C code was compiled with full optimization as shown by the Makefile in the SciMark source package. This time no tuning parameters were used for either 5.0_08 or 6.0 b83. Here's some output from /proc/cpuinfo: vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) MP CPU 3.00GHz For background here's the skinny on SciMark2. Scimark2 is a set of simple numerical kernels and its performance is directly related to the performance and quality of the generated code. The tests are single threaded and have little to no garbage collection overhead. In short, a great set of applications to compare statically compiled C code and dynamically compiled Java. This time Java is 35% faster than C. Here's a breakdown of the subtests. C is only ahead on Sparse MatMult by a small margin. Any one interested to see how the other JVM vendors look? Can JRockit or IBM beat C?

Sun Java is faster than C/C++

This is quite cool. Andy Johnson, a colleague of mine on the Java performance team, did a few performance tests comparing Java to native C. SciMark2 was used for the performance comparision. The system under test was a 2Ghz Pentium white box running Windows 2000 and using the Microsoft Visual C/C++ 6. The C code was compiled with full optimization. The server compiler was used for both J2SE 5.0_07 and Java SE 6. Scimark2 is a set of simple numerical kernels and its performance is directly related to the performance and quality of the generated code. The tests are single threaded and have little to no garbage collection overhead. In short, a great set of applications to compare statically compiled C code and dynamically compiled Java. The chart below is quite revealing. Both the charts are normalized to J2SE 5.0_07. Native C is only 3% faster than 5.0_07 and Java SE 6 pulls ahead of native C by 2%. The following chart breaks the comparison down further. Remember SciMark2 is a composite benchmark and the overall score is a simple mean of each subtest mflops score. With that, Java is ahead in some cases, and behind in others. Actually Java is ahead in all cases except Sparse Matmult. Looks like we have something to look at for additional optimization.

Friday Jun 02, 2006

Java Performance Continues to Accelerate on Sun CoolThreads Technology

The performance of Java on Sun CoolThreads servers continues to be impressive. Our latest round of improvements have increased performance on SPECjbb2005 by 17% on the Sun Fire T1000 and T2000. If you thought the competitive positioning of these systems was impressive before, take a look at them now. The charts below represent the competitive landscape for the Sun CoolThreads servers and by no means are they meant to be a complete comparison of all systems in the classes described below. If there are particular descrepencies that are annoying, please let me know. For more detailed information on the Sun Fire T1000 and T2000 and comparisions running competitive benchmarks check out BMSeer's blog. The first chart shows the competitive landscape for 1 RU servers. The Sun Fire T1000 shines compared to other systems in this space. The Sun Fire X4100 (powered by AMD Opteron CPUs) looks rather good as well. The second chart shows the competitive landscape for 2 RU and 4 RU servers. The Sun Fire T2000 shows impressive performance against the competition in this space as well. Now this is were the Sun Fire T1000 and Sun Fire T2000 truly excel. The first power performance graph shows a comparision based on performance per watt using the SPECjbb2005 bops metric. The data presented is limited to what I've gathered using The Sun Fire CoolThreads systems and what has been gathered on http://www.sun.com/coolthreads. Here's another look at power performance using the SWaP metric. The SWaP metric is similar to performance / Watt, but includes system footprint as a part of the equation. The Sun Fire T1000 number is impressive. The light bulb next to my workbench in my basement uses more power than this server. For those individuals who prefer a spreadsheet to charts, here the same information as show above. Finally, this chart shows the performance difference between J2SE 5.0_06 and J2SE 5.0_08 on the same hardware, demonstrating a 17% increase in performance on both the Sun Fire T1000 and Sun Fire T2000. If we can improve performance by 17% in 6 months, wait to you see what Java SE 6 ("Mustang") can do. Required Disclosure Statement: SPECjbb2005 Sun Fire T1000 (1 chip, 8 cores) 51,528 SPECjbb2005 bops, 12,882 SPECjbb2005 bops/JVM submitted for review; SPECjbb2005 Sun Fire T2000 (1 chip, 8 cores) 74,365 SPECjbb2005 bops, 18,591 SPECjbb2005 bops/JVM submitted for review; Sun Fire X4100 (2 chips, 2 cores) 38,090 SPECjbb2005 bops, 19,045 SPECjbb2005 bops/JVM submitted for review; IBM eServer p5 550 (2 chips, 4 cores) 61,789 SPECjbb2005 bops, 61,789 SPECjbb2005 bops/JVM; IBM x346 (2 chips, 4 cores) 39,585 SPECjbb2005 bops, 39,585 SPECjbb2005 bops/JVM; IBM eServer p5 520 (1 chip, 2 cores) 32,820 SPECjbb2005 bops, 32,820 SPECjbb2005 bops/JVM; IBM eServer p5 510 (1 chip, 2 cores) 36,039 SPECjbb2005 bops, 36,039 SPECjbb2005 bops/JVM; Fujitsu Siemens RX220 (2 chips, 2 cores) 61,155 SPECjbb2005 bops, 30,578 SPECjbb2005 bops/JVM, Dell PE SC1425 (2 chips, 2 cores) 24,208 SPECjbb2005 bops, 24208 SPECjbb2005 bops/JVM; Dell PE 850 (1 chips, 2 cores) 31,138 SPECjbb2005 bops, 31,138 SPECjbb2005 bops/JVM; Dell PE 2950 (2 chips, 4 cores) 64,288 SPECjbb2005 bops, 64,288 SPECjbb2005 bops/JVM; SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 6/02/06 on www.spec.org

Monday May 22, 2006

Sun Fire T2000 Blows Cold Air!!

This year at JavaOne we had a demo at the performance pod demonstrating Java SE performance and scalability. We had a Sun Fire T2000 with a 1.0 Ghz UltraSPARC T1 processor and 8gb RAM running Sun J2SE 5.0_06, J2SE 5.0_08, and Java SE 6. Brian Doherty did the setup this year (thanks a lot Brian!). He spent the entire day on Monday fighting the networking issues on the JavaOne pavilion floor but was eventually able to get the demo working (but we had to buy our own USB to serial kit to do it). It was also quite cold in the building, and Brian didn't bring his jacket because of the 80 degree weather that day in San Francisco. So, just like any resourceful engineer working in a lab, Brian decided to warm his hands with the fan exhaust in the back of the Sun Fire T2000. Much to his surprise, the T2000 was blowing cold air! Over the next few days on the show floor we put the system to the test. We ran SPECjbb2005 every day for 10 hours straight with the CPU fully consumed at 100%. Guess what? It still blew cold air. This was absolutely amazing, especially since my little laptop is about to burn my legs as I type this... I find this incredible. At risk of being a bit annoying I asked nearly everyone who stopped by our booth to put their hands by the fans and feel the air. I wasn't the only one amazed, many people wanted to the the CPU stats to be sure the system was running full tilt. Very cool. (literally).

Sun Java Performance: Here we come again

I love performance work. The sweet taste of knowing that your product is the fastest is like no other. Perhaps it is because I have a competitive personality, but beating the competition is a lot of fun. And you know what? Active competition between vendors on public Java benchmarks benefits customers. So without further ado, I'd like to announce our latest round of world record Java competitive benchmark results. Sun J2SE 5.0_08, powered by the ripping fast Sun HotSpot JVM, has new world records running SPECjbb2005, improving our previous scores on the exact same hardware by a whopping 17%, and publishing the improved score in less than 6 months. See what I mean by sweet? The BMSeer has a great piece on the new results, check it out here. Be sure to check out the very popular press release here. To top it off, performance is not Sun Java Software's highest priority. I'm sure you're well aware that performance optimization is my highest priorty, but really its not the top focus of the organization as a whole. Our primary foci are reliability , compatibility (but performance and scalability are not that far down the list). We would pass up a 20% performance gain at the drop of a hat if it imposed any reliability risk. I do mean any risk, as a performance guy I've butted heads with this ideology many times in the past. But you know what, in the end I agree, because that's what customers need. Reliability is always first. Brian Doherty, an esteemed colleague of mine has often said, “The performance of a crashing JVM is zero”, and that's dead on. A close second is compatibility, but that's an easy one as it speaks to the core of what defines Java technology. I'm proud to say Sun has taken this to heart, we support more hardware and OS combinations than any other vendors, Any JVM vendor can claim they are the “World's fastest JVM”. Competitive benchmarking is a lot of fun and is an opportunity to promote software and hardware performance. What's important is that your application is as fast as you need it to be, and its so reliable that you don't have to think about it.

Friday Mar 10, 2006

Java Compatibility Call to Arms

Capatibility between Java implementations is critical to the success of the platform. Its the responsibility of the JRE vendor to ensure that any Java application will run. Yes, any Java application. After all, compatibility is the a key ingredient to what makes Java. "Write Once Run Anywhere", Right? Apparently this isn't always the case. Here's an example of a "compatibility issue identified on the Java.net Glassfish Project.":https://glassfish.dev.java.net/servlets/ReadMsg?list=dev&msgNo=761 There are always bugs in software and some of those bugs can break compatibility. It is of utmost importance that issues such as this are addressed in a timely manner. This is where you come in. When testing Java software, whether it be new development, a purchase evaluation, or your tried and true back office application, please do the following. Run your application with your JVM of choice, but also test it against other JVMs running on the same platform. That's right, if you're running Sun's JVM, also test BEA JRockit and IBM JDK. Multiple Java implementations are available on Windows, Linux, and now Solaris SPARC. If any of the implementations show incorrect behavior or dare I say don't run at all, I implore you to send a note to the implementation's support channels and if possible file a bug. None of the Java vendors out there can possibly test enough Java applications, and in many ways we're relying on the users to let us know if something's broken. In the end, any Java application should run on any Java implementation. Hands down. No excuses. If you run into problems, have questions about Java performance, or identify compatibility issues running Sun's JVMs, in particular "Java SE 6 ":https://mustang.dev.java.net, please post a note on the java.net performance forum or feel free to send me a comment here. I would love to hear your compatibility successes, along with the issues seen with our competitors JVMs :-) We're very serious about performance, compatibility, and reliability of the Java platform. If a vendor is not doing well in this regard I would like to know about so I can take steps to ensure compatibility guarantees of that implementation.

Monday Feb 27, 2006

High Performance Java on Sun CoolThread Servers

Back in December when Sun's CoolThread Servers were announced, I wrote a similar blog entry comparing the Sun Fire T1000 and T2000 SPECjbb2005 scores to our competitor's SPECjbb2005 scores on 1U, 2U, and 4U systems. Below is updated data, along with space and power data using the SWaP metric. The Sun Fire T1000 scores are phenomenal!. All run with Sun J2SE 5.0._06. with HotSpot JVM technology. Interested in finding out for yourself? Go here to try a Sun Fire T2000 free for 60 days. Take a look at the the chart below. The Sun T2000 surpasses all other competition in the 2U and 4U space. How are these results comparable? Its simple, compare the raw throughput SPECjbb2005 bops score. One may ask: "How can you compare a 8 core / 32 thread box to a 4 core / 8 thread Power 5+?". Its easy. Chip and core counts are steadily becoming irrelavent. What really matters is how much work (throughput) a system can achieve and how much is that system going to cost to run. This includes lab space, power, and cooling costs. Below is a system comparison using the SWaP--the Space, Watts and Performance (SWaP) metric. The SWaP metric is defined as follows: How about scalability? Here's a good example of how the Sun Fire T2000 and the UltraSPARC T1 processor scales from 1 to 32 threads. Each SPECjbb2005 warehouse is a new thread. Throughput steadily increases as new threads are added, peaking at 32. Fine print SPEC disclosure: SPECjbb2005 Sun Fire T1000 (1 chip, 8 core, 32 threads) 51,540 bops, 12,885 bops/JVM, Sun Fire T2000 (1 chip, 8 core, 32 threads) 63,378 bops, 15,845 bops/JVM, IBM eServer p5 520 (2 chips, 2 cores, 4 thread) 32,820 bops, 32,820 bops/JVM, IBM eServer p5 510 (2 chips, 2 cores, 4 thread) 32,820 bops, 32,820 bops/JVM (referenced on IBM benchmark website), AMD Tyan white box (2 chips, 4 cores, 4 thread) 44,574 bops, 44,574 bops/JVM, IBM eServer p5 550 (4 chips, 4 cores, 4 thread) 61,789 bops, 61,789 bops/JVM . SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 27, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Thursday Feb 23, 2006

Sun Fire E25K and J2SE 5.0_06 SPECjbb2005 World Record

The Sun Fire E25K running J2SE 5.0_06 now holds the overall world record running SPECjbb2005! Hot off the presses, here's the new world record result: 1,164,995 SPECjbb2005 bops, 32,361 SPECjbb2005 bops/JVM. This result beats the recently announced result from Fujitsu for the PRIMEPOWER 2500 with SPARC64 V. Once again the combination of Sun's world class enterprise server architecture, the Ultra SPARC IV+ processor, and Sun J2SE 5.0_06 with HotSpot JVM technology team up to prove once again world class performance and scalability with the SPECjbb2005 benchmark. Very, very impressive. As a designer and developer of this benchmark I found it hard to envision the day where the SPECjbb2005 bops score would breach 1 million. The day is here and much sooner than I could have ever anticipated. These are exciting times for Java performance (and there's more performance optimizations coming soon!) Stay tuned for more information on this latest world record. The BMSeer has a excellent competitive overview of this result, the price performance of the Sun Fire E25K is quite impressive compared to our competition $$ (add an extra $ for IBM). (Hey BMSeer, next time you won't beat me to the punch announcing our latest SPECjbb2005 world record!!). Fine print SPEC disclosure: SPECjbb2005 Sun Fire E25K (72-way, 72 chips, 144 cores) 1,164,995 SPECjbb2005 bops, 32,361 SPECjbb0205 bops/JVM submitted for review, Fujitsu PRIMEPOWER 2500 (128 chips, 128 cores) 1,157,619 SPECjbb2005 bops, 72,351 SPECjbb2005 bops/JVM. SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 23, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Wednesday Feb 22, 2006

Sun HotSpot J2SE 5.0_06 Crushes BEA JRockit Running SPECjbb2005

(The following is a resubmission of a blog entry from February 10, 2006 with a few comments and edits. Changes are noted below.) Looks like our friends from BEA JRockit are at it again. Take a look at the following blog entry from BEA. http://dev2dev.bea.com/blog/hstahl/archive/2006/01/new_specjbb2000_1.html First SPECjbb2000 is a 5 year old retired benchmark. Its time has past and SPECjbb2005 is its replacement. BEA loves to talk about SPECjbb2000, they obviously spent a lot of time optimizing for SPECjbb2000. The problem with JRockit is that they are optimized just for SPECjbb2000. If time was spent on optimizations for the real world they'd be able to maintain their competitive position with SPECjbb2005, right? The same applies for any other competitive benchmark (SPECjappserver2004, Scimark, and so on). The reality is much different, SPECjbb2000 is a special case for JRockit and performance gains there don't pan out in the real world. One more comment on SPECjbb2000. As I stated above the benchmark retired the beginning of January. Which JVM ended on top? Reading the BEA blog you'd assume it was BEA JRockit. Sun HotSpot J2SE 5.0_06 closed this benchmark as the final world record holder. Now lets move on, SPECjbb2000 is over. BEA JRockit tried to spin their current competitive situation in the best possible light, omitting many results that did not suit their smoke and mirrors argument. First, BEA positioned a fully configured 32-way, 32-core, 32-thread Itanium2 system against a partially configured 16-way, 32-core, 32-thread Sun Fire 6900 in an attempt to highlight JVM performance. These are completely different hardware platforms and any attempt to highlight JVM performance alone using these results is inaccurate. Comparing these results does give insight on throughput and scaling capacity but the comparison is at a system level and only demonstrates a JVMs capacity to fully utilize the underlying hardware platform. When comparing a fully configured mid-sized enterprise systems regardless of the platform, the Sun Fire 6900 (24-way, 48-core, 48-thread) beats the JRockit result hands down. 342,578 SPECjbb2005 bops, 28,548 SPECjbb2005 bops/JVM (Sun Fire E6900 with Sun JVM) 322,719 SPECjbb2005 bops, 40,340 SPECjbb2005 bops/JVM (Fujitsu PRIMEQUEST 480 with JRockit) Also, please review the SPECjbb2005 results page. A quick scan will show that Sun HotSpot holds the record for single and multi-instance results, more than doubling BEA's single JVM result, and tripling BEA's multi-instance result. Funny how BEA forgot to mention these results. http://www.spec.org/jbb2005/results/jbb2005.htmlTWO(2) JVMs on a 4 core box. They even use 2 JVMs on a 2-core box. That's absolutely ridiculous. Why would anyone choose to do this? The only reason is they can't beat HotSpot running a single JVM and have difficultly scaling this benchmark on small 2 and 4 core systems. HotSpot could easily beat these multi-instance results, but chances are we won't submit multi-instance SPECjbb2005 on configurations that don't match customer deployments. (Author's note: Since hindsight is always 20/20, the following is more specific than the above paragraph) Now onto the AMD based SPECjbb2005 results referred to in the BEA blog. I'm embarrassed for BEA because they had to use these results to talk about performance. Their 2-way, 2-core result uses TWO(2) JVMs on a 4 core box. They even use 2 JVMs on a 2-core box. That's absolutely ridiculous. Why would anyone choose to do this? The only logical reason is they can't beat HotSpot running a single JVM and have difficultly scaling SPECjbb2005 on small 2 and 4 core systems. HotSpot could easily beat these multi-instance results, but chances are we won't submit multi-instance SPECjbb2005 on configurations that don't match customer deployments. Here are the latest 2 and 4 core single instance SPECjbb2005 submissions on a Sun Fire X4200 running Windows, Linux, and Solaris. 49,097 SPECjbb2005 bops, 49,097 SPECjbb2005 bops/JVMSun Fire X4200 running Solaris 10 x64 47,437 SPECjbb2005 bops, 47,437 SPECjbb2005 bops/JVMSun Fire X4200 running Windows 2003 Server 43,076 SPECjbb2005 bops, 43,076 SPECjbb2005 bops/JVMSun Fire X4200 running Red Hat EL 4 Fine print SPEC disclosure: SPECjbb2005 Sun Fire X4200 on Solaris 10 (2 chips, 4 cores, 4 threads) 49,097 bops, 49,097 bops/JVM,SPECjbb2005 Sun Fire X4200 on Windows 2003 Server (2 chips, 4 cores, 4 threads) 47,437 bops, 47,437 bops/JVM, SPECjbb2005 Sun Fire X4200 on Red Hat EL 4 (2 chips, 2 cores, 2 threads) 43,076 bops, 43,076 bops/JVM, Fujitsu Limited PrimeQuest 480 (32 chips, 32 cores, 32 threads) 322,719 bops, 40,340 bops/JVM. SPECjbb2005 Sun Fire E6900 on Solaris 10 (24 chips, 32 cores, 32 threads) 342,578 bops, 28,548 bops/JVM. SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 22, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Wednesday Feb 15, 2006

Java SE 6 Beta is Released!

Hey Look, Java SE 6 ("Mustang") has gone Beta! http://java.sun.com/javase/6/download.jsp Huge performance improvements, slick client improvements (love the font smoothing!), and a plethora of other features make this our best beta release to date. Give it a try and let us know what you think. As always, please let us know if you run into issues or regressions. Go to the Java SE 6 Regressions Challenge Page if you identify a regression for a chance to win a Sun Ultra 20 Workstation. For performance issues and questions visit the java.net performance forum.

Thursday Jan 19, 2006

Sun Hotspot Wins Best Java Virtual Machine

Sun J2SE has won JDJ Reader's Choice Best Java Virtual Machine Award. Take a look, its category #16. Congratulations Java Software!

Tuesday Dec 13, 2005

Sun's Hotspot JVM = Industry Leading Performance

Sun's Hotspot JVM continues to demonstrate industry leading performance. Here's just a few examples where Hotspot shines. SPECjbb2005 Leading x64 on Opteron 2-core result; 27004 bops, 27004 bops/JVM; Sun Fire X4100 and Sun Fire X4200 Leading x64 on Xeon 2-core result; 28,314 bops, 28,314 bops/JVM Fujitsu Siemens Computers PRIMERGY TX300 S2 Leading x64 on Opteron 4-core result: 45,124 bops, 45,124 bops/JVM Sun Fire X4100 and Sun Fire X4200 Best of class 1U result; Sun Fire T1000, 51,540 bops, 12,885 bops/JVM; Results under review. Best of class 2U result; 63,378 bops, 15,845 bops/JVM; Sun Fire T2000, powered by UltraSPARC T1 SPECjappserver2004 SPECjappserver2004 World Record 6 Sun Fire T2000 servers SPECjappserver2004 Single J2EE Node World Record 1 Sun Fire T2000 server SciMark Top 3 submitted results; running Solaris, Linux, and Windows Please post comments and questions here or on the java.net performance forum sharing your experiences running Hotspot. Yes, I'd love to here success stories, but what is most important are those situations where performance wasn't what you expected. We are serious about Java performance here at Sun, and want to do what it takes to make every Java user satisfied with the performance of their application. We want to fix any and all performance issues you run into. We can and will continue to demonstrate industry leading performance, but what is most important is broad and reliable JVM performance which is defined individually with every user's application. Fine print SPEC disclosure: SPECjbb2005 Sun Fire X4200 (2 chips, 2 cores, 2 threads) 27004 bops, 27004 bops/JVM, Fujitsu Siemens Computers PRIMERGY TX300 S2 (2 chips, 2 cores, 4 threads) 28,314 bops, 28,314 bops/JVM, Sun Fire X4200 (2 chips, 4 cores, 4 threads) 45,124 bops, 45,124 bops/JVM, Sun Fire T1000 (1 chip, 8 core, 32 threads) 51,540 bops, 12,885 bops/JVM submitted for review, Sun Fire T2000 (1 chip, 8 core, 32 threads) 63,378 bops, 15,845 bops/JVM. SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of November 30, 2005. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Monday Dec 12, 2005

Sun's Hotspot JVM = Reliable Performance

Take a look at the latest SPECappserver2004 World Record results. BEA Weblogic running on Sun Fire T2000 servers powered by UltraSPARC T1 processors and Sun J2SE 5.0_06. Thats right BEA's "Record setting Weblogic 9" set the world records running on Sun's Hotspot JVM. SPECjappserver2004 World Record (Multi-Node) SPECjappserver2004 World Record (2-Node) But how can this be? Sounds like BEA Weblogic relies on the cool performance and reliability of Sun's Hotspot JVM to achieve their world record performance on SPECjappserver2004.

Tuesday Dec 06, 2005

UltraSPARC T1 Screams Running Java

Sun has announced the new Sun Fire T1000 and T2000 servers today along with SPECjbb2005 benchmark results on these systems. What makes these results so special? They run the UltraSPARC T1 processor with 8 cores and 32 threads on a single chip. The performance of the UltraSPARC T1 systems easily surpasses performance on all other 1U, 2U, or 4U Systems. These results also leverage the high performance features in the newly released J2SE 5.0._06. Take a look at the the chart below. The Sun T2000 surpasses all other competition in the 2U and 4U space. The 1U Sun Fire T1000 leads the 1U results. How are these results comparable? Its simple, compare the raw throughput SPECjbb2005 bops score. One may ask: "How can you compare a 8 core / 32 thread box to a 4 core / 8 thread Power 5+?". Its easy. Chip and core counts are steadily becoming irrelavent. What really matters is how much work (throughput) a system can achieve, how much is that system going to cost to run, and how much lab space, power, and cooling will this system require. Looking at the above results with this in mind clearly shows why Sun UltraSPARC T1 systems are separate from the pack. Sun Fire UltraSPARC T1 much, much less expensive to run than is competitors. How about those Cool Threads! Here's the details on the configurations compared above: How about scalability? Here's a good example of how the Sun Fire T2000 and the UltraSPARC T1 processor scales from 1 to 32 threads. Each SPECjbb2005 is a new thread. Throughput steadily increases as new threads are added, peaking at 32. Fine print SPEC disclosure: SPECjbb2005 Sun Fire T1000 (1 chip, 8 core, 32 threads) 51,540 bops, 12,885 bops/JVM submitted for review, Sun Fire T2000 (1 chip, 8 core, 32 threads) 63,378 bops, 15,845 bops/JVM submitted for review, IBM eServer p5 520 (2 chips, 2 cores, 4 thread) 32,820 bops, 32,820 bops/JVM, AMD Tyan white box (2 chips, 4 cores, 4 thread) 44,574 bops, 44,574 bops/JVM, IBM eServer p5 550 (4 chips, 4 cores, 4 thread) 61,789 bops, 61,789 bops/JVM . SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of November 30, 2005. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

[ T: http://technorati.com/tag/NiagaraCMT ]

Thursday Sep 29, 2005

Sun Hotspot is the World's Fastest JVM: Round 2

BEA has responded to my response to their latest press release, take a look at it here. First I agree with BEA on their first point. A valid JVM performance comparisons need to be on identical hardware, hopefully we're sharp enough to be sure we're comparing the same benchmarks :-) The response claims that the point of industry benchmarks is to get the highest possible score on like platforms. But this statement suggest a 32-bit JVM and a 64-bit JVM are one in the same. That unfortunately is not based in reality. The overhead of running a 64-bit application is substantial, and Sun has made targetted optimizations for the Sun Fire X4200 and AMD Opteron processors to achieve the level of performance shown in the 64-bit SPECjbb2000 benchmark results. The reality of SPECjbb2000 is that its been around for 5 years, and over those years several benchmark specific performance optimizations have been identified. Very few of these optimizations have any effect on real world customer applications. Does your application call System.gc() on a regular basis? Does your application call System.arraycopy() for 20% of the workload. Based on BEA's 32-bit SPECjbb2000 results they've done a good job targeting optimizations for this benchmark. So BEA, why don't you submit 64-bit SPECjbb2000 results on AMD Opteron? Or even better, submit more results on SPECjbb2005 the \*New\* industry standard JVM server benchmark. SPECjbb2000 retires on Jan. 4th, 2006. Which isn't too far away. Sun Hotspot is the world's fast JVM running SPECjbb2005. Hands down. SPECjbb2005 replaces SPECjbb2000 because it more closely models customer applications. Sun targets customer performance when prioritizing performance engineering work. Because of this, I am very happy to see BEA's challenge to test your applications with BEA and Sun JVMs. Please test your applications with Sun's JVMs (links listed below). Run your performance and \*reliability\* tests. Run Sun and BEA's JVM for days, not just a few minute comparison. I'm confident Sun's Hotspot will shine. If there is any problem whatsoever, contact us at the Performance Forum and Performance Project at java.net. Here's the latest J2SE 5.0 Update J2SE 5.0_05. Here's the latest Java SE 6 Mustang Developement Build.

Wednesday Sep 28, 2005

Sun Hotspot is the World's Fastest JVM

Yes, its true. Sun Hotspot is the World's Fastest JVM and it scortches BEA JRockit. I feel I need to correct BEA latest press release on SOA and JVM performance. And yes! Not only is Sun Hotspot the fastest JVM on the Planet, I also have benchmarks to prove it. See BEA's press release here. BEA's press release and performance claims are based on SPECjbb2000 and SPECjappserver2004 results. Hotspot Performance SPECjbb2000 is an old, outdated benchmark. SPEC plans to replace SPECjbb2000 with SPECjbb2005 in the next few months. There are many problems with SPECjbb2000, so many it merits a blog entry on its own. Stay tuned, I plan on addressing this in the next few days. The biggest problem with SPECjbb2000 is the 32-bit space is its System.gc() before each measurement interval. This call effectively removes Full GCs from the measurement interval. This means that Full GC overhead is not part of the benchmark whatsoever. Because of this alone, optimizations specifically for SPECjbb2000 (also known as benchmark specials) have no real effect on customer applications. That's why Sun only publishes 64-bit SPECjbb2000 results. BEA's press release compares BEA JRockit 32-bit SPECjbb2000 results on a 4-way AMD Opteron 875 box to Sun Hotspot's 64-bit SPECjbb2000 results on similar hardware. They go on to say how Sun's result uses a 14GB heap and BEA's result uses a 1.8GB heap. They never specifically mention that the JRockit result is with a 32-bit JVM and Sun's result is with a 64-bit JVM. Sun is purposely running with a 14GB heap to highlight this. Now, yes, JRockit is quite fast running SPECjbb2000 on x86 hardware, and yes Sun has chosen not to compete in this space. We believe that benchmark specials don't benefit our customers and we are only interested in investing in real optimizations that will help every customer's application perform better. Now wouldn't you assume that if you claim you are the fastest JVM on the planet you'd be able to prove it with the latest benchmarks, not one which is 5 years old? (very, very old for Java software!). Here's the latest 2-way SPECjbb2005 results on 32-bit: 2 CPU 3.2 Ghz Xeon running BEA JRockit: 24208 bops 2 CPU Sun Fire X4200 running Sun J2SE 5.0_06: 27004 bops How is it that BEA's "World's fastest JVM" doesn't win running SPECjbb2005. Shouldn't BEA demonstrate winning performance on the most up to date benchmarks before making such a claim? Hotspot Scalability BEA also claimed that JRockit is the world's most scalable JVM. Again, doesn't make much sense when looking at the benchmarks. True JVM scalability is proven on today's largest systems. Sun has recently submitted record performance numbers on a 48-core Ultra SPARC IV+ E6900. The largest configuration JRockit has ever submitted SPECjbb2000 on is an 8-core Opteron box, and that was with a 32-bit JVM. True scalability challenges and bugs are hashed out on today's largest hardware, and 8-cores simply doesn't cut it. Sun has recently announced World record 16 and 24-core performance on SPECjbb2005. BEA has only been used in 2-core submissions to date. Sun Fire V890 Result Sun E4900 Result How can they possibly claim superior scalability? Where are their any benchmark results to support this claim? SPECjappserver2004 Performance Not much direct evidence to publically point out here, but I have as suspicion that BEA's own application server, running in Sun hardware, Solaris 10, and Sun J2SE 5.0_06 would perform better. Hold that thought for about two weeks, I'll come back to this later.

Monday May 09, 2005

Dave Dagastine's Weblog

Hey, this is my first blog entry. I'm going to try to keep up with this and share my thoughts, frustrations, ramblings; mostly around the top of Java Performance.
About

dagastine

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today