SPECjbb2005: A Valid Representation of Java Server Workloads
By dagastine on Feb 02, 2006
I was reading some of the other blogs at Sun and noticed some entertaining comments on BMSeer's blog. In particular the comments on the entry titled Sun head-to-head wins again: SPECjbb2005. Specifically the set of comments is from Robin (email@example.com). Robin apparently works for or has close association with HP. Hello Robin, I hope you are reading this. Robin doesn't feel that SPECjbb2005 represents real world Java server applications and workloads, mostly because it doesn't stress the network or I/O subsystems. I strongly disagree and feel that SPECjbb2005 is a valid representation of Java server workloads and has already had a significant impact on JVM and Java SE performance. Here's a few quotes from Robin's comments: "It looks like HP is the only company smart enough to stay out of this benchmark game, with no relevance in the real world." ... "JBB pretends to measure the server-side performance of Java runtime environments but it is not at all representative of a real workload. Running unrealistic workloads to measure performance is a disservice to customers." This statement is a bit naive. SPECjbb2005 has significant features that highlight its relevance to real world workloads. First, garbage collection is part of the measurement interval. SPECjbb2000 called a System.gc() before each measurement interval to ease the impact of GC on the score. This was somewhat necessary to have the benchmark scale back in 2000, not the case now. Garbage collection is fully a part of this benchmark, large GC pauses significantly impact benchmark scores. Second XML DOM L3 is part of the benchmark, will 20% of the workload in DOM tree creation and manipulation. Parsing is not included in order to avoid I/O bottlenecks. Third, the benchmarks must run with thread counts (warehouses) 2X the number of hardware threads on the system. A 4-way must run to 8 warehouses. A 32-way must run 64 warehouses. When did managing 64-threads become trivial and not impacted by system performance? Fourth, many of the optimizations and performance work that started with SPECjbb2005 had direct impact on customer and Java EE benchmark performance. Take a look at the latest SPECjappserver2004 world record. BEA WebLogic Server 9.0 on Sun Fire T2000 Cluster running Sun J2SE 5.0_06 Sun's HotSpot J2SE 5.0_06 was the JVM for this benchmark result, the same JVM which currently holds many, many major performance records on SPECjbb2005. If performance optimizations targeted for SPECjbb2005 have direct impact on Java EE benchmarking, how again is SPECjbb2005 irrelevant? "In my opinion HP does not want to give credit to a bad benchmark by publishing results. Why should they give you the satisfaction of jumping off the bridge after you? Clearly HP thinks the benchmark is not important." HP was on the core development team of SPECjbb2005. Take a look at one of my first blog entries announcing SPECjbb2005. Why would HP think a benchmark was not important or irrelavant when they put resources on the development of the benchmark? . Fifth, I/O and network were purposely left out of the benchmark to concentrate on JVM, OS, and Hardware performance. The benchmark heavily stresses the memory subsystem with large Java heaps and high memory allocation counts. The OS needs to manage many threads and possibly many processes effectively for high performance. SPECjbb2005 stresses JVM, OS, and Memory, it is a complete system benchmark concentrating on Java server performance. Lastly, I would like to see HP submit SPECjbb2005 numbers, competition leads to innovation and performance optimization that benefits customers. Chances are HP is plugging away working to improve their HotSpot implementation, preparing for the day they will submit a result.