Tuesday Feb 18, 2014

SPARC T5-2 Produces SPECjbb2013-MultiJVM World Record for 2-Chip Systems

The SPECjbb2013 benchmark shows modern Java application performance. Oracle's SPARC T5-2 set a two-chip world record, which is 1.8x faster than the best two-chip x86-based server. Using Oracle Solaris and Oracle Java, Oracle delivered this two-chip world record result on the MultiJVM SPECjbb2013 metric.

  • The SPARC T5-2 server achieved 114,492 SPECjbb2013-MultiJVM max-jOPS and 43,963 SPECjbb2013-MultiJVM critical-jOPS on the SPECjbb2013 benchmark. This result is a two-chip world record.

  • The SPARC T5-2 server running SPECjbb2013 is 1.8x faster than the Cisco UCS C240 M3 server (2.7 GHz Intel Xeon E5-2697 v2) based on both the SPECjbb2013-MultiJVM max-jOPS and SPECjbb2013-MultiJVM critical-jOPS metrics.

  • The SPARC T5-2 server running SPECjbb2013 is 2x faster than the HP ProLiant ML350p Gen8 server (2.7 GHz Intel Xeon E5-2697 v2) based on SPECjbb2013-MultiJVM max-jOPS and 1.3x faster based on SPECjbb2013-MultiJVM critical-jOPS.

  • The new Oracle results were obtained using Oracle Solaris 11 along with Oracle Java SE 8 on the SPARC T5-2 server.

  • The SPARC T5-2 server running SPECjbb2013 on a per chip basis is 1.3x faster than the NEC Express5800/A040b server (2.8 GHz Intel Xeon E7-4890 v2) based on both the SPECjbb2013-MultiJVM max-jOPS and SPECjbb2013-MultiJVM critical-jOPS metrics.

  • There are no IBM POWER7 or POWER7+ based server results on the SPECjbb2013 benchmark. IBM has published IBM POWER7+ based servers on the SPECjbb2005 which was retired by SPEC in 2013.

Performance Landscape

Results of SPECjbb2013 from www.spec.org as of March 6, 2014. These are the leading 2-chip SPECjbb2013 MultiJVM results.

SPECjbb2013 - 2-Chip MultiJVM Results
System Processor SPECjbb2013-MultiJVM JDK
max-jOPS critical-jOPS
SPARC T5-2 2xSPARC T5, 3.6 GHz 114,492 43,963 Oracle Java SE 8
Cisco UCS C240 M3 2xIntel E5-2697 v2, 2.7 GHz 63,079 23,797 Oracle Java SE 7u45
HP ProLiant ML350p Gen8 2xIntel E5-2697 v2, 2.7 GHz 62,393 24,310 Oracle Java SE 7u45
IBM System x3650 M4 BD 2xIntel E5-2695 v2, 2.4 GHz 59,124 22,275 IBM SDK V7 SR6 (*)
HP ProLiant ML350p Gen8 2xIntel E5-2697 v2, 2.7 GHz 57,594 32,103 Oracle Java SE 7u40
HP ProLiant BL460c Gen8 2xIntel E5-2697 v2, 2.7 GHz 56,367 30,078 Oracle Java SE 7u40
Sun Server X4-2, DDR3-1600 2xIntel E5-2697 v2, 2.7 GHz 52,664 20,553 Oracle Java SE 7u40
HP ProLiant DL360e Gen8 2xIntel E5-2470 v2, 2.4 GHz 48,772 17,915 Oracle Java SE 7u40

* IBM SDK V7 SR6 – IBM SDK, Java Technology Edition, Version 7, Service Refresh 6

The following table compares the SPARC T5 processor to the Intel E7 v2 processor.

SPECjbb2013 - Results Using JDK 8
Per Chip Comparison
System SPECjbb2013-MultiJVM SPECjbb2013-MultiJVM/Chip JDK
max-jOPS critical-jOPS max-jOPS critical-jOPS
SPARC T5-2
2xSPARC T5, 3.6 GHz
114,492 43,963 57,246 21,981 Oracle Java SE 8
NEC Express5800/A040b
4xIntel E7-4890 v2, 2.8 GHz
177,753 65,529 44,438 16,382 Oracle Java SE 8

SPARC per Chip Advantage 1.29x 1.34x

Configuration Summary

System Under Test:

SPARC T5-2 server
2 x SPARC T5, 3.60 GHz
512 GB memory (32 x 16 GB dimms)
Oracle Solaris 11.1
Oracle Java SE 8

Benchmark Description

The SPECjbb2013 benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is relevant to all audiences who are interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community.

From SPEC's press release, "SPECjbb2013 replaces SPECjbb2005. The new benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is expected to be used widely by all those interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community."

SPECjbb2013 features include:

  • A usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases and data-mining operations.
  • Both a pure throughput metric and a metric that measures critical throughput under service-level agreements (SLAs) specifying response times ranging from 10ms to 500ms.
  • Support for multiple run configurations, enabling users to analyze and overcome bottlenecks at multiple layers of the system stack, including hardware, OS, JVM and application layers.
  • Exercising new Java 7 features and other important performance elements, including the latest data formats (XML), communication using compression, and messaging with security.
  • Support for virtualization and cloud environments.

See Also

Disclosure Statement

SPEC and the benchmark name SPECjbb are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results as of 3/6/2014, see http://www.spec.org for more information.  SPARC T5-2 114,492 SPECjbb2013-MultiJVM max-jOPS, 43,963 SPECjbb2013-MultiJVM critical-jOPS; NEC Express5800/A040b 177,753 SPECjbb2013-MultiJVM max-jOPS, 65,529 SPECjbb2013-MultiJVM critical-jOPS; Cisco UCS c240 M3 63,079 SPECjbb2013-MultiJVM max-jOPS, 23,797 SPECjbb2013-MultiJVM critical-jOPS; HP ProLiant ML350p Gen8 62,393 SPECjbb2013-MultiJVM max-jOPS, 24,310 SPECjbb2013-MultiJVM critical-jOPS; IBM System X3650 M4 BD 59,124 SPECjbb2013-MultiJVM max-jOPS, 22,275 SPECjbb2013-MultiJVM critical-jOPS; HP ProLiant ML350p Gen8 57,594 SPECjbb2013-MultiJVM max-jOPS, 32,103 SPECjbb2013-MultiJVM critical-jOPS; HP ProLiant BL460c Gen8 56,367 SPECjbb2013-MultiJVM max-jOPS, 30,078 SPECjbb2013-MultiJVM critical-jOPS; Sun Server X4-2 52,664 SPECjbb2013-MultiJVM max-jOPS, 20,553 SPECjbb2013-MultiJVM critical-jOPS; HP ProLiant DL360e Gen8 48,772 SPECjbb2013-MultiJVM max-jOPS, 17,915 SPECjbb2013-MultiJVM critical-jOPS.

Monday Sep 23, 2013

SPARC T5-2 Delivers Best 2-Chip MultiJVM SPECjbb2013 Result

SPECjbb2013 is a new benchmark designed to show modern Java server performance. Oracle's SPARC T5-2 set a world record as the fastest two-chip system beating just introduced two-chip x86-based servers. Oracle, using Oracle Solaris and Oracle JDK, delivered this two-chip world record result on the MultiJVM SPECjbb2013 metric. SPECjbb2013 is the replacement for SPECjbb2005 (SPECjbb2005 will soon be retired by SPEC).

  • Oracle's SPARC T5-2 server achieved 81,084 SPECjbb2013-MultiJVM max-jOPS and 39,129 SPECjbb2013-MultiJVM critical-jOPS on the SPECjbb2013 benchmark. This result is a two chip world record.

  • There are no IBM POWER7 or POWER7+ based server results on the SPECjbb2013 benchmark. IBM has published IBM POWER7+ based servers on the SPECjbb2005 which will soon be retired by SPEC.

  • The 2-chip SPARC T5-2 server running SPECjbb2013 is 30% faster than the 2-chip Cisco UCS B200 M3 server (2.7 GHz E5-2697 v2 Ivy Bridge-based) based on SPECjbb2013-MultiJVM max-jOPS.

  • The 2-chip SPARC T5-2 server running SPECjbb2013 is 66% faster than the 2-chip Cisco UCS B200 M3 server (2.7 GHz E5-2697 v2 Ivy Bridge-based) based on SPECjbb2013-MultiJVM critical-jOPS.

  • These results were obtained using Oracle Solaris 11 along with Java Platform, Standard Edition, JDK 7 Update 40 on the SPARC T5-2 server.

From SPEC's press release, "SPECjbb2013 replaces SPECjbb2005. The new benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is expected to be used widely by all those interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community."

Performance Landscape

Results of SPECjbb2013 from www.spec.org as of September 22, 2013 and this report.

SPECjbb2013
System Processor SPECjbb2013-MultiJVM JDK
type # max-jOPS critical-jOPS
SPARC T5-2 SPARC T5, 3.6 GHz 2 81,084 39,129 Oracle JDK 7u40
Cisco UCS B200 M3, DDR3-1866 Intel E5-2697 v2, 2.7 GHz 2 62,393 23,505 Oracle JDK 7u40
Sun Server X4-2, DDR3-1600 Intel E5-2697 v2, 2.7 GHz 2 52,664 20,553 Oracle JDK 7u40
Cisco UCS C220 M3 Intel E5-2690, 2.9 GHz 2 41,954 16,545 Oracle JDK 7u11

The above table represents all of the published results on www.spec.org. SPEC allows for self publication of SPECjbb2013 results. See below for locations where full reports were made available.

Configuration Summary

System Under Test:

SPARC T5-2 server
2 x SPARC T5, 3.60 GHz
512 GB memory (32 x 16 GB dimms)
Oracle Solaris 11.1
Oracle JDK 7 Update 40

Benchmark Description

The SPECjbb2013 benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is relevant to all audiences who are interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community.

SPECjbb2013 replaces SPECjbb2005. New features include:

  • A usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases and data-mining operations.
  • Both a pure throughput metric and a metric that measures critical throughput under service-level agreements (SLAs) specifying response times ranging from 10ms to 500ms.
  • Support for multiple run configurations, enabling users to analyze and overcome bottlenecks at multiple layers of the system stack, including hardware, OS, JVM and application layers.
  • Exercising new Java 7 features and other important performance elements, including the latest data formats (XML), communication using compression, and messaging with security.
  • Support for virtualization and cloud environments.

See Also

Disclosure Statement

SPEC and the benchmark name SPECjbb are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results as of 9/23/2013, see http://www.spec.org for more information. SPARC T5-2 81,084 SPECjbb2013-MultiJVM max-jOPS, 39,129 SPECjbb2013-MultiJVM critical-jOPS, result from https://blogs.oracle.com/BestPerf/resource/jbb2013/sparct5-922.pdf Cisco UCS B200 M3 62,393 SPECjbb2013-MultiJVM max-jOPS, 23,505 SPECjbb2013-MultiJVM critical-jOPS, result from http://www.cisco.com/en/US/prod/collateral/ps10265/le_41704_pb_specjbb2013b200.pdf; Sun Server X4-2 52,664 SPECjbb2013-MultiJVM max-jOPS, 20,553 SPECjbb2013-MultiJVM critical-jOPS, result from https://blogs.oracle.com/BestPerf/entry/20130918_x4_2_specjbb2013; Cisco UCS C220 M3 41,954 SPECjbb2013-MultiJVM max-jOPS, 16,545 SPECjbb2013-MultiJVM critical-jOPS result from www.spec.org.

Wednesday Sep 18, 2013

Sun Server X4-2 Performance Running SPECjbb2013 MultiJVM Benchmark

Oracle's Sun Server X4-2 system, using Oracle Solaris and Oracle JDK, produced a SPECjbb2013 benchmark (MultiJVM metric) result. This benchmark was designed by the industry to showcase Java server performance.

  • The Sun Server X4-2 system is 24% faster than the fastest published Intel Xeon E5-2600 (Sandy Bridge) based two socket system's (Dell PowerEdge R720's) SPECjbb2013-MultiJVM max-jOPS.

  • The Sun Server X4-2 is 22% faster than the fastest published Intel Xeon E5-2600 (Sandy Bridge) based two socket system's (Dell PowerEdge R720's) SPECjbb2013-MultiJVM critical-jOPS.

  • The Sun Server X4-2 runs SPECjbb2013 (MultiJVM metric) at 70% of the published T5-2 SPECjbb2013-MultiJVM max-jOPS.

  • The Sun Server X4-2 runs SPECjbb2013 (MultiJVM metric) at 88% of the published T5-2 SPECjbb2013-MultiJVM critical-jOPS.

  • The combination of Oracle Solaris 11.1 and Oracle JDK 7 update 40 delivered a result of 52,664 SPECjbb2013-MultiJVM max-jOPS and 20,553 SPECjbb2013-MultiJVM critical-jOPS on the SPECjbb2013 benchmark.

From SPEC's press release, "SPECjbb2013 replaces SPECjbb2005. The new benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is expected to be used widely by all those interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community."

Performance Landscape

Top two-socket results of SPECjbb2013 MultiJVM as of October 8, 2013.

SPECjbb2013
System Processor DDR3 SPECjbb2013-MultiJVM OS JDK
max-jOPS critical-jOPS
SPARC T5-2 2 x 3.6 GHz SPARC T5 1600 75,658 23,334 Solaris 11.1 7u17
Cisco UCS B200 M3 2 x 2.7 GHz Intel E5-2697 v2 1866 62,393 23,505 RHEL 6.4 7u40
Sun Server X4-2 2 x 2.7 GHz Intel E5-2697 v2 1600 52,664 20,553 Solaris 11.1 7u40
Dell PowerEdge R720 2 x 2.9 GHz Intel Xeon E5-2690 1600 42,431 16,779 RHEL 6.4 7u21

The above table includes published results from www.spec.org.

Configuration Summary

System Under Test:

Sun Server X4-2
2 x Intel E5-2697 v2, 2.7 GHz
Hyper-Threading enabled
Turbo Boost enabled
128 GB memory (16 x 8 GB dimms)
Oracle Solaris 11.1 (11.1.4.2.0)
Oracle JDK 7u40

Benchmark Description

The SPECjbb2013 benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is relevant to all audiences who are interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community.

SPECjbb2013 replaces SPECjbb2005. New features include:

  • A usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases and data-mining operations.
  • Both a pure throughput metric and a metric that measures critical throughput under service-level agreements (SLAs) specifying response times ranging from 10ms to 500ms.
  • Support for multiple run configurations, enabling users to analyze and overcome bottlenecks at multiple layers of the system stack, including hardware, OS, JVM and application layers.
  • Exercising new Java 7 features and other important performance elements, including the latest data formats (XML), communication using compression, and messaging with security.
  • Support for virtualization and cloud environments.

See Also

Disclosure Statement

SPEC and the benchmark name SPECjbb are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results from http://www.spec.org as of 10/8/2013. SPARC T5-2, 75,658 SPECjbb2013-MultiJVM max-jOPS, 23,334 SPECjbb2013-MultiJVM critical-jOPS; Cisco UCS B200 M3 62,393 SPECjbb2013-MultiJVM max-jOPS, 23,505 SPECjbb2013-MultiJVM critical-jOPS; Dell PowerEdge R720 42,431 SPECjbb2013-MultiJVM max-jOPS, 16,779 SPECjbb2013-MultiJVM critical-jOPS; Sun Server X4-2 52,664 SPECjbb2013-MultiJVM max-jOPS, 20,553 SPECjbb2013-MultiJVM critical-jOPS.

Tuesday Mar 26, 2013

SPARC T5-2 Achieves SPECjbb2013 Benchmark World Record Result

Oracle, using Oracle Solaris and Oracle JDK, delivered a two socket server world record result on the SPECjbb2013 benchmark, Multi-JVM metric. This benchmark was designed by the industry to showcase Java server performance. SPECjbb2013 is the replacement for SPECjbb2005 (SPECjbb2005 will soon be retired by SPEC).

  • Oracle's SPARC T5-2 server achieved 75,658 SPECjbb2013-MultiJVM max-jOPS and 23,268 SPECjbb2013-MultiJVM critical-jOPS on the SPECjbb2013 benchmark. This result is a two chip world record. (Oracle has submitted this result for review by SPEC.)

  • There are no IBM POWER7 or POWER7+ based server results on the SPECjbb2013 benchmark. IBM has published IBM POWER7+ based servers on the SPECjbb2005 which will soon be retired by SPEC.

  • The SPARC T5-2 server running is 1.9x faster than the 2-chip HP ProLiant ML350p server (2.9 GHz E5-2690 Sandy Bridge-based) based on SPECjbb2013-MultiJVM max-jOPS.

  • The 2-chip SPARC T5-2 server is 15% faster than the 4-chip HP ProLiant DL560p server (2.7 GHz E5-4650 Sandy Bridge-based) based on SPECjbb2013-MultiJVM max-jOPS.

  • The 2-chip SPARC T5-2 server is 6.1x faster than the 1-chip HP ProLiant ML310e Gen8 (3.6 GHZ E3-1280v2 Ivy Bridge based) based on SPECjbb2013-MultiJVM max-jOPS.

  • The Sun Server X3-2 system running Oracle Solaris 11 is 5% faster than the HP ProLiant ML350p Gen8 server running Windows Server 2008 based on SPECjbb2013-MultiJVM max-jOPS.

  • Oracle's SPARC T4-2 server achieved 34,804 SPECjbb2013-MultiJVM max-jOPS and 10,101 SPECjbb2013-MultiJVM critical-jOPS on the SPECjbb2013 benchmark.
    (Oracle has submitted this result for review by SPEC.)

  • Oracle's Sun Server X3-2 system achieved 41,954 SPECjbb2013-MultiJVM max-jOPS and 13,305 SPECjbb2013-MultiJVM critical-jOPS on the SPECjbb2013 benchmark. (Oracle has submitted this result for review by SPEC.)

  • Oracle's Sun Server X2-4 system achieved 65,211 SPECjbb2013-MultiJVM max-jOPS and 22,057 SPECjbb2013-MultiJVM critical-jOPS on the SPECjbb2013 benchmark. (Oracle has submitted this result for review by SPEC.)

  • SPECjbb2013 demonstrates better performance on Oracle hardware and software, engineered to work together, than alternatives from HP.

  • These results were obtained using Oracle Solaris 11 along with Java Platform, Standard Edition, JDK 7 Update 17 on the SPARC T5-2 server, SPARC T4-2 server, Sun Server X3-2 and Sun Server X2-4.

From SPEC's press release, "SPECjbb2013 replaces SPECjbb2005. The new benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is expected to be used widely by all those interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community."

Performance Landscape

Results of SPECjbb2013 from www.spec.org as of March 26, 2013 and this report.

SPECjbb2013
System Processor SPECjbb2013-MultiJVM OS JDK
max-jOPS critical-jOPS
SPARC T5-2 2 x SPARC T5 75,658 23,334 Oracle Solaris 11.1 Oracle JDK 7u17
HP DL560p Gen8 4 x Intel E5-4650 66,007 16,577 Windows 2008 R2 Oracle JDK 7u15
Sun Server X2-4 4 x Intel E7-4870 65,211 22,057 Oracle Solaris 11.1 Oracle JDK 7u17
Sun Server X3-2 2 x Intel E5-2690 41,954 13,305 Oracle Solaris 11.1 Oracle JDK 7u17
HP ML350p Gen8 2 x Intel E5-2690 40,047 12,308 Windows 2008 R2 Oracle JDK 7u15
SPARC T4-2 2 x SPARC T4 34,804 10,101 Oracle Solaris 11.1 Oracle JDK 7u17
Supermicro X8DTN+ 2 x Intel E5690 20,977 6,188 RHEL 6.3 Oracle JDK 7u11
HP ML310e Gen8 1 x Intel E3-1280v2 12,315 2,908 Windows 2008 R2 Oracle JDK 7u15
Intel R1304BT 1 x Intel 1260L 6,198 1,722 Windows 2008 R2 Oracle JDK 7u11

The above table represents all of the published results on www.spec.org. SPEC allows for self publication of SPECjbb2013 results.

Configuration Summary

Systems Under Test:

SPARC T5-2 server
2 x SPARC T5, 3.60 GHz
512 GB memory (32 x 16 GB dimms)
Oracle Solaris 11.1
Oracle JDK 7 Update 17

Sun Server X2-4
4 x Intel Xeon E7-4870, 2.40 GHz
Hyper-Threading enabled
Turbo Boost enabled
128 GB memory (32 x 4 GB dimms)
Oracle Solaris 11.1
Oracle JDK 7 Update 17

Sun Server X3-2
2 x Intel E5-2690, 2.90 GHz
Hyper-Threading enabled
Turbo Boost enabled
128 GB memory (32 x 4 GB dimms)
Oracle Solaris 11.1
Oracle JDK 7 Update 17

SPARC T4-2 server
2 x SPARC T4, 2.85 GHz
256 GB memory (32 x 8 GB dimms)
Oracle Solaris 11.1
Oracle JDK 7 Update 17

Benchmark Description

The SPECjbb2013 benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is relevant to all audiences who are interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community.

SPECjbb2013 replaces SPECjbb2005. New features include:

  • A usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases and data-mining operations.
  • Both a pure throughput metric and a metric that measures critical throughput under service-level agreements (SLAs) specifying response times ranging from 10ms to 500ms.
  • Support for multiple run configurations, enabling users to analyze and overcome bottlenecks at multiple layers of the system stack, including hardware, OS, JVM and application layers.
  • Exercising new Java 7 features and other important performance elements, including the latest data formats (XML), communication using compression, and messaging with security.
  • Support for virtualization and cloud environments.

See Also

Disclosure Statement

SPEC and the benchmark name SPECjbb are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results as of 3/26/2013, see http://www.spec.org for more information. SPARC T5-2 75,658 SPECjbb2013-MultiJVM max-jOPS, 23,334 SPECjbb2013-MultiJVM critical-jOPS. Sun Server X2-4 65,211 SPECjbb2013-MultiJVM max-jOPS, 22,057 SPECjbb2013-MultiJVM critical-jOPS. Sun Server X3-2 41,954 SPECjbb2013-MultiJVM max-jOPS, 13,305 SPECjbb2013-MultiJVM critical-jOPS. SPARC T4-2 34,804 SPECjbb2013-MultiJVM max-jOPS, 10,101 SPECjbb2013-MultiJVM critical-jOPS. HP ProLiant DL560p Gen8 66,007 SPECjbb2013-MultiJVM max-jOPS, 16,577 SPECjbb2013-MultiJVM critical-jOPS. HP ProLiant ML350p Gen8 40,047 SPECjbb2013-MultiJVM max-jOPS, 12,308 SPECjbb2013-MultiJVM critical-jOPS. Supermicro X8DTN+ 20,977 SPECjbb2013-MultiJVM max-jOPS, 6,188 SPECjbb2013-MultiJVM critical-jOPS. HP ProLiant ML310e Gen8 12,315 SPECjbb2013-MultiJVM max-jOPS, 2,908 SPECjbb2013-MultiJVM critical-jOPS. Intel R1304BT 6,198 SPECjbb2013-MultiJVM max-jOPS, 1,722 SPECjbb2013-MultiJVM critical-jOPS.

SPARC T5 Systems Deliver SPEC CPU2006 Rate Benchmark Multiple World Records

Oracle's SPARC T5 processor based systems delivered world record performance on the SPEC CPU2006 rate benchmarks. This was accomplished with Oracle Solaris 11.1 and Oracle Solaris Studio 12.3 software.

SPARC T5-8

  • The SPARC T5-8 server delivered world record SPEC CPU2006 rate benchmark results for systems with eight processors.

  • The SPARC T5-8 server achieved scores of 3750 SPECint_rate2006, 3490 SPECint_rate_base2006, 3020 SPECfp_rate2006, and 2770 SPECfp_rate_base2006.

  • The SPARC T5-8 server beat the 8 processor IBM Power 760 with POWER7+ processors by 1.7x on the SPECint_rate2006 benchmark and 2.2x on the SPECfp_rate2006 benchmark.

  • The SPARC T5-8 server beat the 8 processor IBM Power 780 with POWER7 processors by 35% on the SPECint_rate2006 benchmark and 14% on the SPECfp_rate2006 benchmark.

  • The SPARC T5-8 server beat the 8 processor HP DL980 G7 with Intel Xeon E7-4870 processors by 1.7x on the SPECint_rate2006 benchmark and 2.1x on the SPECfp_rate2006 benchmark.

SPARC T5-1B

  • The SPARC T5-1B server module delivered world record SPEC CPU2006 rate benchmark results for systems with one processor.

  • The SPARC T5-1B server module achieved scores of 467 SPECint_rate2006, 436 SPECint_rate_base2006, 369 SPECfp_rate2006, and 350 SPECfp_rate_base2006.

  • The SPARC T5-1B server module beat the 1 processor IBM Power 710 Express with a POWER7 processor by 62% on the SPECint_rate2006 benchmark and 49% on the SPECfp_rate2006 benchmark.

  • The SPARC T5-1B server module beat the 1 processor NEC Express5800/R120d-1M with an Intel Xeon E5-2690 processor by 31% on the SPECint_rate2006 benchmark. The SPARC T5-1B server module beat the 1 processor Huawei RH2288 V2 with an Intel Xeon E5-2690 processor by 44% on the SPECfp_rate2006 benchmark.

  • The SPARC T5-1B server module beat the 1 processor Supermicro A+ 1012G-MTF with an AMD Operton 6386 SE processor by 51% on the SPECint_rate2006 benchmark and 65% on the SPECfp_rate2006 benchmark.

Performance Landscape

Complete benchmark results are at the SPEC website, SPEC CPU2006 Results. The tables below provide the new Oracle results, as well as, select results from other vendors.

SPEC CPU2006 Rate Results – Eight Processors
System Processor ch/co/th * Peak Base
SPECint_rate2006
SPARC T5-8 SPARC T5, 3.6 GHz 8/128/1024 3750 3490
IBM Power 780 POWER7, 3.92 GHz 8/64/256 2770 2420
HP DL980 G7 Xeon E7-4870, 2.4 GHz 8/80/160 2180 2070
IBM Power 760 POWER7+, 3.42 GHz 8/48/192 2170 1480
Dell PowerEdge C6145 Opteron 6180 SE, 2.5 GHz 8/96/96 1670 1440
SPECfp_rate2006
SPARC T5-8 SPARC T5, 3.6 GHz 8/128/1024 3020 2770
IBM Power 780 POWER7, 3.92 GHz 8/64/256 2640 2410
HP DL980 G7 Xeon E7-4870, 2.4 GHz 8/80/160 1430 1380
IBM Power 760 POWER7+, 3.42 GHz 8/48/192 1400 1130
Dell PowerEdge C6145 Opteron 6180 SE, 2.5 GHz 8/96/96 1310 1200

* ch/co/th — chips / cores / threads enabled

SPEC CPU2006 Rate Results – One Processor
System Processor ch/co/th * Peak Base
SPECint_rate2006
SPARC T5-1B SPARC T5, 3.6 GHz 1/16/128 467 436
NEC Express5800/R120d-1M Xeon E5-2690, 2.9 GHz 1/8/16 357 343
Supermicro A+ 1012G-MTF Opteron 6386 SE, 2.8 GHz 1/16/16 309 269
IBM Power 710 Express POWER7, 3.556 GHz 1/8/32 289 255
SPECfp_rate2006
SPARC T5-1B SPARC T5, 3.6 GHz 1/16/128 369 350
Huawei RH2288 V2 Xeon E5-2690, 2.9 GHz 1/8/16 257 250
IBM Power 710 Express POWER7, 3.556 GHz 1/8/32 248 229
Supermicro A+ 1012G-MTF Opteron 6386 SE, 2.8 GHz 1/16/16 223 199

* ch/co/th — chips / cores / threads enabled

Configuration Summary

Systems Under Test:

SPARC T5-8
8 x 3.6 GHz SPARC T5 processors
4 TB memory (128 x 32 GB dimms)
2 TB on 8 x 600 GB 10K RPM SAS disks, arranged as 4 x 2-way mirrors
Oracle Solaris 11.1 (SRU 4.6)
Oracle Solaris Studio 12.3 1/13 PSE

SPARC T5-1B
1 x 3.6 GHz SPARC T5 processor
256 GB memory (16 x 16 GB dimms)
157 GB on 2 x 300 GB 10K RPM SAS disks (mirrored)
Oracle Solaris 11.1 (SRU 3.4)
Oracle Solaris Studio 12.3 1/13 PSE

Benchmark Description

SPEC CPU2006 is SPEC's most popular benchmark. It measures:

  • Speed — single copy performance of chip, memory, compiler
  • Rate — multiple copy (throughput)

The benchmark is also divided into integer intensive applications and floating point intensive applications:

  • integer: 12 benchmarks derived from real applications such as perl, gcc, XML processing, and pathfinding
  • floating point: 17 benchmarks derived from real applications, including chemistry, physics, genetics, and weather.

It is also divided depending upon the amount of optimization allowed:

  • base: optimization is consistent per compiled language, all benchmarks must be compiled with the same flags per language.
  • peak: specific compiler optimization is allowed per application.

The overall metrics for the benchmark which are commonly used are:

  • SPECint_rate2006, SPECint_rate_base2006: integer, rate
  • SPECfp_rate2006, SPECfp_rate_base2006: floating point, rate
  • SPECint2006, SPECint_base2006: integer, speed
  • SPECfp2006, SPECfp_base2006: floating point, speed

See Also

Disclosure Statement

SPEC and the benchmark names SPECfp and SPECint are registered trademarks of the Standard Performance Evaluation Corporation. Results as of March 26, 2013 from www.spec.org and this report. SPARC T5-8: 3750 SPECint_rate2006, 3490 SPECint_rate_base2006, 3020 SPECfp_rate2006, 2770 SPECfp_rate_base2006; SPARC T5-1B: 467 SPECint_rate2006, 436 SPECint_rate_base2006, 369 SPECfp_rate2006, 350 SPECfp_rate_base2006.

SPARC T5 Systems Produce Oracle TimesTen Benchmark World Record

The Oracle TimesTen In-Memory Database is optimized to run on Oracle's SPARC T5 processor platforms running Oracle Solaris 11. In this series of tests, systems with the new SPARC T5 processor were significantly faster than systems based on other processors. Two tests were run to explore TimesTen performance: a Mobile Call Processing test (based on customer workload) and Oracle's TimesTen Performance Throughput Benchmark (TPTBM). TimesTen version 11.2.2.4 was used for all tests.

  • On the TimesTen Performance Throughput Benchmark (TPTBM), SPARC T5-8 server produced a world record 59.9 million read transactions per second.

  • On the Mobile Call Processing test, the SPARC T5 processor achieves 2.4 times more throughput than the Intel Xeon E7-4870 processor. The two-chip SPARC T5-2 server is 22% faster than an x86 server with four Intel E7-4870 2.4 GHz processors.

  • On the TimesTen Performance Throughput Benchmark (TPTBM) read-only workload, the SPARC T5 processor achieves 2.2 times higher throughput than the Intel Xeon E7-4870 processor. On the same workload, the two-chip SPARC T5-2 server produces 10% more throughput than an x86 server with four Intel E7-4870 processors and has almost twice the performance of a 2-chip Intel E5-2680 system.

  • With the TPTBM read-only workload, the SPARC T5-8 server delivers 3.8x more throughput than a SPARC T5-2 Server, showing excellent scalability.

  • The SPARC T5 processor delivers over twice the performace of the previous generation SPARC T4 processor and over 4x the performace of the SPARC T3 processor, all in the same amount of space.

  • The SPARC T5-2 server delivers 2.4x the performace of the SPARC T4-2 server in the same 3U space. This is better performance than that of the SPARC T4-4 server which occupies 5U.

Performance Landscape

Mobile Call Processing Test Performance

Processor Tps
SPARC T5, 3.6 GHz 367,600
Intel Xeon E7-4870, 2.4 GHz 302,000
SPARC T4, 2.85 GHz 230,500

All systems measured using Oracle Solaris 11 and Oracle TimesTen In-Memory Database 11.2.2.4.1

TimesTen Performance Throughput Benchmark (TPTBM) Read-Only

System Processor Chips Tps Tps/
Chip
SPARC T5-8 SPARC T5, 3.6 GHz 8 59.9M 7.5M
SPARC T5-2 SPARC T5, 3.6 GHz 2 15.9M 7.9M
x86 Intel Xeon E7-4870, 2.4 GHz 4 14.5M 3.6M
SPARC T4-4 SPARC T4, 3.0 GHz 4 14.2M 3.6M
x86* Intel Xeon E5-2680, 2.7 GHz 2 8.5M 4.3
SPARC T4-2 SPARC T4, 2.85 GHz 2 6.5M 3.3M
SPARC T3-4 SPARC T3, 1.65 GHz 4 7.9M 1.9M
T5440 SPARC T2+, 1.4 GHz 4 3.1M 0.8M

All systems measured using Oracle Solaris 11 and Oracle TimesTen In-Memory Database 11.2.2.4.1

*Intel E5-2680 using Oracle Linux and Oracle TimesTen In-Memory Database 11.2.2.4.1

TimesTen Performance Throughput Benchmark (TPTBM) Update-Only

Processor Tps
SPARC T5, 3.6 GHz 1,031.7K
Intel Xeon E7-4870, 2.4 GHz 988.1K
Intel Xeon E5-2680, 2.7 GHz * 944.3K
SPARC T4, 3.0 GHz 678.0K

All systems measured using Oracle Solaris 11 and Oracle TimesTen In-Memory Database 11.2.2.4.1

*Intel E5-2680 using Oracle Linux and Oracle TimesTen In-Memory Database 11.2.2.4.1

Configuration Summary

Hardware Configurations:

SPARC T5-8 server
8 x SPARC T5 processors, 3.6 GHz
2 TB memory
1 x 8 Gbs FC Qlogic HBA
1 x 6 Gbs SAS HBA
2 x 300 GB internal disks
Oracle Solaris 11
TimesTen 11.2.2.4.1
1 x Sun Fire X4275 server configured as COMSTAR redo head (log)

SPARC T5-2 server
2 x SPARC T5 processors, 3.6 GHz
512 GB memory
1 x 8 Gbs FC Qlogic HBA
1 x 6 Gbs SAS HBA
2 x 300 GB internal disks
Oracle Solaris 11
TimesTen 11.2.2.4.1
1 x Sun Fire X4275 server configured as COMSTAR redo head (log)

SPARC T4-4 server
4 x SPARC T4 processors, 3.0 GHz
1 TB memory
1 x 8 Gbs FC Qlogic HBA
1 x 6 Gbs SAS HBA
6 x 300 GB internal disks
Oracle Solaris 11
TimesTen 11.2.2.4.1
Sun Storage F5100 Flash Array (80 x 24 GB flash modules)
1 x Sun Fire X4275 server configured as COMSTAR redo head (log)

SPARC T4-2 server
2 x SPARC T4 processors, 2.85 GHz
256 GB memory
1 x 8 Gbs FC Qlogic HBA
1 x 6 Gbs SAS HBA
4 x 300 GB internal disks
Oracle Solaris 11
TimesTen 11.2.2.4.1
Sun Storage F5100 Flash Array (40 x 24 GB flash modules)
1 x Sun Fire X4275 server configured as COMSTAR head

SPARC T3-4 server
4 x SPARC T3 processors, 1.6 GHz
512 GB memory
1 x 8 Gbs FC Qlogic HBA
8 x 146 GB internal disks
Oracle Solaris 11
TimesTen 11.2.2.4.1
1 x Sun Fire X4275 server configured as COMSTAR head

Intel Server x86_64
2 x Intel Xeon E5-2680 processors, 2.7 GHz
256 GB memory
4 x SSD SAS disks (log)
1 x 600 GB internal disks
Oracle Linux
TimesTen 11.2.2.4.1

Sun Server X2-4
4 x Intel Xeon E7-4870 processors, 2.4 GHz
512 GB memory
1 x 8 Gbs FC Qlogic HBA
6 x 146 GB internal disks
Oracle Solaris 11
TimesTen 11.2.2.4.1
1 x Sun Fire X4275 server configured as COMSTAR redo head (log)

Benchmark Descriptions

TimesTen Performance Throughput BenchMark (TPTBM) is shipped with TimesTen and measures the total throughput of the system. The benchmark workloads can be reads, inserts, updates, and delete operations, or a mix of them as required.

Mobile Call Processing is a customer-based workload for processing calls made by mobile phone subscribers. The workload has a mixture of read-only, update, and insert-only transactions. The peak throughput performance is measured from multiple concurrent processes executing the transactions until a peak performance is reached via saturation of the available resources.

Key Points and Best Practices

The Mobile Call Processing test utilized Oracle Solaris processor sets in all environments for optimum performance. This features isolates running processes from other processes in the system. Combined with parameters to limit memory pages to the lgroup within the processor set and isolating the processor set to a single processor within the system.

See Also

Disclosure Statement

Copyright 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 26 March 2013.

SPARC T5-2 Achieves ZFS File System Encryption Benchmark World Record

Oracle continues to lead in enterprise security. Oracle's SPARC T5 processors combined with the Oracle Solaris ZFS file system demonstrate faster file system encryption than equivalent x86 systems using the Intel Xeon Processor E5-2600 Sequence chips which have AES-NI security instructions.

Encryption is the process where data is encoded for privacy and a key is needed by the data owner to access the encoded data.

  • The SPARC T5-2 server is 3.4x faster than a 2 processor Intel Xeon E5-2690 server running Oracle Solaris 11.1 that uses the AES-NI GCM security instructions for creating encrypted files.

  • The SPARC T5-2 server is 2.2x faster than a 2 processor Intel Xeon E5-2690 server running Oracle Solaris 11.1 that uses the AES-NI CCM security instructions for creating encrypted files.

  • The SPARC T5-2 server consumes a significantly less percentage of system resources as compared to a 2 processor Intel Xeon E5-2690 server.

Performance Landscape

Below are results running two different ciphers for ZFS encryption. Results are presented for runs without any cipher, labeled clear, and a variety of different key lengths. The results represent the maximum delivered values measured for 3 concurrent sequential write operations using 1M blocks. Performance is measured in MB/sec (bigger is better). System utilization is reported as %CPU as measured by iostat (smaller is better).

The results for the x86 server were obtained using Oracle Solaris 11.1 with performance bug fixes.

Encryption Using AES-GCM Ciphers

System GCM Encryption: 3 Concurrent Sequential Writes
Clear AES-256-GCM AES-192-GCM AES-128-GCM
MB/sec %CPU MB/sec %CPU MB/sec %CPU MB/sec %CPU
SPARC T5-2 server 3,918 7 3,653 14 3,676 15 3,628 14
SPARC T4-2 server 2,912 11 2,662 31 2,663 30 2,779 31
2-Socket Intel Xeon E5-2690 3,969 42 1,062 58 1,067 58 1,076 57
SPARC T5-2 vs x86 server 1.0x 3.4x 3.4x 3.4x

Encryption Using AES-CCM Ciphers

System CCM Encryption: 3 Concurrent Sequential Writes
Clear AES-256-CCM AES-192-CCM AES-128-CCM
MB/sec %CPU MB/sec %CPU MB/sec %CPU MB/sec %CPU
SPARC T5-2 server 3,862 7 3,665 15 3,622 14 3,707 12
SPARC T4-2 server 2,945 11 2,471 26 2,801 26 2,442 25
2-Socket Intel Xeon E5-2690 3,868 42 1,566 64 1,632 63 1,689 66
SPARC T5-2 vs x86 server 1.0x 2.3x 2.2x 2.2x

Configuration Summary

Storage Configuration:

Sun Storage 6780 array
4 CSM2 trays, each with 16 83GB 15K RPM drives
8x 8 GB/sec Fiber Channel ports per host
R0 Write cache enabled, controller mirroring off for peak write bandwidth
8 Drive R0 512K stripe pools mirrored via ZFS to storage

Sun Storage 6580 array
9 CSM2 trays, each with 16 136GB 15K RPM drives
8x 4 GB/sec Fiber Channel ports per host
R0 Write cache enabled, controller mirroring off for peak write bandwidth
4 Drive R0 512K stripe pools mirrored via ZFS to storage

Server Configuration:

SPARC T5-2 server
2 x SPARC T5 3.6 GHz processors
512 GB memory
Oracle Solaris 11.1

SPARC T4-2 server
2 x SPARC T4 2.85 GHz processors
256 GB memory
Oracle Solaris 11.1

Sun Server X3-2L server
2 x Intel Xeon E5-2690, 2.90 GHz processors
128 GB memory
Oracle Solaris 11.1

Switch Configuration:

Brocade 5300 FC switch

Benchmark Description

This benchmark evaluates secure file system performance by measuring the rate at which encrypted data can be written. The Vdbench tool was used to generate the IO load. The test performed 3 concurrent sequential write operations using 1M blocks to 3 separate files.

Key Points and Best Practices

  • ZFS encryption is integrated with the ZFS command set. Like other ZFS operations, encryption operations such as key changes and re-key are performed online.

  • Data is encrypted using AES (Advanced Encryption Standard) with key lengths of 256, 192, and 128 in the CCM and GCM operation modes.

  • The flexibility of encrypting specific file systems is a key feature.

  • ZFS encryption is inheritable to descendent file systems. Key management can be delegated through ZFS delegated administration.

  • ZFS encryption uses the Oracle Solaris Cryptographic Framework which gives it access to SPARC T5 and Intel Xeon E5-2690 processor hardware acceleration or to optimized software implementations of the encryption algorithms automatically.

  • On modern computers with multiple threads per core, simple statistics like %utilization measured in tools like iostat and vmstat are not "hard" indications of the resources that might be available for other processing. For example, 90% idle may not mean that 10 times the work can be done. So drawing numerical conclusions must be done carefully.

See Also

Disclosure Statement

Copyright 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of March 26, 2013.

SPARC T5-2 Obtains Oracle Internet Directory Benchmark World Record Performance

Oracle's SPARC T5-2 server running Oracle Internet Directory (OID, Oracle's LDAP Directory Server) on Oracle Solaris 11 achieved a record result for LDAP searches/second with 1000 clients.

  • The SPARC T5-2 server running Oracle Internet Directory on Oracle Solaris 11 achieved a result of 944,624 LDAP searches/sec with an average latency of 1.05 ms with 1000 clients.

  • The SPARC T5-2 server running Oracle Internet Directory demonstrated 2.7x better throughput and 39% better latency improvement over similarly configured OID and SPARC T4 benchmark environment.

  • The SPARC T5-2 server running Oracle Internet Directory demonstrates 39% better throughput and latency for LDAP searches on core-to-core comparison over an x86 system configured with two Intel Xeon X5675 processors.

  • Oracle Internet Directory achieved near linear scaling on the SPARC T5-2 server with 68,399 LDAP searches/sec with 2 cores to 944,624 LDAP searches/sec with 32 cores.

  • Oracle Internet Directory and the SPARC T5-2 server achieved up to 12,453 LDAP modifys/sec with an average latency of 3.9 msec for 50 clients.

Performance Landscape

Oracle Internet Directory Tests
System c/c/th Search Modify Add
ops/sec lat (msec) ops/sec lat (msec) ops/sec lat (msec)
SPARC T5-2 2/32/256 944,624 1.05 12,453 3.9 888 17.9
SPARC T4-4 4/32/256 682,000 1.46 12,000 4.0 835 19.0

In order to compare the SPARC T5-2 to a 12-core x86 system, only 1 processor and 12 cores was used in the SPARC T5-2.

Oracle Internet Directory Tests – Comparing Against x86
System c/c/th Search Compare Authentication
ops/sec lat (msec) ops/sec lat (msec) ops/sec lat (msec)
SPARC T5-2 1/12/96 417,000 1.19 274,185 1.82 149,623 3.30
x86 2 x Intel X5675 2/12/24 299,000 1.66 202,433 2.46 119,198 4.19

Scaling runs were also made on the SPARC T5-2 server.

Scaling of Search Tests – SPARC T5-2
Cores Clients ops/sec Latency (msec)
32 1000 944,624 1.05
24 1000 823,741 1.21
16 500 560,709 0.88
8 500 270,601 1.84
4 100 145,879 0.68
2 100 68,399 1.46

Configuration Summary

System Under Test:

SPARC T5-2
2 x SPARC T5 processors, 3.6 GHz
512 GB memory
4 x 300 GB internal disks
Flash Storage (used for database and log files)
1 x Sun Storage 2540-M2 (used for redo logs)
Oracle Solaris 11.1
Oracle Internet Directory 11g Release 1 PS6 (11.1.1.7.0)
Oracle Database 11g Enterprise Edition 11.2.0.3 (64-bit)

Benchmark Description

Oracle Internet Directory (OID) is Oracle's LDAPv3 Directory Server. The throughput for five key operations are measured — Search, Compare, Modify, Mix and Add.

LDAP Search Operations Test

This test scenario involved concurrent clients binding once to OID and then performing repeated LDAP Search operations. The salient characteristics of this test scenario is as follows:

  • SLAMD SearchRate job was used.
  • BaseDN of the search is root of the DIT, the scope is SUBTREE, the search filter is of the form UID=, DN and UID are the required attribute.
  • Each LDAP search operation matches a single entry.
  • The total number concurrent clients was 1000 and were distributed amongst two client nodes.
  • Each client binds to OID once and performs repeated LDAP Search operations, each search operation resulting in the lookup of a unique entry in such a way that no client looks up the same entry twice and no two clients lookup the same entry and all entries are searched randomly.
  • In one run of the test, random entries from the 50 Million entries are looked up in as many LDAP Search operations.
  • Test job was run for 60 minutes.

LDAP Compare Operations Test

This test scenario involved concurrent clients binding once to OID and then performing repeated LDAP Compare operations on userpassword attribute. The salient characteristics of this test scenario is as follows:

  • SLAMD CompareRate job was used.
  • Each LDAP compare operation matches user password of user.
  • The total number concurrent clients was 1000 and were distributed amongst two client nodes.
  • Each client binds to OID once and performs repeated LDAP compare operations.
  • In one run of the test, random entries from the 50 Million entries are compared in as many LDAP compare operations.
  • Test job was run for 60 minutes.

LDAP Modify Operations Test

This test scenario consisted of concurrent clients binding once to OID and then performing repeated LDAP Modify operations. The salient characteristics of this test scenario is as follows:

  • SLAMD LDAP modrate job was used.
  • A total of 50 concurrent LDAP clients were used.
  • Each client updates a unique entry each time and a total of 50 Million entries are updated.
  • Test job was run for 60 minutes.
  • Value length was set to 11.
  • Attribute that is being modified is not indexed.

LDAP Mixed Load Test

The test scenario involved both the LDAP search and LDAP modify clients enumerated above.

  • The ratio involved 60% LDAP search clients, 30% LDAP bind and 10% LDAP modify clients.
  • A total of 1000 concurrent LDAP clients were used and were distributed on 2 client nodes.
  • Test job was run for 60 minutes.

LDAP Add Load Test

The test scenario involved concurrent clients adding new entries as follows.

  • Slamd standard add rate job is used.
  • A total of 500,000 entries were added.
  • A total of 16 concurrent LDAP clients were used.
  • Slamd add's inetorgperson objectclass entry with 21 attributes (includes operational attributes).

See Also

Disclosure Statement

Copyright 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 26 March 2013.

Friday Feb 22, 2013

Oracle Produces World Record SPECjbb2013 Result with Oracle Solaris and Oracle JDK

Oracle, using Oracle Solaris and Oracle JDK, delivered a world record result on the SPECjbb2013 benchmark (Composite metric). This benchmark was designed by the industry to showcase Java server performance. SPECjbb2013 is the replacement for SPECjbb2005 (SPECjbb2005 will soon be retired by SPEC).

  • Oracle Solaris is 1.8x faster on the SPECjbb2013-Composite max-jOPS metric than the Red Hat Enterprise Linux result.

  • Oracle Solaris is 2.2x faster on the SPECjbb2013-Composite critical-jOPS metric than the Red Hat Enterprise Linux result.

  • The combination of Oracle Solaris 11.1 and Oracle JDK 7 update 15 delivered a result of 37,007 SPECjbb2013-Composite max-jOPS and 13,812 SPECjbb2013-Composite critical-jOPS on the SPECjbb2013 benchmark.
    (Oracle has submitted this result for review by SPEC and it is currently under review.)

From SPEC's press release, "SPECjbb2013 replaces SPECjbb2005. The new benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is expected to be used widely by all those interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community."

Performance Landscape

Results of SPECjbb2013 from www.spec.org as of February 22, 2013 and this report.

SPECjbb2013
System Processor SPECjbb2013-Composite OS JDK
max-jOPS critical-jOPS
Sun Server X2-4 4 x Intel E7-4870 37,007 13,812 Solaris 11.1 Oracle JDK 7u15
Supermicro X8DTN+ 2 x Intel E5690 20,977 6,188 RHEL 6.3 Oracle JDK 7u11
Intel R1304BT 1 x Intel 1260L 6,198 1,722 Windows 2008 R2 Oracle JDK 7u11

The above table represents all of the published results on www.spec.org. SPEC allows for self publication of SPECjbb2013 results. AnandTech has taken advantage of this and has some result on their website which were run on Intel Xeon E5-2660, AMD Opteron 6380, AMD Opteron 6376 systems. These information be viewed at: www.anandtech.com. Unfortunately AnandTech did not follow SPEC's Fair Use requirements in disclosing information about their runs, so it is not possible to include the results in the table above.

SPECjbb2013
System Processor SPECjbb2013-MultiJVM OS JDK
max-jOPS critical-jOPS
HP ProLiant DL560p Gen8 4 x Intel E5-4650 66,007 16,577 Windows Server 2008 Oracle JDK 7u15
HP ProLiant ML350p Gen8 2 x Intel E5-2690 40,047 12,308 Windows Server 2008 Oracle JDK 7u15
HP ProLiant ML310e Gen8 1 x Intel E3-1280v2 12,315 2,908 Windows 2008 R2 Oracle JDK 7u15

Configuration Summary

System Under Test:

Sun Server X2-4
4 x Intel Xeon E7-4870, 2.40 GHz
Hyper-Threading enabled
Turbo Boost enabled
128 GB memory (32 x 4 GB dimms)
Oracle Solaris 11.1
Oracle JDK 7 update 15

Benchmark Description

The SPECjbb2013 benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is relevant to all audiences who are interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community.

SPECjbb2013 replaces SPECjbb2005. New features include:

  • A usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases and data-mining operations.
  • Both a pure throughput metric and a metric that measures critical throughput under service-level agreements (SLAs) specifying response times ranging from 10ms to 500ms.
  • Support for multiple run configurations, enabling users to analyze and overcome bottlenecks at multiple layers of the system stack, including hardware, OS, JVM and application layers.
  • Exercising new Java 7 features and other important performance elements, including the latest data formats (XML), communication using compression, and messaging with security.
  • Support for virtualization and cloud environments.

See Also

Disclosure Statement

SPEC and the benchmark name SPECjbb are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results as of 2/22/2013, see http://www.spec.org for more information. Sun Server X2-4 37007 SPECjbb2013-Composite max-jOPS, 13812 SPECjbb2013-Composite critical-jOPS.

Thursday Apr 12, 2012

Sun Fire X4270 M3 SAP Enhancement Package 4 for SAP ERP 6.0 (Unicode) Two-Tier Standard Sales and Distribution (SD) Benchmark

Oracle's Sun Fire X4270 M3 server (now known as Sun Server X3-2L) achieved 8,320 SAP SD Benchmark users running SAP enhancement package 4 for SAP ERP 6.0 with unicode software using Oracle Database 11g and Oracle Solaris 10.

  • The Sun Fire X4270 M3 server using Oracle Database 11g and Oracle Solaris 10 beat both IBM Flex System x240 and IBM System x3650 M4 server running DB2 9.7 and Windows Server 2008 R2 Enterprise Edition.

  • The Sun Fire X4270 M3 server running Oracle Database 11g and Oracle Solaris 10 beat the HP ProLiant BL460c Gen8 server using SQL Server 2008 and Windows Server 2008 R2 Enterprise Edition by 6%.

  • The Sun Fire X4270 M3 server using Oracle Database 11g and Oracle Solaris 10 beat Cisco UCS C240 M3 server running SQL Server 2008 and Windows Server 2008 R2 Datacenter Edition by 9%.

  • The Sun Fire X4270 M3 server running Oracle Database 11g and Oracle Solaris 10 beat the Fujitsu PRIMERGY RX300 S7 server using SQL Server 2008 and Windows Server 2008 R2 Enterprise Edition by 10%.

Performance Landscape

SAP-SD 2-Tier Performance Table (in decreasing performance order).

SAP ERP 6.0 Enhancement Pack 4 (Unicode) Results
(benchmark version from January 2009 to April 2012)

System OS
Database
Users SAP
ERP/ECC
Release
SAPS SAPS/
Proc
Date
Sun Fire X4270 M3
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Oracle Solaris 10
Oracle Database 11g
8,320 2009
6.0 EP4
(Unicode)
45,570 22,785 10-Apr-12
IBM Flex System x240
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Windows Server 2008 R2 EE
DB2 9.7
7,960 2009
6.0 EP4
(Unicode)
43,520 21,760 11-Apr-12
HP ProLiant BL460c Gen8
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Windows Server 2008 R2 EE
SQL Server 2008
7,865 2009
6.0 EP4
(Unicode)
42,920 21,460 29-Mar-12
IBM System x3650 M4
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Windows Server 2008 R2 EE
DB2 9.7
7,855 2009
6.0 EP4
(Unicode)
42,880 21,440 06-Mar-12
Cisco UCS C240 M3
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Windows Server 2008 R2 DE
SQL Server 2008
7,635 2009
6.0 EP4
(Unicode)
41,800 20,900 06-Mar-12
Fujitsu PRIMERGY RX300 S7
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Windows Server 2008 R2 EE
SQL Server 2008
7,570 2009
6.0 EP4
(Unicode)
41,320 20,660 06-Mar-12

Complete benchmark results may be found at the SAP benchmark website http://www.sap.com/benchmark.

Configuration and Results Summary

Hardware Configuration:

Sun Fire X4270 M3
2 x 2.90 GHz Intel Xeon E5-2690 processors
128 GB memory
Sun StorageTek 6540 with 4 * 16 * 300GB 15Krpm 4Gb FC-AL

Software Configuration:

Oracle Solaris 10
Oracle Database 11g
SAP enhancement package 4 for SAP ERP 6.0 (Unicode)

Certified Results (published by SAP):

Number of benchmark users:
8,320
Average dialog response time:
0.95 seconds
Throughput:

Fully processed order line:
911,330

Dialog steps/hour:
2,734,000

SAPS:
45,570
SAP Certification:
2012014

Benchmark Description

The SAP Standard Application SD (Sales and Distribution) Benchmark is a two-tier ERP business test that is indicative of full business workloads of complete order processing and invoice processing, and demonstrates the ability to run both the application and database software on a single system. The SAP Standard Application SD Benchmark represents the critical tasks performed in real-world ERP business environments.

SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products.

See Also

Disclosure Statement

Two-tier SAP Sales and Distribution (SD) standard SAP SD benchmark based on SAP enhancement package 4 for SAP ERP 6.0 (Unicode) application benchmark as of 04/11/12: Sun Fire X4270 M3 (2 processors, 16 cores, 32 threads) 8,320 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, Oracle 11g, Solaris 10, Cert# 2012014. IBM Flex System x240 (2 processors, 16 cores, 32 threads) 7,960 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, DB2 9.7, Windows Server 2008 R2 EE, Cert# 2012016. IBM System x3650 M4 (2 processors, 16 cores, 32 threads) 7,855 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, DB2 9.7, Windows Server 2008 R2 EE, Cert# 2012010. Cisco UCS C240 M3 (2 processors, 16 cores, 32 threads) 7,635 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 DE, Cert# 2012011. Fujitsu PRIMERGY RX300 S7 (2 processors, 16 cores, 32 threads) 7,570 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 EE, Cert# 2012008. HP ProLiant DL380p Gen8 (2 processors, 16 cores, 32 threads) 7,865 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 EE, Cert# 2012012.

SAP, R/3, reg TM of SAP AG in Germany and other countries. More info www.sap.com/benchmark

Tuesday Apr 10, 2012

SPEC CPU2006 Results on Oracle's Sun x86 Servers

Oracle's new Sun x86 servers delivered world records on the benchmarks SPECfp2006 and SPECint_rate2006 for two processor servers. This was accomplished with Oracle Solaris 11 and Oracle Solaris Studio 12.3 software.

  • The Sun Fire X4170 M3 (now known as Sun Server X3-2) server achieved a world record result in for SPECfp2006 benchmark with a score of 96.8.

  • The Sun Blade X6270 M3 server module (now known as Sun Blade X3-2B) produced best integer throughput performance for all 2-socket servers with a SPECint_rate2006 score of 705.

  • The Sun x86 servers with Intel Xeon E5-2690 2.9 GHz processors produced a cross-generational performance improvement up to 1.8x over the previous generation, Sun x86 M2 servers.

Performance Landscape

Complete benchmark results are at the SPEC website, SPEC CPU2006 Results. The tables below provide the new Oracle results, as well as, select results from other vendors.

SPECint2006
System Processor c/c/t * Peak Base O/S Compiler
Fujitsu PRIMERGY BX924 S3 Intel E5-2690, 2.9 GHz 2/16/16 60.8 56.0 RHEL 6.2 Intel 12.1.2.273
Sun Fire X4170 M3 Intel E5-2690, 2.9 GHz 2/16/32 58.5 54.3 Oracle Linux 6.1 Intel 12.1.0.225
Sun Fire X4270 M2 Intel X5690, 3.47 GHz 2/12/12 46.2 43.9 Oracle Linux 5.5 Intel 12.0.1.116

SPECfp2006
System Processor c/c/t * Peak Base O/S Compiler
Sun Fire X4170 M3 Intel E5-2690, 2.9 GHz 2/16/32 96.8 86.4 Oracle Solaris 11 Studio 12.3
Sun Blade X6270 M3 Intel E5-2690, 2.9 GHz 2/16/32 96.0 85.2 Oracle Solaris 11 Studio 12.3
Sun Fire X4270 M3 Intel E5-2690, 2.9 GHz 2/16/32 95.9 85.1 Oracle Solaris 11 Studio 12.3
Fujitsu CELSIUS R920 Intel E5-2687, 2.9 GHz 2/16/16 93.8 87.6 RHEL 6.1 Intel 12.1.2.273
Sun Fire X4270 M2 Intel X5690, 3.47 GHz 2/12/24 64.2 59.2 Oracle Solaris 10 Studio 12.2

Only 2-chip server systems listed below, excludes workstations.

SPECint_rate2006
System Processor Base
Copies
c/c/t * Peak Base O/S Compiler
Sun Blade X6270 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 705 632 Oracle Solaris 11 Studio 12.3
Sun Fire X4270 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 705 630 Oracle Solaris 11 Studio 12.3
Sun Fire X4170 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 702 628 Oracle Solaris 11 Studio 12.3
Cisco UCS C220 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 697 671 RHEL 6.2 Intel 12.1.0.225
Sun Blade X6270 M2 Intel X5690, 3.47 GHz 24 2/12/24 410 386 Oracle Linux 5.5 Intel 12.0.1.116

SPECfp_rate2006
System Processor Base
Copies
c/c/t * Peak Base O/S Compiler
Cisco UCS C240 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 510 496 RHEL 6.2 Intel 12.1.2.273
Sun Fire X4270 M3 Intel E5-2690, 2.9 GHz 64 2/16/32 497 461 Oracle Solaris 11 Studio 12.3
Sun Blade X6270 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 497 460 Oracle Solaris 11 Studio 12.3
Sun Fire X4170 M3 Intel E5-2690, 2.9 GHz 64 2/16/32 495 464 Oracle Solaris 11 Studio 12.3
Sun Fire X4270 M2 Intel E5690, 3.47 GHz 24 2/12/24 273 265 Oracle Linux 5.5 Intel 12.0.1.116

* c/c/t — chips / cores / threads enabled

Configuration Summary and Results

Hardware Configuration:

Sun Fire X4170 M3 server
2 x 2.90 GHz Intel Xeon E5-2690 processors
128 GB memory (16 x 8 GB 2Rx4 PC3-12800R-11, ECC)

Sun Fire X4270 M3 server
2 x 2.90 GHz Intel Xeon E5-2690 processors
128 GB memory (16 x 8 GB 2Rx4 PC3-12800R-11, ECC)

Sun Blade X6270 M3 server module
2 x 2.90 GHz Intel Xeon E5-2690 processors
128 GB memory (16 x 8 GB 2Rx4 PC3-12800R-11, ECC)

Software Configuration:

Oracle Solaris 11 11/11 (SRU2)
Oracle Solaris Studio 12.3 (patch update 1 nightly build 120313) Oracle Linux Server Release 6.1
Intel C++ Studio XE 12.1.0.225
SPEC CPU2006 V1.2

Benchmark Description

SPEC CPU2006 is SPEC's most popular benchmark. It measures:

  • Speed — single copy performance of chip, memory, compiler
  • Rate — multiple copy (throughput)

The benchmark is also divided into integer intensive applications and floating point intensive applications:

  • integer: 12 benchmarks derived from real applications such as perl, gcc, XML processing, and pathfinding
  • floating point: 17 benchmarks derived from real applications, including chemistry, physics, genetics, and weather.

It is also divided depending upon the amount of optimization allowed:

  • base: optimization is consistent per compiled language, all benchmarks must be compiled with the same flags per language.
  • peak: specific compiler optimization is allowed per application.

The overall metrics for the benchmark which are commonly used are:

  • SPECint_rate2006, SPECint_rate_base2006: integer, rate
  • SPECfp_rate2006, SPECfp_rate_base2006: floating point, rate
  • SPECint2006, SPECint_base2006: integer, speed
  • SPECfp2006, SPECfp_base2006: floating point, speed

See here for additional information.

See Also

Disclosure Statement

SPEC and the benchmark names SPECfp and SPECint are registered trademarks of the Standard Performance Evaluation Corporation. Results as of 10 April 2012 from www.spec.org and this report.

SPEC CPU2006 Results on Oracle's Netra Server X3-2

Oracle's Netra Server X3-2 (formerly Sun Netra X4270 M3) equipped with the new Intel Xeon processor E5-2658, is up to 2.5x faster than the previous generation Netra systems on SPEC CPU2006 workloads.

Performance Landscape

Complete benchmark results are at the SPEC website, SPEC CPU2006 results. The tables below provide the new Oracle results and previous generation results.

SPECint2006
System Processor c/c/t * Peak Base O/S Compiler
Netra Server X3-2
Intel E5-2658, 2.1 GHz 2/16/32 38.5 36.0 Oracle Linux 6.1 Intel 12.1.0.225
Sun Netra X4270 Intel L5518, 2.13 GHz 2/8/16 27.9 25.0 Oracle Linux 5.4 Intel 11.1
Sun Netra X4250 Intel L5408, 2.13 GHz 2/8/8 20.3 17.9 SLES 10 SP1 Intel 11.0

SPECfp2006
System Processor c/c/t * Peak Base O/S Compiler
Netra Server X3-2 Intel E5-2658, 2.1 GHz 2/16/32 65.3 61.6 Oracle Linux 6.1 Intel 12.1.0.225
Sun Netra X4270 Intel L5518, 2.13 GHz 2/8/16 32.5 29.4 Oracle Linux 5.4 Intel 11.1
Sun Netra X4250 Intel L5408, 2.13 GHz 2/8/8 18.5 17.7 SLES 10 SP1 Intel 11.0

SPECint_rate2006
System Processor Base
Copies
c/c/t * Peak Base O/S Compiler
Netra Server X3-2 Intel E5-2658, 2.1 GHz 32 2/16/32 477 455 Oracle Linux 6.1 Intel 12.1.0.225
Sun Netra X4270 Intel L5518, 2.13 GHz 16 2/8/16 201 189 Oracle Linux 5.4 Intel 11.1
Sun Netra X4250 Intel L5408, 2.13 GHz 8 2/8/8 103 82.0 SLES 10 SP1 Intel 11.0

SPECfp_rate2006
System Processor Base
Copies
c/c/t * Peak Base O/S Compiler
Netra Server X3-2 Intel E5-2658, 2.1 GHz 32 2/16/32 392 383 Oracle Linux 6.1 Intel 12.1.0.225
Sun Netra X4270 Intel L5518, 2.13 GHz 16 2/8/16 155 153 Oracle Linux 5.4 Intel 11.1
Sun Netra X4250 Intel L5408, 2.13 GHz 8 2/8/8 55.9 52.3 SLES 10 SP1 Intel 11.0

* c/c/t — chips / cores / threads enabled

Configuration Summary

Hardware Configuration:

Netra Server X3-2
2 x 2.10 GHz Intel Xeon E5-2658 processors
128 GB memory (16 x 8 GB 2Rx4 PC3-12800R-11, ECC)

Software Configuration:

Oracle Linux Server Release 6.1
Intel C++ Studio XE 12.1.0.225
SPEC CPU2006 V1.2

Benchmark Description

SPEC CPU2006 is SPEC's most popular benchmark. It measures:

  • Speed — single copy performance of chip, memory, compiler
  • Rate — multiple copy (throughput)

The benchmark is also divided into integer intensive applications and floating point intensive applications:

  • integer: 12 benchmarks derived from real applications such as perl, gcc, XML processing, and pathfinding
  • floating point: 17 benchmarks derived from real applications, including chemistry, physics, genetics, and weather.

It is also divided depending upon the amount of optimization allowed:

  • base: optimization is consistent per compiled language, all benchmarks must be compiled with the same flags per language.
  • peak: specific compiler optimization is allowed per application.

The overall metrics for the benchmark which are commonly used are:

  • SPECint_rate2006, SPECint_rate_base2006: integer, rate
  • SPECfp_rate2006, SPECfp_rate_base2006: floating point, rate
  • SPECint2006, SPECint_base2006: integer, speed
  • SPECfp2006, SPECfp_base2006: floating point, speed

See here for additional information.

See Also

Disclosure Statement

SPEC and the benchmark names SPECfp and SPECint are registered trademarks of the Standard Performance Evaluation Corporation. Results as of 10 July 2012 from www.spec.org and this report.

Thursday Mar 29, 2012

Sun Server X2-8 (formerly Sun Fire X4800 M2) Delivers World Record TPC-C for x86 Systems

Oracle's Sun Server X2-8 (formerly Sun Fire X4800 M2 server) equipped with eight 2.4 GHz Intel Xeon Processor E7-8870 chips obtained a result of 5,055,888 tpmC on the TPC-C benchmark. This result is a world record for x86 servers. Oracle demonstrated this world record database performance running Oracle Database 11g Release 2 Enterprise Edition with Partitioning.

  • The Sun Server X2-8 delivered a new x86 TPC-C world record of 5,055,888 tpmC with a price performance of $0.89/tpmC using Oracle Database 11g Release 2. This configuration is available 7/10/12.

  • The Sun Server X2-8 delivers 3.0x times better performance than the next 8-processor result, an IBM System p 570 equipped with POWER6 processors.

  • The Sun Server X2-8 has 3.1x times better price/performance than the 8-processor 4.7GHz POWER6 IBM System p 570.

  • The Sun Server X2-8 has 1.6x times better performance than the 4-processor IBM x3850 X5 system equipped with Intel Xeon processors.

  • This is the first TPC-C result on any system using eight Intel Xeon Processor E7-8800 Series chips.

  • The Sun Server X2-8 is the first x86 system to get over 5 million tpmC.

  • The Oracle solution utilized Oracle Linux operating system and Oracle Database 11g Enterprise Edition Release 2 with Partitioning to produce the x86 world record TPC-C benchmark performance.

Performance Landscape

Select TPC-C results (sorted by tpmC, bigger is better)

System p/c/t tpmC Price
/tpmC
Avail Database Memory
Size
Sun Server X2-8 8/80/160 5,055,888 0.89 USD 7/10/2012 Oracle 11g R2 4 TB
IBM x3850 X5 4/40/80 3,014,684 0.59 USD 7/11/2011 DB2 ESE 9.7 3 TB
IBM x3850 X5 4/32/64 2,308,099 0.60 USD 5/20/2011 DB2 ESE 9.7 1.5 TB
IBM System p 570 8/16/32 1,616,162 3.54 USD 11/21/2007 DB2 9.0 2 TB

p/c/t - processors, cores, threads
Avail - availability date

Oracle and IBM TPC-C Response times

System tpmC Response Time (sec)
New Order 90th%
Response Time (sec)
New Order Average

Sun Server X2-8 5,055,888 0.210 0.166
IBM x3850 X5 3,014,684 0.500 0.272
Ratios - Oracle Better 1.6x 1.4x 1.3x

Oracle uses average new order response time for comparison between Oracle and IBM.

Graphs of Oracle's and IBM's response times for New-Order can be found in the full disclosure reports on TPC's website TPC-C Official Result Page.

Configuration Summary and Results

Hardware Configuration:

Server
Sun Server X2-8
8 x 2.4 GHz Intel Xeon Processor E7-8870
4 TB memory
8 x 300 GB 10K RPM SAS internal disks
8 x Dual port 8 Gbs FC HBA

Data Storage
10 x Sun Fire X4270 M2 servers configured as COMSTAR heads, each with
1 x 3.06 GHz Intel Xeon X5675 processor
8 GB memory
10 x 2 TB 7.2K RPM 3.5" SAS disks
2 x Sun Storage F5100 Flash Array storage (1.92 TB each)
1 x Brocade 5300 switches

Redo Storage
2 x Sun Fire X4270 M2 servers configured as COMSTAR heads, each with
1 x 3.06 GHz Intel Xeon X5675 processor
8 GB memory
11 x 2 TB 7.2K RPM 3.5" SAS disks

Clients
8 x Sun Fire X4170 M2 servers, each with
2 x 3.06 GHz Intel Xeon X5675 processors
48 GB memory
2 x 300 GB 10K RPM SAS disks

Software Configuration:

Oracle Linux (Sun Fire 4800 M2)
Oracle Solaris 11 Express (COMSTAR for Sun Fire X4270 M2)
Oracle Solaris 10 9/10 (Sun Fire X4170 M2)
Oracle Database 11g Release 2 Enterprise Edition with Partitioning
Oracle iPlanet Web Server 7.0 U5
Tuxedo CFS-R Tier 1

Results:

System: Sun Server X2-8
tpmC: 5,055,888
Price/tpmC: 0.89 USD
Available: 7/10/2012
Database: Oracle Database 11g
Cluster: no
New Order Average Response: 0.166 seconds

Benchmark Description

TPC-C is an OLTP system benchmark. It simulates a complete environment where a population of terminal operators executes transactions against a database. The benchmark is centered around the principal activities (transactions) of an order-entry environment. These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses.

Key Points and Best Practices

  • Oracle Database 11g Release 2 Enterprise Edition with Partitioning scales easily to this high level of performance.

  • COMSTAR (Common Multiprotocol SCSI Target) is the software framework that enables an Oracle Solaris host to serve as a SCSI Target platform. COMSTAR uses a modular approach to break the huge task of handling all the different pieces in a SCSI target subsystem into independent functional modules which are glued together by the SCSI Target Mode Framework (STMF). The modules implementing functionality at SCSI level (disk, tape, medium changer etc.) are not required to know about the underlying transport. And the modules implementing the transport protocol (FC, iSCSI, etc.) are not aware of the SCSI-level functionality of the packets they are transporting. The framework hides the details of allocation providing execution context and cleanup of SCSI commands and associated resources and simplifies the task of writing the SCSI or transport modules.

  • Oracle iPlanet Web Server middleware is used for the client tier of the benchmark. Each web server instance supports more than a quarter-million users while satisfying the response time requirement from the TPC-C benchmark.

See Also

Disclosure Statement

TPC Benchmark C, tpmC, and TPC-C are trademarks of the Transaction Processing Performance Council (TPC). Sun Server X2-8 (8/80/160) with Oracle Database 11g Release 2 Enterprise Edition with Partitioning, 5,055,888 tpmC, $0.89 USD/tpmC, available 7/10/2012. IBM x3850 X5 (4/40/80) with DB2 ESE 9.7, 3,014,684 tpmC, $0.59 USD/tpmC, available 7/11/2011. IBM x3850 X5 (4/32/64) with DB2 ESE 9.7, 2,308,099 tpmC, $0.60 USD/tpmC, available 5/20/2011. IBM System p 570 (8/16/32) with DB2 9.0, 1,616,162 tpmC, $3.54 USD/tpmC, available 11/21/2007. Source: http://www.tpc.org/tpcc, results as of 7/15/2011.

Friday Sep 30, 2011

SPARC T4-2 Server Beats Intel (Westmere AES-NI) on ZFS Encryption Tests

Oracle continues to lead in enterprise security. Oracle's SPARC T4 processors combined with Oracle's Solaris ZFS file system demonstrate faster file system encryption than equivalent systems based on the Intel Xeon Processor 5600 Sequence chips which use AES-NI security instructions.

Encryption is the process where data is encoded for privacy and a key is needed by the data owner to access the encoded data. The benefits of using ZFS encryption are:

  • The SPARC T4 processor is 3.5x to 5.2x faster than the Intel Xeon Processor X5670 that has the AES-NI security instructions in creating encrypted files.

  • ZFS encryption is integrated with the ZFS command set. Like other ZFS operations, encryption operations such as key changes and re-key are performed online.

  • Data is encrypted using AES (Advanced Encryption Standard) with key lengths of 256, 192, and 128 in the CCM and GCM operation modes.

  • The flexibility of encrypting specific file systems is a key feature.

  • ZFS encryption is inheritable to descendent file systems. Key management can be delegated through ZFS delegated administration.

  • ZFS encryption uses the Oracle Solaris Cryptographic Framework which gives it access to SPARC T4 processor and Intel Xeon X5670 processor (Intel AES-NI) hardware acceleration or to optimized software implementations of the encryption algorithms automatically.

Performance Landscape

Below are results running two different ciphers for ZFS encryption. Results are presented for runs without any cipher, labeled clear, and a variety of different key lengths.

Encryption Using AES-CCM Ciphers

MB/sec – 5 File Create* Encryption
Clear AES-256-CCM AES-192-CCM AES-128-CCM
SPARC T4-2 server 3,803 3,167 3,335 3,225
SPARC T3-2 server 2,286 1,554 1,561 1,594
2-Socket 2.93 GHz Xeon X5670 3,325 750 764 773

Speedup T4-2 vs X5670 1.1x 4.2x 4.4x 4.2x
Speedup T4-2 vs T3-2 1.7x 2.0x 2.1x 2.0x

Encryption Using AES-GCM Ciphers

MB/sec – 5 File Create* Encryption
Clear AES-256-GCM AES-192-GCM AES-128-GCM
SPARC T4-2 server 3,618 3,929 3,164 2,613
SPARC T3-2 server 2,278 1,451 1,455 1,449
2-Socket 2.93 GHz Xeon X5670 3,299 749 748 753

Speedup T4-2 vs X5670 1.1x 5.2x 4.2x 3.5x
Speedup T4-2 vs T3-2 1.6x 2.7x 2.2x 1.8x

(*) Maximum Delivered values measured over 5 concurrent mkfile operations.

Configuration Summary

Storage Configuration:

Sun Storage 6780 array
16 x 15K RPM drives
Raid 0 pool
Write back cache enable
Controller cache mirroring disabled for maximum bandwidth for test
Eight 8 Gb/sec ports per host

Server Configuration:

SPARC T4-2 server
2 x SPARC T4 2.85 GHz processors
256 GB memory
Oracle Solaris 11

SPARC T3-2 server
2 x SPARC T3 1.6 GHz processors
Oracle Solaris 11 Express 2010.11

Sun Fire X4270 M2 server
2 x Intel Xeon X5670, 2.93 GHz processors
Oracle Solaris 11

Benchmark Description

The benchmark ran the UNIX command mkfile (1M). Mkfile is a simple single threaded program to create a file of a specified size. The script ran 5 mkfile operations in the background and observed the peak bandwidth observed during the test.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of December 16, 2011.

Monday Sep 19, 2011

Halliburton ProMAX® Seismic Processing on Sun Blade X6270 M2 with Sun ZFS Storage 7320

Halliburton/Landmark's ProMAX® 3D Pre-Stack Kirchhoff Time Migration's (PSTM) single workflow scalability and multiple workflow throughput using various scheduling methods are evaluated on a cluster of Oracle's Sun Blade X6270 M2 server modules attached to Oracle's Sun ZFS Storage 7320 appliance.

Two resource scheduling methods, compact and distributed, are compared while increasing the system load with additional concurrent ProMAX® workflows.

  • Multiple concurrent 24-process ProMAX® PSTM workflow throughput is constant; 10 workflows on 10 nodes finish as fast as 1 workflow on one compute node. Additionally, processing twice the data volume yields similar traces/second throughput performance.

  • A single ProMAX® PSTM workflow has good scaling from 1 to 10 nodes of a Sun Blade X6270 M2 cluster scaling 4.5X. ProMAX® scales to 4.7X on 10 nodes with one input data set and 6.3X with two consecutive input data sets (i.e. twice the data).

  • A single ProMAX® PSTM workflow has near linear scaling of 11x on a Sun Blade X6270 M2 server module when running from 1 to 12 processes.

  • The 12-thread ProMAX® workflow throughput using the distributed scheduling method is equivalent or slightly faster than the compact scheme for 1 to 6 concurrent workflows.

Performance Landscape

Multiple 24-Process Workflow Throughput Scaling

This test measures the system throughput scalability as concurrent 24-process workflows are added, one workflow per node. The per workflow throughput and the system scalability are reported.

Aggregate system throughput scales linearly. Ten concurrent workflows finish in the same time as does one workflow on a single compute node.

Halliburton ProMAX® Pre-Stack Time Migration - Multiple Workflow Scaling


Single Workflow Scaling

This test measures single workflow scalability across a 10-node cluster. Utilizing a single data set, performance exhibits near linear scaling of 11x at 12 processes, and per-node scaling of 4x at 6 nodes; performance flattens quickly reaching a peak of 60x at 240 processors and per-node scaling of 4.7x with 10 nodes.

Running with two consecutive input data sets in the workflow, scaling is considerably improved with peak scaling ~35% higher than obtained using a single data set. Doubling the data set size minimizes time spent in workflow initialization, data input and output.

Halliburton ProMAX® Pre-Stack Time Migration - Single Workflow Scaling

This next test measures single workflow scalability across a 10-node cluster (as above) but limiting scheduling to a maximum of 12-process per node; effectively restricting a maximum of one process per physical core. The speedup relative to a single process, and single node are reported.

Utilizing a single data set, performance exhibits near linear scaling of 37x at 48 processes, and per-node scaling of 4.3x at 6 nodes. Performance of 55x at 120 processors and per-node scaling of 5x with 10 nodes is reached and scalability is trending higher more strongly compared to the the case of two processes running per physical core above. For equivalent total process counts, multi-node runs using only a single process per physical core appear to run between 28-64% more efficiently (96 and 24 processes respectively). With a full compliment of 10 nodes (120 processes) the peak performance is only 9.5% lower than with 2 processes per vcpu (240 processes).

Running with two consecutive input data sets in the workflow, scaling is considerably improved with peak scaling ~35% higher than obtained using a single data set.

Halliburton ProMAX® Pre-Stack Time Migration - Single Workflow Scaling

Multiple 12-Process Workflow Throughput Scaling, Compact vs. Distributed Scheduling

The fourth test compares compact and distributed scheduling of 1, 2, 4, and 6 concurrent 12-processor workflows.

All things being equal, the system bi-section bandwidth should improve with distributed scheduling of a fixed-size workflow; as more nodes are used for a workflow, more memory and system cache is employed and any node memory bandwidth bottlenecks can be offset by distributing communication across the network (provided the network and inter-node communication stack do not become a bottleneck). When physical cores are not over-subscribed, compact and distributed scheduling performance is within 3% suggesting that there may be little memory contention for this workflow on the benchmarked system configuration.

With compact scheduling of two concurrent 12-processor workflows, the physical cores become over-subscribed and performance degrades 36% per workflow. With four concurrent workflows, physical cores are oversubscribed 4x and performance is seen to degrade 66% per workflow. With six concurrent workflows over-subscribed compact scheduling performance degrades 77% per workflow. As multiple 12-processor workflows become more and more distributed, the performance approaches the non over-subscribed case.

Halliburton ProMAX® Pre-Stack Time Migration - Multiple Workflow Scaling

141616 traces x 624 samples


Test Notes

All tests were performed with one input data set (70808 traces x 624 samples) and two consecutive input data sets (2 * (70808 traces x 624 samples)) in the workflow. All results reported are the average of at least 3 runs and performance is based on reported total wall-clock time by the application.

All tests were run with NFS attached Sun ZFS Storage 7320 appliance and then with NFS attached legacy Sun Fire X4500 server. The StorageTek Workload Analysis Tool (SWAT) was invoked to measure the I/O characteristics of the NFS attached storage used on separate runs of all workflows.

Configuration Summary

Hardware Configuration:

10 x Sun Blade X6270 M2 server modules, each with
2 x 3.33 GHz Intel Xeon X5680 processors
48 GB DDR3-1333 memory
4 x 146 GB, Internal 10000 RPM SAS-2 HDD
10 GbE
Hyper-Threading enabled

Sun ZFS Storage 7320 Appliance
1 x Storage Controller
2 x 2.4 GHz Intel Xeon 5620 processors
48 GB memory (12 x 4 GB DDR3-1333)
2 TB Read Cache (4 x 512 GB Read Flash Accelerator)
10 GbE
1 x Disk Shelf
20.0 TB RAID-Z (20 x 1 TB SAS-2, 7200 RPM HDD)
4 x Write Flash Accelerators

Sun Fire X4500
2 x 2.8 GHz AMD 290 processors
16 GB DDR1-400 memory
34.5 TB RAID-Z (46 x 750 GB SATA-II, 7200 RPM HDD)
10 GbE

Software Configuration:

Oracle Linux 5.5
Parallel Virtual Machine 3.3.11 (bundled with ProMAX)
Intel 11.1.038 Compilers
Libraries: pthreads 2.4, Java 1.6.0_01, BLAS, Stanford Exploration Project Libraries

Benchmark Description

The ProMAX® family of seismic data processing tools is the most widely used Oil and Gas Industry seismic processing application. ProMAX® is used for multiple applications, from field processing and quality control, to interpretive project-oriented reprocessing at oil companies and production processing at service companies. ProMAX® is integrated with Halliburton's OpenWorks® Geoscience Oracle Database to index prestack seismic data and populate the database with processed seismic.

This benchmark evaluates single workflow scalability and multiple workflow throughput of the ProMAX® 3D Prestack Kirchhoff Time Migration (PSTM) while processing the Halliburton benchmark data set containing 70,808 traces with 8 msec sample interval and trace length of 4992 msec. Benchmarks were performed with both one and two consecutive input data sets.

Each workflow consisted of:

  • reading the previously constructed MPEG encoded processing parameter file
  • reading the compressed seismic data traces from disk
  • performing the PSTM imaging
  • writing the result to disk

Workflows using two input data sets were constructed by simply adding a second identical seismic data read task immediately after the first in the processing parameter file. This effectively doubled the data volume read, processed, and written.

This version of ProMAX® currently only uses Parallel Virtual Machine (PVM) as the parallel processing paradigm. The PVM software only used TCP networking and has no internal facility for assigning memory affinity and processor binding. Every compute node is running a PVM daemon.

The ProMAX® processing parameters used for this benchmark:

Minimum output inline = 65
Maximum output inline = 85
Inline output sampling interval = 1
Minimum output xline = 1
Maximum output xline = 200 (fold)
Xline output sampling interval = 1
Antialias inline spacing = 15
Antialias xline spacing = 15
Stretch Mute Aperature Limit with Maximum Stretch = 15
Image Gather Type = Full Offset Image Traces
No Block Moveout
Number of Alias Bands = 10
3D Amplitude Phase Correction
No compression
Maximum Number of Cache Blocks = 500000

Primary PSTM business metrics are typically time-to-solution and accuracy of the subsurface imaging solution.

Key Points and Best Practices

  • Multiple job system throughput scales perfectly; ten concurrent workflows on 10 nodes each completes in the same time and has the same throughput as a single workflow running on one node.
  • Best single workflow scaling is 6.6x using 10 nodes.

    When tasked with processing several similar workflows, while individual time-to-solution will be longer, the most efficient way to run is to fully distribute them one workflow per node (or even across two nodes) and run these concurrently, rather than to use all nodes for each workflow and running consecutively. For example, while the best-case configuration used here will run 6.6 times faster using all ten nodes compared to a single node, ten such 10-node jobs running consecutively will overall take over 50% longer to complete than ten jobs one per node running concurrently.

  • Throughput was seen to scale better with larger workflows. While throughput with both large and small workflows are similar with only one node, the larger dataset exhibits 11% and 35% more throughput with four and 10 nodes respectively.

  • 200 processes appears to be a scalability asymptote with these workflows on the systems used.
  • Hyperthreading marginally helps throughput. For the largest model run on 10 nodes, 240 processes delivers 11% more performance than with 120 processes.

  • The workflows do not exhibit significant I/O bandwidth demands. Even with 10 concurrent 24-process jobs, the measured aggregate system I/O did not exceed 100 MB/s.

  • 10 GbE was the only network used and, though shared for all interprocess communication and network attached storage, it appears to have sufficient bandwidth for all test cases run.

See Also

Disclosure Statement

The following are trademarks or registered trademarks of Halliburton/Landmark Graphics: ProMAX®, GeoProbe®, OpenWorks®. Results as of 9/1/2011.

About

BestPerf is the source of Oracle performance expertise. In this blog, Oracle's Strategic Applications Engineering group explores Oracle's performance results and shares best practices learned from working on Enterprise-wide Applications.

Index Pages
Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today