X

Everything you want and need to know about Oracle SPARC systems performance

Recent Posts

SPECjEnterprise2010: SPARC T8-1 World Record Single Chip Results

Oracle's SPARC T8-1 servers have set a world record for the SPECjEnterprise2010 benchmark for solutions using a single application server with one to four chips. The result of 34,259.69 SPECjEnterprise2010 EjOPS used two SPARC T8-1 servers, one server for the application tier and the other server for the database tier. The SPARC T8-1 servers obtained a result of 32,622.97 SPECjEnterprise2010 EjOPS using encrypted data. This secured result used Oracle Advanced Security Transparent Data Encryption (TDE) for the application database tablespaces with the AES-256-CFB cipher. The network connection between the application server and the database server was also encrypted using the secure JDBC. The SPARC T8-1 server solution delivered 77% more performance compared to the two-chip IBM x3650 M5 server result of 19,282.14 SPECjEnterprise2010 EjOPS. The SPARC T8-1 server solution delivered 51% more performance compared to the four-chip IBM Power System S824 server result of 22,543.34 SPECjEnterprise2010 EjOPS. The SPARC T8-1 server based results demonstrated 23% more performance compared to the Oracle Server X6-2 system result of 27,803.39 SPECjEnterprise2010 EjOPS. Oracle holds the top x86 two-chip application server SPECjEnterprise2010 result. The application server used Oracle Fusion Middleware components including the Oracle WebLogic 12.1 application server and Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.8.0_144. The database server was configured with Oracle Database 12c Release 2. For the secure result, the application data was encrypted in the Oracle database using the Oracle Advanced Security Transparent Data Encryption (TDE) feature. Hardware accelerated cryptography support in the SPARC M8 processor for the AES-256-CFB cipher was used to provide data security. The benchmark performance using the secure SPARC T8-1 server configuration with encryption was only 5% less when compared to the peak result. This result demonstrated less than 1 second average response times for all SPECjEnterprise2010 transactions and represents Java EE 5.0 transactions generated by over 279,500 users. Performance Landscape Select single application server results. Complete benchmark results are at the SPEC website, SPECjEnterprise2010 Results. SPECjEnterprise2010 Performance Chart 12/18/2017 Owner EjOPS* Java EE Server DB Server Notes Oracle 34,259.69 1 x SPARC T8-1 1 x 5.0 GHz SPARC M8 Oracle WebLogic 12c (12.2.1) 1 x SPARC T8-1 1 x 5.0 GHz SPARC M8 Oracle Database 12c (12.2.0.1) - Oracle 32,622.97 1 x SPARC T8-1 1 x 5.0 GHz SPARC M8 Oracle WebLogic 12c (12.2.1.2) Network Data Encryption for JDBC 1 x SPARC T8-1 1 x 5.0 GHz SPARC M8 Oracle Database 12c (12.2.0.1) Transparent Data Encryption Secure Oracle 27,803.39 1 x Oracle Server X6-2 2 x 2.2 GHz Intel Xeon E5-2699 v4 Oracle WebLogic 12c (12.2.1.2) 1 x Oracle Server X6-2 2 x 2.2 GHz Intel Xeon E5-2699 v4 Oracle Database 12c (12.1.0.2) - Oracle 25,818.85 1 x SPARC T7-1 1 x 4.13 GHz SPARC M7 Oracle WebLogic 12c (12.1.3) 1 x SPARC T7-1 1 x 4.13 GHz SPARC M7 Oracle Database 12c (12.1.0.2) - Oracle 25,093.06 1 x SPARC T7-1 1 x 4.13 GHz SPARC M7 Oracle WebLogic 12c (12.1.3) Network Data Encryption for JDBC 1 x SPARC T7-1 1 x 4.13 GHz SPARC M7 Oracle Database 12c (12.1.0.2) Transparent Data Encryption Secure IBM 22,543.34 1 x IBM Power S824 4 x 3.5 GHz POWER 8 WebSphere Application Server V8.5 1 x IBM Power S824 4 x 3.5 GHz POWER 8 IBM DB2 10.5 FP3 - IBM 19,282.14 1 x System x3650 M5 2 x 2.6 GHz Intel Xeon E5-2697 v3 WebSphere Application Server V8.5 1 x System x3850 X6 4 x 2.8 GHz Intel Xeon E7-4890 v2 IBM DB2 10.5 FP5 - * SPECjEnterprise2010 EjOPS (bigger is better) Configuration Summary Application Server: 1 x SPARC T8-1 server, with 1 x SPARC M8 processor (5.0 GHz) 1024 GB memory (16 x 64 GB) 2 x 600 GB SAS HDD 2 x 800 GB SAS SSD 4 x Sun Dual Port 10 GbE PCIe 2.0 Networking card with Intel 82599 10 GbE Controller Oracle Solaris 11.3 (11.3.23.0.5) Oracle WebLogic Server 12c (12.2.1.2) Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.8.0_144 Database Server: 1 x SPARC T8-1 server, with 1 x SPARC M8 processor (5.0 GHz) 512 GB memory (16 x 32 GB) 2 x 600 GB SAS HDD 1 x Sun Dual Port 10 GbE PCIe 2.0 Networking card with Intel 82599 10 GbE Controller 4 x 3.2TB Flash Accelerator F320 PCIe Card 2 x 3.2TB Flash Accelerator F320 SFF SSD Oracle Solaris 11.3 (11.3.23.0.5) Oracle Database 12c (12.2.0.1)   Benchmark Description SPECjEnterprise2010 is the third generation of the SPEC organization's J2EE end-to-end industry standard benchmark application. The SPECjEnterprise2010 benchmark has been designed and developed to cover the Java EE 5 specification's significantly expanded and simplified programming model, highlighting the major features used by developers in the industry today. This provides a real world workload driving the Application Server's implementation of the Java EE specification to its maximum potential and allowing maximum stressing of the underlying hardware and software systems, The web zone, servlets, and web services The EJB zone JPA 1.0 Persistence Model JMS and Message Driven Beans Transaction management Database connectivity Moreover, SPECjEnterprise2010 also heavily exercises all parts of the underlying infrastructure that make up the application environment, including hardware, JVM software, database software, JDBC drivers, and the system network.   The primary metric of the SPECjEnterprise2010 benchmark is jEnterprise Operations Per Second (SPECjEnterprise2010 EjOPS). The primary metric for the SPECjEnterprise2010 benchmark is calculated by adding the metrics of the Dealership Management Application in the Dealer Domain and the Manufacturing Application in the Manufacturing Domain. There is NO price/performance metric in this benchmark. Key Points and Best Practices Four Oracle WebLogic server instances on the SPARC T8-1 server were hosted in 4 separate processor sets. The Oracle WebLogic application servers were executed in the FX scheduling class to improve performance by reducing the frequency of context switches. The JVM was run using libumem.so. The Oracle log writer process was run in a separate processor set containing one whole core on the database system. All database foreground processes were run in the FX scheduling class. See Also   SPECjEnterprise2010 Results Page SPARC T8-1 Result Page at SPEC Encrypted SPARC T8-1 Result Page at SPEC Disclosure Statement SPEC and the benchmark name SPECjEnterprise are registered trademarks of the Standard Performance Evaluation Corporation. Results from www.spec.org as of 9/30/2017. SPARC T8-1, 34,259.69 SPECjEnterprise2010 EjOPS (unsecure); SPARC T8-1, 32,622.97 SPECjEnterprise2010 EjOPS (secure); SPARC T7-1, 25,818.84 SPECjEnterprise2010 EjOPS (unsecure); SPARC T7-1, 25,093.06 SPECjEnterprise2010 EjOPS (secure); Oracle Server X6-2, 27,803.39 SPECjEnterprise2010 EjOPS (unsecure); IBM Power S824, 22,543.34 SPECjEnterprise2010 EjOPS (unsecure); IBM x3650 M5, 19,282.14 SPECjEnterprise2010 EjOPS (unsecure);

Oracle's SPARC T8-1 servers have set a world record for the SPECjEnterprise2010 benchmark for solutions using a single application server with one to four chips. The result of...

Benchmark

SPECjbb2015: SPARC T8-1 World Record Single Chip Multi-JVM Result

Oracle's SPARC T8-1 server, using Oracle Solaris and Oracle JDK, produced a world record single-chip SPECjbb2015-MultiJVM benchmark result. This benchmark was designed by the industry to showcase Java performance in the Enterprise. Performance is expressed in terms of two metrics, max-jOPS which is the maximum throughput number and critical-jOPS which is critical throughput under service level agreements (SLAs). The SPARC T8-1 server achieved 153,532 SPECjbb2015-MultiJVM max-jOPS and 89,980 SPECjbb2015-MultiJVM critical-jOPS on the SPECjbb2015 benchmark. On the SPARC T8-1 server's 32 cores, these rates are 4,798 SPECjbb2015-MultiJVM max-jOPS per core and 2,812 SPECjbb2015-MultiJVM critical-jOPS per core. The SPARC T8-1 server delivered 1.5 times more SPECjbb2015-MultiJVM critical-jOPS performance per core than the SPARC T7-1 server which uses SPARC M7 processors. The SPARC T8-1 server also produced 1.3 times more SPECjbb2015-MultiJVM max-jOPS performance per core compared to the SPARC T7-1 server. The SPARC T8-1 server delivered 2.7 times more SPECjbb2015-MultiJVM critical-jOPS performance per core than the Cisco UCS C240 M5 server using Intel Skylake processors (Intel Xeon Platinum 8180 Processor). The SPARC T8-1 server also produced 1.5 times more SPECjbb2015-MultiJVM max-jOPS performance per core compared to the same Cisco UCS C240 M5 server. From SPEC's press release: "The SPECjbb2015 benchmark is based on the usage model of a worldwide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases, and data-mining operations. It exercises Java 7 and higher features, using the latest data formats (XML), communication using compression, and secure messaging." Performance Landscape Single chip results of SPECjbb2015 MultiJVM from www.spec.org as of August 28, 2017 and this report (for SPARC T8-1 results). Results ordered by by SPECjbb2015-MultiJVM max-jOPS, bigger is better. SPECjbb2015 MultiJVM Results System Performance Perf/Core Environment max crit max crit SPARC T8-1 1 x SPARC M8 (5.06 GHz, 1x 32core) 153,532 89,980 4,798 2,812 Oracle Solaris 11.3 JDK 8u144 SPARC T7-1 1 x SPARC M7 (4.13 GHz, 32core) 120,603 60,280 3,769 1,884 Oracle Solaris 11.3 JDK 8u66 HP ProLiant DL380 Gen10 1 x Intel Xeon Platinum 8180 (2.50 GHz, 1x 28core) 84,142 25,431 3,005 908 SUSE Linux Enterprise Server 12 SP2 JDK 8u131 Lenovo ThinkSystem SR650 1 x Intel Xeon Platinum 8180 (2.50 GHz, 1x 28core) 83,909 26,145 2,997 934 SUSE Linux Enterprise Server JDK 8u131 HP ProLiant DL380 Gen10 1 x Intel Xeon Platinum 8180 (2.50 GHz, 1x 28core) 74,086 62,592 2,646 2,235 Red Hat Enterprise Linux Server 7.3 JDK 8u131   Best SPECjbb2015-MultiJVM max-jOPS per core results (all chip counts) from www.spec.org as of August 28, 2017 and this report (for SPARC T8-1 results). Results ordered by Perf/Core, bigger is better. SPECjbb2015 MultiJVM Results System Performance Perf/Core Environment max crit max crit SPARC T8-1 1 x SPARC M8 (5.06 GHz, 1x 32core) 153,532 89,980 4,798 2,812 Oracle Solaris 11.3 JDK 8u144 Fujitsu SPARC M12-2S 1 x SPARC64 XII (4.25 GHz, 1x 12core) 54,434 34,771 4,536 2,898 Oracle Solaris 11.3 JDK 8u121 IBM Power S812LC 1 x POWER8 (2.92 GHz, 10core) 44,883 13,032 4,488 1,303 Ubuntu 14.04.3 J9 VM SPARC S7-2 2 x SPARC S7 (4.26 GHz, 2x 8core) 65,790 35,812 4,112 2,238 Oracle Solaris 11.3 JDK 8u92 SPARC T7-1 1 x SPARC M7 (4.13 GHz, 32core) 120,603 60,280 3,769 1,884 Oracle Solaris 11.3 JDK 8u66 Cisco UCS C240 M5 2 x Intel Xeon Platinum 8180 (2.50 GHz, 2x 28core) 179,534 58,094 3,206 1,037 SUSE Linux Enterprise Server 12 SP2 JDK 8u131 Lenovo ThinkSystem SR650 2 x Intel Xeon Platinum 8180 (2.50 GHz, 2x 28core) 177,561 54,418 3,170 972 SUSE Linux Enterprise Server JDK 8u131 Huawei FusionServer 2288H V5 2 x Intel Xeon Platinum 8180 (2.50 GHz, 2x 28core) 175,588 48,977 3,135 874 Red Hat Enterprise Linux JDK 8u131 HP ProLiant DL380 Gen10 1 x Intel Xeon Platinum 8180 (2.50 GHz, 1x 28core) 84,142 25,431 3,005 908 SUSE Linux Enterprise Server 12 SP2 JDK 8u131 Lenovo ThinkSystem SR650 1 x Intel Xeon Platinum 8180 (2.50 GHz, 1x 28core) 83,909 26,145 2,997 934 SUSE Linux Enterprise Server JDK 8u131 Huawei RH2288H V3 2 x Intel Xeon E5-2699 v4 (2.2 GHz, 2x 22core) 121,381 38,595 2,759 877 Red Hat 6.7 JDK 8u92 HP ProLiant DL360 Gen9 2 x Intel Xeon E5-2699 v4 (2.2 GHz, 2x 22core) 120,674 29,013 2,743 659 Red Hat 7.2 JDK 8u74 Huawei RH2288H V3 2 x Intel Xeon E5-2699 v3 (2.3 GHz, 2x 18core) 98,673 28,824 2,741 801 Red Hat 6.7 JDK 8u92 HP ProLiant DL380 Gen10 1 x Intel Xeon Platinum 8180 (2.50 GHz, 1x 28core) 74,086 62,592 2,646 2,235 Red Hat Enterprise Linux Server 7.3 JDK 8u131 SPARC T5-2 2 x SPARC T5 (3.6 GHz, 2x 16core) 80,889 37,422 2,528 1,169 Oracle Solaris 11.2 JDK 8u66 HP ProLiant DL380 Gen9 2 x Intel Xeon E5-2699 v4 (2.2 GHz, 2x 22core) 105,690 52,952 2,402 1,203 Red Hat 7.2 JDK 8u72 Lenovo Flex System x240 M5 2 x Intel Xeon E5-2699 v3 (2.3 GHz, 2x 18core) 80,889 43,654 2,247 1,213 Red Hat 6.5 JDK 8u60 Cisco UCS C220 M4 2 x Intel Xeon E5-2699 v4 (2.2 GHz, 2x 22core) 94,667 71,951 2,152 1,635 Red Hat 6.7 JDK 8u74   Note: under Performance, the max column contains SPECjbb2015-MultiJVM max-jOPS results, and the crit column contains SPECjbb2015-MultiJVM critical-jOPS results. Under Perf/Core, the max column contains SPECjbb2015-MultiJVM max-jOPS results divided by their respective core count, and the crit column contains SPECjbb2015-MultiJVM critical-jOPS results divided by their respective core count. The Environment column contains the operating system version, the JDK version, and any special configuration. Configuration Summary System Under Test: SPARC T8-1 Server 2 x SPARC M8 processor (5.06 GHz) 512 GB memory (16 x 32 GB dimms) Oracle Solaris 11.3 (11.3.23.5.0) Java HotSpot 64-Bit Server VM, version 1.8.0_144   Benchmark Description The benchmark description, as found at the SPEC website. The SPECjbb2015 benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is relevant to all audiences who are interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community. Features include: A usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases and data-mining operations. Both a pure throughput metric and a metric that measures critical throughput under service level agreements (SLAs) specifying response times ranging from 10ms to 100ms. Support for multiple run configurations, enabling users to analyze and overcome bottlenecks at multiple layers of the system stack, including hardware, OS, JVM and application layers. Exercising new Java 7 features and other important performance elements, including the latest data formats (XML), communication using compression, and messaging with security. Support for virtualization and cloud environments. See Also SPECjbb2015 Results Website More Information on SPECjbb2015 SPARC T8-1 Server oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Java oracle.com     OTN Disclosure Statement SPEC and the benchmark name SPECjbb are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results from http://www.spec.org as of 8/28/2017. Fujitsu SPARC M12-2S 54,434 SPECjbb2015-MultiJVM max-jOPS, 34,771 SPECjbb2015-MultiJVM critical-jOPS; Cisco UCS C240 M5 179,534 SPECjbb2015-MultiJVM max-jOPS, 58,094 SPECjbb2015-MultiJVM critical-jOPS; Lenovo ThinkSystem SR650 177,561 SPECjbb2015-MultiJVM max-jOPS, 54,418 SPECjbb2015-MultiJVM critical-jOPS; Huawei FusionServer 2288H V5 175,588 SPECjbb2015-MultiJVM max-jOPS, 48,977 SPECjbb2015-MultiJVM critical-jOPS; SPARC T8-1 153,532 SPECjbb2015-MultiJVM max-jOPS, 89,980 SPECjbb2015-MultiJVM critical-jOPS; Huawei RH2288H V3 121,381 SPECjbb2015-MultiJVM max-jOPS, 38,595 SPECjbb2015-MultiJVM critical-jOPS; HP ProLiant DL360 Gen9 120,674 SPECjbb2015-MultiJVM max-jOPS, 29,013 SPECjbb2015-MultiJVM critical-jOPS; SPARC T7-1 120,603 SPECjbb2015-MultiJVM max-jOPS, 60,280 SPECjbb2015-MultiJVM critical-jOPS; HP ProLiant DL380 Gen9 105,690 SPECjbb2015-MultiJVM max-jOPS, 52,952 SPECjbb2015-MultiJVM critical-jOPS; Huawei RH2288H V3 98,673 SPECjbb2015-MultiJVM max-jOPS, 28,824 SPECjbb2015-MultiJVM critical-jOPS; Cisco UCS C220 M4 94,667 SPECjbb2015-MultiJVM max-jOPS, 71,951 SPECjbb2015-MultiJVM critical-jOPS; HP ProLiant DL380 Gen10 84,142 SPECjbb2015-MultiJVM max-jOPS, 25,431 SPECjbb2015-MultiJVM critical-jOPS; Lenovo ThinkSystem SR650 83,909 SPECjbb2015-MultiJVM max-jOPS, 26,145 SPECjbb2015-MultiJVM critical-jOPS; Lenovo Flex System x240 M5 80,889 SPECjbb2015-MultiJVM max-jOPS, 43,654 SPECjbb2015-MultiJVM critical-jOPS; SPARC T5-2 80,889 SPECjbb2015-MultiJVM max-jOPS, 37,422 SPECjbb2015-MultiJVM critical-jOPS; HP ProLiant DL380 Gen10 74,086 SPECjbb2015-MultiJVM max-jOPS, 62,592 SPECjbb2015-MultiJVM critical-jOPS; SPARC S7-2 65,790 SPECjbb2015-MultiJVM max-jOPS, 35,812 SPECjbb2015-MultiJVM critical-jOPS; IBM Power S812LC 44,883 SPECjbb2015-MultiJVM max-jOPS, 13,032 SPECjbb2015-MultiJVM critical-jOPS.

Oracle's SPARC T8-1 server, using Oracle Solaris and Oracle JDK, produced a world record single-chip SPECjbb2015-MultiJVM benchmark result. This benchmark was designed by the industry to showcase Java...

Benchmark

Apache Spark SQL: SPARC T8-1 Up To 2x Advantage Under Load Compared to 2-Chip x86 E5-2630 v4

The table below compares the SPARC T8-1 server and two-chip Intel Xeon Processor E5-2630 v4 server running the same analytic queries and transactions against a Real Cardinality Database (RCDB) with a 600 million rows fact table. All of the following results were run as part of this benchmark effort. Apache Spark SQL RCDB Performance Chart System Elapsed Time in Seconds Q11x Q12x Q13x Q15x Qima Cube Create Pivot Create SPARC T8-1 (16 cores enabled) 1 x SPARC M8 5.3 4.1 5.7 5.9 9.1 12.9 12.3 2-Chip x86 server (20 total cores) 2 x Intel Xeon Processor E5-2630 v4 7.8 5.8 8.0 8.1 13.2 19.4 19.2   SPARC Core*Seconds Advantage SPARC M8 advantage to x86 1.8x 1.8x 1.8x 1.7x 1.8x 1.9x 2.0x Queries – Q11x, Q12x, Q13x, Q15x, Qima Cube creation – Cube Create Pivot table creation – Pivot Create Configuration Summary SPARC Server: 1 x SPARC T8-1 server with 1 x SPARC M8 processors 1 TB memory Oracle Solaris 11.3 Apache Spark 2.1.0 Java SE 8   x86 Server: 1 x Oracle Server X6-2 system with 2 x Intel Xeon Processor E5-2630 v4 256 GB memory Oracle Linux 7.2 Apache Spark 2.1.0 Java SE 8   Benchmark Description The Apache Spark SQL benchmark consists of a set of queries, table scans, cube creation and pivot table creation against a Real Cardinality Database (RCDB) with data stored in local disks. The RCDB data size is about 52 GB on disk. The test executes Apache Spark SQL operations out of memory after reading the entire data-set from the local disks. The RCDB Star Schema consists of a single fact table with over 600 million rows and 4 dimension tables. There are 56 columns with cardinality varying between 5 and 2,000, with exception of primary keys with much higher cardinality (e.g. fact table primary key). The workload consists of 7 unique business queries. For example, "In a specific year, what was the total revenue generated by orders of a specific item?" or "total sales of each shipping mode with each priority level". Cube creation invloves a heavy aggregation of two column features across the fact table with 600 million rows. Pivot table creates a two dimensional output involving a heavy aggregation of three column features across the join of the fact table and a dimension table. The test output reports average query elapsed times from running a query, cube and pivot table creation for 20 iterations. Key Points and Best Practices Multiple executor JVMs sometimes help with performance df.repartition(N) N set to 2x system cores for x86 systems df.repartition(N) N set to 8x system cores for SPARC systems spark.sql.tungsten.enabled true spark.sql.codegen true spark.sql.codegen.wholeStage true spark.sql.inMemoryColumnarStorage.batchSize 16384 spark.sql.inMemoryColumnarStorage.compressed true spark.memory.offHeap.enabled false spark.executor.extraJavaOptions -XX:+UseG1GC All SPARC T8-1 server and x86 results were run with out-of-the-box OS tuning. A list of Spark tunings: spark-default.conf: See Also Apache Spark Apache Spark SQL   SPARC T8-1 Server oracle.com     OTN     Blog Oracle Server X6-2 oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Oracle Linux oracle.com     OTN     Blog Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of September 19, 2017. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

The table below compares the SPARC T8-1 server and two-chip Intel Xeon Processor E5-2630 v4 server running the same analytic queries and transactions against a Real Cardinality Database (RCDB) with a...

Benchmark

Apache Spark ML: SPARC T8-1 Up To 1.8x Advantage Under Load Compared to 2-Chip x86 E5-2630 v4

Oracle's SPARC T8-1 server has shown up to a 1.8x advantage under load compared to a two-chip x86 server with Intel Xeon Processor E5-2630 v4 running Apache Spark 2.1 ML model creations. Performance Landscape The table below compares the SPARC T8-1 server and two-chip Intel Xeon Processor E5-2630 v4 server running the same model creation transactions against a Real Cardinality Database (RCDB) with a 60 million rows fact table and an aviation ontime dataset with 159 million rows. All of the following results were run as part of this benchmark effort. Apache Spark ML Performance Chart System Elapsed Time in Seconds RCDB Workload Ontime Workload Logistic Regression NaÏve Bayes Logistic Regression K-Means Decision Tree SPARC T8-1 (16 cores enabled) 1 x SPARC M8 28.0 12.3 142.1 411 92 2-Chip x86 server (20 total cores) 2 x Intel Xeon Processor E5-2630 v4 39.3 17.4 185.2 545 122   SPARC Core*Seconds Advantage SPARC M8 advantage to x86 1.8x 1.8x 1.6x 1.7x 1.7x Configuration Summary SPARC Server: 1 x SPARC T8-1 server with 1 x SPARC M8 processors 1 TB memory Oracle Solaris 11.3 Apache Spark 2.1.0 Java SE 8   x86 Server: 1 x Oracle Server X6-2 system with 2 x Intel Xeon Processor E5-2630 v4 256 GB memory Oracle Linux 7.2 Apache Spark 2.1.0 Java SE 8   Benchmark Description The Apache Spark ML benchmark consists of a Logistic Regression model creation against a Real Cardinality Database (RCDB), and a set of ML model creations against an aviation ontime dataset with data stored in local disks. The RCDB data size is about 52 GB on disk. The ontime data size is about 10GB on disk. The test executes Apache Spark ML operations out of memory after reading the entire data-set from the local disks. The RCDB Star Schema consists of a single fact table with over 600 million rows. There are 56 columns with cardinality varying between 5 and 2,000, with exception of primary keys with much higher cardinality (e.g. fact table primary key). The ontime dataset contains historical aviation data, ontime performance for US flights gathered between 1987 and 2014. The dataset consists of 159 million rows with 21 columns. Any missing data points in the dataset were replaced by zeros and only numerical columns were used during the model creation phase. For the Naive Bayes test, columns containing negative numbers were not used. A total of between 14 to 17 columns out of the 21 columns available were used for the model creation, depending upon the test being executed. MLlib is Apache Spark's scalable machine learning library, with APIs in Java, Scala and Python. The benchmark exercised the following functions: NaÏve Bayes Logistic Regression K-Means Decision Tree   The test output reports the last elapsed times of creating a ML model after 3 iterations. Key Points and Best Practices Multiple executor JVMs sometimes help with performance df.repartition(N) N set to 2x system cores for x86 systems df.repartition(N) N set to 8x system cores for SPARC systems spark.sql.tungsten.enabled true spark.sql.codegen true spark.sql.codegen.wholeStage true spark.sql.inMemoryColumnarStorage.batchSize 16384 spark.sql.inMemoryColumnarStorage.compressed true spark.memory.offHeap.enabled false spark.executor.extraJavaOptions -XX:+UseG1GC All SPARC T8-1 server and x86 results were run with out-of-the-box OS tuning. A list of Spark tunings: spark-default.conf: See Also Apache Spark Apache Spark ML   Public Government Data, historical aviation   SPARC T8-1 Server oracle.com     OTN     Blog Oracle Server X6-2 oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Oracle Linux oracle.com     OTN     Blog Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of September 19, 2017. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Oracle's SPARC T8-1 server has shown up to a 1.8x advantage under load compared to a two-chip x86 server with Intel Xeon Processor E5-2630 v4 running Apache Spark 2.1 ML model creations. Performance...

Benchmark

Oracle Advanced Analytics: SPARC T8-2 Up To 1.5x Advantage Under Load Compared to 2-Chip x86 E5-2699 v4

Running Oracle Advanced Analytics, comparing Oracle's SPARC M8 processor's performance to an Intel Xeon Processor E5-2699 v4, the SPARC M8 processor delivered an advantage of up to 1.5x for scoring/prediction analysis and up to 1.4x for training/learning analysis. For these scoring/prediction algorithms the SPARC T8-2 server is compared to a two-chip Intel Xeon Processor E5-2699 v4 based server. For the Support Vector Machine algorithm using the Interior Point Method solver (SVM IPM), the SPARC M8 processor shows 1.4x advantage compared to an Intel Xeon Processor E5-2699 v4. For the Generalized Linear Model Regression algorithm (GLM Regression), the SPARC M8 processor shows 1.4x advantage compared to an Intel Xeon Processor E5-2699 v4. For the Generalized Linear Model Classification algorithm (GLM Classification), the SPARC M8 processor shows 1.4x advantage compared to an Intel Xeon Processor E5-2699 v4. For the Support Vector Machine algorithm using the Stochastic Gradient Descent solver (SVM SGD solver), the SPARC M8 processor shows 1.4x advantage compared to an Intel Xeon Processor E5-2699 v4. For the K-Means algorithm, the SPARC M8 processor shows 1.4x advantage compared to an Intel Xeon Processor E5-2699 v4. For these training/learning algorithms the SPARC T8-2 server is compared to a two-chip Intel Xeon Processor E5-2699 v4 based server. For the Support Vector Machine algorithm using the Interior Point Method solver (SVM IPM), the SPARC M8 processor shows 1.4x advantage compared to an Intel Xeon Processor E5-2699 v4. For the Generalized Linear Model Regression algorithm (GLM Regression), the SPARC M8 processor shows 1.4x advantage compared to an Intel Xeon Processor E5-2699 v4. For the Generalized Linear Model Classification algorithm (GLM Classification), the SPARC M8 processor shows 1.4x advantage compared to an Intel Xeon Processor E5-2699 v4. For the Support Vector Machine algorithm using the Stochastic Gradient Descent solver (SVM SGD solver), the SPARC M8 processor shows 1.4x advantage compared to an Intel Xeon Processor E5-2699 v4. For the K-Means algorithm, the SPARC M8 processor shows 1.4x advantage compared to an Intel Xeon Processor E5-2699 v4. Oracle Advanced Analytics is an option of Oracle Database. Training/Learning is the part of Machine Learning (ML) and Statistics that analyzes a sample of data to create a model of what is most interesting for the desired analysis. Typically, this is a compute intensive operation that involves many 64-bit floating-point calculations. The output of the training/learning stage is a model that can analyze huge datasets in a stage called scoring and/or prediction. While training/learning is a very important task, typically most time will be spent in the scoring/prediction state. Performance Landscape All of the following results were run as part of this benchmark effort. Oracle Advanced Analytics Summary Scoring/Prediction Method Attributes Run Time (sec) SPARC Core*Seconds Advantage 2 x E5-2699 v4 44 cores total SPARC M8 64 cores total Supervised SVM IPM Solver 900 47.9 24.2 1.4x GLM Regression 900 59.5 28.7 1.4x GLM Classification 900 40.5 20.6 1.4x SVM SGD Solver 9000 48.9 24.1 1.4x Cluster Model K-Means 9000 86.1 40.2 1.5x   Oracle Advanced Analytics Summary Training/Learning Training/Learning: Creating Model from data Attributes Run Time (sec) SPARC Core*Seconds Advantage 2 x E5-2699 v4 44 cores total SPARC M8 64 cores total Supervised SVM IPM Solver 900 1380 811 1.2x GLM Classification 900 305 180 1.2x SVM SGD Solver 9000 170 83 1.4x GLM Regression 900 100 59 1.2x Cluster Model K-Means 9000 343 168 1.4x Configuration Summary SPARC Configuration: SPARC T8-2 2 x SPARC M8 processors (5.0 GHz, 32 cores per chip) 1 TB memory Oracle Solaris 11.3 Oracle Database 12c Enterprise Edition (12.2.0.2) x86 Configuration: Oracle Server X6-2 2 x Intel Xeon Processor E5-2699 v4 (2.2 GHz, 22 cores per chip) 512 GB memory Oracle Linux 7.2 Oracle Database 12c Enterprise Edition (12.2.0.2)   Benchmark Description The benchmark tests various capabilities of Oracle Advanced Analytics. Statistical analysis was run on historical aviation data, ontime performance for US flights gathered between 1987 and 2014. The dataset sizes were chosen to avoid I/O. The scoring/prediction was tested on a one billion row dataset; the model was first built on a 159 million row dataset. The training/learning was tested on a 640 million row dataset. The degree of parallelism was set to get the optimal performance for each test. The run actual times of the analysis calculations are reported. The following algorithms were tested. Supervised Support Vector Machine algorithm using the Interior Point Method solver (SVM IPM): the tables contained about 900 mining attributes Generalized Linear Model Classification algorithm (GLM Classification): the tables contained about 900 mining attributes Support Vector Machine algorithm using the Stochastic Gradient Descent solver (SVM SGD solver): the tables contained about 9000 mining attributes Generalized Linear Model Regression algorithm (GLM Regression): the tables contained about 900 mining attributes Cluster Model K-Means algorithm: the tables contained about 9000 mining attributes See Also Public Government Data, historical aviation SPARC T8-2 Server oracle.com     OTN     Blog Oracle Server X6-2 oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Oracle Database oracle.com     OTN     Blog Oracle Advanced Analytics oracle.com     OTN     Blog Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of September 19, 2017. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Running Oracle Advanced Analytics, comparing Oracle's SPARC M8 processor's performance to an Intel Xeon Processor E5-2699 v4, the SPARC M8 processor delivered an advantage of up to 1.5x...

Benchmark

Yahoo Cloud Serving Benchmark: SPARC T8-1 and Oracle NoSQL Advantage Over x86 E5-2699 v4 Server Per Core Under Load

Oracle's SPARC T8-1 server delivered 848 Kops/sec on 300 million records for the Yahoo Cloud Serving Benchmark (YCSB) running a 95% read, 5% update workload using Oracle NoSQL Database 4.4. NoSQL is important for Big Data Analysis and for Cloud Computing. The SPARC T8-1 server also delivered 772 Kops/sec with ZFS Encryption on 300 million records for the Yahoo Cloud Serving Benchmark (YCSB) running a 95% read, 5% update workload using Oracle NoSQL Database 4.4. The SPARC T8-1 server was 2.1 times faster per core than a two-chip x86 E5-2699 v4 server running YCSB with a 95% read, 5% update workload. Running tests with encryption, the SPARC T8-1 server was 2.8 times faster per core than a two-chip x86 E5-2699 v4 server running YCSB with a 95% read, 5% update workload. The performance change on the SPARC T8-1 server from Clear to ZFS Encryption was low as 9% maintaining low latency of average 3.5ms.   Performance Landscape The table below compares the SPARC T8-1 server and 2-chip Intel Xeon Processor E5-2699 v4 server. All of the following results were run as part of this benchmark effort. YCSB Benchmark Performance System Insert Mixed Load (95% Read, 5% Update) Security Throughput ops/sec Ave Latency Throughput ops/sec Ave Latency Throughput per core Write msec Read msec Write msec SPARC T8-1 1 x SPARC M8 (1x 32core) 158,845 2.03 848,041 0.61 3.45 26,501 Clear 160,038 2.02 771,734 0.64 3.52 24,117 ZFS Encryption 2-chip x86 server 2 x E5-2699 v4 (2x 22core) 156,229 1.61 552,523 0.61 2.15 12,557 Clear 144,230 1.74 385,817 0.80 3.97 8,769 ZFS Encryption   Configuration Summary SPARC System: SPARC T8-1 server 1 x SPARC M8 processor 512 GB memory 4 x Oracle Flash Accelerator F320 PCIe card 2 x 10 GbE PCIe port   x86 System: Oracle Server X6-2L server 2 x Intel Xeon Processor E5-2699 v4 512 GB memory 2 x Oracle Flash Accelerator F320 PCIe card 1 x Sun Dual Port 10 GbE PCIe 2.0 Low Profile Adapter   Software Configuration: Oracle Solaris 11.3 (11.3.19.5.0) Oracle NoSQL Database, Enterprise Edition 12c (R2.4.4.6) Java SE 8 (build 1.8.0_131) Logical Domains Manager v3.4 (running on the SPARC T8-1)   Benchmark Description The Yahoo Cloud Serving Benchmark (YCSB) is a performance benchmark for cloud database and their systems. The benchmark documentation says: With the many new serving databases available including Sherpa, BigTable, Azure and many more, it can be difficult to decide which system is right for your application, partially because the features differ between systems, and partially because there is not an easy way to compare the performance of one system versus another. The goal of the Yahoo Cloud Serving Benchmark (YCSB) project is to develop a framework and common set of workloads for evaluating the performance of different "key-value" and "cloud" serving stores.   Key Points and Best Practices for T8-1 One Oracle VM for SPARC (LDom) server was created on 1/2 processor with 240 GB memory accessing two PCIe IO slots using the Direct IO feature. The 300 million records were loaded into 6 Shards with the replication factor set to 3. Six processor sets were created to host six Storage Nodes. The default processor set was additionally used for OS and IO interrupts. The processor sets were used for isolation and to ensure a balanced load. Fixed priority class was assigned to Oracle NoSQL Storage Node java processes. The ZFS record size was set to 16K (default 128K) and this worked best for the 95% read, 5% update workload. Two Oracle Server X6-2 systems were used as clients for generating the workload. The server and client systems were connected through a 10 GbE network. The default ZFS encryption algorithm (aes-128-ccm) was used for disk encryption. See Also   Yahoo Cloud Serving Benchmark YCSB Source SPARC T8-1 Server oracle.com     OTN     Blog Oracle Server X6-2L oracle.com     OTN     Blog Oracle Flash Accelerator F320 PCIe Card oracle.com     OTN Oracle Solaris oracle.com     OTN     Blog Oracle NoSQL Database oracle.com     OTN     Blog Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of September 19, 2017. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Oracle's SPARC T8-1 server delivered 848 Kops/sec on 300 million records for the Yahoo Cloud Serving Benchmark (YCSB) running a 95% read, 5% update workload using Oracle NoSQL Database 4.4. NoSQL...

Benchmark

Yahoo Cloud Serving Benchmark: SPARC T8-1 and Apache Cassandra 2.2x Advantage Over x86 E5-2630 v4 Server Per Core

Running Yahoo Cloud Serving Benchmark (YCSB) against a 50 million record Cassandra database, Oracle's SPARC T8-1 server delivered 289,802 ops/sec running a 50% read, 50% write workload. The SPARC T8-1 server was 2.2 times faster per core under load than a two-chip x86 E5-2630 v4 server running YCSB with a 50% read, 50% update workload. The SPARC T8-1 server delivered 5 times better average latency than a two-chip x86 E5-2630 v4 server running YCSB with a 50% read, 50% update workload.   Performance Landscape The table below compares the SPARC T8-1 server and 2-chip x86 E5-2630 v4 server. All of the following results were run as part of this benchmark effort. YCSB Benchmark Performance System Total Cores Used Mixed Load (50% Read, 50% Update) Throughput ops/sec Ave Latency Throughput per core Read msec Write msec SPARC T8-1 1 x SPARC M8 (32 core) 16 289,802 1 1 18,112 x86 E5-2630 v4 server 2 x Intel Xeon Processor E5-2630 v4 @ 2.20GHz (2 x 10core) 20 166,717 5 5 8,336   Oracle Advantage 5x 5x 2.2x   Note: This is comparing a 20-core x86 system to a 16-core (out of 32) SPARC T8-1 server. Configuration Summary SPARC System: SPARC T8-1 server 1 x SPARC M8 processors (5.0 GHz, 32 cores per processor) 512 GB memory 2 x Oracle Flash Accelerator F320 PCIe card Oracle Solaris 11.3 Apache Cassandra 2.2.8 Java SE 9   x86 System: Oracle Server X6-2 2 x Intel Xeon Processor E5-2630 v4 (2.20 GHz, 10 cores per processor) 512 GB memory 4 x Oracle Flash Accelerator F160 PCIe card Oracle Linux Server release 7.1 (3.10.0-229.el7.x86_64) Apache Cassandra 2.2.8 Java SE 8 (build 1.8.0_102-b14)   Benchmark Description The Yahoo Cloud Serving Benchmark (YCSB) is a performance benchmark for cloud database and their systems. The benchmark documentation says: With the many new serving databases available including Sherpa, BigTable, Azure and many more, it can be difficult to decide which system is right for your application, partially because the features differ between systems, and partially because there is not an easy way to compare the performance of one system versus another. The goal of the Yahoo Cloud Serving Benchmark (YCSB) project is to develop a framework and common set of workloads for evaluating the performance of different "key-value" and "cloud" serving stores.   The 50 million records were loaded into a database on the servers under test using the provided YCSB load command: ./bin/ycsb load cassandra2-cql -p hosts="$HOST" -p host="$HOST" -p port=9042 -P workloads/workloadb.load   Key Points and Best Practices Fixed priority class was assigned to Apache Cassandra processes when running on the SPARC T8-1 server. The ZFS record size was set to 16K (default 128K) and this worked best for the 50% read, 50% write workload. On the SPARC T8-1 server, Cassandra's lz4-java.jar was configured to create native lz4.c routines built as a loadable shared library packaged within the jar file. The x86/Linux version of cassandra had a native loadable shared library with the default cassandra installation. The SPARC T8-1 server was restricted to 16 cores by using the "psradm" command to disable the other 16 cores. See Also     Yahoo Cloud Serving Benchmark YCSB Source Apache Cassandra Database SPARC T8-1 Server oracle.com     OTN     Blog Oracle Server X6-2 oracle.com     OTN     Blog Oracle Flash Accelerator F320 PCIe Card oracle.com     OTN Oracle Solaris oracle.com     OTN     Blog Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of September 19, 2017. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Running Yahoo Cloud Serving Benchmark (YCSB) against a 50 million record Cassandra database, Oracle's SPARC T8-1 server delivered 289,802 ops/sec running a 50% read, 50% write workload. The SPARC T8-1...

Benchmark

Parallel Graph AnalytiX (PGX): SPARC T8-1 Up To 2.9x Advantage Under Load Compared to 2-Chip x86 E5-2699 v3

The Parallel Graph AnalytiX (PGX) is a fast, parallel, in-memory graph analytic framework that allows users to load their graph data, run analytic algorithms, and browse or store the result. Five different PGX algorithms were executed on Oracle's SPARC T8-1 server and compared to a two-chip Intel Xeon Processor E5-2699 v3 based server. For the Single Source Shortest Path (SSSP), Bellman-Ford algorithm, a SPARC M8 processor shows a 2.9x advantage compared to an Intel Xeon Processor E5-2699 v3. For the Page Rank algorithm, a SPARC M8 processor shows a 1.9x advantage compared to an Intel Xeon Processor E5-2699 v3. For the Approximate Vertex Betweenness Centrality algorithm, a SPARC M8 processor shows a 2.3x advantage compared to an Intel Xeon Processor E5-2699 v3. For the k-Cores algorithm, a SPARC M8 processor shows a 1.8x advantage compared to an Intel Xeon Processor E5-2699 v3. For the Triangle Counting algorithm, a SPARC M8 processor shows a 1.7x advantage compared to an Intel Xeon Processor E5-2699 v3. Performance Landscape All of the following results were run as part of this benchmark effort. Parallel Graph AnalytiX (PGX) Summary Scale Factor 28 Problem Size Algorithm Run Time (sec) SPARC Core*Seconds Advantage 2 x E5-2699 v3 36 cores total SPARC M8 32 cores total Page Rank 67.2 38.8 1.9x Triangle Counting 1169 761 1.7x Single Source Shortest Path 17.4 6.76 2.9x Approximate Vertex Betweenness Centrality 25.7 12.5 2.3x k-Cores 136 83.5 1.8x Configuration Summary SPARC Configuration: SPARC T8-1 1 x SPARC M8 processors (5.0 GHz, 32 cores per chip) 512 GB memory Oracle Solaris 11.3 Java SE 8 (build 1.8.0_111-b14) PGX Shell 2.5.0   x86 Configuration: Oracle Server X5-2 2 x Intel Xeon Processor E5-2699 v3 (2.3 GHz, 18 cores per chip) 512 GB memory Oracle Linux 7.2 (3.10.0-327.el7.x86_64) Java SE 8 (build 1.8.0_111-b14) PGX Shell 2.5.0   Benchmark Description Graphs are a core part of many analytics workloads. They are very data intensive and stress computational capability of computer systems. Each algorithm typically traverses the entire graph multiple times, while doing certain arithmetic operations during the traversal, and performing (double/single precision) floating point operations. Five computational graph algorithms were run to obtain these results: PageRank, Triangle Counting, Single-Source Shortest Path (SSSP)/Bellman-Ford, Approximate Vertex Betweenness Centrality, and k-Cores, using Graph500/Scale Factor 28 data, which has a size of 64 GB on storage, and contains 121-Million vertices, 4.3-Billion edges. The mathematics of PageRank are entirely general and apply to any graph or network in any domain. Thus, PageRank is now regularly used in bibliometrics, social and information network analysis, and for link prediction and recommendation. The PageRank algorithm counts the number and quality of links to a page to determine a rough estimate of the importance of the website. SSSP (Single Source Shortest Path), Bellman-Ford algorithm, finds the shortest paths from a source vertex to all other vertices in the graph. Often used to find the shortest distance between 2 points, e.g. Google Maps. It is also used in operations research and "six degrees of separation." Triangle Counting counts the number of triangles in a graph and is used in complex network analysis. Approximate Vertex Betweenness Centrality Computes an approximation of the betweenness centrality value for each node. The implementation performs a breadth first search around k randomly selected nodes to partially compute and approximate the betweenness centrality values. k-Cores algorithm computes the k-core decomposition of the graph. Returns the largest k-core value (and vertex ID) found in the graph, and a mapping of each vertex to its computed k-core value. This could be used when looking for the most important vertexes. See Also     PageRank Description Shortest Path Description Bellman-Ford Description k-Cores Description More on PGX OTN     Docs SPARC T8-1 Server oracle.com     OTN     Blog Oracle Server X5-2 oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Java oracle.com     OTN Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of September 19, 2017. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

The Parallel Graph AnalytiX (PGX) is a fast, parallel, in-memory graph analytic framework that allows users to load their graph data, run analytic algorithms, and browse or store the result....

Benchmark

AES Encryption: SPARC M8 Performance, Beats x86 Per Core Under Load

Oracle's cryptography benchmark measures security performance on important AES security modes. Oracle's SPARC M8 processor with its software in silicon security is faster than x86 servers that have the AES-NI instructions. In this test, the performance of on-processor encryption operations is measured (32 KB encryptions). Multiple threads are used to measure each processor's maximum throughput. Oracle's SPARC T8-2 server shows dramatically faster encryption compared to recent x86 two processor servers. SPARC M8 processors running Oracle Solaris 11.3 ran 2.9 times faster executing AES-CFB 256-bit key encryption (in cache) than the Intel Xeon Processor Platinum 8168 (with AES-NI) running Oracle Linux 7.3. SPARC M8 processors running Oracle Solaris 11.3 ran 6.4 times faster executing AES-CFB 256-bit key encryption (in cache) than the Intel Xeon Processor E5-2699 v4 (with AES-NI) running Oracle Linux 7.2. SPARC M8 processors running Oracle Solaris 11.3 ran 5.7 times faster executing AES-CFB 128-bit key encryption (in cache) than the Intel Xeon Processor E5-2699 v4 (with AES-NI) running Oracle Linux 7.2. SPARC M8 processors running Oracle Solaris 11.3 ran 7.8 times faster executing AES-CFB 256-bit key encryption (in cache) than Intel Xeon Processor E5-2699 v3 (with AES-NI) running Oracle Linux 6.5. SPARC M8 processors running Oracle Solaris 11.3 ran 7.0 times faster executing AES-CFB 128-bit key encryption (in cache) than Intel Xeon Processor E5-2699 v3 (with AES-NI) running Oracle Linux 6.5. AES-CFB encryption is used by Oracle Database for Transparent Data Encryption (TDE) which provides security for database storage. The SPARC M8 processor has improved cryptographic support. A second 3-cycle cryptographic unit has been added which allows performance of some ciphers like AES to double in performance. For long-latency operations like SHA, they are are not able to use the additional unit and only see a performance boost because of the improved clock speed. Oracle has also measured SHA digest performance on the SPARC M8 processor. Performance Landscape Presented below are results for running encryption using the AES cipher with the CFB and CBC modes for key sizes of 128, 192 and 256. Decryption performance was similar and is not presented. Results are presented as MB/sec (10**6). All SPARC M8 processor results were run as part of this benchmark effort. All other results were run during previous benchmark efforts. Encryption Performance – AES-CFB (used by Oracle Database) Performance is presented for in-cache AES-CFB128 mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CFB Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CFB SPARC M8 5.06 2 250,181 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC M7 4.13 2 126,948 Oracle Solaris 11.3, libsoftcrypto + libumem Intel Platinum 8168 2.70 2 87,559 Oracle Linux 7.3, IPP/AES-NI SPARC T5 3.60 2 53,794 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 39,034 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 31,924 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 19,964 Oracle Linux 6.5, IPP/AES-NI AES-192-CFB SPARC M8 5.06 2 276,664 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC M7 4.13 2 144,299 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 60,736 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 45,351 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 37,157 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 23,218 Oracle Linux 6.5, IPP/AES-NI AES-128-CFB SPARC M8 5.06 2 311,220 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC M7 4.13 2 166,324 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 68,691 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 54,179 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 44,388 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 27,755 Oracle Linux 6.5, IPP/AES-NI Encryption Performance – AES-CBC Performance is presented for in-cache AES-CBC mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CBC Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CBC SPARC M8 5.06 2 241,720 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC M7 4.13 2 134,278 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 56,788 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 38,943 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 31,894 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 19,961 Oracle Linux 6.5, IPP/AES-NI AES-192-CBC SPARC M8 5.06 2 266,149 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC M7 4.13 2 152,961 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 63,937 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 45,285 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 37,021 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 23,224 Oracle Linux 6.5, IPP/AES-NI AES-128-CBC SPARC M8 5.06 2 304,136 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC M7 4.13 2 175,151 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 72,870 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 54,076 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 44,103 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 27,730 Oracle Linux 6.5, IPP/AES-NI   Configuration Summary SPARC T8-2 server 2 x SPARC M8 processor, 5.0 GHz (64 total cores) 1 TB memory Oracle Solaris 11.3   SPARC T7-2 server 2 x SPARC M7 processor, 4.13 GHz (64 total cores) 1 TB memory Oracle Solaris 11.3   SPARC T5-2 server 2 x SPARC T5 processor, 3.60 GHz (32 total cores) 512 GB memory Oracle Solaris 11.2   Oracle Server X7-2L system 2 x Intel Xeon Processor Platinum 8168, 2.70 GHz (48 total cores) 768 GB memory Oracle Linux 7.3 Intel Integrated Performance Primitives for Linux, Version 2017 (Update 2) 23 Feb 2017   Oracle Server X6-2L system 2 x Intel Xeon Processor E5-2699 v4, 2.20 GHz (44 total cores) 256 GB memory Oracle Linux 7.2 Intel Integrated Performance Primitives for Linux, Version 9.0 (Update 2) 17 Feb 2016   Oracle Server X5-2 system 2 x Intel Xeon Processor E5-2699 v3, 2.30 GHz (36 total cores) 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014   Sun Server X4-2 system 2 x Intel Xeon Processor E5-2697 v2, 2.70 GHz (24 total cores) 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014   Benchmark Description The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-cache and on-chip using various ciphers, including AES-128-CFB, AES-192-CFB, AES-256-CFB, AES-128-CBC, AES-192-CBC and AES-256-CBC. The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various ciphers. They were run using optimized libraries for each platform to obtain the best possible performance. The encryption tests were run with pseudo-random data of size 32 KB. The benchmark tests were designed to run out of cache, so memory bandwidth and latency are not the limitations. See Also   More about AES SPARC T8-2 Server oracle.com     OTN     Blog SPARC T7-2 Server oracle.com     OTN     Blog SPARC T5-2 Server oracle.com     OTN Oracle Server X6-2L oracle.com     OTN     Blog Oracle Server X5-2 oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/19/2017. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Oracle's cryptography benchmark measures security performance on important AES security modes. Oracle's SPARC M8 processor with its software in silicon security is faster than x86 servers that...

Benchmark

SHA Digest Encryption: SPARC M8 Performance, Beats x86 Per Core Under Load

Oracle's cryptography benchmark measures security performance on important Secure Hash Algorithm (SHA) functions. Oracle's SPARC M8 processor with its security software in silicon is faster than recent x86 servers. In this test, the performance of on-processor digest operations is measured for three sizes of plaintext inputs (64, 1024 and 8192 bytes) using three SHA-3 digests (SHA3-512, SHA3-384 and SHA3-256), three SHA-2 digests (SHA512, SHA384, SHA256) and the older, weaker SHA-1 digest. Multiple parallel threads are used to measure each processor's maximum throughput. Oracle's SPARC T8-2 server shows dramatically faster digest computation compared to recent x86 two processor servers. SHA-3 digests are faster to compute on the SPARC M8 processor than their equivalent SHA-2 digests: SPARC M8 processors running Oracle Solaris 11.3 ran 1.2 times faster computing multiple parallel SHA3-512 digests of 8 KB inputs (in cache) than the same processors running SHA512 (SHA-2). SPARC M8 processors running Oracle Solaris 11.3 ran 1.7 times faster computing multiple parallel SHA3-384 digests of 8 KB inputs (in cache) than the same processors running SHA384 (SHA-2). SPARC M8 processors running Oracle Solaris 11.3 ran 3.4 times faster computing multiple parallel SHA3-256 digests of 8 KB inputs (in cache) than the same processors running SHA256 (SHA-2). SHA-2 digests are faster to compute on the SPARC M8 processor than the same digests on Intel Xeon Processor Platinum 8168: SPARC M8 processors running Oracle Solaris 11.3 ran 9.6 times faster computing multiple parallel SHA512 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor Platinum 8168 running Oracle Linux 7.3. SHA-2 digests are faster to compute on the SPARC M8 processor than the same digests on Intel Xeon Processor E5-2699 v4: SPARC M8 processors running Oracle Solaris 11.3 ran 12 times faster computing multiple parallel SHA512 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v4 running Oracle Linux 7.2. SPARC M8 processors running Oracle Solaris 11.3 ran 11 times faster computing multiple parallel SHA256 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v4 running Oracle Linux 7.2. SPARC M8 processors running Oracle Solaris 11.3 ran 4.2 times faster computing multiple parallel SHA1 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v4 running Oracle Linux 7.2. SHA-2 digests are faster to compute on the SPARC M8 processor than the same digests on Intel Xeon Processor E5-2699 v3: SPARC M8 processors running Oracle Solaris 11.3 ran 20 times faster computing multiple parallel SHA512 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v3 running Oracle Linux 6.5. SPARC M8 processors running Oracle Solaris 11.3 ran 16 times faster computing multiple parallel SHA256 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v3 running Oracle Linux 6.5. SPARC M8 processors running Oracle Solaris 11.3 ran 5.5 times faster computing multiple parallel SHA1 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v3 running Oracle Linux 6.5. SHA-1, SHA-2 and SHA-3 operations are an integral part of Oracle Solaris, while on Linux SHA-1 and SHA-2 are performed using the add-on Cryptography for Intel Integrated Performance Primitives for Linux (library). SHA-3 has not been supported in recent releases of the Intel Integrated Performance Primitives for Linux library. The SPARC M8 processor has improved cryptographic support. A second 3-cycle cryptographic unit has been added which allows performance of some ciphers like AES to double in performance. For long-latency operations like SHA, they are are not able to use the additional unit and only see a performance boost because of the improved clock speed. Oracle has also measured AES (CFB, CBC) cryptographic performance on the SPARC M8 processor. Performance Landscape Presented below are results for computing SHA1, SHA256, SHA384, SHA512 and the SHA-3 digests for input plaintext sizes of 64, 1024 and 8192 bytes. Results are presented as MB/sec (10**6). All SPARC M8 processor results were run as part of this benchmark effort. All other results were run during previous benchmark efforts. Digest Performance – SHA-3 (SHA3-512, SHA3-384, SHA3-256) Performance is presented for the SHA-3 digests. Each digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Digest Size (bits) Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M8, 5.0 GHz SHA3-512 59,544 229,476 241,743 SHA3-384 48,135 344,078 348,655 SHA3-256 41,211 404,355 451,565 Digest Performance – SHA512 Performance is presented for SHA512 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M8, 5.0 GHz 61,392 188,436 208,686 2 x SPARC M7, 4.13 GHz 32,331 167,461 185,582 2 x SPARC T5, 3.6 GHz 18,717 73,810 78,997 2 x Intel Xeon Processor Platinum 8168, 2.7 GHz 9,221 19,130 21,765 2 x Intel Xeon Processor E5-2699 v4, 2.2 GHz 6,973 15,412 17,616 2 x Intel Xeon Processor E5-2699 v3, 2.3 GHz 3,949 9,214 10,681 2 x Intel Xeon Processor E5-2697 v2, 2.7 GHz 2,681 6,631 7,701 Digest Performance – SHA384 Performance is presented for SHA384 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M8, 5.0 GHz 63,097 188,410 208,624 2 x SPARC M7, 4.13 GHz 33,233 167,380 185,605 2 x SPARC T5, 3.6 GHz 18,814 73,770 78,997 2 x Intel Xeon Processor E5-2699 v4, 2.2 GHz 6,909 15,353 17,618 2 x Intel Xeon Processor E5-2699 v3, 2.3 GHz 4,061 9,263 10,678 2 x Intel Xeon Processor E5-2697 v2, 2.7 GHz 2,774 6,669 7,706 Digest Performance – SHA256 Performance is presented for SHA256 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M8, 5.0 GHz 61,044 127,544 134,452 2 x SPARC M7, 4.13 GHz 37,368 114,025 120,302 2 x SPARC T5, 3.6 GHz 21,140 49,483 51,114 2 x Intel Xeon Processor E5-2699 v4, 2.2 GHz 5,103 11,174 12,037 2 x Intel Xeon Processor E5-2699 v3, 2.3 GHz 3,446 7,785 8,463 2 x Intel Xeon Processor E5-2697 v2, 2.7 GHz 2,404 5,570 6,037 Digest Performance – SHA1 Performance is presented for SHA1 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M8, 5.0 GHz 56,116 105,750 111,450 2 x SPARC M7, 4.13 GHz 49,247 92,882 97,971 2 x SPARC T5, 3.6 GHz 21,052 40,107 41,584 2 x Intel Xeon Processor E5-2699 v4, 2.2 GHz 8,566 23,901 26,752 2 x Intel Xeon Processor E5-2699 v3, 2.3 GHz 6,677 18,165 20,405 2 x Intel Xeon Processor E5-2697 v2, 2.7 GHz 4,649 13,245 14,842   Configuration Summary SPARC T8-2 server 2 x SPARC M8 processors, 5.0 GHz (64 total cores) 1 TB memory Oracle Solaris 11.3   SPARC T7-2 server 2 x SPARC M7 processors, 4.13 GHz (64 total cores) 1 TB memory Oracle Solaris 11.3   SPARC T5-2 server 2 x SPARC T5 processors, 3.60 GHz (32 total cores) 512 GB memory Oracle Solaris 11.2   Oracle Server X7-2L system 2 x Intel Xeon Processor Platinum 8168, 2.70 GHz (48 total cores) 768 GB memory Oracle Linux 7.3 Intel Integrated Performance Primitives for Linux, Version 2017 (Update 2) 23 Feb 2017   Oracle Server X6-2L system 2 x Intel Xeon Processor E5-2699 v4, 2.20 GHz (44 total cores) 256 GB memory Oracle Linux 7.2 Intel Integrated Performance Primitives for Linux, Version 9.0 (Update 2) 17 Feb 2016   Oracle Server X5-2 system 2 x Intel Xeon Processor E5-2699 v3, 2.30 GHz (36 total cores) 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014   Sun Server X4-2 system 2 x Intel Xeon Processor E5-2697 v2, 2.70 GHz (24 total cores) 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014   Benchmark Description The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-cache and on-chip using various digests, including SHA-1 (SHA1), SHA-2 (SHA256, SHA384, SHA512) and SHA-3 (SHA3-256, SHA3-384, SHA3-512). The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various digests. They were run using optimized libraries for each platform to obtain the best possible performance. The encryption tests were run with pseudo-random data of sizes 64 bytes, 1024 bytes and 8192 bytes. The benchmark tests were designed to run out of cache, so memory bandwidth and latency are not the limitations. See Also   More about Secure Hash Algorithm (SHA) SPARC T8-2 Server oracle.com     OTN     Blog SPARC T7-2 Server oracle.com     OTN     Blog SPARC T5-2 Server oracle.com     OTN Oracle Server X6-2L oracle.com     OTN     Blog Oracle Server X5-2 oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/19/2017. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Oracle's cryptography benchmark measures security performance on important Secure Hash Algorithm (SHA) functions. Oracle's SPARC M8 processor with its security software in silicon is faster...

Benchmark

Memory and Bisection Bandwidth: SPARC M8 Performance

The STREAM benchmark measures delivered memory bandwidth on a variety of memory intensive tasks. Delivered memory bandwidth is key to a server delivering high performance on a wide variety of workloads. The STREAM benchmark is typically run where each chip in the system gets its memory requests satisfied from local memory. This report presents performance of Oracle's SPARC M8 processor based servers and compares their performance to x86 and IBM POWER8 servers. Bisection bandwidth on a server is a measure of the cross-chip data bandwidth between the processors of a system where no memory access is local to the processor. Systems with large cross-chip penalties show dramatically lower bisection bandwidth. Real-world ad hoc workloads tend to perform better on systems with better bisection bandwidth because their memory usage characteristics tend to be chaotic. Systems with slow bisection bandwidth can be very non-uniform in performance. The STREAM benchmark is easy to run and anyone can measure memory bandwidth on a target system (see Key Points and Best Practices section). The eight-chip SPARC M8-8 server delivers over 1.2 TB/sec on the STREAM benchmark. This is 2.9 times the triad bandwidth of an eight-chip x86 E7 v3 server. The SPARC M8-8 server delivered over 10.8 times the triad bisection bandwidth of an eight-chip x86 E7 v3 server. The SPARC T8-4 delivered 2.5 times the STREAM triad bandwidth of a four-chip x86 E7 v3 server and 2.0 times the triad bandwidth of a four-chip IBM Power System S824 server. The SPARC T8-4 server delivered over 3.3 times the triad bisection bandwidth of a four-chip x86 E7 v3 server and 2.8 times the triad bisection bandwidth of a four-chip IBM Power System S824 server. The SPARC T8-2 delivered 1.8 times the STREAM triad bandwidth of a two-chip x86 Intel Xeon Platinum 8168 Processor based server and 2.8 times the STREAM triad bandwidth of a two-chip x86 E5 v4 server. The SPARC T8-2 server delivered over 1.7 times the triad bisection bandwidth of a two-chip x86 Intel Xeon Platinum 8168 Processor based server and over 3.0 times the triad bisection bandwidth of a two-chip x86 E5 v4 server. Performance Landscape The following SPARC, x86, and IBM S824 STREAM results were run as part of this benchmark effort. The IBM S822L result is from the referenced web location. Maximum STREAM Benchmark Performance System Chips Bandwidth (MB/sec - 10^6) Copy Scale Add Triad SPARC M8-8 8 1,167,333 1,163,380 1,270,576 1,268,974 SPARC M7-8 8 995,402 995,727 1,092,742 1,086,305 x86 E7 v3 8 346,771 354,679 445,550 442,184   SPARC T8-4 4 571,945 572,456 641,787 639,944 SPARC T7-4 4 512,080 510,387 556,184 555,374 IBM S824 4 251,533 253,216 322,399 319,561 IBM S822L 4 252,743 247,314 295,556 305,955 x86 E7 v3 4 230,027 232,092 248,761 251,161   SPARC T8-2 2 284,006 293,678 359,654 359,680 SPARC T7-2 2 259,198 259,380 285,835 285,905 x86 8168 2 187,159 195,217 203,593 203,360 x86 E5-2699 v4 2 120,939 121,417 129,775 130,242 x86 E5 v3 2 105,622 105,808 113,116 112,521   SPARC T8-1 1 147,407 147,281 184,526 184,415 SPARC T7-1 1 131,323 131,308 144,956 144,706   All of the following bisection bandwidth results were run as part of this benchmark effort. Bisection Bandwidth Benchmark Performance (Nonlocal STREAM) System Chips Bandwidth (MB/sec - 10^6) Copy Scale Add Triad SPARC M8-8 8 404,170 401,316 473,194 473,647 SPARC M7-8 8 383,479 381,219 375,371 375,851 SPARC T5-8 8 172,195 172,354 250,620 250,858 x86 E7 v3 8 42,636 42,839 43,753 43,744   SPARC T8-4 4 170,147 170,113 171,388 171,273 SPARC T7-4 4 142,549 142,548 142,645 142,729 SPARC T5-4 4 75,926 75,947 76,975 77,061 IBM S824 4 53,940 54,107 60,746 60,939 x86 E7 v3 4 41,636 47,740 51,206 51,333   SPARC T8-2 2 150,048 150,017 152,139 152,109 SPARC T7-2 2 127,372 127,097 129,833 129,592 SPARC T5-2 2 91,530 91,597 91,761 91,984 x86 8168 2 83,192 84,674 89,001 89,035 x86 E5-2699 v4 2 50,153 50,366 50,266 50,265 x86 E5-2699 v3 2 45,211 45,331 47,414 47,251   Configuration Summary SPARC Configurations: SPARC M8-8 8 x SPARC M8 processors (5.06 GHz) 4 TB memory (128 x 32 GB dimms)   SPARC T8-4 4 x SPARC M8 processors (5.06 GHz) 2 TB memory (64 x 32 GB dimms)   SPARC T8-2 2 x SPARC M8 processors (5.06 GHz) 1 TB memory (32 x 32 GB dimms)   SPARC T8-1 1 x SPARC M8 processor (5.06 GHz) 512 GB memory (16 x 32 GB dimms) Oracle Solaris 11.3 Oracle Developer Studio 12.6   x86 Configurations: Oracle Server X5-8 8 x Intel Xeon Processor E7-8995 v3 2 TB memory (128 x 16 GB dimms) Oracle Linux 7.1 Intel Parallel Studio XE Composer Version 2016 compilers   Oracle Server X5-4 4 x Intel Xeon Processor E7-8995 v3 1 TB memory (64 x 16 GB dimms) Oracle Linux 7.1 Intel Parallel Studio XE Composer Version 2016 compilers   Oracle Server X7-2L 2 x Intel Xeon Platinum 8168 Processor 768 GB memory (24 x 32 GB dimms) Oracle Linux 7.3 Intel Parallel Studio XE Composer Version 2017 compilers   Oracle Server X6-2 2 x Intel Xeon Processor E5-2699 v4 256 GB memory (16 x 16 GB dimms) Oracle Linux 7.2 Intel Parallel Studio XE Composer Version 2016 compilers   Oracle Server X5-2 2 x Intel Xeon Processor E5-2699 v3 256 GB memory (16 x 16 GB dimms) Oracle Linux 7.1 Intel Parallel Studio XE Composer Version 2016 compilers   Benchmark Description STREAM The STREAM benchmark measures sustainable memory bandwidth (in MB/s) for simple vector compute kernels. All memory accesses are sequential, so a picture of how fast regular data may be moved through the system is portrayed. Properly run, the benchmark displays the characteristics of the memory system of the machine and not the advantages of running from the system's memory caches. STREAM counts the bytes read plus the bytes written to memory. For the simple Copy kernel, this is exactly twice the number obtained from the bcopy convention. STREAM does this because three of the four kernels (Scale, Add and Triad) do arithmetic, so it makes sense to count both the data read into the CPU and the data written back from the CPU. The Copy kernel does no arithmetic, but, for consistency, counts bytes the same way as the other three. The sequential nature of the memory references is the benchmark's biggest weakness. The benchmark does not expose limitations in a system's interconnect to move data from anywhere in the system to anywhere. Bisection Bandwidth – Easy Modification of STREAM Benchmark To test for bisection bandwidth, processes are bound to processors in sequential order. The memory is allocated in reverse order, so that the memory is placed non-local to the process. The benchmark is then run. If the system is capable of page migration, this feature must be turned off. Key Points and Best Practices The stream benchmark code was compiled for the SPARC M8 processor based systems with the following flags (using cc): -fast -m64 -W2,-Avector:aggressive -xautopar -xreduction -xpagesize=4m   The benchmark code was compiled for the x86 based systems with the following flags (Intel icc compiler): -O3 -m64 -xCORE-AVX2 -ipo -openmp -mcmodel=medium -fno-alias -nolib-inline   On Oracle Solaris, binding is accomplished with either setting the environment variable SUNW_MP_PROCBIND or the OpenMP variables OMP_PROC_BIND and OMP_PLACES. export OMP_NUM_THREADS=512 export SUNW_MP_PROCBIND=0-511 On Oracle Linux systems using Intel compiler, binding is accomplished by setting the environment variable KMP_AFFINITY. export OMP_NUM_THREADS=72 export KMP_AFFINITY='verbose,granularity=fine,proclist=[0-71],explicit'   The source code change in the file stream.c to do the reverse allocation < for (j=STREAM_ARRAY_SIZE-1; j>=0; j--) { a[j] = 1.0; b[j] = 2.0; c[j] = 0.0; } --- > for (j=0; j<STREAM_ARRAY_SIZE; j++) { a[j] = 1.0; b[j] = 2.0; c[j] = 0.0; } See Also   STREAM Benchmark Website IBM S822L Results SPARC M8-8 Server oracle.com     OTN     Blog SPARC T8-4 Server oracle.com     OTN     Blog SPARC T8-2 Server oracle.com     OTN     Blog SPARC T8-1 Server oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Oracle Developer Studio oracle.com     OTN Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 09/19/2017.

The STREAM benchmark measures delivered memory bandwidth on a variety of memory intensive tasks. Delivered memory bandwidth is key to a server delivering high performance on a wide variety...

Benchmark

Oracle Spatial and Graph: SPARC M8 beats Intel E5-2699 v4

Using the Oracle Spatial and Graph option from the Oracle Database, one of Oracle's SPARC M8 processors achieved comparable performance to Oracle Server X6-2 system with two Intel Xeon Processor E5-2699 v4. The SPARC T8-2 server was configured with one SPARC M8 processor. The SPARC T8-2 server was configured using Oracle VM Server for SPARC to create a one-chip virtual machine (VM). The SPARC M8 processor showed a 1.8x advantage per core*seconds running sixty-four queries simultaneously with modest parallelism of 4 per query compared to the two Intel Xeon Processor E5-2699 v4 server. The 32-core SPARC M8 processor completed the same number of queries as two Intel Xeon Processor E5-2699 v4 (Broadwell) chips, total of 44-cores, in 22% less time. Performance Landscape   Oracle Spatial and Graph Server Number of Cores Number of Simultaneous Queries Wallclock Time (seconds) SPARC T8-2 1 x SPARC M8 (5.0 GHz, 1x 32core) 32 64 7.00 x86 E5 v4 server 2 x Intel E5-2699 v4 (2.2 GHz, 2x 22core) 44 64 8.98   SPARC core*seconds advantage 1.8x   Configuration Summary Systems Under Test: SPARC T8-2 server with 1 x SPARC M8 processor (5.0 GHz) 512 GB memory Oracle Solaris 11.3 (11.3.21.5.0) Oracle Database 12c Enterprise Edition Release (12.2.0.1.0)   Oracle Server X6-2 with 2 x Intel Xeon Processor E5-2699 v4 (2.2 GHz) 512 GB memory Oracle Linux 7.3 ( 3.8.13-118.19.3.el7uek.x86_64 ) Oracle Database 12c Enterprise Edition Release (12.2.0.1.0)   Benchmark Description This is a test of Oracle Spatial and Graph, part of Oracle Database core. Oracle Spatial and Graph supports applications that need geographic data or locations. The building block of the test is a spatial query which solves the following problem: given a set of neighborhood regions, find all the zoning regions they intersect with. The test harness can issue a sequential set of spatial queries concurrently by a given number of users. In this test, 64 users were used each issuing 1 query simultaneously. The spatial query can be issued with a selected degree of parallelism to take advantage of the Oracle Parallel Server to potentially increase performance. The sizes of the tables for this test are   neighborhood regions zoning regions 40 MB 72 MB 17K rows 70K rows   and the total allocated space is 6.3 GB. See Also SPARC T8-2 Server oracle.com     OTN     Blog Oracle Server X6-2 oracle.com     OTN     Blog Oracle Spatial and Graph oracle.com     OTN     Blog Oracle Database oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Oracle Linux oracle.com     OTN     Blog Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of September 19, 2017. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Using the Oracle Spatial and Graph option from the Oracle Database, one of Oracle's SPARC M8 processors achieved comparable performance to Oracle Server X6-2 system with two Intel Xeon Processor...

Benchmark

SPEC CPU2017: 1-Chip VM with SPARC M7 Produces Best Per-Chip Throughput

Oracle's SPARC M7 processor delivers best throughput per chip as measured by the newly-announced benchmark suite SPEC CPU2017. The SPARC M7 processor achieved world record per-chip scores: 114 SPECrate2017_int_base, 123 SPECrate2017_int_peak, 111 SPECrate2017_fp_base, 118 SPECrate2017_fp_peak. The SPARC M7 processor beat the Fujitsu PRIMERGY RX2560 M2 with Intel Xeon Processor E5-2699A v4 per chip by: 1.4 times on SPECrate2017_int_base 1.5 times on SPECrate2017_fp_base The SPARC M7 processor beat the Dell PowerEdge R930 with Intel Xeon Processor E7-8890 v4 per chip by: 1.4 times on SPECrate2017_int_base 1.4 times on SPECrate2017_fp_base The SPARC M7 processor beat the HPE Integrity Superdome X with Intel Xeon Processor E7-8890 v4 per chip by: 1.5 times on SPECrate2017_int_base 1.6 times on SPECrate2017_int_peak 1.5 times on SPECrate2017_fp_base 1.6 times on SPECrate2017_fp_peak The SPEC CPU2017 benchmarks are derived from the compute-intensive portions of real applications, stressing chip, memory hierarchy, and compilers. The benchmarks are not intended to stress other computer components such as networking, the operating system, or the I/O system. Note that there are many other SPEC benchmarks, including benchmarks that specifically focus on Java computing, enterprise computing, and network file systems. Performance Landscape Presented are SPEC CPU2017 rate results ordered by best per chip performance. Complete benchmark results are at the SPEC website. SPEC CPU2017 Rate Results System SPECrate2017 Integer SPARC Advantage Base/Chip Peak/Chip Base Peak Base Peak 1-Chip VM with SPARC M7 1 x SPARC M7 (4.13 GHz, 32 cores) 114 123 114 123 - - Fujitsu PRIMERGY RX2560 M2 2 x Intel E5-2699A v4 (2.4 GHz, 22 cores) 82.0 - 164 - 1.4 - Dell PowerEdge R930 4 x Intel E7-8890 v4 (2.2 GHz, 24 cores) 79.5 - 318 - 1.4 - HPE Integrity Superdome X 16 x Intel E7-8890 v4 (2.2 GHz, 24 cores) 73.8 76.3 1180 1220 1.5 1.6 Inspur NF5280M4 2 x Intel E5-2898 v4 (2.2 GHz, 20 cores) 71.5 74.0 143 148 1.6 1.7 H3C R4900 G2 2 x Intel E5-2620 v4 (2.1 GHz, 8 cores) 26.8 29.7 53.5 59.4 4.3 4.1 Intel ASUS Z170M-PLUS 1 x Intel Core i7-6700K (4.0 GHz, 4 cores) 23.5 - 23.5 - 4.9 - System SPECrate2017 Floating Point SPARC Advantage Base/Chip Peak/Chip Base Peak Base Peak 1-Chip VM with SPARC M7 1 x SPARC M7 (4.13 GHz, 32 cores) 111 118 111 118 - - Dell PowerEdge R930 4 x Intel E7-8890 v4 (2.2 GHz, 24 cores) 79.0 - 316 - 1.4 - Fujitsu PRIMERGY RX2560 M2 2 x Intel E5-2699A v4 (2.4 GHz, 22 cores) 74.5 - 149 - 1.5 - HPE ProLiant DL360 Gen9 Intel E5-2699A v4 (2.4 GHz, 22 cores) 74.5 74.5 149 149 1.5 1.6 HPE ProLiant DL380 Gen9 2 x Intel E5-2699A v4 (2.4 GHz, 22 cores) 73.5 74.0 147 148 1.5 1.6 HPE Integrity Superdome X 16 x Intel E7-8890 v4 (2.2 GHz, 24 cores) 72.5 73.1 1160 1170 1.5 1.6 Inspur NF5280M4 2 x Intel E5-2898 v4 (2.2 GHz, 20 cores) 66.5 66.5 133 133 1.7 1.8 H3C R4900 G2 2 x Intel E5-2620 v4 (2.1 GHz, 8 cores) 38.0 38.9 76.0 77.7 2.9 3.0 Intel ASUS Z170M-PLUS 1 x Intel Core i7-6700K (4.0 GHz, 4 cores) 24.6 - 24.6 - 4.5 -   Configuration Summary System Under Test: SPARC M7-16, One Processor Logical Domain with 1 x SPARC M7 processor (4.13 GHz, 32 cores, 256 hardware threads) 512 GB memory (16 x 32 GB dimms) Oracle Solaris 11.3 Oracle Developer Studio 12.5   The Logical Domain is managed by Oracle VM Server for SPARC v3.3, which is included with Oracle Solaris 11.3. The domain uses one SPARC M7 chip within a SPARC M7-16 server. Benchmark Description SPEC CPU2017 is SPEC's latest update to the CPU series of benchmarks. The focus of CPU2017 is on compute intensive performance and the benchmarks emphasize the performance of the processor, memory hierarchy and compilers. SPEC CPU2017 contains four suites: SPECspeed 2017 Integer – 10 integer benchmarks SPECspeed 2017 Floating Point – 10 floating point benchmarks SPECrate 2017 Integer – 10 integer benchmarks SPECrate 2017 Floating Point – 13 floating point benchmarks   Each of the suites generate two metrics, base and peak, which reflect the amount of optimization allowed. The overall metrics for the benchmark suites which are commonly used are: Suite      Metrics SPECspeed2017 Integer   SPECspeed2017_int_base SPECspeed2017_int_peak SPECspeed2017 Floating Point   SPECspeed2017_fp_base SPECspeed2017_fp_peak SPECrate2017 Integer   SPECrate2017_int_base SPECrate2017_int_peak SPECrate2017 Floating Point   SPECrate2017_fp_base SPECrate2017_fp_peak   See Also SPEC website SPARC M7-16 Server oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Oracle Developer Studio oracle.com     OTN Disclosure Statement SPEC and SPECrate are registered trademarks of the Standard Performance Evaluation Corporation. Results as of June 20, 2017 from www.spec.org. SPARC M7 LDom: 114 SPECrate2017_int_base, 123 SPECrate2017_int_peak, 111 SPECrate2017_fp_base, 118 SPECrate2017_fp_peak; Fujitsu PRIMERGY RX2560 M2: 164 SPECrate2017_int_base, 149 SPECrate2017_fp_base; HPE Integrity Superdome X: 1180 SPECrate2017_int_base, 1120 SPECrate2017_int_peak, 1160 SPECrate2017_fp_base, 1170 SPECrate2017_fp_peak; HPE ProLiant DL360 Gen9: 149 SPECrate2017_fp_base, 149 SPECrate2017_fp_peak; HPE ProLiant DL380 Gen9: 147 SPECrate2017_fp_base, 148 SPECrate2017_fp_peak; Inspur NF5280M4: 143 SPECrate2017_int_base, 148 SPECrate2017_int_peak, 133 SPECrate2017_fp_base, 133 SPECrate2017_fp_peak; Dell PowerEdge R930: 318 SPECrate2017_int_base, 316 SPECrate2017_fp_base; H3C R4900 G2: 53.5 SPECrate2017_int_base, 59.4 SPECrate2017_int_peak, 76.0 SPECrate2017_fp_base, 77.7 SPECrate2017_fp_peak; Intel ASUS Z170M-PLUS: 23.5 SPECrate2017_int_base, 24.6 SPECrate2017_fp_base.

Oracle's SPARC M7 processor delivers best throughput per chip as measured by the newly-announced benchmark suite SPEC CPU2017. The SPARC M7 processor achieved world record per-chip scores: 114...

Java Streams 3x to 22x Faster with Hardware Acceleration on Oracle's SPARC Servers, a Step-by-Step Guide

Thanks to Karthik Ganesan for providing the majority of this content. The Java streams abstraction in JDK 8 provides an easy way to write and efficiently execute aggregate operations on large datasets. A key driver for this work is making parallelism more accessible to developers. In addition, Java Stream programs are often more concise, understandable and maintainable. Oracle's SPARC M7 and S7 processors can seamlessly make Java Streams programs 3x to 22x faster than x86. This is accomplished using the streamoffload Oracle Solaris package. See the blog entry "Accelerating Java Streams Performance Using SPARC with DAX" for more information. The SPARC M7 and S7 processors are designed with “Software in Silicon” functions that can accelerate the execution of key features of Java Streams. These SPARC processors include a hardware unit called the DAX (Data Analytics Accelerator) that accelerates scanning filtering and other operations. Minimum requirement for acceleration Oracle Solaris 11.3.16.3.0 and above SPARC M7, SPARC S7, SPARC T7 server Java 8 Installation instructions pkg set-publisher -G '*' -g http://pkg.oracle.com/solaris/release solaris pkg install streamoffload displays the agreement. To accept and install, use pkg install --accept streamoffload Once this package has been installed, three new files should appear in the /usr/lib/streamoffload directory: comOracleStream.jar, libstreamoffload.so and a README. Usage Add import com.oracle.stream to Java program. Use DaxIntStream.of() function to create the Stream instead of standard IntStream interface. Users can continue to use an IntStream handle for the Stream. Compile with the library: javac -cp /usr/lib/streamoffload/comOracleStream.jar {Program.java} Include the following in the Java run command: -cp .:/usr/lib/streamoffload/comOracleStream.jar -DaccelLibPath=/usr/lib/streamoffload -d64 Supported Java Streams Operations filter() map(ternary) (only if the output space is confined to the values 1 and 0) toArray() anyMatch() allMatch() noneMatch() filter().count() Example Java code: import com.oracle.stream.*; public class Test1 { static int myGlobal = 0; public static void main(String[] args) { int[] intArray = new int[1_000_000]; // Init array IntStream myStream = DaxIntStream.of(intArray).parallel(); boolean matched = myStream.allMatch(x -> (x>myGlobal)&&(x<15)); } } Compile using: javac -cp /usr/lib/streamoffload/comOracleStream.jar: Test1 Run using: java -XX:LargePageSizeInBytes=256m -XX:+UseLargePages -d64 \ -cp .:/usr/lib/streamoffload/comOracleStream.jar \ -DaccelLibPath=/usr/lib/streamoffload -cp . Test1 To measure performance without hardware acceleration, one can use the flag -DDisable_Offload=true   Best Practices Only combinations of the supported operations listed above in a given pipeline will be accelerated, so you move operations listed above into a separate pipeline Underlying source data of the pipeline should be an integer array It is essential that the stream be marked as parallel so the streamoffload library knows it is acceptable to do operations in parallel. For best performance when using the streamoffload library, it is often preferable to use large pages for larger input sizes and smaller pages for smaller input sizes. One can do this using the Java flags -XX:LargePageSizeInBytes -XX:+UseLargePages. A good rule of thumb is to use the smallest page size within which the entire input data will fit. Streamoffload supports lambda expressions under certain conditions. The following are supported in the lambda expression: comparison operators (<,>,<=,>=,==,!=),|| and && logical operators, global variables, Constants, final static local,variables, scalar argument to the lambda The following are not accelerated in lambda expressions: Arithmetic operations inside predicates, Instance or non static local variables. This can be worked around by assigning Instance variables and local variables to a final static variable before usage in the Lambda. See Also Accelerating Java Streams with the SPARC Data Analytics Accelerator Lambdas and Streams in Java 8 Libraries by Brian Goetz, March 25, 2014 Additional SPARC DAX resource (including open API for other languages): SWiSdev.oracle.com/DAX

Thanks to Karthik Ganesan for providing the majority of this content. The Java streams abstraction in JDK 8 provides an easy way to write and efficiently execute aggregate operations on large datasets....

Benchmark

SPC-2: Oracle ZFS Storage ZS5-2 (SSD)

Running the SPC-2 benchmark, the Oracle ZFS Storage ZS5-2 appliance using SSDs delivered SPC-2 Price-Performance of $8.89 and an overall score of 24,397.12 SPC-2 MBPS. The Oracle ZFS Storage ZS5-2 appliance delivered 1.5x improved performance and 1.4x improved price performance over the previous generation Oracle ZFS Storage ZS3-2 appliance as shown by the SPC-2 benchmark. Oracle holds four of the top ten price-performance results on the SPC-2 benchmark, more than any other company. Oracle holds four of the top ten performance results on the SPC-2 benchmark, more than any other company. Oracle's new DE3 SAS3 disk enclosuer provides exceptional bandwidth for the latest generation high performance SSDs. The Oracle ZFS Storage ZS5-2 appliance for this result provided 610 MB/sec/Drive. If you exclude the 4 spare drives in the configuration, it provided 677 MB/sec/Drive. The Oracle ZFS Storage ZS5-2 appliance has a 2.4x performance advantage over the Infortrend EonStor DS 4024B as measured by the SPC-2 benchmark. The Oracle ZFS Storage ZS5-2 appliance has a 2.1x performance advantage over the NetApp EF560 All-Flash Array as measured by the SPC-2 benchmark. The Oracle ZFS Storage ZS5-2 appliance has a 1.6x price-performance advantage over the NetApp E-Series E5660 Storage Array and 3.0x performance advantage as shown by the SPC-2 benchmark. The Oracle ZFS Storage ZS5-2 appliance has a 1.7x price-performance advantage over the Fujitsu Eternus DX200 S3 and 2.8x performance advantage as shown by the SPC-2 benchmark. The Oracle ZFS Storage ZS5-2 appliance has a 3.2x price-performance advantage over the HP XP7 disk array as shown by the SPC-2 benchmark (HP even discounted their hardware 63%). Performance Landscape SPC-2 Price-Performance Below is a table of the best SPC-2 Price-Performance results, presented in increasing price-performance order (as of 04/03/2017). These results may be found at SPC-2 top 10 Price-Performance list. System SPC-2 MBPS $/SPC-2 MBPS Results Identifier Infortrend EonStor DS 4024B 10,030.77 $6.80 B00080 NetApp EF560 All-Flash Array 11,352.17 $8.12 B00078 Oracle ZFS Storage ZS5-2 (SSD) 24,397.12 $8.89 B12001 Oracle ZFS Storage ZS3-2 16,212.66 $12.08 BE00002 Oracle ZFS Storage ZS5-2 (HDD) 19,610.36 $12.93 B12002 NetApp E-Series E5660 Storage Array 8,236.16 $14.74 B00074 Fujitsu Eternus DX200 S3 6,266.50 $15.42 B00071 SGI InfiniteStorage 5600 8,855.70 $15.97 B00065 Huawei OceanStor 6800 V3 42,801.98 $16.89 B00076 Oracle ZFS Storage ZS4-4 31,486.23 $17.09 B00072 SPC-2 MBPS = the Performance Metric $/SPC-2 MBPS = the Price-Performance Metric Results Identifier = A unique identification of the result   SPC-2 Performance The following table lists the top SPC-2 Performance results, presented in decreasing performance order (as of 04/03/2017). These results may be found at the SPC-2 top 10 Performance list. System SPC-2 MBPS $/SPC-2 MBPS TSC Price Results Identifier Fujitsu Eternus DX8900 S3 70,120.92 $24.37 $1,708,835 B00079 HPE 3PAR StoreServe 20850/20840 62,844.45 $19.93 $1,252,724 B00075 EMC VMAX 400K 55,643.78 $33.58 $1,866,568 B00073 HP XP7 storage 43,012.52 $28.30 $1,217,462 B00070 Huawei OceanStor 6800 V3 42,801.98 $16.89 $722,776 B00076 Kaminario K2 33,477.03 $29.79 $997,348 B00068 Oracle ZFS Storage ZS4-4 31,486.23 $17.09 $538,050 B00072 Oracle ZFS Storage ZS5-2 (SSD) 24,397.12 $8.89 $217,012 B12001 Oracle ZFS Storage ZS5-2 (HDD) 19,610.36 $12.93 $253,557 B12002 Oracle ZFS Storage ZS3-4 17,244.22 $22.53 $388,472 B00067 SPC-2 MBPS = the Performance Metric $/SPC-2 MBPS = the Price-Performance Metric TSC Price = Total Cost of Ownership Metric Results Identifier = A unique identification of the result Metric   Complete SPC-2 benchmark results may be found at http://www.storageperformance.org/results/benchmark_results_spc2. Configuration Summary Storage Configuration: Oracle ZFS Storage ZS5-2 storage system in clustered configuration 2 x Oracle ZFS Storage ZS5-2 controllers each with 2 x Intel Xeon processors 384 GB memory (24 * 16 GB) 2 x Oracle Storage Drive Enclosure DE3-24P, each with 20 x 800 GB SSD Benchmark Description SPC Benchmark 2 (SPC-2): Consists of three distinct workloads designed to demonstrate the performance of a storage subsystem during the execution of business critical applications that require the large-scale, sequential movement of data. Those applications are characterized predominately by large I/Os organized into one or more concurrent sequential patterns. A description of each of the three SPC-2 workloads is listed below as well as examples of applications characterized by each workload. Large File Processing: Applications in a wide range of fields, which require simple sequential process of one or more large files such as scientific computing and large-scale financial processing. Large Database Queries: Applications that involve scans or joins of large relational tables, such as those performed for data mining or business intelligence. Video on Demand: Applications that provide individualized video entertainment to a community of subscribers by drawing from a digital film library. SPC-2 is built to: Provide a level playing field for test sponsors. Produce results that are powerful and yet simple to use. Provide value for engineers as well as IT consumers and solution integrators. Is easy to run, easy to audit/verify, and easy to use to report official results. See Also Oracle ZFS Storage ZS5-2 SPC-2 Executive Summary storageperformance.org Complete Oracle ZFS Storage ZS5-2 SPC-2 Full Disclosure Report storageperformance.org Storage Performance Council (SPC) Home Page Oracle ZFS Storage ZS5-2 oracle.com     OTN     Disclosure Statement SPC-2 and SPC-2 MBPS are registered trademarks of Storage Performance Council (SPC). Results as of April 3, 2017, for more information see www.storageperformance.org. Oracle ZFS Storage ZS5-2 (SSD) - B12001, Oracle ZFS Storage ZS5-2 (HDD) - B12002, Oracle ZFS Storage ZS4-4 - B00072, Oracle ZFS Storage ZS3-4 - B00067, Oracle ZFS Storage ZS3-2 - BE00002, EMC VMAX 400K - B00073, Fujitsu ETERNUS DX8900 S3 - B00079, Fujitsu ETERNUS DX200 S3 - B00071, HPE 3PAR StoreServe 20850 - B00075, HPE 3PAR StoreServe 20840 - B00077, HP XP7 Storage Array - B00070, Huawei OceanStor 6800 V3 - B00076, Infortrend EonStor DS 4024B - B00080, Kaminario K2 - B00068, NetApp EF560 All-Flash Array - B00078, NetApp E-Series E5660 Storage Array - B00074, SGI InfiniteStorage 5600 - B00065.

Running the SPC-2 benchmark, the Oracle ZFS Storage ZS5-2 appliance using SSDs delivered SPC-2 Price-Performance of $8.89 and an overall score of 24,397.12 SPC-2 MBPS. The Oracle ZFS Storage ZS5-2...

Benchmark

SPC-2: Oracle ZFS Storage ZS5-2 (HDD)

Running the SPC-2 benchmark, the Oracle ZFS Storage ZS5-2 appliance with 12 trays of HDD delivered SPC-2 Price-Performance of $12.93 and an overall score of 19,610.36 SPC-2 MBPS. Oracle holds four of the top ten price-performance results on the SPC-2 benchmark, more than any other company. Oracle holds four of the top ten performance results on the SPC-2 benchmark, more than any other company. The Oracle ZFS Storage ZS5-2 appliance has nearly a 2.0x performance advantage over the Infortrend EonStor DS 4024B as measured by the SPC-2 benchmark. The Oracle ZFS Storage ZS5-2 appliance has a 1.7x performance advantage over the NetApp EF560 All-Flash Array as measured by the SPC-2 benchmark. The Oracle ZFS Storage ZS5-2 appliance has a 1.1x price-performance advantage over the NetApp E-Series E5660 Storage Array and 2.4x performance advantage as shown by the SPC-2 benchmark. The Oracle ZFS Storage ZS5-2 appliance has a 3.1x price-performance advantage over the Fujitsu Eternus DX200 S3 and 1.2x performance advantage as shown by the SPC-2 benchmark. Performance Landscape SPC-2 Price-Performance Below is a table of the best SPC-2 Price-Performance results, presented in increasing price-performance order (as of 04/03/2017). These results may be found at SPC-2 top 10 Price-Performance list. System SPC-2 MBPS $/SPC-2 MBPS Results Identifier Infortrend EonStor DS 4024B 10,030.77 $6.80 B00080 NetApp EF560 All-Flash Array 11,352.17 $8.12 B00078 Oracle ZFS Storage ZS5-2 (SSD) 24,397.12 $8.89 B12001 Oracle ZFS Storage ZS3-2 16,212.66 $12.08 BE00002 Oracle ZFS Storage ZS5-2 (HDD) 19,610.36 $12.93 B12002 NetApp E-Series E5660 Storage Array 8,236.16 $14.74 B00074 Fujitsu Eternus DX200 S3 6,266.50 $15.42 B00071 SGI InfiniteStorage 5600 8,855.70 $15.97 B00065 Huawei OceanStor 6800 V3 42,801.98 $16.89 B00076 Oracle ZFS Storage ZS4-4 31,486.23 $17.09 B00072 SPC-2 MBPS = the Performance Metric $/SPC-2 MBPS = the Price-Performance Metric Results Identifier = A unique identification of the result SPC-2 Performance The following table lists the top SPC-2 Performance results, presented in decreasing performance order (as of 04/03/2017). These results may be found at the SPC-2 top 10 Performance list. System SPC-2 MBPS $/SPC-2 MBPS TSC Price Results Identifier Fujitsu Eternus DX8900 S3 70,120.92 $24.37 $1,708,835 B00079 HPE 3PAR StoreServe 20850/20840 62,844.45 $19.93 $1,252,724 B00075 EMC VMAX 400K 55,643.78 $33.58 $1,866,568 B00073 HP XP7 storage 43,012.52 $28.30 $1,217,462 B00070 Huawei OceanStor 6800 V3 42,801.98 $16.89 $722,776 B00076 Kaminario K2 33,477.03 $29.79 $997,348 B00068 Oracle ZFS Storage ZS4-4 31,486.23 $17.09 $538,050 B00072 Oracle ZFS Storage ZS5-2 (SSD) 24,397.12 $8.89 $217,012 B12001 Oracle ZFS Storage ZS5-2 (HDD) 19,610.36 $12.93 $253,557 B12002 Oracle ZFS Storage ZS3-4 17,244.22 $22.53 $388,472 B00067 SPC-2 MBPS = the Performance Metric $/SPC-2 MBPS = the Price-Performance Metric TSC Price = Total Cost of Ownership Metric Results Identifier = A unique identification of the result Metric Complete SPC-2 benchmark results may be found at http://www.storageperformance.org/results/benchmark_results_spc2. Configuration Summary Storage Configuration: Oracle ZFS Storage ZS5-2 storage system in clustered configuration 2 x Oracle ZFS Storage ZS5-2 controllers each with 2 x Intel Xeon processors 384 GB memory (24 * 16 GB) 12 x Oracle Storage Drive Enclosure DE3-24P, each with 24 x 600 GB 10000 RPM SAS-3 HDD   Benchmark Description SPC Benchmark 2 (SPC-2): Consists of three distinct workloads designed to demonstrate the performance of a storage subsystem during the execution of business critical applications that require the large-scale, sequential movement of data. Those applications are characterized predominately by large I/Os organized into one or more concurrent sequential patterns. A description of each of the three SPC-2 workloads is listed below as well as examples of applications characterized by each workload. Large File Processing: Applications in a wide range of fields, which require simple sequential process of one or more large files such as scientific computing and large-scale financial processing. Large Database Queries: Applications that involve scans or joins of large relational tables, such as those performed for data mining or business intelligence. Video on Demand: Applications that provide individualized video entertainment to a community of subscribers by drawing from a digital film library. SPC-2 is built to: Provide a level playing field for test sponsors. Produce results that are powerful and yet simple to use. Provide value for engineers as well as IT consumers and solution integrators. Is easy to run, easy to audit/verify, and easy to use to report official results. See Also Oracle ZFS Storage ZS5-2 SPC-2 Executive Summary storageperformance.org Complete Oracle ZFS Storage ZS5-2 SPC-2 Full Disclosure Report storageperformance.org Storage Performance Council (SPC) Home Page Oracle ZFS Storage ZS5-2 oracle.com     OTN     Disclosure Statement SPC-2 and SPC-2 MBPS are registered trademarks of Storage Performance Council (SPC). Results as of April 3, 2017, for more information see www.storageperformance.org. Oracle ZFS Storage ZS5-2 (SSD) - B12001, Oracle ZFS Storage ZS5-2 (HDD) - B12002, Oracle ZFS Storage ZS4-4 - B00072, Oracle ZFS Storage ZS3-4 - B00067, Oracle ZFS Storage ZS3-2 - BE00002, EMC VMAX 400K - B00073, Fujitsu ETERNUS DX8900 S3 - B00079, Fujitsu ETERNUS DX200 S3 - B00071, HPE 3PAR StoreServe 20850 - B00075, HPE 3PAR StoreServe 20840 - B00077, HP XP7 Storage Array - B00070, Huawei OceanStor 6800 V3 - B00076, Infortrend EonStor DS 4024B - B00080, Kaminario K2 - B00068, NetApp EF560 All-Flash Array - B00078, NetApp E-Series E5660 Storage Array - B00074, SGI InfiniteStorage 5600 - B00065.

Running the SPC-2 benchmark, the Oracle ZFS Storage ZS5-2 appliance with 12 trays of HDD delivered SPC-2 Price-Performance of $12.93 and an overall score of 19,610.36 SPC-2 MBPS. Oracle holds four of...

General Information

Improving Algorithms in Spark ML, Open call to community

One of the higher level goals of Spark MLlib should be to improve the efficiency of the ML algorithms that already exist. Currently ML has a reasonable coverage of the important core algorithms. The work to get to feature parity for DataFrame-based API and model persistence are important. Apache Spark needs to use higher-level BLAS3 and LAPACK routines, instead of BLAS1 & BLAS2. For a long time we've used the concept of compute intensity (compute_intensity = FP_operations/Word) to help look at the performance of the underling compute kernels (see the papers referenced below). It has been proven in many implementations that performance, scalability, and huge reduction in memory pressure can be achieved by using higher-level BLAS3 or LAPACK routines in both single node as well as distributed computations. We've performed a survey of some of Apache Spark's 2.1.0 ML algorithms (DataFrame). Unfortunately most of the ML algorithms are implemented with BLAS1 or BLAS2 routines which have very low compute intensity. BLAS2 and BLAS1 routines require a lot more memory bandwidth and will not achieve peak performance on x86, GPUs, IBM Power or Oracle's SPARC processors. Apache Spark 2.1.0 ML Routines (DataFrame) & BLAS Routines ALS (Alternating Least Squares matrix factorization) BLAS2: _SPR, _TPSV BLAS1: _AXPY, _DOT, _SCAL, _NRM2 Logistic regression classification BLAS2: _GEMV BLAS1: _DOT, _SCAL Generalized linear regression BLAS1: _DOT Gradient-boosted tree regression BLAS1: _DOT GraphX SVD++ BLAS1: _AXPY, _DOT,_SCAL Neural Net Multi-layer Perceptron BLAS3: _GEMM BLAS2: _GEMV Only the Neural Net Multi-layer Perceptron uses BLAS3 matrix multiply (DGEMM matrix multiply). BTW the underscores are replaced by S, D, Z, C for (32-bit real, 64-bit double, 32-bit complex, 64-bit complex operations; respectably). Refactoring the algorithms to use BLAS3 routines or higher level LAPACK routines will require coding changes to use sub-block algorithms but the performance benefits can be great. GPUs & Compute Intensity Without higher level routines it will be harder for GPUs to offload accelerate these routines due to the detached connection to the host and the interconnect bottleneck (even with Nvidia's NVlink). Granted these block sizes can be larger which would provide even more performance for the detached GPU performance for BLAS3 only. See Also 1993: Brad Carlile. Parallelism, compute intensity, and data vectorization. SuperComputing'93, November 1993. 1995: John McCalpin. 213876927 Memory Bandwidth and Machine Balance in Current High Performance Computers Accelerating Apache Spark SQL 16x faster than Whole-Stage CodeGen Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

One of the higher level goals of Spark MLlib should be to improve the efficiency of the ML algorithms that already exist. Currently ML has a reasonable coverage of the important core algorithms. The...

General Information

Accelerating Spark SQL Using SPARC with DAX, 16x faster per core over and above Spark 2.1.0 with Tungsten

The explosive growth of data and the opportunity to discover insights from that data have never been greater, but the performance challenges of these massive calculations can be daunting. Apache Spark SQL provides a powerful way for data scientist to easily process lots of data quickly. Apache Spark is rapidly evolving. Spark SQL and the new abstractions of Datasets/DataFrames provides a more expressive and powerful way to write code than previously possible with Spark's lower-level RDDs. In addition Apache Spark's Catalyst engine automatically optimizes Apache Spark SQL code to improve execution speed. The community has been doing a lot of work within the auspices of Project Tungsten to improve the execution of Apache Spark SQL in version 2.1.0 with what is called Whole-stage CodeGen. Dramatic improvements of 10x have been demonstrated by Databrick's engineers. Please see "Apache Spark 2.0 presented by Databricks' Spark Chief Architect Reynold Xin" Oracle libdax Further Accelerates Spark SQL Even with all of these 10x improvements, processing on 10's of Terabytes or more of data can still take a long time. Oracle engineers have prototyped accelerating Spark SQL using some of Oracle's new hardware and software innovations. These innovations have the ability to further increase performance by 10-20x over Whole-stage Codegen (which is 100x to 200x over Apache Spark's Volcano interpreter in earlier versions). Oracle's hardware innovations were created and inspired by the challenges of optimizing SQL processing for in-memory analytics in Oracle Database. The hardware that Oracle designed into current SPARC processors is called the Data Analytics Accelerator (DAX). It is an offload co-processor that uses the same memory as the processor's cores. In addition, Oracle has created an Open API (libdax) so we can easily use the DAX hardware on Apache Spark and other applications. See Open DAX APIs for more. To show the advantages of libdax in Apache Spark SQL, Oracle engineers have created a proof-of-concept implementation to show what is possible with libdax integrated into Apache Spark. We have demonstrated that Oracle's SPARC S7 processor with DAX can provide a 15.6x per core improvement over an x86 (E5 v4) processor in SQL performance on a 2-predicate "Between" query. Both the SPARC S7 and SPARC M7 processors contain the DAX offload co-processor. The DAX shares the same memory controller as the normal cores so the bottleneck of GPU-like data transfers are completely avoided.   This work was debuted at the Apache Spark New York City meetup in December of 2016. A video of this meetup can be seen on youtube. Even without the DAX, the SPARC processors can show better performance than x86 processors. This is because Apache Spark is written in Scala which runs on the JVM. The SPARC M7 and SPARC S7 processors have demonstrated 1.5x performance per core advantage over x86 cores on JVM performance as shown by these benchmark results "SPECjbb2015: SPARC T7-1 World Record for 1 Chip Result" and "SPECjbb2015: SPARC S7-2 Multi-JVM and Distributed Results".  Oracle achieved these results by focusing on Java (JVM) performance as a goal during processor design. Data Analytics Accelerator With the SPARC M7 processor, Oracle added a number of Software in Silicon (SWiS) features by building in higher-level software functions into the processor. One of the new features introduced in the SPARC M7 processor is called the Data Analytics Accelerator, or DAX, and the multiple DAX on the processor deliver unprecedented analytics efficiency. DAX is an integrated co-processor which provides a specialized set of instructions that can run very selective functionality – Scan, Extract, Select, Filter, and Translate – at fast speeds. The multiple DAX share the same memory interface with the processors cores so the DAX can take full advantage of the 140-160 GB/sec memory bandwidth of the SPARC M7 processor. The DAX co-processor were originally intended to speed up Oracle Database and have been used in production to run an SQL query in parallel at query rates of 170 billion rows/second. Other programming methods have also been shown to be able to take advantage of DAX. As an example, consider how the SQL statement that is accelerated in the Oracle Database can be written in other programming styles.  These additional programming styles can also be accelerated using DAX. SQL (Oracle Database): SELECT count(*) from citizen WHERE citizen.age > 18 Apache Spark SQL (DSL - Domain Specific Langauge, or Language Integrated Query): val nvoters : Int = citizen.filter($”citizen.age” > 18).count() Java Streams API: int nvoters = array.parallelStream().filter(citizen->citizenage>18).count(); Eclipse Collections (formerly Goldman Sachs Collections): int nvoters = fastList.count(citizen-> citizen.olderThan(18)); Java 8 Stream API Accelerated by Oracle libdax The Stream API was introduced in Java 8.  See "Processing Data with Java SE 8 Streams, Part 1" and "Part 2: Processing Data with Java SE 8 Streams" for  more on what can be done using Java Streams.  Java Streams can condense verbose collection processing code into simple and readable stream programs.  These programs provide the ability to process data that can be accelerated on modern processors. You can write abstract, query-like code leveraging the Stream API without going into the details of iterating over the collection entities. Eclipse Collections offers an API that can also take advantage of accelerations due to Java Stream performance. Other Applications Accelerated Using libdax The DAX has also been used to accelerate other software applications such as ActivePivot from ActiveViam, and their results of 6x to 8x faster with DAX were reported at the 2016 Oracle Open World. More information on ActivePivot performance can be found at "Accelerating In-Memory Computing with ActivePivot on Oracle". See Also SPECjbb2015: SPARC T7-1 World Record for 1 Chip Result SPECjbb2015: SPARC S7-2 Multi-JVM and Distributed Results Accelerating Java Streams with the SPARC Data Analytics Accelerator Processing Data with Java SE 8 Streams, Part 1 Part 2: Processing Data with Java SE 8 Streams Eclipse Collections Accelerating In-Memory Computing with ActivePivot on Oracle Open DAX APIs Disclosure Statement Copyright 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

The explosive growth of data and the opportunity to discover insights from that data have never been greater, but the performance challenges of these massive calculations can be daunting. Apache Spark...

Benchmark

Yahoo Cloud Serving Benchmark: SPARC S7-2 with Cassandra Shows 2x Advantage Over x86 E5-2699 v3 Server Per Core Under Load

Running Yahoo Cloud Serving Benchmark (YCSB) against a 100 million record Cassandra database, Oracle's SPARC S7-2L server delivered 69,858 ops/sec running a 50% read, 50% write workload. Cassandra is used for Big Data Analysis and for Cloud Computing. The SPARC S7-2L server was 2.0 times faster per core than a two-chip x86 E5-2699 v3 server running YCSB with a 50% read, 50% update workload. Performance Landscape The table below compares the SPARC S7-2L server and 2-chip x86 E5-2699 v3 server. All of the following results were run as part of this benchmark effort. YCSB Benchmark Performance System Total Cores Mixed Load (50% Read, 50% Update) Throughput ops/sec Ave Latency Throughput per core Read msec Write msec SPARC S7-2L 2 x SPARC S7 (2x 8core) 16 69,858 8 7 4,366 x86 E5-2699 v3 server 2 x Intel Xeon Processor E5-2699 v3 (2x 18core) 36 78,617 1 1 2,184 Note: This is comparing a 36-core x86 system to a 16-core SPARC S7-2 server, both are 2-chip servers but the number of cores per chip is very different.   Configuration Summary SPARC System: SPARC S7-2L server 2 x SPARC S7 processors (4.27 GHz, 8 cores per processor) 512 GB memory 2 x Oracle Flash Accelerator F320 PCIe card Oracle Solaris 11.3 (0.5.11-0.175.3.9.0.4.0) Apache Cassandra 2.2.8 Java(TM) SE Runtime Environment (build 1.8.0_92-b14)   x86 System: Oracle Server X5-2L 2 x Intel Xeon Processor E5-2699 v3 (2.3 GHz, 18 cores per processor) 256 GB memory 4 x Oracle Flash Accelerator F160 PCIe card Oracle Linux Server release 7.1 (3.10.0-229.el7.x86_64) Apache Cassandra 2.2.8 Java(TM) SE Runtime Environment ((build 1.8.0_102-b14)   Benchmark Description The Yahoo Cloud Serving Benchmark (YCSB) is a performance benchmark for cloud database and their systems. The benchmark documentation says: With the many new serving databases available including Sherpa, BigTable, Azure and many more, it can be difficult to decide which system is right for your application, partially because the features differ between systems, and partially because there is not an easy way to compare the performance of one system versus another. The goal of the Yahoo Cloud Serving Benchmark (YCSB) project is to develop a framework and common set of workloads for evaluating the performance of different "key-value" and "cloud" serving stores. Key Points and Best Practices The 100 million records were loaded into a database on the server under test. Fixed priority class was assigned to Apache Cassandra processes when running on the SPARC S7-2L system. The ZFS record size was set to 16K (default 128K) and this worked best for the 50% read, 50% write workload. See Also Yahoo Cloud Serving Benchmark YCSB Source Apache Cassandra Database SPARC S7-2L Server oracle.com     OTN     Blog Oracle Server X5-2L oracle.com     OTN     Blog Oracle Flash Accelerator F320 PCIe Card oracle.com     OTN Oracle Solaris oracle.com     OTN     Blog Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of December 6, 2016. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Running Yahoo Cloud Serving Benchmark (YCSB) against a 100 million record Cassandra database, Oracle's SPARC S7-2L server delivered 69,858 ops/sec running a 50% read, 50% write workload. Cassandra is...

General Information

Accelerating Java Streams Performance Using SPARC with DAX

Processing lots of data with Java can require significant computing power. Oracle engineers have taken a two prong approach to improving Java performance. First, on general Java performance, Oracle's SPARC M7 and SPARC S7 processors provide up to a 1.5x performance per core advantage over x86 cores as shown by these benchmark results "SPECjbb2015: SPARC T7-1 World Record for 1 Chip Result" and "SPECjbb2015: SPARC S7-2 Multi-JVM and Distributed Results". This was achieved by focusing on Java (JVM) performance as a goal in processor design. Second, using the Stream API in Java 8, we have a high-level, concise programming API that can be utilized to improve performance. The SPARC M7 and SPARC S7 processors contain the Data Analytics Accelerator (DAX) offload coprocessor that can provide 3x to 22x improvement for Java Streams processing. For more information on Java Streams and the performance benefits of using DAX, see "Accelerating Java Streams with the SPARC Data Analytics Accelerator". Java 8 Stream API The Stream API was introduced in Java 8. See "Processing Data with Java SE 8 Streams, Part 1" and "Part 2: Processing Data with Java SE 8 Streams" for more on what can be done using Java Streams. Java Streams can condense verbose collection processing code into simple and readable stream programs. These programs provide the ability to process data that can be accelerated on modern processors. You can write abstract, query-like code leveraging the Stream API without going into the details of iterating over the collection entities. Eclipse Collections offers an API that can also take advantage of accelerations due to Java Stream performance. Data Analytics Accelerator With the SPARC M7 processor, Oracle added a number of Software in Silicon (SWiS) features by building in higher-level software functions into the processor. One of the new features introduced in the SPARC M7 processor is called the Data Analytics Accelerator, or DAX, and the multiple DAX on the processor deliver unprecedented analytics efficiency. DAX is an integrated coprocessor which provides a specialized set of instructions that can run very selective functionality – Scan, Extract, Select, Filter, and Translate – at fast speeds. The multiple DAX share the same memory interface with the processors cores so the DAX can take full advantage of the 140-160 GB/sec memory bandwidth of the SPARC M7 processor. The DAX coprocessor were originally intended to speed up Oracle Database and have been used in production to run an SQL query in parallel at query rates of 170 billion rows/second. Other programming methods have also been shown to be able to take advantage of DAX.  As an example, consider how the SQL statement that is accelerated in the Oracle Database can be written in other programming styles. These additional programming styles can also be accelerated using DAX. SQL (Oracle Database): SELECT count(*) from citizen WHERE citizen.age > 18 Apache Spark SQL (DSL - Domain Specific Langauge, or Language Integrated Query) val nvoters : Int = citizen.filter($”citizen.age” > 18).count() Java Streams API int nvoters = arrayList.parallelStream().filter(citizen-> citizen.olderThan(18)).count(); Eclipse Collections (formerly Goldman Sachs Collections) int nvoters = fastList.count(citizen-> citizen.olderThan(18)); More Acceleration Using DAX The DAX has also been used to accelerate other software applications such as ActivePivot from ActiveViam, and their results of 6x to 8x faster with DAX were reported at the 2016 Oracle Open World. More information on ActivePivot performance can be found at "Accelerating In-Memory Computing with ActivePivot on Oracle". Summary The following is a summary of the potential performance benefits of using DAX along with Java Streams. See Also SPECjbb2015: SPARC T7-1 World Record for 1 Chip Result SPECjbb2015: SPARC S7-2 Multi-JVM and Distributed Results Accelerating Java Streams with the SPARC Data Analytics Accelerator Processing Data with Java SE 8 Streams, Part 1 Part 2: Processing Data with Java SE 8 Streams Eclipse Collections Accelerating In-Memory Computing with ActivePivot on Oracle Open DAX APIs Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Processing lots of data with Java can require significant computing power. Oracle engineers have taken a two prong approach to improving Java performance. First, on general Java performance, Oracle's...

Benchmark

Oracle Berkeley DB: SPARC S7-2 Faster Per Core Under Load Than Intel E5-2630 v4 System

Oracle's SPARC S7-2 server shows higher throughput performance, 1.6x per core, running a mixed transaction workload using Oracle Berkeley DB compared to  results on a two-chip, Intel Xeon Processor E5-2699 v3 server.  Each instance contains a sum of 50 million rows of customer, account and orders data.  The size of each instance is nearly 20 gigabytes in size. The SPARC S7-2 server delivered a rate of 365,195 transactions per second on the throughput test. On a per-core basis it delivered a rate of 22,825 transactions per second. Oracle Berkeley DB is available for download from Oracle and may be customized use POSIX pthreads locking. Performance Landscape All of the following results were run as part of this benchmark effort. Mixed Workload – Oracle Berkeley DB System Total Cores Performance (tps) Performance /Core x86 with two Intel E5-2699 v3 36 526,931 14,637 x86 with two Intel E5-2630 v4 20 385,189 19,259 x86 with two Intel E5-2630 v3 16 291,708 18,232 SPARC S7-2 with two SPARC S7 16 365,195 22,825   SPARC per core Advantage over E5-2699 v3 1.6x SPARC per core Advantage over E5-2630 v4 1.2x SPARC per core Advantage over E5-2630 v3 1.3x   Configuration Summary Systems Under Test: 1 x SPARC S7-2 server with 2 x SPARC S7 processors, 4.27 GHz, 8 cores per processor 512 GB memory Oracle Solaris 11.3 Oracle Berkeley DB 6.2 1 x Oracle Server X6-2L 2 x Intel Xeon Processor E5-2630 v4, 2.2 GHz, 10 cores per processor 256 GB Memory Oracle Linux 7.2 (3.10.0-327.el7.x86_64) Oracle Berkeley DB 6.2.11 1 x Oracle Server X5-2L 2 x Intel Xeon Processor E5-2699 v3, 2.2 GHz, 18 cores per processor 256 GB Memory Oracle Linux 6.5 (3.8.13-16.2.1.el6uek.x86_64) Oracle Berkeley DB 6.2 1 x Oracle Server X5-2 2 x Intel Xeon Processor E5-2630 v3, 2.4 GHz, 8 cores per processor 256 GB Memory Oracle Linux 7.2 (3.8.13-118.13.2.el7uek.x86_64) Oracle Berkeley DB 6.2   Benchmark Description The benchmark consists of workload running against a schema of 6 tables: 4 tables that get updated: account, branch, teller, history, 2 read-only: customer and orders. The workload has a set of 4 transactions: account update: update account, branch, teller balances. get-order-customer: random read on order to get customer key, then locate and read customer records. search-order: Get a range of orders. search-customer: Get a random customer record. Transaction mix: account update is 5%; the other three (read-only) 95%. The benchmark sampling time was 5 minutes and the total throughput was calculated. Each instance contains a sum of 50 million rows of customer, account and orders data.  The size of each instance is nearly 20 gigabytes in size. Key Points and Best Practices The default mechanism for implementing the Oracle Berkeley DB cache is memory mapped files. Improved performance is obtained using shared memory. For this demonstration, changes are made in the test programs. Changing from shared memory to ISM requires a simple change to the provided Oracle Berkeley DB source code. To add ISM support, the routine os_map.c was modified original: if((infop->addr = shmat(id,NULL,0)) == (void *)-1) ISM: if((infop->addr = shmat(id,NULL,SHM_SHARE_MMU)) == (void *)-1) The POSIX pthreads locking implementation was substituted in place of the default  "hybrid" locking for better performance on both systems. This is a Berkeley library build option specified during the configure step: ./dist/configure \ --enable-posixmutexes \ --with-mutex=POSIX/pthreads See Also SPARC S7-2 Server oracle.com    OTN    Blog   Oracle Solaris oracle.com    OTN    Blog   Oracle Berkeley DB oracle.com    OTN    Blog   Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates.  Other names may be trademarks of their respective owners. Results as of 11/09/2016. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Oracle's SPARC S7-2 server shows higher throughput performance, 1.6x per core, running a mixed transaction workload using Oracle Berkeley DB compared to  results on a two-chip, Intel Xeon Processor...

Benchmark

SPEC SFS2014_swbuild: World Record Oracle ZFS Storage ZS3-2 Appliance Results

The Oracle ZFS Storage ZS3-2 appliance delivered world record performance on the SPEC SFS2014 Software Build benchmark, beating results published on the IBM Elastic Storage Server GL6. The Oracle ZFS Storage ZS3-2 appliance delivered 240 SPEC SFS2014_swbuild Builds with an overall response time (ORT) of 1.71 msec. The Oracle ZFS Storage ZS3-2 appliance delivered 1.5 times greater business metric (Builds) score than the IBMs Spectrum Scale 4.2 with Elastic Storage Server GL6 on the SPEC SFS2014_swbuild benchmark. The Oracle ZFS Storage ZS3-2 appliance has 1.5 times better Builds Ops/Sec than the IBM Spectrum Scale 4.2 with Elastic Storage Server GL6 on the SPEC SFS2014_swbuild benchmark. The Oracle ZFS Storage ZS3-2 appliance was configured with 136 HDDs and 8 SSDs while the IBM Elastic Storage Server GL6 had 348 HDDs. Performance Landscape The top SPEC SFS2014_swbuild results in decreasing business metric (Builds) order as of October 11, 2016.  Complete SPEC SFS2014 benchmark results may be found at www.spec.org/sfs2014/results/sfs2014.html. System Builds ORT (msec) Throughput (Ops/Sec) Oracle ZFS Storage ZS3-2 240 1.71 116,020 IBM Spectrum Scale 4.2 with Elastic Storage Server GL6 160 1.21 80,003   Builds – SPEC SFS2014_swbuild business metric ORT – Overall Response Time Throughput – Builds Ops/Sec Configuration Summary Storage Configuration: Oracle ZFS Storage ZS3-2 storage system in clustered configuration 2 x Oracle ZFS Storage ZS3-2 controllers, each with 2 x 2.1 GHz Intel Xeon E5-2658 processors 256 GB memory 6 x Oracle Storage Drive Enclosure DE2-24P 4 each with 24 x 300 GB 10K RPM SAS-2 drives 2 each with 20 x 300 GB 10K RPM SAS-2 drives, 4 x 73 GB SAS-2 flash-enabled write-cache   Benchmark Description SPEC SFS2014 is a benchmark suite measuring file server throughput and response time, providing a standardized method for comparing performance across different vendor platforms.  SPEC SFS2014 results summarize the server's capabilities with respect to the number of operations that can be handled per second, as well as the overall latency of the operations. The suite is a follow-on to the SPEC SFS2008 benchmark The software build workload is a classic meta-data intensive build workload. This workload was derived from analysis of software builds and traces collected on systems in the software build arena.  Conceptually, these tests are similar to running the software tool make against tens of thousands of files. The file attributes are checked (metadata operations) and if necessary, the file is read, compiled, then data is written back out to storage. See Also Standard Performance Evaluation Corporation (SPEC) Home Page Oracle ZFS Storage ZS3-2 oracle.com    OTN     Disclosure Statement SPEC and SPEC SFS are registered trademarks of Standard Performance Evaluation Corporation (SPEC).  Results as of October 11, 2016, for more information see www.spec.org.  Oracle ZFS Storage ZS3-2 Appliance, 240 SPEC SFS2014_swbuild Builds with an overall response time of 1.71 msec;  IBM Spectrum Scale 4.2 w/Elastic Storage Server GL6, 160 SPEC SFS2014_swbuild Builds with an  overall response time of 1.21 msec.

The Oracle ZFS Storage ZS3-2 appliance delivered world record performance on the SPEC SFS2014 Software Build benchmark, beating results published on the IBM Elastic Storage Server GL6. The Oracle ZFS...

Benchmark

SPECjbb2015: SPARC S7-2 Multi-JVM and Distributed Results

Oracle's SPARC S7-2 server, using Oracle Solaris and Oracle JDK, produced two-chip SPECjbb2015-Distributed and SPECjbb2015-MultiJVM benchmark results.  This benchmark was designed by the industry to showcase Java performance in the Enterprise.  Performance is expressed in terms of two metrics, max-jOPS which is the maximum throughput number and critical-jOPS which is critical throughput under service level agreements (SLAs). Summary for SPECjbb2015-Distributed The SPARC S7-2 server achieved 66,612 SPECjbb2015-Distributed max-jOPS and 36,922 SPECjbb2015-Distributed critical-jOPS on the SPECjbb2015 benchmark.  On the SPARC S7-2 server's 16 cores, these rates are 4,163 SPECjbb2015-Distributed max-jOPS per core and 2,308 SPECjbb2015-Distributed critical-jOPS per core. The two-chip SPARC S7-2 server delivered 1.5 times more SPECjbb2015-Distributed max-jOPS performance per core than the HP ProLiant DL380 Gen9 server using Intel v4 processors.  The SPARC S7-2 server also produced 2.6 times more SPECjbb2015-Distributed  critical-jOPS performance per core compared to the HP ProLiant DL380 Gen9. Summary for SPECjbb2015-MultiJVM The SPARC S7-2 server achieved 65,790 SPECjbb2015-MultiJVM max-jOPS and 35,812 SPECjbb2015-MultiJVM critical-jOPS on the SPECjbb2015 benchmark.  On the SPARC S7-2 server's 16 cores, these rates are 4,112 SPECjbb2015-MultiJVM max-jOPS per core and 2,238 SPECjbb2015-MultiJVM critical-jOPS per core. The SPARC S7-2 server delivered 1.5 times more SPECjbb2015-MultiJVM max-jOPS  performance per core than the Huawei RH2288H V3 using Intel v4 processors.  The SPARC S7-2 server also produced 1.4 times more SPECjbb2015-MultiJVM critical-jOPS performance per core  compared to the Cisco UCS C220 M4 using Intel v4 processors. From SPEC's press release: "The SPECjbb2015 benchmark is based on the usage model of a worldwide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases, and data-mining operations.  It exercises Java 7 and higher features, using the latest data formats (XML), communication using compression, and secure messaging." Performance Landscape Results of SPECjbb2015 Distributed from www.spec.org as of July 21, 2016.   SPECjbb2015 Distributed Results System Performance Perf/Core Environment max crit max crit SPARC S7-2 2 x SPARC S7 (4.26 GHz, 2x 8core) 66,612 36,922 4,163 2,308 Oracle Solaris 11.3 JDK 8u92 HP ProLiant DL380 Gen9 2 x Intel E5-2699 v4 (2.2 GHz, 2x 22core) 120,674 39,615 2,743 900 Red Hat 7.2 JDK 8u74 HP ProLiant DL360 Gen9 2 x Intel E5-2699 v4 (2.2 GHz, 2x 22core) 106,337 55,858 2,417 1,270 Red Hat 7.2 JDK 8u91 HP ProLiant DL580 Gen9 4 x Intel E7-8890 v4 (2.2 GHz, 4x 24core) 219,406 72,271 2,285 753 SUSE 12 SP1 JDK 8u92 Lenovo System x3850 X6 4 x Intel E7-8890 v4 (2.2 GHz, 4x 24core) 194,068 132,111 2,022 1,376 Red Hat 7.2 JDK 8u91 Note: under Performance, the max column contains SPECjbb2015-Distributed max-jOPS results, and the crit column contains SPECjbb2015-Distributed critical-jOPS results.  Under Perf/Core, the max column contains SPECjbb2015-Distributed max-jOPS results divided by their respective core count, and the crit column contains SPECjbb2015-Distributed critical-jOPS results divided by their respective core count.  The Environment column contains the operating system version, the JDK version, and any special configuration. Results of SPECjbb2015 MultiJVM from www.spec.org as of July 21, 2016. SPECjbb2015 MultiJVM Results System Performance Perf/Core Environment max crit max crit SPARC S7-2 2 x SPARC S7 (4.26 GHz, 2x 8core) 65,790 35,812 4,112 2,238 Oracle Solaris 11.3 JDK 8u92 IBM Power S812LC 1 x POWER8 (2.92 GHz, 10core) 44,883 13,032 4,488 1,303 Ubuntu 14.04.3 J9 VM SPARC T7-1 1 x SPARC M7 (4.13 GHz, 32core) 120,603 60,280 3,769 1,884 Oracle Solaris 11.3 JDK 8u66 Huawei RH2288H V3 2 x Intel Xeon E5-2699 v4 (2.2 GHz, 2x 22core) 121,381 38,595 2,759 877 Red Hat 6.7 JDK 8u92 HP ProLiant DL360 Gen9 2 x Intel Xeon E5-2699 v4 (2.2 GHz, 2x 22core) 120,674 29,013 2,743 659 Red Hat 7.2 JDK 8u74 HP ProLiant DL380 Gen9 2 x Intel Xeon E5-2699 v4 (2.2 GHz, 2x 22core) 105,690 52,952 2,402 1,203 Red Hat 7.2 JDK 8u72 Cisco UCS C220 M4 2 x Intel Xeon E5-2699 v4 (2.2 GHz, 2x 22core) 94,667 71,951 2,152 1,635 Red Hat 6.7 JDK 8u74 Huawei RH2288H V3 2 x Intel Xeon E5-2699 v3 (2.3 GHz, 2x 18core) 98,673 28,824 2,741 801 Red Hat 6.7 JDK 8u92 Lenovo Flex System x240 M5 2 x Intel Xeon E5-2699 v3 (2.3 GHz, 2x 18core) 80,889 43,654 2,247 1,213 Red Hat 6.5 JDK 8u60 SPARC T5-2 2 x SPARC T5 (3.6 GHz, 2x 16core) 80,889 37,422 2,528 1,169 Oracle Solaris 11.2 JDK 8u66 Note: under Performance, the max column contains SPECjbb2015-MultiJVM max-jOPS results, and the crit column contains SPECjbb2015-MultiJVM critical-jOPS results.  Under Perf/Core, the max column contains SPECjbb2015-MultiJVM max-jOPS results divided by their respective core count, and the crit column contains SPECjbb2015-MultiJVM critical-jOPS results divided by their respective core count.  The Environment column contains the operating system version, the JDK version, and any special configuration. Configuration Summary System Under Test: SPARC S7-2 Server 2 x SPARC S7 processor (4.26 GHz) 1 TB memory (16 x 64 GB dimms) Oracle Solaris 11.3 (11.3.9.3.0) Java HotSpot 64-Bit Server VM, version 1.8.0_92   Driver System (Distributed result): Sun Server X4-2L 2 x Intel Xeon E5-2697 v2 processor (2.70 GHz) 128 GB memory (16 x 8 GB dimms) Oracle Solaris 11.3 (11.3.8.2.0) Java HotSpot 64-Bit Server VM, version 1.8.0_92   Benchmark Description The benchmark description, as found at the SPEC website. The SPECjbb2015 benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is relevant to all audiences who are interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community. Features include: A usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases and data-mining operations. Both a pure throughput metric and a metric that measures critical throughput under service level agreements (SLAs) specifying response times ranging from 10ms to 100ms. Support for multiple run configurations, enabling users to analyze and overcome bottlenecks at multiple layers of the system stack, including hardware, OS, JVM and application layers. Exercising new Java 7 features and other important performance elements, including the latest data formats (XML), communication using compression, and messaging with security. Support for virtualization and cloud environments. See Also SPECjbb2015 Results Website More Information on SPECjbb2015 SPARC S7-2 Server oracle.com    OTN    Blog Oracle Solaris oracle.com    OTN    Blog Java oracle.com    OTN   Disclosure Statement SPEC and the benchmark name SPECjbb are registered trademarks of Standard Performance Evaluation Corporation (SPEC).  Results from http://www.spec.org as of 7/21/2016. HP ProLiant DL580 Gen9 219,406 SPECjbb2015-Distributed max-jOPS, 72,271 SPECjbb2015-Distributed critical-jOPS; Lenovo System x3850 X6 194,068 SPECjbb2015-Distributed max-jOPS, 132,111 SPECjbb2015-Distributed critical-jOPS; HP ProLiant DL380 Gen9 120,674 SPECjbb2015-Distributed max-jOPS, 39,615 SPECjbb2015-Distributed critical-jOPS; HP ProLiant DL360 Gen9 106,337 SPECjbb2015-Distributed max-jOPS, 55,858 SPECjbb2015-Distributed critical-jOPS; SPARC S7-2 66,612 SPECjbb2015-Distributed max-jOPS, 36,922 SPECjbb2015-Distributed critical-jOPS; Oracle S7-2 65,790 SPECjbb2015-MultiJVM max-jOPS, 35,812 SPECjbb2015-MultiJVM critical-jOPS; IBM Power S812LC 44,883 SPECjbb2015-MultiJVM max-jOPS, 13,032 SPECjbb2015-MultiJVM critical-jOPS;  SPARC T7-1 120,603 SPECjbb2015-MultiJVM max-jOPS, 60,280 SPECjbb2015-MultiJVM critical-jOPS;  Huawei RH2288H V3 121,381 SPECjbb2015-MultiJVM max-jOPS, 38,595 SPECjbb2015-MultiJVM critical-jOPS;  HP ProLiant DL360 Gen9 120,674 SPECjbb2015-MultiJVM max-jOPS, 29,013 SPECjbb2015-MultiJVM critical-jOPS; HP ProLiant DL380 Gen9 105,690 SPECjbb2015-MultiJVM max-jOPS, 52,952 SPECjbb2015-MultiJVM critical-jOPS; Cisco UCS C220 M4 94,667 SPECjbb2015-MultiJVM max-jOPS, 71,951 SPECjbb2015-MultiJVM critical-jOPS; Huawei RH2288H V3 98,673 SPECjbb2015-MultiJVM max-jOPS, 28,824 SPECjbb2015-MultiJVM critical-jOPS; Lenovo Flex System x240 M5 80,889 SPECjbb2015-MultiJVM max-jOPS, 43,654 SPECjbb2015-MultiJVM critical-jOPS; SPARC T5-2 80,889 SPECjbb2015-MultiJVM max-jOPS, 37,422 SPECjbb2015-MultiJVM critical-jOPS.

Oracle's SPARC S7-2 server, using Oracle Solaris and Oracle JDK, produced two-chip SPECjbb2015-Distributed and SPECjbb2015-MultiJVM benchmark results.  This benchmark was designed by the industry to...

Benchmark

SPECjEnterprise2010: SPARC S7-2 Secure and Unsecure Results

Oracle's SPARC S7-2 servers produced a SPECjEnterprise2010 benchmark result of 14,400.78 SPECjEnterprise2010 EjOPS using one SPARC S7-2 server for the application tier and one SPARC S7-2 server for the database server. The SPARC S7-2 servers also obtained a result of 14,121.47 SPECjEnterprise2010 EjOPS using encrypted data. This secured result used Oracle Advanced Security Transparent Data Encryption (TDE) for the application database tablespaces with the AES-128 cipher. The network connection between the application server and the database server was also secured using Oracle's Network Data Encryption with the JDBC driver and RC4-128 cipher. The SPARC S7-2 server, equipped with two SPARC S7 processors, demonstrated 43% better SPECjEnterprise2010 EjOPS/app-server-cores performance compared to the Oracle Server X6-2 system result. The SPARC S7-2 server, equipped with two SPARC S7 processors, demonstrated 50% better SPECjEnterprise2010 EjOPS/app-server-cores performance compared to the Oracle Server X5-2 system result. The SPARC S7-2 server, equipped with two SPARC S7 processors, demonstrated 31% better SPECjEnterprise2010 EjOPS/core performance compared to the 2-chip IBM x3650 server result. The application server used Oracle Fusion Middleware components including the Oracle WebLogic 12.2 application server and Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.8.0_92. The database server was configured with Oracle Database 12c Release 1. The benchmark performance using the secure SPARC S7-2 server configuration with encryption ran within 2% of the performance of the non-secure SPARC S7-2 server result. Performance Landscape Select single application server results.  Complete benchmark results are at the SPEC website,  SPECjEnterprise2010 Results.   SPECjEnterprise2010 Performance Chart 7/6/2016 Java EE Server DB Server EjOPS app cores EjOPS /appcore 1 x SPARC S7-2 2 x 4.27 GHz SPARC S7 Oracle WebLogic 12c (12.2.1) 1 x SPARC S7-2 2 x 4.27 GHz SPARC S7 Oracle Database 12c (12.1.0.2) 14,400.78 16 900 1 x SPARC S7-2 2 x 4.27 GHz SPARC S7 Oracle WebLogic 12c (12.2.1) Network Data Encryption for JDBC 1 x SPARC S7-2 2 x 4.27 GHz SPARC S7 Oracle Database 12c (12.1.0.2) Transparent Data Encryption 14,121.47 16 882 1 x Oracle Server X6-2 2 x 2.2 GHz Intel Xeon E5-2699 v4 Oracle WebLogic 12c (12.2.1) 1 x Oracle Server X6-2 2 x 2.2 GHz Intel Xeon E5-2699 v4 Oracle Database 12c (12.1.0.2) 27,803.39 44 631 1 x Oracle Server X5-2 2 x 2.3 GHz Intel Xeon E5-2699 v3 Oracle WebLogic 12c (12.1.3) 1 x Oracle Server X5-2 2 x 2.3 GHz Intel Xeon E5-2699 v3 Oracle Database 12c (12.1.0.2) 21,504.30 36 597 1 x IBM System x3650 M5 2 x 2.6 GHz Intel Xeon E5-2697 v3 WebSphere Application Server V8.5 1 x IBM System x3850 X6 4 x 2.8 GHz Intel Xeon E7-4890 v2 IBM DB2 10.5 FP5 19,282.14 28 688 1 x IBM Power S824 4 x 3.5 GHz POWER 8 WebSphere Application Server V8.5 1 x IBM Power S824 4 x 3.5 GHz POWER 8 IBM DB2 10.5 FP3 22,543.34 24 939   EjOPS – SPECjEnterprise2010 EjOPS (bigger is better) app cores – application server cores used EjOPS/appcore – SPECjEnterprise2010 EjOPS divided by total application server cores (bigger is better) Configuration Summary Application Server: 1 x SPARC S7-2 server, with 2 x SPARC S7 processor (4.27 GHz) 512 GB memory (16 x 32 GB) 2 x 600 GB SAS-3 HDD 2 x 400 GB SAS-3 SSD 2 x Sun Dual Port 10 GBase-T Network Adapter Oracle Solaris 11.3 Oracle WebLogic Server 12c (12.2.1) Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.8.0_92   Database Server: 1 x SPARC S7-2 server, with 2 x SPARC S7 processor (4.27 GHz) 512 GB memory (16 x 32 GB) 2 x 600 GB SAS-3 HDD 1 x Sun Dual Port 10 GBase-T Network Adapter 2 x Sun Storage 16 Gbit Fibre Channel Universal HBA Oracle Solaris 11.3 Oracle Database 12c (12.1.0.2)   Storage Servers: 1 x Oracle Server X6-2L (24 Slot Disk-Cage), with 2 x Intel Xeon Processor E5-2643 v4 (3.4 GHz) 256 GB memory 1 x Sun Storage 16 Gbit Fibre Channel PCI-E HBA dual port 2 x 3.2 TB NVMe SSD 2 x 600 GB SAS-3 HDD Oracle Solaris 11.3 (11.3.8.0.4)   1 x Oracle Server X6-2L (24 Slot Disk Cage), with 2 x Intel Xeon Processor E5-2643 v4 (3.4 GHz) 256 GB memory 1 x Sun Storage 16 Gb Fibre Channel PCI-E HBA dual port 14 x 600 GB 10k RPM SAS-3 HDD Oracle Solaris 11.3 (11.3.8.0.4)   1 x Brocade 6510 16 Gb FC switch   Benchmark Description SPECjEnterprise2010 is the third generation of the SPEC organization's J2EE end-to-end industry standard benchmark application. The SPECjEnterprise2010 benchmark has been designed and developed to cover the Java EE 5 specification's significantly expanded and simplified programming model, highlighting the major features used by developers in the industry today. This provides a real world workload driving the Application Server's implementation of the Java EE specification to its maximum potential and allowing maximum stressing of the underlying hardware and software systems, The web zone, servlets, and web services The EJB zone JPA 1.0 Persistence Model JMS and Message Driven Beans Transaction management Database connectivity Moreover, SPECjEnterprise2010 also heavily exercises all parts of the underlying infrastructure that make up the application environment, including hardware, JVM software, database software, JDBC drivers, and the system network. The primary metric of the SPECjEnterprise2010 benchmark is jEnterprise Operations Per Second (SPECjEnterprise2010 EjOPS). The primary metric for the SPECjEnterprise2010 benchmark is calculated by adding the metrics of the Dealership Management Application in the Dealer Domain and the Manufacturing Application in the Manufacturing Domain. There is NO price/performance metric in this benchmark. See Also SPECjEnterprise2010 Results Page SPARC S7-2 Result Page at SPEC Encrypted SPARC S7-2 Result Page at SPEC SPARC S7-2 Server oracle.com    OTN    Blog Oracle Server X6-2L oracle.com    OTN    Blog Oracle Solaris oracle.com    OTN    Blog Oracle Database oracle.com    OTN    Blog Oracle Fusion Middleware oracle.com    OTN    Blog Oracle WebLogic Suite oracle.com    OTN    Blog Oracle Database – Transparent Data Encryption oracle.com    OTN    Blog   Disclosure Statement SPEC and the benchmark name SPECjEnterprise are registered trademarks of the Standard Performance  Evaluation Corporation. Results from www.spec.org as of 7/6/2016.  SPARC S7-2, 14,400.78 SPECjEnterprise2010 EjOPS (unsecure); SPARC S7-2, 14,121.47  SPECjEnterprise2010 EjOPS (secure); Oracle Server X6-2, 27,803.39 SPECjEnterprise2010 EjOPS  (unsecure); Oracle Server X5-2, 21,504.30 SPECjEnterprise2010 EjOPS (unsecure); IBM Power S824,  22,543.34 SPECjEnterprise2010 EjOPS (unsecure); IBM System x3650 M5, 19,282.14  SPECjEnterprise2010 EjOPS (unsecure).

Oracle's SPARC S7-2 servers produced a SPECjEnterprise2010 benchmark result of 14,400.78 SPECjEnterprise2010 EjOPS using one SPARC S7-2 server for the application tier and one SPARC S7-2 server for...

Benchmark

SPECjEnterprise2010: Oracle Server X6-2 Top x86 Result

Two Oracle Server X6-2 systems, using the Intel Xeon E5-2699 v4 processor, produced a world record for x86 systems SPECjEnterprise2010 benchmark result of 27,803.39 SPECjEnterprise2010 EjOPS.  One Oracle Server X6-2 system ran the application tier and the second Oracle Server X6-2 system ran the database tier. The Oracle Server X6-2 system demonstrated 44% better performance when compared to the IBM X3650 M5 server result of 19,282.14 SPECjEnterprise2010 EjOPS. The Oracle Server X6-2 system demonstrated 29% better performance when compared to the previous generation Oracle Server X5-2 system result of 21,504.30 SPECjEnterprise2010 EjOPS. This result used Oracle WebLogic Server 12c, Java HotSpot 64-Bit Server 1.8.0_91, Oracle Database 12c, and Oracle Linux.   Performance Landscape Complete benchmark results are at the SPEC website, SPECjEnterprise2010 Results. The table below shows the top two-chip x86 server results.   SPECjEnterprise2010 Performance Chart as of 7/6/2016 Submitter EjOPS* Application Server Database Server Oracle 27,803.39 1 x Oracle Server X6-2 2 x 2.2 GHz Intel Xeon E5-2699 v4 Oracle WebLogic 12c (12.2.1) 1 x Oracle Server X6-2 2 x 2.2 GHz Intel Xeon E5-2699 v4 Oracle Database 12c (12.1.0.2) Oracle 21,504.30 1 x Oracle Server X5-2 2 x 2.3 GHz Intel Xeon E5-2699 v3 Oracle WebLogic 12c (12.1.3) 1 x Oracle Server X5-2 2 x 2.3 GHz Intel Xeon E5-2699 v3 Oracle Database 12c (12.1.0.2) IBM 19,282.14 1 x IBM X3650 M5 2 x 2.6 GHz Intel Xeon E5-2697 v3 WebSphere Application Server V8.5 1 x IBM X3850 X6 4 x 2.8 GHz Intel Xeon E7-4890 v2 IBM DB2 10.5   * EjOPS – SPECjEnterprise2010 EjOPS, bigger is better.   Configuration Summary Application Server: 1 x Oracle Server X6-2 2 x Intel Xeon Processor E5-2699 v4 (2.2 GHz) 256 GB memory 3 x 10 GbE NIC Oracle Linux 6.7 (kernel-4.1.12-37.2.2.el6uek.x86_64) Oracle WebLogic Server 12c (12.2.1) Java HotSpot(TM) 64-Bit Server VM on Linux, version 1.8.0_91 (Java SE 8 Update 91)   Database Server: 1 x Oracle Server X6-2 2 x Intel Xeon Processor E5-2699 v4 (2.2 GHz) 512 GB memory 2 x 10 GbE NIC 1 x 16 Gb FC HBA 2 x Oracle Server X5-2L Storage Oracle Linux 6.7 (kernel-4.1.12-37.2.2.el6uek.x86_64) Oracle Database 12c Enterprise Edition Release 12.1.0.2   Benchmark Description SPECjEnterprise2010 is the third generation of the SPEC organization's J2EE end-to-end industry standard benchmark application. The SPECjEnterprise2010 benchmark has been designed and developed to cover the Java EE 5 specification's significantly expanded and simplified programming model, highlighting the major features used by developers in the industry today. This provides a real world workload driving the Application Server's implementation of the Java EE specification to its maximum potential and allowing maximum stressing of the underlying hardware and software systems. The workload consists of an end to end web based order processing domain, an RMI and Web Services driven manufacturing domain and a supply chain model utilizing document based Web Services. The application is a collection of Java classes, Java Servlets, Java Server Pages, Enterprise Java Beans, Java Persistence Entities (pojo's) and Message Driven Beans. The SPECjEnterprise2010 benchmark heavily exercises all parts of the underlying infrastructure that make up the application environment, including hardware, JVM software, database software, JDBC drivers, and the system network. The primary metric of the SPECjEnterprise2010 benchmark is jEnterprise Operations Per Second ("SPECjEnterprise2010 EjOPS"). This metric is calculated by adding the metrics of the Dealership Management Application in the Dealer Domain and the Manufacturing Application in the Manufacturing Domain. There is no price/performance metric in this benchmark. Key Points and Best Practices Four Oracle WebLogic server instances were started using numactl binding 2 instances per chip. Four Oracle database listener processes were started, 2 processes bound per processor. Additional tuning information is in the report at spec.org. COD (Cluster on Die) is enabled in the BIOS on the application server. See Also SPECjEnterprise2010 Results Page Oracle Server X6-2 oracle.com    OTN    Blog Oracle Linux oracle.com    OTN    Blog Oracle Database oracle.com    OTN    Blog Oracle WebLogic Suite oracle.com    OTN    Blog   Disclosure Statement SPEC and the benchmark name SPECjEnterprise are registered trademarks of the Standard Performance Evaluation Corporation. Oracle Server X6-2, 27,803.39 SPECjEnterprise2010 EjOPS; Oracle Server X5-2, 21,504.30 SPECjEnterprise2010 EjOPS; IBM System x3650 M5, 19,282.14 SPECjEnterprise2010 EjOPS. Results from www.spec.org as of 7/6/2016.

Two Oracle Server X6-2 systems, using the Intel Xeon E5-2699 v4 processor, produced a world record for x86 systems SPECjEnterprise2010 benchmark result of 27,803.39 SPECjEnterprise2010 EjOPS.  One...

Benchmark

Simultaneous OLTP & In-memory Analytics: SPARC S7-2 Advantage Per Core Under Load Compared to 2-Chip x86 E5-2699 v3

A goal of the modern business is real-time enterprise where analytics are run simultaneously with transaction processing on the same system to provide the most effective decision making. Oracle Database 12c Enterprise Edition utilizing the Oracle In-Memory option is designed for the same database to be able to perform transactions at the highest performance and to perform analytical calculations that once took days or hours to complete orders of magnitude faster. Oracle's SPARC S7 processor has deep innovations to take the real-time enterprise to the next level of performance. In this test both OLTP transactions and analytical queries were run in a single database instance using all of the same features of Oracle Database 12c Enterprise Edition including the Oracle In-Memory option in order to compare the advantages of the SPARC S7 processor to the Intel Xeon Processor E5-2699 v3. The SPARC S7 processor uses the Data Analytics Accelerator (DAX). DAX is not a SIMD instruction, but rather an actual co-processor that offloads in-memory queries which frees the cores up for other processing.  The DAX has direct access to the memory bus and can execute scans at near full memory bandwidth. Oracle makes the DAX API available to other applications, so this kind of acceleration is not just available to Oracle Database. Oracle's SPARC S7-2 server ran the in-memory analytics RCDB based queries 2.3x faster per chip under load than a two-chip x86 Intel Xeon Processor E5-2699 v3 server on the 24 stream test. Furthermore, the SPARC S7-2 server ran the in-memory analytics RCDB based queries 5.1x faster per core under load than the same x86 server. The SPARC S7-2 server and the two-chip Intel Xeon Processor E5-2699 v3 server both ran OLTP transactions and the in-memory analytics on the same database instance using Oracle Database 12c Enterprise Edition utilizing the Oracle In-Memory option. Performance Landscape The table below compares the SPARC S7-2 server and 2-chip x86 Intel Xeon Processor E5-2699 v3 server while running OLTP and in-memory analytics against tables in the same database instance. The same set of transactions and queries were executed on each system.  All of the following results were run as part of this benchmark effort.   Real-Time Enterprise Performance Chart 24 RCDB DSS Streams, 112 OLTP users Simultaneous Mixed Workload SPARC S7-2 2-Chip x86 E5 v3 SPARC Per Chip Advantage SPARC Per Core Advantage OLTP Transactions (Transactions per Second) 195,790 216,302 0.9x 2.0x Analytic Queries (Queries per Minute) 107 47 2.3x 5.1x Configuration Summary SPARC Server: 1 X SPARC S7-2 server 2 X SPARC S7 processor 512 GB Memory Oracle Solaris 11.3 Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 x86 Server: 1 X Oracle Server X5-2L 2 X Intel Xeon Processor E5-2699 v3 256 GB Memory Oracle Linux 6.5 (3.8.13-16.2.1.el6uek.x86_64) Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 Benchmark Description The Real-Time Enterprise benchmark simulates the demands of customers who simultaneously run both their OLTP database and the related historical warehouse DSS data that would be based on that OLTP data. It answers the question of how a system will perform when doing data analysis, while at the same time executing real-time on-line transactions. The OLTP workload simulates an Order Inventory System that exercises both reads and writes with a potentially large number of users that stresses the lock management and connectivity, as well as, database access. The number of customers, orders and users is fully parametrized.  This benchmark is base on 100 GB dataset, 15 million customers, 600 million orders and up to 580 users.  The workload consists of a number of transaction types including show-expenses, part-cost, supplier-phone, low-inv, high-inv, update-price, update-phone, update-cost, and new-order. The real cardinality database (RCDB) schema was created to showcase the potential speedup one may see moving from on disk, row format data warehouse/star schema, to utilizing Oracle Database 12c's In-Memory feature for analytical queries. The DSS workload consists of, as many as, 2,304 unique queries asking questions such as "In 2014, what was the total revenue of single item orders?" or "In August 2013, how many orders exceeded a total price of $50?" Questions like these can help a company see where to focus for further revenue growth or identify weaknesses in their offerings. RCDB scale factor 1050 represents a 1.05 TB data warehouse. It is transformed into a star schema of 1.0 TB, and then becomes 110 GB in size when loaded in memory. It consists of 1 fact table, and 4 dimension tables with over 10.5 billion rows. There are 56 columns with most cardinalities varying between 5 and 2,000, a primary key being an example of something outside this range. The results were obtained running a set of OLTP transactions and analytic queries simultaneously against two schema: a real time online orders system and a related historical orders schema configured as a real cardinality database (RCDB) star schema. The in-memory analytics RCDB queries are executed using the Oracle Database 12c In-Memory columnar feature. Two reports are generated: one for the OLTP-Perf workload and one for the RCDB DSS workload. For the analytical DSS workload, queries per minute and average query elapsed times are reported. For the OLTP-Perf workload, both transactions-per-second in thousands and OLTP average response times in milliseconds are reported. Key Points and Best Practices This benchmark utilized the SPARC S7 processor's co-processor DAX for query acceleration. All SPARC S7-2 server results were run with out-of-the-box tuning for Oracle Solaris. All Oracle Server X5-2L system results were run with out-of-the-box tunings for Oracle Linux except for the setting in /etc/sysctl.conf to get large pages for the Oracle Database: vm.nr_hugepages=98304 To create an in memory area, the following was added to the init.ora: inmemory_size = 120g An example of how to set a table to be in memory is below: ALTER TABLE CUSTOMER INMEMORY MEMCOMPRESS FOR QUERY HIGH   See Also SPARC S7-2 Server oracle.com    OTN    Blog Oracle Server X5-2L oracle.com    OTN    Blog Oracle Solaris oracle.com    OTN    Blog Oracle Database oracle.com    OTN    Blog Oracle Database In-Memory oracle.com    OTN    Blog   Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of June 29, 2016. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

A goal of the modern business is real-time enterprise where analytics are run simultaneously with transaction processing on the same system to provide the most effective decision making....

Benchmark

Yahoo Cloud Serving Benchmark: SPARC S7-2 and Oracle NoSQL Advantage Over x86 E5-2699 v4 Server Per Core Under Load

Oracle's SPARC S7-2 server delivered 341 Kops/sec on 300 million records for the Yahoo Cloud Serving Benchmark (YCSB) running a 95% read, 5% update workload using Oracle NoSQL Database 4.0. NoSQL is important for Big Data Analysis and for Cloud Computing. The SPARC S7-2 server was 1.9 times faster per core than a two-chip x86 E5-2699 v4 server running YCSB with a 95% read, 5% update workload. Performance Landscape The table below compares the SPARC S7-2 server and 2-chip x86 E5-2699 v4 server.  All of the following results were run as part of this benchmark effort.   YCSB Benchmark Performance System Insert Mixed Load (95% Read, 5% Update) Throughput ops/sec Ave Latency Throughput ops/sec Ave Latency Throughput per core Write msec Read msec Write msec SPARC S7-2 2 x SPARC S7 (2x 8core) 81,777 2.64 340,766 0.80 2.81 21,298 x86 E5 v4 server 2 x E5-2699 v4 (2x 22core) 155,498 1.38 502,273 0.46 1.16 11,415 Configuration Summary SPARC System: SPARC S7-2 server 2 x SPARC S7 processors 512 GB memory 2 x Oracle Flash Accelerator F320 PCIe card 1 x Built-in 10 GbE PCIe port x86 System: Oracle Server X6-2L server 2 x Intel Xeon Processor E5-2699 v4 512 GB memory 2 x Oracle Flash Accelerator F320 PCIe card 1 x Sun Dual Port 10 GbE PCIe 2.0 Low Profile Adapter Software Configuration (for both systems): Oracle Solaris 11.3 (11.3.8.7.0) Oracle NoSQL Database, Enterprise Edition 12c R1.4.0.9 Java(TM) SE Runtime Environment (build 1.8.0_92-b14) Benchmark Description The Yahoo Cloud Serving Benchmark (YCSB) is a performance benchmark for cloud database and their systems. The benchmark documentation says: With the many new serving databases available including Sherpa, BigTable, Azure and many more, it can be difficult to decide which system is right for your application, partially because the features differ between systems, and partially because there is not an easy way to compare the performance of one system versus another.  The goal of the Yahoo Cloud Serving Benchmark (YCSB) project is to develop a framework and common set of workloads for evaluating the performance of different "key-value" and "cloud" serving stores. Key Points and Best Practices The 300 million records were loaded into 4 Shards with the replication factor set to 3. Four processor sets were created to host 4 Storage Nodes.  The default processor set was additionally used for OS and IO interrupts. The processor sets were used for isolation and to ensure a balanced load. Fixed priority class was assigned to Oracle NoSQL Storage Node java processes. The ZFS record size was set to 16K (default 128K) and this worked best for the 95% read, 5% update workload. Sun Server X4-2L system was used as client for generating the workload. The server and client system were connected through a 10 GbE network. See Also Yahoo Cloud Serving Benchmark YCSB Source SPARC S7-2 Server oracle.com    OTN    Blog Oracle Server X6-2L oracle.com    OTN    Blog Oracle Flash Accelerator F320 PCIe Card oracle.com    OTN Oracle Solaris oracle.com    OTN    Blog Oracle NoSQL Database oracle.com    OTN    Blog   Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of June 29, 2016.

Oracle's SPARC S7-2 server delivered 341 Kops/sec on 300 million records for the Yahoo Cloud Serving Benchmark (YCSB) running a 95% read, 5% update workload using Oracle NoSQL Database 4.0. NoSQL...

Benchmark

Computational Graph Algorithms: SPARC T7-4 Performance on PageRank and Single-Source Shortest Path

Computational graph algorithms are used in many big data and analytics workloads. The algorithm kernels inherently have a large degree of parallelism and a very large number of random accesses to memory. Oracle's SPARC M7 processor based system provides better performance than an x86 server with Intel Xeon Processor E7-8895 v3. Oracle's SPARC T7-4 server with four SPARC M7 processors evaluating PageRank on large graphs was able to deliver 1.2x to 1.5x better per core performance than a four chip x86 server with Intel Xeon Processors E7-8895 v3. The SPARC T7-4 server with four SPARC M7 processors computing Single-Source Shortest Path (SSSP) using the Bellman-Ford algorithm on large graphs was able to deliver 1.4x to 1.5x better per core performance than a four-chip x86 server with Intel Xeon Processor E7-8895 v3. Fifty PageRank iterations were run. SSSP repeated the computation from 3 different source vertices.  The Graph 500 Scale 32 graph has 1,650 million vertices, 68.7 billion edges and requires 4 TB of memory to analyze.  The Graph 500 Scale 30 graph has 448 million vertices, 17.2 billion edges and requires a computer with 1 TB of memory to analyze.  The Graph 500 Scale 29 graph has 233 million vertices, 8.6 billion edges and requires 500 GB of memory to analyze. Performance Landscape The graphs are identified by "Scale". Graph Algorithm Graph 500 Scale x86 4-chip (sec) SPARC T7-4 4-chip (sec) SPARC Per Chip Advantage SPARC Per Core Advantage PageRank 30 136.7 62.6 2.1x 1.2x 29 72.1 27.6 2.6x 1.5x SSSP Bellman-Ford 30 39.2 14.7 2.7x 1.5x 29 21.3 8.5 2.5x 1.4x Both graph algorithms were also run on a 12 processor SPARC M7-16 server with Graph 500 Scale 32. Graph Algorithm Graph 500 Scale SPARC M7-16 using 1 chip (sec) SPARC M7-16 using 12 chips (sec) PageRank 32 1,600.9 98.4 SSSP Bellman-Ford 32 405.9 31.5   Configuration Summary Systems Under Test: SPARC T7-4 server with 4 x SPARC M7 processors (4.13 GHz) 2 TB memory Oracle Solaris 11.3 Oracle Developer Studio 12.5 Oracle Server X5-4 system with 4 x Intel Xeon E7-8895 v3 processors (2.6 GHz), hyperthreading enabled 1 TB memory Oracle Linux Server release 7.1 gcc version 6.1.0 (CDS 27-Apr-2016) SPARC M7-16 server with 12 x SPARC M7 processors (4.13 GHz), configurable from 4 to 16 processors 6 TB memory Oracle Solaris 11.3 Oracle Developer Studio 12.5 Benchmark Description Computational graphs are a core part of many analytics workloads and are very data intensive and stress computers. Each algorithm typically traverses the entire graph multiple times, while doing certain arithmetic operations during the traversal.  These benchmarks generate a combination of sequential memory access patterns (streaming through the structure of the graph) and random accesses to the targets of graph edges Two computational graph algorithms were run, PageRank and the Bellman-Ford algorithm for Single-Source Shortest Path (SSSP), using the Graph 500 graphs at scale 29, 30 and 32. The mathematics of PageRank are entirely general and apply to any graph or network in any domain. Thus, PageRank is now regularly used in bibliometrics, social and information network analysis, and for link prediction and recommendation. The PageRank algorithm counts the number and quality of links to a page to determine a rough estimate of the importance. Graph vertices and edges are represented as 64 bit integers.  Double precision floating point arithmetic is used for PageRank values.  The benchmark runs 50 PageRank iterations. The Bellman-Ford algorithm for SSSP, Single-Source Shortest Path, finds the shortest paths from a source vertex to all other vertices in the graph. Often used to find the shortest distance between 2 points, e.g. Google Maps. It's also used in operations research and "six degrees of separation." Graph vertices and edges are represented as 64 bit integers.  Double precision floating point distances are held on each graph edge. The result provides the shortest distance from a root vertex to every other vertex, and each vertex's predecessor along this shortest path. The benchmark repeats this computation from 3 different source vertices. Characteristics of the Graphs Graph 500 Scale Connected Vertices Edges Runtime Memory Required 32 1,650 M 6.87 B 4 TB 30 448 M 17.2 B 1 TB 29 233 M 8.6 B 500 GB Note: M is 10**6, B is 10**9. Key Points and Best Practices Computational graph algorithms benefit from the faster memory bandwidth in the SPARC T7-4 server. The memory bandwidth as measured by STREAM is 258,868 MB/Sec on the Oracle Server X5-4 and 555,374 MB/sec on the SPARC T7-4 server. Each algorithm was highly tuned for the memory hierarchy of each processor's architecture and its server. See Also PageRank SSSP, Bellman-Ford algorithm SPARC T7-4 Server oracle.com    OTN    Blog Oracle Server X5-4 oracle.com    OTN    Blog SPARC M7-16 Server oracle.com    OTN    Blog Oracle Solaris oracle.com    OTN    Blog Oracle Developer Studio oracle.com    OTN Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of June 29, 2016.

Computational graph algorithms are used in many big data and analytics workloads. The algorithm kernels inherently have a large degree of parallelism and a very large number of random accesses...

Benchmark

Oracle Advanced Analytics: SPARC T7-4 Beats 4-Chip x86 E7 v3

Oracle's SPARC T7-4 server can deliver up to 8.6x better performance than a four-chip x86 Intel Xeon Processor E7-8895 v3 server running Oracle Advanced Analytics data mining features for scoring/prediction analysis. The SPARC T7-4 server can deliver up to 3.5x faster performance than a four-chip x86 Intel Xeon Processor E7-8895 v3 server running Oracle Advanced Analytics data mining features for training/learning analysis. For these scoring/prediction algorithms the SPARC T7-4 server is compared to a four-chip Intel Xeon Processor E7-8895 v3 based server on both a system and per core basis. For the Support Vector Machine algorithm using the Interior Point Method solver (SVM IPM), the SPARC server is 8.6x faster than the x86 server and has a 4.8x advantage per core under load. For the Generalized Linear Model Regression algorithm (GLM Regression), the SPARC server is 6.6x faster than the x86 server and has a 3.7x advantage per core under load. For the Generalized Linear Model Classification algorithm (GLM Classification), the SPARC server is 6.2x faster than the x86 server and has a 3.5x advantage per core under load. For the Support Vector Machine algorithm using the Stochastic Gradient Descent solver (SVM SGD solver), the SPARC is 5.5x faster than the x86 server and has a 3.1x advantage per core under load. For the K-Means algorithm, the SPARC server is 6.3x faster than the x86 server and has a 3.6x advantage per core under load. For the Expectation Maximization algorithm, the SPARC server is 6.1x faster than the x86 server and has a 3.4x advantage per core under load. For these training/learning algorithms the SPARC T7-4 server is compared to a four-chip Intel Xeon Processor E7-8895 v3 based server on both a system and per core basis. For the Support Vector Machine algorithm using the Interior Point Method solver (SVM IPM), the SPARC server is 3.6x faster than the x86 server and has a 2.0x advantage per core under load. For the Generalized Linear Model Regression algorithm (GLM Regression), the SPARC server is 1.4x faster than the x86 server. For the Generalized Linear Model Classification algorithm (GLM Classification), the SPARC server is 2.1x faster than the x86 server and has a 1.2x advantage per core under load. For the Support Vector Machine algorithm using the Stochastic Gradient Descent solver (SVM SGD solver), the SPARC server is 1.9x faster than the x86 server and has a 1.1x advantage per core under load. For the K-Means algorithm, the SPARC server is 1.4x faster than the x86 server. For the Expectation Maximization algorithm, the SPARC server is 1.7x faster the x86 server. Oracle Advanced Analytics is an option of Oracle Database. Training/Learning is the part of Machine Learning (ML) and Statistics that analyzes a sample of data to create a model of what is most interesting for the desired analysis. Typically, this is a compute intensive operation that involves many 64-bit floating-point calculations. The output of the training/learning stage is a model that can analyze huge datasets in a stage called scoring and/or prediction.  While training/learning is a very important task, typically most time will be spent in the scoring/prediction state. Performance Landscape All of the following results were run as part of this benchmark effort.   Oracle Advanced Analytics Summary Scoring/Prediction Method Attributes Run Time (sec) SPARC per chip Advantage SPARC per core Advantage x86 E7 v3 72 cores total SPARC T7-4 128 cores total Supervised SVM IPM Solver 900 206 24 8.6x 4.8x GLM Regression 900 166 25 6.6x 3.7x GLM Classification 900 156 25 6.2x 3.5x SVM SGD Solver

Oracle's SPARC T7-4 server can deliver up to 8.6x better performance than a four-chip x86 Intel Xeon Processor E7-8895 v3 server running Oracle Advanced Analytics data mining features...

Benchmark

AES Encryption: SPARC S7 Performance, Beats Intel E5-2699 v4 Per Core Under Load

Oracle's cryptography benchmark measures security performance on important AES security modes. Oracle's SPARC S7 processor with its software in silicon security is faster per core than x86 servers that have the AES-NI instructions. In this test, the performance of on-processor encryption operations is measured (32 KB encryptions). Multiple threads are used to measure each processor's maximum throughput. SPARC S7 processors ran 2.3 times faster per core executing AES-CFB 256-bit key encryption (in cache) than the Intel Xeon Processor E5-2699 v4 (with AES-NI). SPARC S7 processors ran 2.2 times faster per core executing AES-CFB 128-bit key encryption (in cache) than the Intel Xeon Processor E5-2699 v4 (with AES-NI). SPARC S7 processors ran 2.3 times faster per core executing AES-CFB 256-bit key encryption (in cache) than Intel Xeon Processor E5-2699 v3 (with AES-NI). SPARC S7 processors ran 2.2 times faster per core executing AES-CFB 128-bit key encryption (in cache) than Intel Xeon Processor E5-2699 v3 (with AES-NI). AES-CFB encryption is used by Oracle Database for Transparent Data Encryption (TDE) which provides security for database storage. Oracle has also measured SHA digest performance on the SPARC S7 processor. Performance Landscape Presented below are results for running encryption using the AES cipher with the CFB, CBC, GCM and CCM modes for key sizes of 128, 192 and 256. Decryption performance was similar and is not presented. Results are presented as MB/sec (10**6). All SPARC S7 processor results were run as part of this benchmark effort. All other results were run during previous benchmark efforts. Encryption Performance – AES-CFB (used by Oracle Database) Performance is presented for in-cache AES-CFB128 mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CFB Two-chip Microbenchmark Performance (MB/sec) Processor GHz Cores Perf Perf/Core Software Environment AES-256-CFB SPARC M7 4.13 64 126,948 1,984 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 53,794 1,681 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 44 39,034 887 Oracle Linux 7.2, IPP/AES-NI SPARC S7 4.26 16 32,791 2,049 Oracle Solaris 11.3, libsoftcrypto + libumem Intel E5-2699 v3 2.30 36 31,924 887 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 24 19,964 832 Oracle Linux 6.5, IPP/AES-NI AES-192-CFB SPARC M7 4.13 64 144,299 2,255 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 60,736 1,898 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 44 45,351 1,031 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 36 37,157 1,032 Oracle Linux 6.5, IPP/AES-NI SPARC S7 4.26 16 37,295 2,331 Oracle Solaris 11.3, libsoftcrypto + libumem Intel E5-2697 v2 2.70 24 23,218 967 Oracle Linux 6.5, IPP/AES-NI AES-128-CFB SPARC M7 4.13 64 166,324 2,599 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 68,691 2,147 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 44 54,179 1,231 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 36 44,388 1,233 Oracle Linux 6.5, IPP/AES-NI SPARC S7 4.26 16 43,145 2,697 Oracle Solaris 11.3, libsoftcrypto + libumem Intel E5-2697 v2 2.70 24 27,755 1,156 Oracle Linux 6.5, IPP/AES-NI Encryption Performance – AES-CBC Performance is presented for in-cache AES-CBC mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CBC Two-chip Microbenchmark Performance (MB/sec) Processor GHz Cores Perf Perf/Core Software Environment AES-256-CBC SPARC M7 4.13 64 134,278 2,098 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 56,788 1,775 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 44 38,943 885 Oracle Linux 7.2, IPP/AES-NI SPARC S7 4.26 16 34,733 2,171 Oracle Solaris 11.3, libsoftcrypto + libumem Intel E5-2699 v3 2.30 36 31,894 886 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 24 19,961 832 Oracle Linux 6.5, IPP/AES-NI AES-192-CBC SPARC M7 4.13 64 152,961 2,390 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 63,937 1,998 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 44 45,285 1,029 Oracle Linux 7.2, IPP/AES-NI SPARC S7 4.26 16 39,654 2,478 Oracle Solaris 11.3, libsoftcrypto + libumem Intel E5-2699 v3 2.30 36 37,021 1,028 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 24 23,224 968 Oracle Linux 6.5, IPP/AES-NI AES-128-CBC SPARC M7 4.13 64 175,151 2,737 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 72,870 2,277 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 44 54,076 1,229 Oracle Linux 7.2, IPP/AES-NI SPARC S7 4.26 16 46,788 2,924 Oracle Solaris 11.3, libsoftcrypto + libumem Intel E5-2699 v3 2.30 36 44,103 1,225 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 24 27,730 1,155 Oracle Linux 6.5, IPP/AES-NI Encryption Performance – AES-GCM (used by ZFS Filesystem) Performance is presented for in-cache AES-GCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run). AES-GCM Two-chip Microbenchmark Performance (MB/sec) Processor GHz Cores Perf Perf/Core Software Environment AES-256-GCM SPARC M7 4.13 64 74,221 1,160 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 34,022 1,063 Oracle Solaris 11.2, libsoftcrypto + libumem SPARC S7 4.26 16 20,559 1,285 Oracle Solaris 11.3, libsoftcrypto + libumem Intel E5-2697 v2 2.70 24 15,338 639 Oracle Solaris 11.1, libsoftcrypto + libumem AES-192-GCM SPARC M7 4.13 64 81,448 1,273 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 36,820 1,151 Oracle Solaris 11.2, libsoftcrypto + libumem SPARC S7 4.26 16 22,326 1,395 Oracle Solaris 11.3, libsoftcrypto + libumem Intel E5-2697 v2 2.70 24 15,768 637 Oracle Solaris 11.1, libsoftcrypto + libumem AES-128-GCM SPARC M7 4.13 64 86,223 1,347 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 38,845 1,214 Oracle Solaris 11.2, libsoftcrypto + libumem SPARC S7 4.26 16 23,931 1,496 Oracle Solaris 11.3, libsoftcrypto + libumem Intel E5-2697 v2 2.70 24 16,405 684 Oracle Solaris 11.1, libsoftcrypto + libumem Encryption Performance – AES-CCM (alternative used by ZFS Filesystem) Performance is presented for in-cache AES-CCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run). AES-CCM Two-chip Microbenchmark Performance (MB/sec) Processor GHz Cores Perf Perf/Core Software Environment AES-256-CCM SPARC M7 4.13 64 67,669 1,057 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 28,909 903 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 24 19,447 810 Oracle Linux 6.5, IPP/AES-NI SPARC S7 4.26 16 17,504 1,094 Oracle Solaris 11.3, libsoftcrypto + libumem AES-192-CCM SPARC M7 4.13 64 77,711 1,214 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 33,116 1,035 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 24 22,634 943 Oracle Linux 6.5, IPP/AES-NI SPARC S7 4.26 16 20,085 1,255 Oracle Solaris 11.3, libsoftcrypto + libumem AES-128-CCM SPARC M7 4.13 64 90,729 1,418 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 32 38,529 1,204 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 24 26,951 1,123 Oracle Linux 6.5, IPP/AES-NI SPARC S7 4.26 16 23,552 1,472 Oracle Solaris 11.3, libsoftcrypto + libumem Configuration Summary SPARC S7-2 server 2 x SPARC S7 processor, 4.26 GHz 1 TB memory Oracle Solaris 11.3 SPARC T7-2 server 2 x SPARC M7 processor, 4.13 GHz 1 TB memory Oracle Solaris 11.3 SPARC T5-2 server 2 x SPARC T5 processor, 3.60 GHz 512 GB memory Oracle Solaris 11.2 Oracle Server X6-2L system 2 x Intel Xeon Processor E5-2699 v4, 2.20 GHz 256 GB memory Oracle Linux 7.2 Intel Integrated Performance Primitives for Linux, Version 9.0 (Update 2) 17 Feb 2016 Oracle Server X5-2 system 2 x Intel Xeon Processor E5-2699 v3, 2.30 GHz 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014 Sun Server X4-2 system 2 x Intel Xeon Processor E5-2697 v2, 2.70 GHz 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014 Benchmark Description The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-cache and on-chip using various ciphers, including AES-128-CFB, AES-192-CFB, AES-256-CFB, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CCM, AES-192-CCM, AES-256-CCM, AES-128-GCM, AES-192-GCM and AES-256-GCM. The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various ciphers. They were run using optimized libraries for each platform to obtain the best possible performance. The encryption tests were run with pseudo-random data of size 32 KB. The benchmark tests were designed to run out of cache, so memory bandwidth and latency are not the limitations. See Also More about AES SPARC S7-2 Server oracle.com     OTN     Blog SPARC T7-2 Server oracle.com     OTN     Blog SPARC T5-2 Server oracle.com     OTN Oracle Server X6-2L oracle.com     OTN     Blog Oracle Server X5-2 oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 6/29/2016.

Oracle's cryptography benchmark measures security performance on important AES security modes. Oracle's SPARC S7 processor with its software in silicon security is faster per core than x86...

Benchmark

SHA Digest Encryption: SPARC S7 Performance, Beats Intel E5-2699 v4 Per Core Under Load

Oracle's cryptography benchmark measures security performance on important Secure Hash Algorithm (SHA) functions. Oracle's SPARC S7 processor with its security software in silicon is faster than current and recent x86 servers. In this test, the performance of on-processor digest operations is measured for three sizes of plaintext inputs (64, 1024 and 8192 bytes) using three SHA2 digests (SHA512, SHA384, SHA256) and the older, weaker SHA1 digest. Multiple parallel threads are used to measure each processor's maximum throughput. Oracle's SPARC S7-2 server shows dramatically faster digest computation compared to current x86 two processor servers. SPARC S7 processors ran 7.5 times faster per core computing multiple parallel SHA512 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v4. SPARC S7 processors ran 7.1 times faster per core computing multiple parallel SHA256 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v4. SPARC S7 processors ran 2.6 times faster per core computing multiple parallel SHA1 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v4. SHA1 and SHA2 operations are an integral part of Oracle Solaris, while on Linux they are performed using the add-on Cryptography for Intel Integrated Performance Primitives for Linux (library). Oracle has also measured AES (CFB, GCM, CCM, CBC) cryptographic performance on the SPARC S7 processor. Performance Landscape Presented below are results for computing SHA1, SHA256, SHA384 and SHA512 digests for input plaintext sizes of 64, 1024 and 8192 bytes. Results are presented as MB/sec (10**6). All SPARC S7 processor results were run as part of this benchmark effort. All other results were run during previous benchmark efforts. Digest Performance – SHA512 Performance is presented for SHA512 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors TotalCores Performance (MB/sec) Perf/Core (MB/sec/core) 64B 1024B 8192B 64B 1024B 8192B 2 x SPARC M7, 4.13 GHz 64 39,201 167,072 184,944 613 2,611 2,890 2 x SPARC T5, 3.6 GHz 32 18,717 73,810 78,997 585 2,307 2,469 2 x SPARC S7, 4.26 GHz 16 10,231 43,099 47,820 639 2,694 2,989 2 x Intel E5-2699 v4, 2.2 GHz 44 6,973 15,412 17,616 158 350 400 2 x Intel E5-2699 v3, 2.3 GHz 36 3,949 9,214 10,681 110 256 297 2 x Intel E5-2697 v2, 2.7 GHz 24 2,681 6,631 7,701 112 276 321 Digest Performance – SHA384 Performance is presented for SHA384 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors TotalCores Performance (MB/sec) Perf/Core (MB/sec/core) 64B 1024B 8192B 64B 1024B 8192B 2 x SPARC M7, 4.13 GHz 64 39,697 166,898 185,194 620 2,608 2,894 2 x SPARC T5, 3.6 GHz 32 18,814 73,770 78,997 588 2,305 2,469 2 x SPARC S7, 4.26 GHz 16 10,315 43,158 47,763 645 2,697 2,985 2 x Intel E5-2699 v4, 2.2 GHz 44 6,909 15,353 17,618 157 349 400 2 x Intel E5-2699 v3, 2.3 GHz 36 4,061 9,263 10,678 113 257 297 2 x Intel E5-2697 v2, 2.7 GHz 24 2,774 6,669 7,706 116 278 321 Digest Performance – SHA256 Performance is presented for SHA256 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors TotalCores Performance (MB/sec) Perf/Core (MB/sec/core) 64B 1024B 8192B 64B 1024B 8192B 2 x SPARC M7, 4.13 GHz 64 45,148 113,648 119,929 705 1,776 1,874 2 x SPARC T5, 3.6 GHz 32 21,140 49,483 51,114 661 1,546 1,597 2 x SPARC S7, 4.26 GHz 16 11,872 29,371 30,961 742 1,836 1,935 2 x Intel E5-2699 v4, 2.2 GHz 44 5,103 11,174 12,037 116 254 274 2 x Intel E5-2699 v3, 2.3 GHz 36 3,446 7,785 8,463 96 216 235 2 x Intel E5-2697 v2, 2.7 GHz 24 2,404 5,570 6,037 100 232 252 Digest Performance – SHA1 Performance is presented for SHA1 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors TotalCores Performance (MB/sec) Perf/Core (MB/sec/core) 64B 1024B 8192B 64B 1024B 8192B 2 x SPARC M7, 4.13 GHz 64 47,640 92,515 97,545 744 1,446 1,524 2 x SPARC T5, 3.6 GHz 32 21,052 40,107 41,584 658 1,253 1,300 2 x SPARC S7, 4.26 GHz 16 12,665 23,899 25,209 792 1,494 1,576 2 x Intel E5-2699 v4, 2.2 GHz 44 8,566 23,901 26,752 195 543 608 2 x Intel E5-2699 v3, 2.3 GHz 36 6,677 18,165 20,405 185 505 567 2 x Intel E5-2697 v2, 2.7 GHz 24 4,649 13,245 14,842 194 552 618 Configuration Summary SPARC S7-2 server 2 x SPARC S7 processor, 4.26 GHz 1 TB memory Oracle Solaris 11.3 SPARC T7-2 server 2 x SPARC M7 processor, 4.13 GHz 1 TB memory Oracle Solaris 11.3 SPARC T5-2 server 2 x SPARC T5 processor, 3.60 GHz 512 GB memory Oracle Solaris 11.2 Oracle Server X6-2L system 2 x Intel Xeon Processor E5-2699 v4, 2.20 GHz 256 GB memory Oracle Linux 7.2 Intel Integrated Performance Primitives for Linux, Version 9.0 (Update 2) 17 Feb 2016 Oracle Server X5-2 system 2 x Intel Xeon Processor E5-2699 v3, 2.30 GHz 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014 Sun Server X4-2 system 2 x Intel Xeon Processor E5-2697 v2, 2.70 GHz 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014 Benchmark Description The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-cache and on-chip using various digests, including SHA1 and SHA2 (SHA256, SHA384, SHA512). The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various digests. They were run using optimized libraries for each platform to obtain the best possible performance. The encryption tests were run with pseudo-random data of sizes 64 bytes, 1024 bytes and 8192 bytes. The benchmark tests were designed to run out of cache, so memory bandwidth and latency are not the limitations. See Also More about Secure Hash Algorithm (SHA) SPARC S7-2 Server oracle.com     OTN     Blog Oracle Server X6-2L oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 6/29/2016.

Oracle's cryptography benchmark measures security performance on important Secure Hash Algorithm (SHA) functions. Oracle's SPARC S7 processor with its security software in silicon is faster...

Benchmark

In-Memory Database: SPARC S7-2 Performance

Fast analytics on large databases are critical to transforming key business processes. Oracle's SPARC S7 processors are specifically designed to accelerate in-memory analytics using Oracle Database 12c Enterprise Edition and its In-Memory option. The SPARC S7 processor outperforms an x86 E5-2699 v4 chip by up to 2.8x on analytics queries where all queries were run in-memory. In order to test real world deep analysis on the SPARC S7 processor, a scenario with over 2,300 analytical queries was run against a real cardinality database (RCDB) star schema. The SPARC S7 processor does this by using its Data Analytics Accelerator (DAX). DAX is not a SIMD instruction, but rather an actual co-processor that offloads in-memory queries to free up the cores for other processing. The DAX has direct access to the memory bus and can execute scans at near full memory bandwidth. This kind of acceleration is not just for the Oracle database. Oracle makes the DAX API available to other applications. Oracle's SPARC S7-2 server delivers up to a 2.8x Queries Per Minute speedup over a 2-chip x86 E5-2699 v4 server when executing analytical queries using the In-Memory option of Oracle Database 12c. The SPARC S7-2 server scanned over 36 billion rows per second through the database. Oracle Database In-Memory compresses the on-disk RCDB star schema by about 6x when using the Memcompress For Query High setting (more information following below) and by nearly 10x compared to a standard data warehouse row format version of the same database. This result shows the potential licensing cost advantage the SPARC S7 processor has because of DAX when compared to x86 based solutions. Because the Oracle Database core multipliers are the same, the licensing cost of the database gives the SPARC S7-2 server a 7.7x advantage. Performance Landscape RCDB Performance All of the following results were run as part of this benchmark effort. RCDB Performance Chart 2,304 Queries Comparison Point 2-Chip x86 E5-2699 v4 SPARC S7-2 SPARC Advantage Total Cores 44 16 2.75x Elapsed Time 1885 sec 675 sec 2.8x Throughput 73 qpm 205 qpm 2.8x Required Database Licenses 22 8 2.75x   SPARC S7-2 Server Licensing Advantage 7.7x Total Cores – Total processor cores in the system Elapsed Time – Run time of test Throughput – Number of queries per minute processed Required Database License – Number of database licenses, using 0.5 multiplier, see OLSA page for more Licensing Advantage – To show how DAX can help, this shows the relative throughput per licenses between the presented systems Compression This performance test was run on a Scale Factor 1750 database, which represents a 1.75 TB row format data warehouse. The database is then transformed into a star schema which ends up around 1.1 TB in size. The star schema is then loaded in memory with a setting of "MEMCOMPRESS FOR QUERY HIGH", which focuses on performance with somewhat more aggressive compression. This memory area is a separate part of the System Global Area (SGA) which is defined by the database initialization parameter "inmemory_size". See below for an example. The LINEORDER fact table, which comprises nearly the entire database size, is listed below in memory with its compression ratio.   Column Name Original Size (Bytes) In Memory Size (Bytes) Compression Ratio LINEORDER 1,101,950,197,760 178,583,568,384 6.2x Configuration Summary SPARC Server: 1 x SPARC S7-2 server with 2 x SPARC S7 processors, 4.27 GHz 512 GB memory Oracle Solaris 11.3 Oracle Database 12c Enterprise Edition Release 12.1.0.2 Oracle Database In-Memory x86 Server: 1 x Oracle Server X6-2L system with 2 x Intel Xeon Processor E5-2699 v4, 2.2 GHz 512 GB memory Oracle Linux 7.2 (3.8.13-98.7.1.el7uek.x86_64) Oracle Database 12c Enterprise Edition Release 12.1.0.2 Oracle Database In-Memory Benchmark Description The real cardinality database (RCDB) benchmark was created to showcase the potential speedup one may see moving from on disk, row format data warehouse/Star Schema, to utilizing the Oracle Database 12c In-Memory feature for analytical queries.  All tests presented are run in-memory. The workload consists of 2,304 unique queries asking questions such as "In 2014, what was the total revenue of single item orders?" or "In August 2013, how many orders exceeded a total price of $50?" Questions like these can help a company see where to focus for further revenue growth or identify weaknesses in their offerings. RCDB scale factor 1750 represents a 1.75 TB data warehouse. It is transformed into a star schema of 1.1 TB and then becomes 179 GB in size when loaded in memory. It consists of 1 fact table and 4 dimension tables with over 10.5 billion rows. There are 56 columns with most cardinalities varying between 5 and 2,000. A primary key is an example of something outside this range. One problem with many industry standard generated databases is that as they have grown in size, the cardinalities for the generated columns have become exceedingly unrealistic. For instance, one industry standard benchmark uses a schema where at scale factor 1 TB, it calls for the number of parts to be SF * 800,000. A 1 TB database that calls for 800 million unique parts is not very realistic. Therefore RCDB attempts to take some of these unrealistic cardinalities and size them to be more representative of, at least, a section of customer data. Obviously, one cannot encompass every database in one schema. This is just an example. We carefully scaled each system so that the optimal number of users was run on each system under test so that we did not create artificial bottlenecks. Each user ran an equal portion of the total batch job (2304 queries) and the same queries were run on each system, allowing for a fair comparison of the results. Key Points and Best Practices This benchmark utilized the SPARC S7 processor's DAX for query acceleration. The batch accesses many columns that have been Run Length Encoded (RLE). The decompression of these columns can be done at a much faster rate when offloaded to DAX. Some columns are also OZIP compressed. DAX can scan and return results directly on an OZIP'd column, resulting in reduced computation and bandwidth. The batch scans large chunks of data for every query. These scans are offloaded to DAX, resulting in reduced computation and freeing up cores for other work. All SPARC S7-2 server results were run with out-of-the-box tuning for Oracle Solaris. All Oracle Server X6-2L system results were run with out of the box tunings for Oracle Linux, except for the setting in /etc/sysctl.conf to get large pages for the Oracle Database: vm.nr_hugepages=64520 To create an in-memory area, the following was added to the init.ora: inmemory_size = 200g An example of how to set a table to be in-memory is below: ALTER TABLE CUSTOMER INMEMORY MEMCOMPRESS FOR QUERY HIGH   See Also SPARC S7-2 Server oracle.com    OTN    Blog Oracle Server X6-2L oracle.com    OTN    Blog Oracle Solaris oracle.com    OTN    Blog Oracle Database oracle.com    OTN    Blog Oracle Database In-Memory oracle.com    OTN    Blog   Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of June 29, 2016. The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Fast analytics on large databases are critical to transforming key business processes. Oracle's SPARC S7 processors are specifically designed to accelerate in-memory analytics using Oracle Database 12c...

Benchmark

Memory and Bisection Bandwidth: SPARC S7 Performance

The STREAM benchmark measures delivered memory bandwidth on a variety of memory intensive tasks.  Delivered memory bandwidth is key to a server delivering high performance on a wide variety of workloads.  The STREAM benchmark is typically run where each chip in the system gets its memory requests satisfied from local memory.  This report presents performance of Oracle's SPARC S7 processor based servers and compares their performance to 2-chip x86 servers. Bisection bandwidth on a server is a measure of the cross-chip data bandwidth between the processors of a system where no memory access is local to the processor.  Systems with large cross-chip penalties show dramatically lower bisection bandwidth.  Real-world ad hoc workloads tend to perform better on systems with better bisection bandwidth because their memory usage characteristics tend to be chaotic. The STREAM benchmark is easy to run and anyone can measure memory bandwidth on a target system (see Key Points and Best Practices section). The SPARC S7-2L server delivers nearly 100 GB/sec on the STREAM benchmark.   Performance Landscape The following SPARC and x86 STREAM results were run as part of this benchmark effort.  The SPARC S7 processor based servers deliver nearly the same STREAM benchmark performance as the x86 E5-2699 v4 and v3 based servers but with significantly fewer cores (more performance available per core). Maximum STREAM Benchmark Performance System (2-Chips) Total Cores Bandwidth (MB/sec - 10^6) Copy Scale Add Triad SPARC S7-2L (16 x 32 GB) 16 98,581 93,274 96,431 96,315 SPARC S7-2 (16 x 64 GB) 16 90,285 90,163 87,178 87,051 x86 E5-2699 v4 44 120,939 121,417 129,775 130,242 x86 E5-2699 v3 (COD) 36 103,927 105,262 117,688 117,680 x86 E5-2699 v3 36 105,622 105,808 113,116 112,521 All of the following bisection bandwidth results were run as part of this benchmark effort.  The SPARC S7 processor based servers deliver more STREAM benchmark performance compared to the x86 E5-2699 v4 and v3 based servers and with significantly fewer cores  (more performance available per core). Bisection Bandwidth Benchmark Performance (Nonlocal STREAM) System (2-Chips) Total Cores Bandwidth (MB/sec - 10^6) Copy Scale Add Triad SPARC S7-2L 16 57,443 57,020 56,815 56,562 x86 E5-2699 v4 44 50,153 50,366 50,266 50,265 x86 E5-2699 v3 36 45,211 45,331 47,414 47,251 Configuration Summary SPARC Configurations: SPARC S7-2L 2 x SPARC S7 processors (4.267 GHz) 512 GB memory (16 x 32 GB dimms) SPARC S7-2 2 x SPARC S7 processors (4.267 GHz) 1 TB memory (16 x 64 GB dimms) Oracle Solaris 11.3 Oracle Developer Studio 12.5 x86 Configurations: Oracle Server X6-2 2 x Intel Xeon Processor E5-2699 v4 256 GB memory (16 x 16 GB dimms) Oracle Server X5-2 2 x Intel Xeon Processor E5-2699 v3 256 GB memory (16 x 16 GB dimms) Oracle Linux 7.1 Intel Parallel Studio XE Composer Version 2016 compilers Benchmark Description STREAM The STREAM benchmark measures sustainable memory bandwidth (in MB/s) for simple vector compute kernels. All memory accesses are sequential, so a picture of how fast regular data may be moved through the system is portrayed. Properly run, the benchmark displays the characteristics of the memory system of the machine and not the advantages of running from the system's memory caches. STREAM counts the bytes read plus the bytes written to memory. For the simple Copy kernel, this is exactly twice the number obtained from the bcopy convention. STREAM does this because three of the four kernels (Scale, Add and Triad) do arithmetic, so it makes sense to count both the data read into the CPU and the data written back from the CPU.  The Copy kernel does no arithmetic, but, for consistency, counts bytes the same way as the other three. The sequential nature of the memory references is the benchmark's biggest weakness. The benchmark does not expose limitations in a system's interconnect to move data from anywhere in the system to anywhere. Bisection Bandwidth – Easy Modification of STREAM Benchmark To test for bisection bandwidth, processes are bound to processors in sequential order. The memory is allocated in reverse order, so that the memory is placed non-local to the process. The benchmark is then run.  If the system is capable of page migration, this feature must be turned off. Key Points and Best Practices The stream benchmark code was compiled for the SPARC S7 processor based systems with the following flags using Oracle Developer Studio 12.5 C: -fast -m64-W2,-Avector:aggressive -xautopar -xreduction -xpagesize=4m   The benchmark code was compiled for the x86 based systems with the following flags (Intel icc compiler): -O3 -m64 -xCORE-AVX2 -ipo -openmp -mcmodel=medium -fno-alias -nolib-inline  On Oracle Solaris, binding is accomplished with either setting the environment variable SUNW_MP_PROCBIND or the OpenMP variables OMP_PROC_BIND and OMP_PLACES. export OMP_NUM_THREADS=128 export SUNW_MP_PROCBIND=0-127 On Oracle Linux systems using Intel compiler, binding is accomplished by setting the environment variable KMP_AFFINITY. export OMP_NUM_THREADS=72 export KMP_AFFINITY='verbose,granularity=fine,proclist=[0-71],explicit' The source code change in the file stream.c to do the reverse allocation < for (j=STREAM_ARRAY_SIZE-1; j>=0; j--) { a[j] = 1.0; b[j] = 2.0; c[j] = 0.0; } --- > for (j=0; j<STREAM_ARRAY_SIZE; j++) { a[j] = 1.0; b[j] = 2.0; c[j] = 0.0; } See Also STREAM Benchmark Website SPARC S7-2 Server oracle.com    OTN    Blog SPARC S7-2L Server oracle.com    OTN    Blog Oracle Solaris oracle.com    OTN    Blog Oracle Developer Studio oracle.com    OTN Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 6/29/2016.

The STREAM benchmark measures delivered memory bandwidth on a variety of memory intensive tasks.  Delivered memory bandwidth is key to a server delivering high performance on a wide variety...

Benchmark

Oracle Communications ASAP – Telco Subscriber Activation: SPARC S7-2L Fastest Recorded Result

Oracle's SPARC S7-2L server delivered fastest recorded results on Oracle Communications ASAP. The SPARC S7-2L server ran Oracle Solaris 11.3 with Oracle Database 12c and Oracle Communications ASAP version 7.3, with Oracle Database 11g Release 2 Client. Running Oracle Communications ASAP, the SPARC S7-2L server delivered a fastest recorded result of 3,292 ASDLs/sec (atomic network activation actions). The SPARC S7-2L server running a single instance of the Oracle Communications ASAP application, with both the application and database tiers consolidated onto a single machine, easily supported the service activation volumes of 3,292 ASDLs/sec which is representative of a typical mobile operator with more than 100 million subscribers. Performance Landscape All of the following results were run as part of this benchmark effort. ASAP 7.3 Test Results – 16 NEP System ASDLs/sec CPU Usage SPARC S7-2L 3,292 63% Configuration Summary Hardware Configuration: SPARC S7-2L server 2 x SPARC S7 processors (4.27 GHz) 512 GB memory Flash Storage Software Configuration: Oracle Communications ASAP 7.3 Version B122 Oracle Solaris 11.3 Oracle Database 12c Release 12.1.0.2.0 Oracle WebLogic Server 12.1.3.0.0 Oracle JDK 7 update 95 Benchmark Description Oracle Communications ASAP provides a convergent service activation platform that automatically activates customer services in a heterogeneous network and IT environment. It supports the activation of consumer and business services in fixed and mobile domains against network and IT applications. ASAP enables rapid service design and network technology introduction by means of its metadata-driven architecture, design-time configuration environment, and catalog of pre-built activation cartridges to reduce deployment time, cost, and risk. The application has been deployed for mobile (3G, 4G and M2M) services and fixed multi-play (broadband, voice, video, and IT) services in telecommunications, cable and satellite environments as well as for business voice, data, and IT cloud services. It may be deployed in a fully integrated manner as part of the Oracle Communications Service Fulfillment solution or directly integrated with third- party upstream systems. Market-proven for high-volume performance and scalability, Oracle Communications ASAP is deployed by more than 75 service providers worldwide and activates services for approximately 250 million subscribers globally. The throughput of ASAP is measured in atomic actions per second (or ASDLs/sec). An atomic action is a single command or operation that can be executed on a network element. Atomic actions are grouped together to form a common service action, where each service action typically relates to an orderable item, such as "GSM voice" or "voice mail" or "GSM data". One or more service actions are invoked by an order management system via an activation work order request. The workload resembles a typical mobile order to activate a GSM subscriber. A single service action to add a subscriber consists of seven atomic actions where each atomic action executes a command on a network element. Each network element was serviced by a dedicated Network Element Processor (NEP). The ASAP benchmark can vary the number of NEPs, which correlate to the complexity of a Telco operator's environment. See Also SPARC S7-2L Server oracle.com    OTN    Blog Oracle Communications ASAP oracle.com Oracle Solaris oracle.com    OTN    Blog Oracle Database oracle.com    OTN    Blog Oracle Fusion Middleware oracle.com    OTN    Blog   Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of June 29, 2016.

Oracle's SPARC S7-2L server delivered fastest recorded results on Oracle Communications ASAP. The SPARC S7-2L server ran Oracle Solaris 11.3 with Oracle Database 12c and Oracle Communications ASAP...

Benchmark

Oracle Berkeley DB: SPARC S7-2 Performance

Oracle's SPARC S7-2 server shows higher throughput performance per core running a mixed transaction workload using Oracle Berkeley DB on 4 simultaneous 20 GB database instances compared to results on a single processor domain of Oracle's SPARC M7-16 server. Each instance contains a sum of 50 million rows of customer, account and orders data. The SPARC S7-2 server delivered a rate of 142,481 transactions per second on the throughput test. Performance Landscape All of the following results were run as part of this benchmark effort. Mixed Workload - Berkeley DB Processor Total Cores Perf (tpS) Perf/Core SPARC M7 32 266,450 8,327 SPARC S7 16 142,481 8,905 Configuration Summary Systems Under Test: 1 x SPARC S7-2 server with 2 x SPARC S7 processors, 4.26 GHz, 8 cores per processor 512 GB memory Oracle Solaris 11.3 Oracle Berkeley DB 6.2 1 x SPARC M7-16 server using a single processor domain with 1 x SPARC M7 processor, 4.13 GHz, 32 cores per processor 512 GB Memory Oracle Solaris 11.3 Oracle Berkeley DB 6.2 Benchmark Description The benchmark consists of workload running against a schema of 6 tables: 4 tables that get updated:  account, branch, teller, history, 2 read-only: customer and orders.  The workload has a set of 4 transactions: account update: update account, branch, teller balances. get-order-customer: random read on order to get customer key, then locate and read customer records. search-order: Get a range of orders. search-customer: Get a random customer record. Transaction mix: account update is 5%; the other three (read-only) 95%.  The benchmark sampling time was 5 minutes and the total throughput was calculated. Key Points and Best Practices The default mechanism for implementing the Oracle Berkeley DB cache is memory mapped files. Improved performance is obtained using shared memory.  For this demonstration, changes are made in the test programs. Changing from shared memory to ISM requires a simple change to the provided Oracle Berkeley DB source code.  To add ISM support, the routine os_map.c was modified original: if((infop->addr = shmat(id,NULL,0)) == (void *)-1) ISM: if((infop->addr = shmat(id,NULL,SHM_SHARE_MMU)) == (void *)-1) See Also SPARC S7-2 Server oracle.com    OTN    Blog Oracle Solaris oracle.com    OTN    Blog Oracle Berkeley DB oracle.com    OTN    Blog   Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 06/29/2016.

Oracle's SPARC S7-2 server shows higher throughput performance per core running a mixed transaction workload using Oracle Berkeley DB on 4 simultaneous 20 GB database instances compared to results on...

Benchmark

SAP Two-Tier Standard Sales and Distribution SD Benchmark: SPARC M7-8 World Record 8 Processors

Oracle's SPARC M7-8 server produced a world record result for 8-processors on the SAP two-tier Sales and Distribution (SD) Standard Application Benchmark using SAP Enhancement Package 5 for SAP ERP 6.0 (8 chips / 256 cores / 2048 threads). The SPARC M7-8 server achieved 130,000 SAP SD benchmark users running the two-tier SAP Sales and Distribution (SD) Standard Application Benchmark using SAP Enhancement Package 5 for SAP ERP 6.0. The SPARC M7-8 server is 1.5x faster per core than x86-based HPE Integrity Superdome X running the two-tier SAP Sales and Distribution (SD) Standard Application Benchmark using SAP Enhancement Package 5 for SAP ERP 6.0. The SPARC M7-8 server result was run with Oracle Solaris 11 and used Oracle Database 12c. Previously the SPARC T7-2 server set the 2-chip server world record achieving 30,800 SAP SD benchmark users running the two-tier SAP Sales and Distribution (SD) Standard Application Benchmark using SAP Enhancement Package 5 for SAP ERP 6.0. Performance Landscape SAP-SD 2-tier performance table in decreasing performance order with SAP ERP 6.0 Enhancement Package 5 for SAP ERP 6.0 results (current version of the benchmark as of May 2012). SAP SD Two-Tier Benchmark System Processor OS Database Users Resp Time (sec) Users/ core Cert# SPARC M7-8 8 x SPARC M7 (8x 32core) Oracle Solaris 11 Oracle Database 12c 130,000 0.93 508 2016020 HPE Integrity Superdome X 16 x Intel E7-8890 v3 (16x 18core) Windows Server 2012 R2 Datacenter Edition SQL Server 2014 100,000 0.99 347 2016002 Number of cores presented are per chip, to get system totals, multiple by the number of chips. Complete benchmark results may be found at the SAP benchmark website http://www.sap.com/benchmark. Configuration Summary and Results Database/Application Server: 1 x SPARC M7-8 server with 8 x SPARC M7 processors (4.13 GHz, total of 8 processors / 256 cores / 2048 threads) 4 TB memory Oracle Solaris 11.3 Oracle Database 12c Database Storage: 7 x Sun Server X3-2L each with 2 x Intel Xeon Processors E5-2609 (2.4 GHz) 16 GB memory 4 x Sun Flash Accelerator F40 PCIe Card 12 x 3 TB SAS disks Oracle Solaris 11 REDO log Storage: 1 x Pillar FS-1 Flash Storage System, with 2 x FS1-2 Controller (Netra X3-2) 2 x FS1-2 Pilot (X4-2) 4 x DE2-24P Disk enclosure 96 x 300 GB 10000 RPM SAS Disk Drive Assembly Certified Results (published by SAP) Number of SAP SD benchmark users:   130,000 Average dialog response time:   0.93 seconds Throughput:       Fully processed order line items per hour:   14,269,670   Dialog steps per hour:   42,809,000   SAPS:   713,480 Average database request time (dialog/update):   0.018 sec / 0.039 sec SAP Certification:   2016020 Benchmark Description The SAP Standard Application SD (Sales and Distribution) Benchmark is an ERP business test that is indicative of full business workloads of complete order processing and invoice processing, and demonstrates the ability to run both the application and database software on a single system. The SAP Standard Application SD Benchmark represents the critical tasks performed in real-world ERP business environments. SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products. See Also SPARC M7-8 Benchmark Details   2016020 SAP Benchmark Website SPARC M7-8 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Disclosure Statement Two-tier SAP Sales and Distribution (SD) standard application benchmarks, SAP Enhancement Package 5 for SAP ERP 6.0 as of 5/14/16: SPARC M7-8 (8 processors, 256 cores, 2048 threads) 130,000 SAP SD users, 8 x 4.13 GHz SPARC M7, 4 TB memory, Oracle Database 12c, Oracle Solaris 11, Cert# 2016020 SPARC T7-2 (2 processors, 64 cores, 512 threads) 30,800 SAP SD users, 2 x 4.13 GHz SPARC M7, 1 TB memory, Oracle Database 12c, Oracle Solaris 11, Cert# 2015050 HPE Integrity Superdome X (16 processors, 288 cores, 576 threads) 100,000 SAP SD users, 16 x 2.5 GHz Intel Xeon Processor E7-8890 v3 4096 GB memory, SQL Server 2014, Windows Server 2012 R2 Datacenter Edition, Cert# 2016002 SAP, R/3, reg TM of SAP AG in Germany and other countries. More info www.sap.com/benchmark

Oracle's SPARC M7-8 server produced a world record result for 8-processors on the SAP two-tier Sales and Distribution (SD) Standard Application Benchmark using SAP Enhancement Package 5 for SAP ERP...

Benchmark

SHA Digest Encryption: SPARC T7-2 Beats x86 E5 v4

Oracle's cryptography benchmark measures security performance on important Secure Hash Algorithm (SHA) functions. Oracle's SPARC M7 processor with its security software in silicon is faster than current and recent x86 servers. In this test, the performance of on-processor digest operations is measured for three sizes of plaintext inputs (64, 1024 and 8192 bytes) using three SHA2 digests (SHA512, SHA384, SHA256) and the older, weaker SHA1 digest. Multiple parallel threads are used to measure each processor's maximum throughput. Oracle's SPARC T7-2 server shows dramatically faster digest computation compared to current x86 two processor servers. SPARC M7 processors running Oracle Solaris 11.3 ran 10 times faster computing multiple parallel SHA512 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v4 running Oracle Linux 7.2. SPARC M7 processors running Oracle Solaris 11.3 ran 10 times faster computing multiple parallel SHA256 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v4 running Oracle Linux 7.2. SPARC M7 processors running Oracle Solaris 11.3 ran 3.6 times faster computing multiple parallel SHA1 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v4 running Oracle Linux 7.2. SPARC M7 processors running Oracle Solaris 11.3 ran 17 times faster computing multiple parallel SHA512 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v3 running Oracle Linux 6.5. SPARC M7 processors running Oracle Solaris 11.3 ran 14 times faster computing multiple parallel SHA256 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v3 running Oracle Linux 6.5. SPARC M7 processors running Oracle Solaris 11.3 ran 4.8 times faster computing multiple parallel SHA1 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon Processor E5-2699 v3 running Oracle Linux 6.5. SHA1 and SHA2 operations are an integral part of Oracle Solaris, while on Linux they are performed using the add-on Cryptography for Intel Integrated Performance Primitives for Linux (library). Oracle has also measured AES (CFB, GCM, CCM, CBC) cryptographic performance on the SPARC M7 processor. Performance Landscape Presented below are results for computing SHA1, SHA256, SHA384 and SHA512 digests for input plaintext sizes of 64, 1024 and 8192 bytes. Results are presented as MB/sec (10**6). All SPARC M7 processor results were run as part of this benchmark effort. All other results were run during previous benchmark efforts. Digest Performance – SHA512 Performance is presented for SHA512 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M7, 4.13 GHz 39,201 167,072 184,944 2 x SPARC T5, 3.6 GHz 18,717 73,810 78,997 2 x Intel Xeon E5-2699 v4, 2.2 GHz 6,973 15,412 17,616 2 x Intel Xeon E5-2699 v3, 2.3 GHz 3,949 9,214 10,681 2 x Intel Xeon E5-2697 v2, 2.7 GHz 2,681 6,631 7,701 Digest Performance – SHA384 Performance is presented for SHA384 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M7, 4.13 GHz 39,697 166,898 185,194 2 x SPARC T5, 3.6 GHz 18,814 73,770 78,997 2 x Intel Xeon E5-2699 v4, 2.2 GHz 6,909 15,353 17,618 2 x Intel Xeon E5-2699 v3, 2.3 GHz 4,061 9,263 10,678 2 x Intel Xeon E5-2697 v2, 2.7 GHz 2,774 6,669 7,706 Digest Performance – SHA256 Performance is presented for SHA256 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M7, 4.13 GHz 45,148 113,648 119,929 2 x SPARC T5, 3.6 GHz 21,140 49,483 51,114 2 x Intel Xeon E5-2699 v4, 2.2 GHz 5,103 11,174 12,037 2 x Intel Xeon E5-2699 v3, 2.3 GHz 3,446 7,785 8,463 2 x Intel Xeon E5-2697 v2, 2.7 GHz 2,404 5,570 6,037 Digest Performance – SHA1 Performance is presented for SHA1 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M7, 4.13 GHz 47,640 92,515 97,545 2 x SPARC T5, 3.6 GHz 21,052 40,107 41,584 2 x Intel Xeon E5-2699 v4, 2.2 GHz 8,566 23,901 26,752 2 x Intel Xeon E5-2699 v3, 2.3 GHz 6,677 18,165 20,405 2 x Intel Xeon E5-2697 v2, 2.7 GHz 4,649 13,245 14,842   Configuration Summary SPARC T7-2 server 2 x SPARC M7 processor, 4.13 GHz 1 TB memory Oracle Solaris 11.3   SPARC T5-2 server 2 x SPARC T5 processor, 3.60 GHz 512 GB memory Oracle Solaris 11.2   Oracle Server X6-2L system 2 x Intel Xeon Processor E5-2699 v4, 2.20 GHz 256 GB memory Oracle Linux 7.2 Intel Integrated Performance Primitives for Linux, Version 9.0 (Update 2) 17 Feb 2016   Oracle Server X5-2 system 2 x Intel Xeon Processor E5-2699 v3, 2.30 GHz 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014   Sun Server X4-2 system 2 x Intel Xeon Processor E5-2697 v2, 2.70 GHz 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014   Benchmark Description The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-cache and on-chip using various digests, including SHA1 and SHA2 (SHA256, SHA384, SHA512). The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various digests. They were run using optimized libraries for each platform to obtain the best possible performance. The encryption tests were run with pseudo-random data of sizes 64 bytes, 1024 bytes and 8192 bytes. The benchmark tests were designed to run out of cache, so memory bandwidth and latency are not the limitations. See Also   More about Secure Hash Algorithm (SHA) SPARC T7-2 Server oracle.com     OTN     Blog SPARC T5-2 Server oracle.com     OTN Oracle Server X6-2L oracle.com     OTN     Blog Oracle Server X5-2 oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 4/13/2016.

Oracle's cryptography benchmark measures security performance on important Secure Hash Algorithm (SHA) functions. Oracle's SPARC M7 processor with its security software in silicon is faster...

Benchmark

AES Encryption: SPARC T7-2 Beats x86 E5 v4

Oracle's cryptography benchmark measures security performance on important AES security modes. Oracle's SPARC M7 processor with its software in silicon security is faster than x86 servers that have the AES-NI instructions. In this test, the performance of on-processor encryption operations is measured (32 KB encryptions). Multiple threads are used to measure each processor's maximum throughput. Oracle's SPARC T7-2 server shows dramatically faster encryption compared to current x86 two processor servers. SPARC M7 processors running Oracle Solaris 11.3 ran 3.3 times faster executing AES-CFB 256-bit key encryption (in cache) than the Intel Xeon Processor E5-2699 v4 (with AES-NI) running Oracle Linux 7.2. SPARC M7 processors running Oracle Solaris 11.3 ran 3.1 times faster executing AES-CFB 128-bit key encryption (in cache) than the Intel Xeon Processor E5-2699 v4 (with AES-NI) running Oracle Linux 7.2. SPARC M7 processors running Oracle Solaris 11.3 ran 4.0 times faster executing AES-CFB 256-bit key encryption (in cache) than Intel Xeon Processor E5-2699 v3 (with AES-NI) running Oracle Linux 6.5. SPARC M7 processors running Oracle Solaris 11.3 ran 3.7 times faster executing AES-CFB 128-bit key encryption (in cache) than Intel Xeon Processor E5-2699 v3 (with AES-NI) running Oracle Linux 6.5. AES-CFB encryption is used by Oracle Database for Transparent Data Encryption (TDE) which provides security for database storage. Oracle has also measured SHA digest performance on the SPARC M7 processor. Performance Landscape Presented below are results for running encryption using the AES cipher with the CFB, CBC, GCM and CCM modes for key sizes of 128, 192 and 256. Decryption performance was similar and is not presented. Results are presented as MB/sec (10**6). All SPARC M7 processor results were run as part of this benchmark effort. All other results were run during previous benchmark efforts. Encryption Performance – AES-CFB (used by Oracle Database) Performance is presented for in-cache AES-CFB128 mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CFB Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CFB SPARC M7 4.13 2 126,948 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 53,794 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 39,034 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 31,924 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 19,964 Oracle Linux 6.5, IPP/AES-NI AES-192-CFB SPARC M7 4.13 2 144,299 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 60,736 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 45,351 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 37,157 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 23,218 Oracle Linux 6.5, IPP/AES-NI AES-128-CFB SPARC M7 4.13 2 166,324 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 68,691 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 54,179 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 44,388 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 27,755 Oracle Linux 6.5, IPP/AES-NI Encryption Performance – AES-CBC Performance is presented for in-cache AES-CBC mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CBC Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CBC SPARC M7 4.13 2 134,278 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 56,788 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 38,943 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 31,894 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 19,961 Oracle Linux 6.5, IPP/AES-NI AES-192-CBC SPARC M7 4.13 2 152,961 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 63,937 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 45,285 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 37,021 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 23,224 Oracle Linux 6.5, IPP/AES-NI AES-128-CBC SPARC M7 4.13 2 175,151 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 72,870 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v4 2.20 2 54,076 Oracle Linux 7.2, IPP/AES-NI Intel E5-2699 v3 2.30 2 44,103 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 27,730 Oracle Linux 6.5, IPP/AES-NI Encryption Performance – AES-GCM (used by ZFS Filesystem) Performance is presented for in-cache AES-GCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run). AES-GCM Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-GCM SPARC M7 4.13 2 74,221 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 34,022 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 15,338 Oracle Solaris 11.1, libsoftcrypto + libumem AES-192-GCM SPARC M7 4.13 2 81,448 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 36,820 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 15,768 Oracle Solaris 11.1, libsoftcrypto + libumem AES-128-GCM SPARC M7 4.13 2 86,223 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 38,845 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 16,405 Oracle Solaris 11.1, libsoftcrypto + libumem Encryption Performance – AES-CCM (alternative used by ZFS Filesystem) Performance is presented for in-cache AES-CCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run). AES-CCM Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CCM SPARC M7 4.13 2 67,669 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 28,909 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 19,447 Oracle Linux 6.5, IPP/AES-NI AES-192-CCM SPARC M7 4.13 2 77,711 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 33,116 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 22,634 Oracle Linux 6.5, IPP/AES-NI AES-128-CCM SPARC M7 4.13 2 90,729 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 38,529 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 26,951 Oracle Linux 6.5, IPP/AES-NI Configuration Summary SPARC T7-2 server 2 x SPARC M7 processor, 4.13 GHz 1 TB memory Oracle Solaris 11.3 SPARC T5-2 server 2 x SPARC T5 processor, 3.60 GHz 512 GB memory Oracle Solaris 11.2 Oracle Server X6-2L system 2 x Intel Xeon Processor E5-2699 v4, 2.20 GHz 256 GB memory Oracle Linux 7.2 Intel Integrated Performance Primitives for Linux, Version 9.0 (Update 2) 17 Feb 2016 Oracle Server X5-2 system 2 x Intel Xeon Processor E5-2699 v3, 2.30 GHz 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014 Sun Server X4-2 system 2 x Intel Xeon Processor E5-2697 v2, 2.70 GHz 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014 Benchmark Description The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-cache and on-chip using various ciphers, including AES-128-CFB, AES-192-CFB, AES-256-CFB, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CCM, AES-192-CCM, AES-256-CCM, AES-128-GCM, AES-192-GCM and AES-256-GCM. The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various ciphers. They were run using optimized libraries for each platform to obtain the best possible performance. The encryption tests were run with pseudo-random data of size 32 KB. The benchmark tests were designed to run out of cache, so memory bandwidth and latency are not the limitations. See Also More about AES SPARC T7-2 Server oracle.com     OTN     Blog SPARC T5-2 Server oracle.com     OTN Oracle Server X6-2L oracle.com     OTN     Blog Oracle Server X5-2 oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 4/13/2016.

Oracle's cryptography benchmark measures security performance on important AES security modes. Oracle's SPARC M7 processor with its software in silicon security is faster than x86 servers that...

Benchmark

PeopleSoft Human Capital Management 9.1 FP2: SPARC M7-8 Results Using Oracle Advances Security Transparent Data Encryption

Using Oracle Advanced Security Transparent Data Encryption (TDE), Oracle's SPARC M7-8 server using Oracle's SPARC M7 processor's software in silicon cryptography instructions produced results on Oracle's PeopleSoft Human Capital Management 9.1 FP2 Benchmark that were nearly identical to results run without TDE (clear-text runs). The benchmark consists of three different components, PeopleSoft HR Self-Service Online, PeopleSoft Payroll Batch, and the combined PeopleSoft HR Self-Service Online and PeopleSoft Payroll Batch. The benchmarks were run on a virtualized two-chip, 1 TB LDom of the SPARC M7-8 server. Using TDE enforces data-at-rest encryption in the database layer. Applications and users authenticated to the database continue to have access to application data transparently (no application code or configuration changes are required), while attacks from OS users attempting to read sensitive data from tablespace files and attacks from thieves attempting to read information from acquired disks or backups are denied access to the clear-text data. The PeopleSoft HR online-only and the PeopleSoft HR online combined with PeopleSoft Payroll batch showed similar Search/Save average response times using TDE compared to the corresponding clear-text runs. The PeopleSoft Payroll batch-only run showed only around 4% degradation in batch throughput using TDE compared to the clear-text run. The PeopleSoft HR online combined with PeopleSoft Payroll batch run showed less than 5% degradation in batch throughput (payments per hour) using TDE compared to the clear-text result. On the combined benchmark, the virtualized two-chip LDom of the SPARC M7-8 server with TDE demonstrated around 5 times better Search and around 8 times better Save average response times running nearly double the number of online users for the online component compared to the ten-chip x86 clear-text database solution from Cisco. On the PeopleSoft Payroll batch run and using only a single chip in the virtualized two-chip LDom on the SPARC M7-8 server, the TDE solution demonstrated 1.7 times better batch throughput compared to a four-chip Cisco UCSB460 M4 server with clear-text database. On the PeopleSoft Payroll batch run and using only a single chip in the virtualized two-chip LDom on the SPARC M7-8 server, the TDE solution demonstrated around 2.3 times better batch throughput compared to a nine-chip IBM zEnterprise z196 server (EC 2817-709, 9-way, 8943 MIPS) with clear-text database. On the combined benchmark, the two SPARC M7 processor LDom (in SPARC M7-8) can run the same number of online users with TDE as a dynamic domain (PDom) of eight SPARC M6 processors (in SPARC M6-32) with clear-text database with better online response times, batch elapsed times and batch throughput. Performance Landscape All results presented may be found at Oracle's PeopleSoft benchmark white papers. The first table presents the combined results, running both the PeopleSoft HR Self-Service Online and Payroll Batch tests concurrently. PeopleSoft HR Self-Service Online And Payroll Batch Using Oracle Database 11g System Processors Chips Used Users Search/Save Batch Elapsed Time Batch Pay/Hr SPARC M7-8 (Secure with TDE) SPARC M7 2 35,000 0.55 sec/0.34 sec 23.72 min 1,265,969 SPARC M7-8 (Unsecure) SPARC M7 2 35,000 0.67 sec/0.42 sec 22.71 min 1,322,272 SPARC M6-32 (Unsecure) SPARC M6 8 35,000 1.80 sec/1.12 sec 29.2 min 1,029,440 Cisco 1 x B460 M4, 3 x B200 M3 (Unsecure) Intel E7-4890 v2, Intel E5-2697 v2 10 18,000 2.70 sec/2.60 sec 21.70 min 1,383,816   The following results are running only the Peoplesoft HR Self-Service Online test. PeopleSoft HR Self-Service Online Using Oracle Database 11g System Processors Chips Used Users Search/Save Avg Response Times SPARC M7-8 (Secure with TDE) SPARC M7 2 40,000 0.52 sec/0.31 sec SPARC M7-8 (Unsecure) SPARC M7 2 40,000 0.55 sec/0.33 sec SPARC M6-32 (Unsecure) SPARC M6 8 40,000 2.73 sec/1.33 sec Cisco 1 x B460 M4, 3 x B200 M3 (Unsecure) Intel E7-4890 v2, Intel E5-2697 v2 10 20,000 0.35 sec/0.17 sec   The following results are running only the Peoplesoft Payroll Batch test. For the SPARC M7-8 server results, only one of the processors was used per LDom. This was accomplished using processor sets to further restrict the test to a single SPARC M7 processor. PeopleSoft Payroll Batch Using Oracle Database 11g System Processors Chips Used Batch Elapsed Time Batch Pay/Hr SPARC M7-8 (Secure with TDE) SPARC M7 1 13.34 min 2,251,034 SPARC M7-8 (Unsecure) SPARC M7 1 12.85 min 2,336,872 SPARC M6-32 (Unsecure) SPARC M6 2 18.27 min 1,643,612 Cisco UCS B460 M4 (Unsecure) Intel E7-4890 v2 4 23.02 min 1,304,655 IBM z196 (Unsecure) zEnterprise (5.2 GHz, 8943 MIPS) 9 30.50 min 984,551   Configuration Summary System Under Test: SPARC M7-8 server with 8 x SPARC M7 processor (4.13 GHz) 4 TB memory Virtualized as an Oracle VM Server for SPARC (LDom) with 2 x SPARC M7 processor (4.13 GHz) 1 TB memory   Storage Configuration: 2 x Oracle ZFS Storage ZS3-2 appliance (DB Data) each with 40 x 300 GB 10K RPM SAS-2 HDD, 8 x Write Flash Accelerator SSD and 2 x Read Flash Accelerator SSD 1.6TB SAS 2 x Oracle Server X5-2L as COMSTAR nodes (DB redo logs & App object cache) each with 2 x Intel Xeon Processor E5-2630 v3 32 GB memory 4 x 1.6 TB NVMe SSD   Software Configuration: Oracle Solaris 11.3 Oracle Database 11g Release 2 (11.2.0.3.0) PeopleSoft Human Capital Management 9.1 FP2 PeopleSoft PeopleTools 8.52.03 Oracle Java SE 6u32 Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 043 Oracle WebLogic Server 11g (10.3.5)   Benchmark Description The PeopleSoft Human Capital Management benchmark simulates thousands of online employees, managers and Human Resource administrators executing transactions typical of a Human Resources Self Service application for the Enterprise. Typical transactions are: viewing paychecks, promoting and hiring employees, updating employee profiles, etc. The database tier uses a database instance of about 500 GB in size, containing information for 500,480 employees. The application tier for this test includes web and application server instances, specifically Oracle WebLogic Server 11g, PeopleSoft Human Capital Management 9.1 FP2 and Oracle Java SE 6u32. Key Points and Best Practices In the HR online along with Payroll batch run, the LDom had one Oracle Solaris Zone of 7 cores containing the Web tier, two Oracle Solaris Zones of 16 cores each containing the Application tier and one Oracle Solaris Zone of 23 cores containing the Database tier. Two cores were dedicated to network and disk interrupt handling. In the HR online only run, the LDom had one Oracle Solaris Zone of 12 cores containing the Web tier, two Oracle Solaris Zones of 18 cores each containing the Application tier and one Oracle Solaris Zone of 14 cores containing the Database tier. 2 cores were dedicated to network and disk interrupt handling. In the Payroll batch only run, the LDom had one Oracle Solaris Zone of 31 cores containing the Database tier. 1 core was dedicated to disk interrupt handling. All database data files, recovery files and Oracle Clusterware files for the PeopleSoft test were created with the Oracle Automatic Storage Management (Oracle ASM) volume manager for the added benefit of the ease of management provided by Oracle ASM integrated storage management solution. In the application tier on the LDom, 5 PeopleSoft application domains with 350 application servers (70 per domain) were hosted in two separate Oracle Solaris Zones for a total of 10 domains with 700 application server processes. All PeopleSoft Application processes and the 32 Web Server JVM instances were executed in the Oracle Solaris FX scheduler class. See Also     Oracle Applications Benchmarks Oracle's PeopleSoft Benchmark White Papers Oracle's PeopleSoft HRMS 9.1 FP2 Self-service and Payroll using Oracle DB for Oracle Solaris (Unicode) on an Oracle's SPARC M7-8 with TDE Oracle's PeopleSoft HRMS 9.1 FP2 Self-service using Oracle DB for Oracle Solaris (Unicode) on an Oracle's SPARC M7-8 Server with TDE Oracle's PeopleSoft HRMS 9.1 FP2 Payroll using Oracle DB for Oracle Solaris (Unicode) on an Oracle's SPARC M7-8 with TDE SPARC M7-8 HR Self-Service Online and Payroll Batch Result (Clear Text) SPARC M7-8 HR Self-Service Online Result (Clear Text) SPARC M7-8 Payroll Batch Result (Clear Text) Cisco HR Self-Service Online and Payroll Batch Result Cisco HR Self-Service Online Result Cisco HR Payroll Batch Result IBM z196 Payroll Batch Result, Mainframe MIPS SPARC M7-8 Server oracle.com     OTN     Blog PeopleSoft Enterprise Human Capital Management oracle.com     OTN PeopleSoft Enterprise Human Capital Management (Payroll) oracle.com     OTN Oracle Database oracle.com     OTN     Blog Oracle Database – Transparent Data Encryption oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of March 24, 2016.

Using Oracle Advanced Security Transparent Data Encryption (TDE), Oracle's SPARC M7-8 server using Oracle's SPARC M7 processor's software in silicon cryptography instructions produced results on...

Benchmark

Yahoo Cloud Serving Benchmark: SPARC T7-4 with Flash Storage and Oracle NoSQL Beats x86 E5 v3 Per Chip

Oracle's SPARC T7-4 server delivered 1.8 million ops/sec on 1.2 billion records for the Yahoo Cloud Serving Benchmark (YCSB) 95% read/5% update workload.  Oracle NoSQL Database was used in these tests. NoSQL is important for Big Data Analysis and for Cloud Computing. In the run comparing the performance of a single SPARC M7 processor to one Intel Xeon Processor E5-2699 v3 for the YCSB 95% read/5% update workload, the SPARC M7 processor was 2.6 times better per chip than the x86 processor and on a per core basis, 1.4 times better than the x86 processor. The SPARC T7-4 server showed low average latency of 0.86 msec on read and 5.37 msec on write while achieving 1.8 million ops/sec. The SPARC T7-4 server delivered 313K inserts/sec on 1.2 billion records with a low average latency of 2.75 msec. One processor performance on the SPARC T7-4 server was over half a million (519K  ops/sec) on 300 million records for the YCSB 95% read/5% update workload. The SPARC T7-4 server scaling from 1 to 4 processors was 3.5x while maintaining low latency. These results show the SPARC T7-4 server can handle a large database while achieving high throughput with low latency for cloud computing. Performance Landscape This table presents single chip results comparing the SPARC M7 processor (in a SPARC T7-4 server) to the Intel Xeon Processor E5-2699 v3 (in a 2-socket x86 server).  All of the following results were run as part of this benchmark effort. Comparing Single Chip Performance on YCSB Benchmark Processor Insert Mixed Load (95% Read/5% Update) Throughput ops/sec Average Latency Throughput ops/sec Average Latency Write msec Read msec Write msec SPARC M7 89,177 2.42 519,352 0.82 3.61 E5-2699 v3 55,636 1.18 202,701 0.71 2.30   The following table shows the performance of the Yahoo Clouds Serving Benchmark on multiple processor counts on the SPARC T7-4 server. SPARC T7-4 server running YCSB benchmark CPUs Shards Insert Mixed Load (95% Read/5% Update) Throughput ops/sec Average Latency Throughput ops/sec Average Latency Write msec Read msec Write msec 4 16 313,044 2.75 1,814,911 0.86 5.37 3 12 245,145 2.63 1,412,424 0.82 5.49 2 8 169,720 2.54 974,243 0.82 4.76 1 4 89,177 2.42 519,352 0.82 3.61   Configuration Summary SPARC System: SPARC T7-4 server 4 x SPARC M7 processors (4.13 GHz) 2 TB memory (64 x 32 GB) 8 x Oracle Flash Accelerator F160 PCIe card 8 x Sun Dual Port 10 GbE PCIe 2.0 Low Profile Adapter, Base-T   x86 System: Oracle Server X5-2L server 2 x Intel Xeon Processor E5-2699 v3 (2.3 GHz) 384 GB memory 1 x Sun Storage 16 Gb Fibre Channel PCIe Universal FC HBA, Emulex 1 x Sun Dual 10GbE SFP+ PCIe 2.0 Low Profile Adapter External Storage: COMSTAR (Common Multiprotocol SCSI TARget) 2 x Sun Server X3-2L servers configured as COMSTAR nodes, each with 2 x Intel Xeon Processor E5-2609 (2.4 GHz) 4 x Sun Flash Accelerator F40 PCIe Cards, 400 GB each 1 x 8 Gb dual port HBA   Software Configuration: Oracle Solaris 11.3 (11.3.1.2.0) Logical Domains Manager v3.3.0.0.17 (running on the SPARC T7-4) Oracle NoSQL Database, Enterprise Edition 12c R1.3.3.4 Java(TM) SE Runtime Environment (build 1.8.0_60-b27)   Benchmark Description The Yahoo Cloud Serving Benchmark (YCSB) is a performance benchmark for cloud database and their systems.  The benchmark documentation says: With the many new serving databases available including Sherpa, BigTable, Azure and many more, it can be difficult to decide which system is right for your application, partially because the features differ between systems, and partially because there is not an easy way to compare the performance of one system versus another. The goal of the Yahoo Cloud Serving Benchmark (YCSB) project is to develop a framework and common set of workloads for evaluating the performance of different "key-value" and "cloud" serving stores.   Key Points and Best Practices The SPARC T7-4 server showed 3.5x scaling from 1 to 4 sockets while maintaining low latency.   Two Oracle VM for SPARC (LDom) servers were created per processor, for a total of seven LDoms plus a primary domain.  Each domain was configured with 240 GB memory accessing two PCIe IO slots using the Direct IO feature. The Oracle Flash Accelerator F160 PCIe cards demonstrated excellent IO capability and performed 812K read IOPS using eight Oracle Flash Accelerator F160 PCIe cards (over 100K IOPS per card) during the 1.8 million ops/sec benchmark run. Balanced memory bandwidth was delivered across all four processors achieving an average total of 254 GB/sec during the 1.8 million ops/sec run. The 1.2 billion records were loaded into 16 Shards with the replication factor set to 3. Each LDom hosted two Storage Nodes so two processor sets were created for each Storage Node.  The default processor set was additionally used for OS and IO interrupts.  The processor sets were used for isolation and to ensure a balanced load. Fixed priority class was assigned to Oracle NoSQL Storage Node java processes. The ZFS record size was set to 16K (default 128K) and this worked best for the 95% read/5% update workload. A total of eight Sun Server X4-2 and Sun Server X4-2L systems were used as clients for generating the workload. The LDoms and client systems were connected through a 10 GbE network. Oracle Server X5-2L system configuration was as follows: 1 chip (the other chip disabled by psradm) 2 x Sun Server X3-2L system COMSTAR nodes (total 8 x Sun Flash Accelerator F40 PCIe Cards) 1 x Sun Server X4-2 system as client connected through a 10 GbE network A processor set for NoSQL processes and the default processor set for OS and IO  interrupts Fixed priority class for NoSQL Storage Node java processes ZFS 16K record size 1 Shard (100M records)   See Also Yahoo Cloud Serving Benchmark YCSB Source SPARC T7-4 Server oracle.com    OTN Oracle Server X5-2L oracle.com    OTN Oracle Flash Accelerator F160 PCIe Card oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle NoSQL Database oracle.com    OTN   Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of March 24, 2016.

Oracle's SPARC T7-4 server delivered 1.8 million ops/sec on 1.2 billion records for the Yahoo Cloud Serving Benchmark (YCSB) 95% read/5% update workload.  Oracle NoSQL Database was used in these...

Benchmark

Siebel PSPP: SPARC T7-2 World Record Result, Beats IBM

Oracle set a new world record for the Siebel Platform Sizing and Performance Program (PSPP) benchmark using Oracle's SPARC T7-2 server for the application server with Oracle's Siebel CRM 8.1.1.4 Industry Applications and Oracle Database 12c running on Oracle Solaris. The SPARC T7-2 server running the application tier achieved 55,000 users with sub-second response time and with throughput of 457,909 business transactions per hour on the Siebel PSPP benchmark. The SPARC T7-2 server in the application tier delivered 3.3 times more users on a per chip basis compared to published IBM POWER8 based server results. For the new Oracle results, eight cores of a SPARC T7-2 server were used for the database tier running at 32% utilization (as measured by mpstat). The IBM result used 6 cores at about 75% utilization for the database/gateway tier. The SPARC T7-2 server in the application tier delivered nearly the same number of users on a per core basis compared to published IBM POWER8 based server results. The SPARC T7-2 server in the application tier delivered nearly 2.8 times more users on a per chip basis compared to earlier published SPARC T5-2 server results. The SPARC T7-2 server in the application tier delivered nearly 1.4 times more users on a per core basis compared to earlier published SPARC T5-2 server results. The SPARC T7-2 server used Oracle Solaris Zones which provide flexible, scalable and manageable virtualization to scale the application within and across multiple nodes. The Siebel 8.1.1.4 PSPP workload includes Siebel Call Center and Order Management System. Performance Landscape   Application Server TPH Users Users/ Chip Users/ Core Response Times Call Center Order Mgmt 1 x SPARC T7-2 (2 x SPARC M7 @4.13 GHz) 457,909 55,000 27,500 859 0.045 sec 0.257 sec 3 x IBM S824 (each 2 x 8 active core LPARs, POWER8 @4.1 GHz) 418,976 50,000 8,333 1041 0.031 sec 0.175 sec 2 x SPARC T5-2 (each with 2 x SPARC T5 @3.6 GHz) 333,339 40,000 10,000 625 0.110 sec 0.608 sec TPH – Business transactions throughput per hour Configuration Summary Application Server: 1 x SPARC T7-2 server with 2 x SPARC M7 processors, 4.13 GHz 1 TB memory 6 x 300 GB SAS internal disks Oracle Solaris 11.3 Siebel CRM 8.1.1.4 SIA Web/Database/Gateway Server: 1 x SPARC T7-2 server with 2 x SPARC M7 processors, 4.13 GHz (20 active cores: 8 cores for DB, 12 for Web/Gateway) 512 TB memory 6 x 300 GB SAS internal disks 2 x 1.6 TB NVMe SSD Oracle Solaris 11.3 Siebel CRM 8.1.1.4 SIA iPlanet Web Server 7 Oracle Database 12c (12.1.0.2) Benchmark Description Siebel PSPP benchmark includes Call Center and Order Management: Siebel Financial Services Call Center – Provides the most complete solution for sales and service, allowing customer service and telesales representatives to provide superior customer support, improve customer loyalty, and increase revenues through cross-selling and up-selling. High-level description of the use cases tested: Incoming Call Creates Opportunity, Quote and Order and Incoming Call Creates Service Request.  Three complex business transactions are executed simultaneously for specific number of concurrent users. The ratios of these 3 scenarios were 30%, 40%, 30% respectively, which together were totaling 70% of all transactions simulated in this benchmark. Between each user operation and the next one, the think time averaged approximately 10, 13, and 35 seconds respectively. Siebel Order Management – Oracle's Siebel Order Management allows employees such as salespeople and call center agents to create and manage quotes and orders through their entire life cycle. Siebel Order Management can be tightly integrated with back-office applications allowing users to perform tasks such as checking credit, confirming availability, and monitoring the fulfillment process. High-level description of the use cases tested: Order & Order Items Creation and Order Updates. Two complex Order Management transactions were executed simultaneously for specific number of concurrent users concurrently with aforementioned three Call Center scenarios above. The ratio of these 2 scenarios was 50% each, which together were totaling 30% of all transactions simulated in this benchmark. Between each user operation and the next one, the think time averaged approximately 20 and 67 seconds respectively. See Also Siebel White Papers for Published Results SPARC T7-2 Server oracle.com    OTN Siebel CRM oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN   Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.  Results as of March 22, 2016.

Oracle set a new world record for the Siebel Platform Sizing and Performance Program (PSPP) benchmark using Oracle's SPARC T7-2 server for the application server with Oracle's Siebel CRM...

Benchmark

OLTPbenchmark Workload, Open-Source Benchmark: SPARC T7-1 Performance Beats IBM S824, Beats x86 E5-2699 v3

OLTPbenchmark is an open-source database benchmarking tool that includes an On-Line Transaction Processing (OLTP) transactional workload derived from the industry standard TPC-C workload. Oracle's SPARC T7-1 server demonstrated OLTP performance that is 2.76 times faster per chip than Intel Xeon Processor E5-2699 v3 and 5.47 times faster per chip than an IBM POWER8 (3.5 GHz) processor.  This means that a SPARC T7-1 is 1.38 times faster than a 2-chip x86 E5 v3 based server.  The SPARC T7-1 server is also 1.37 times faster than an IBM Power System S824 (POWER8) server.  On per core performance, the SPARC M7 processor used in the SPARC T7-1 server out performed the IBM POWER8 processor.  All of these tests used Oracle Database 12c Release 1 (12.1.0.2) Enterprise Edition for the database. Comparing the SPARC T7-1 server to the 2-chip x86 E5 v3 server equipped with two 2.3 GHz Intel Xeon Processor E5-2699 v3, we see the following advantages for the the SPARC T7-1 server. On a per chip basis, the SPARC T7-1 server demonstrated 2.76 times better performance compared to the 2-chip x86 E5 v3 server. At the system level, the SPARC T7-1 server demonstrated 1.38 times better performance compared to the 2-chip x86 E5 v3 server. Comparing the SPARC T7-1 server to an IBM Power System S824 server equipped with four 3.5 GHz POWER8 processors (6 cores), we see the following advantages for the the SPARC T7-1 server. On a per chip basis, the SPARC T7-1 server demonstrated nearly 5.47 times better performance compared to an IBM Power System S824 server. On a per core basis the SPARC T7-1 server demonstrated nearly 3% better performance per core compared to an IBM Power System S824 server. At the system level, the SPARC T7-1 server demonstrated nearly 1.37 times better performance compared to the IBM Power System S824 server. The OLTPbenchmark transactional workload is based upon the TPC-C benchmark specification. Details of the configuration and parameters used are available in the reports referenced in the See Also section. Performance Landscape All OLTPbenchmark server results were run as part of this benchmark effort (except as noted).  All results are run with Oracle Database 12c Release 1 Enterprise Edition.  Results are ordered by TPM/core, highest to lowest. OLTPbenchmark Transactional Workload Relative Performance to x86 System System TPM TPM/chip TPM/core SPARC T7-1 1 x SPARC M7 (32 cores/chip, 32 total) 1.38x 2.76x 1.55x IBM Power System S824 4 x POWER8 (6 cores/chip, 24 total) 1.01x 0.50x 1.51x Oracle Server X5-2 2 x Intel E5-2699 v3 (18 cores/chip, 36 total) 1.00x 1.00x 1.00x TPM – OLTPbenchmark transactions per minute Results on the IBM Power System S824 were run by Oracle engineers using Oracle Database 12c.   Configuration Summary Systems Under Test: SPARC T7-1 server with 1 x SPARC M7 processor (4.13 GHz) 512 GB memory 2 x 600 GB 10K RPM SAS2 HDD 1 x Sun Dual Port 10 GbE PCIe 2.0 Networking card with Intel 82599 10 GbE Controller 1 x Sun Storage 16 Gb Fibre Channel Universal HBA Oracle Solaris 11.3 Oracle Database 12c Release 1 (12.1.0.2) Enterprise Edition Oracle Grid Infrastructure 12c Release 1 (12.1.0.2)   Oracle Server X5-2 with 2 x Intel Xeon processor E5-2699 v3 (2.3 GHz) 512 GB memory 2 x 600 GB 10K RPM SAS2 HDD 1 x Sun Dual Port 10 GbE PCIe 2.0 Networking card with Intel 82599 10 GbE Controller 1 x Sun Storage 16 Gb Fibre Channel Universal HBA Oracle Linux 6.5 Oracle Database 12c Release 1 (12.1.0.2) Enterprise Edition Oracle Grid Infrastructure 12c Release 1 (12.1.0.2)   IBM Power System S824 with 4 x POWER8 (3.5 GHz) 512 GB memory 4 x 300 GB 15K RPM SAS HDD 1 x 10 GbE Network Interface 1 x 16 Gb Fibre Channel HBA AIX 7.1 TL3 SP3 Oracle Database 12c Release 1 (12.1.0.2) Enterprise Edition Oracle Grid Infrastructure 12c Release 1 (12.1.0.2)   Storage Servers: 1 x Oracle Server X5-2L with 2 x Intel Xeon Processor E5-2630 v3 (2.4 GHz) 32 GB memory 1 x Sun Storage 16 Gb Fibre Channel Universal HBA 4 x 1.6 TB NVMe SSD 2 x 600 GB SAS HDD Oracle Solaris 11.3   1 x Oracle Server X5-2L with 2 x Intel Xeon Processor E5-2630 v3 (2.4 GHz) 32 GB memory 1 x Sun Storage 16 Gb Fibre Channel Universal HBA 14 x 600 GB SAS HDD Oracle Solaris 11.3   Benchmark Description The OLTPbenchmark workload as described from the OLTPbenchmark website: This is a database performance testing tool that allows you to conduct database workload replay, industry-standard benchmark testing, and scalability testing under various loads, such as scaling a population of users who executes order-entry transactions against a wholesale supplier database.   OLTPbenchmark supports many databases including Oracle, SQL Server, DB2, TimesTen, MySQL, MariaDB, PostgreSQL, Greenplum, Postgres Plus Advanced Server, Redis and Trafodion SQL on Hadoop. Key Points and Best Practices For these tests, an 800 warehouse database was created to compare directly with results posted by Intel. To improve the scalability, the OrderLine table was partitioned and loaded into a separate tablespace using the OLTPbenchmark GUI. The default blocksize was 8K and the OrderLine tablespace blocksize was 16K. To reduce latency of Oracle "cache chains buffers" wait events, the OLTPbenchmark kit was modified by adding partitioning to the NEW_ORDER table as well as the ORDERS_I1 and ORDERS_I2 indexes. To reduce latency of Oracle "library cache: mutex X" wait events, added recommended workarounds from the following Intel blog Refer to the detailed configuration documents in the See Also section below for the list of Oracle parameters. See Also Details of IBM Power System S824 Testing Details of SPARC T7-1 Testing Details of Oracle Server X5-2 Testing SPARC T7-1 Server oracle.com    OTN Oracle Database oracle.com    OTN Oracle Solaris oracle.com    OTN   Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of March 17, 2016.

OLTPbenchmark is an open-source database benchmarking tool that includes an On-Line Transaction Processing (OLTP) transactional workload derived from the industry standard TPC-C workload. Oracle's...

Benchmark

Oracle Advanced Security – Transparent Data Encryption: Secure Database on SPARC M7 Processor Performance Nearly the Same as Clear

Oracle's SPARC T7-1 server is faster and more efficient than a two-processor x86 server (Intel Xeon Processor E5-2699 v3) in processing I/O intensive database queries when running the Oracle Advanced Security Transparent Data Encryption (TDE) feature of Oracle Database 12c. The single-processor SPARC T7-1 server is up to 1.4 times faster than the two-processor x86 system for all queries tested, with TDE enabled and without. On a per chip basis, Oracle's SPARC M7 processor is over twice the performance of the Intel Xeon Processor E5-2699 v3 (Haswell). The SPARC T7-1 server is more efficient than the two-processor x86 system for all queries tested, with TDE enabled and without, as measured by CPU utilization.  For example, on Query A the CPU utilization nearly doubled on the x86 server (41% on clear to 79% with TDE) while on the same Query A the SPARC T7-1 server CPU utilization 30% on clear to 38% with TDE. In a head-to-head comparison of system performance using Oracle's Transparent Data Encryption, the SPARC T7-1 single processor system with one SPARC M7 (4.13 GHz) processor outperforms a two-processor x86 server with Intel Xeon Processor E5-2699 v3 (2.3 GHz) processors. The two systems were configured with the same storage environment, 256 GB of memory, the same version of Oracle Database 12c, and with the same high-level of tunings.  All tests run with TDE security used the hardware instructions available on the processors (SPARC or x86). Performance Landscape In the first table below, results are presented for three different queries and a full table scan.  The results labeled "clear" were executed in clear text or without Transparent Data Encryption.  The results labeled "TDE" are with AES-128 encryption enabled for all of the data tables used in the tablespace with the default parameter of db_block_checking=false. Query Times (seconds – smaller is better) System Security Query A Query B Query C Full Table Scan SPARC T7-1 clear 64.0 61.0 54.8 52.7 TDE 65.3 62.8 56.3 53.4 TDE to Clear ratio   Two x86 E5 v3 clear 41% 40% 38% 41% TDE 79% 73% 80% 86% TDE to Clear ratio 1.9x 1.8x 2.1x 2.1x Comparing SPARC and x86 on Utilization SPARC advantage – clear 1.37x 1.25x 1.41x 1.95x SPARC advantage – TDE 2.08x 1.83x 2.22x 2.77x Configuration Summary SPARC Configuration: SPARC T7-1 server with 1 x SPARC M7 processor (4.13 GHz, 32 cores) 256 GB memory Flash storage Oracle Solaris 11.3 Oracle Enterprise Database 12c x86 Configuration: Oracle Server X5-2L system with 2 x Intel Xeon Processor E5-2699 v3 (2.3 GHz, 36 total cores) 256 GB memory Flash storage Oracle Solaris 11.3 Oracle Enterprise Database 12c Note that the two systems were configured with the same storage environment, the same version of Oracle Database 12c, and with the same high-level of tunings. Benchmark Description The benchmark executes a set of queries on a table of approximately 1 TB in size.  The database contains two copies of the table, one that was built using security and one that does not.  The tablespaces used the same layout on the storage and DBMS parameters. Each query is executed individually after a restart of the database and the average of 5 executions of the query is used as the average execution time and the gathering of other system statistics. Description of the queries: Query A: Determines how the market share of a given nation within a region has changed over two years for a given part type. Query B: Identifies customers who might have a problem with parts shipped to them. Query C: Determines how much average yearly revenue would be lost if orders were no longer filled for small quantities of certain parts. Full Table Scan: Full table scan of the largest table, over 700 GB of data Key Points and Best Practices For each system, the 1 TB of data is spread evenly across the flash storage in 1 MB stripes.  This was determined to be the most efficient stripe size for a data warehouse environment with large sequential read operations. With each system having the same amount of memory and database software, the same tuning parameters were used on each system to ensure a fair comparison and that each query induced roughly the same amount of I/O throughput per query. Efficiency was verified by looking at not only the average processor utilization (as measured by Oracle Solaris tool pgstat(1M)), but also by measuring the average processor core utilization at the hardware level. See Also SPARC T7-1 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle Database – Transparent Data Encryption oracle.com    OTN   Disclosure Statement Copyright 2016, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of March 14, 2016.

Oracle's SPARC T7-1 server is faster and more efficient than a two-processor x86 server (Intel Xeon Processor E5-2699 v3) in processing I/O intensive database queries when running the Oracle Advanced...

Benchmark

SPECvirt_2013: SPARC T7-2 World Record Performance for Two- and Four-Chip Systems

Oracle's SPARC T7-2 server delivered a world record SPECvirt_sc2013 result for systems with two to four chips. The SPARC T7-2 server produced a result of 3198 @ 179 VMs SPECvirt_sc2013. The two-chip SPARC T7-2 server beat the best four-chip x86 Intel E7-8890 v3 server (HP ProLiant DL580 Gen9), demonstrating that the SPARC M7 processor is 2.1 times faster than the Intel Xeon Processor E7-8890 v3 (chip-to-chip comparison). The two-chip SPARC T7-2 server beat the best two-chip x86 Intel E5-2699 v3 server results by nearly 2 times (Huawei FusionServer RH2288H V3, HP ProLiant DL360 Gen9). The two-chip SPARC T7-2 server delivered nearly 2.2 times the performance of the four-chip IBM Power System S824 server solution which used 3.5 GHz POWER8 six core chips. The SPARC T7-2 server running Oracle Solaris 11.3 operating system, utilizes embedded virtualization products as the Oracle Solaris 11 zones, which in turn provide a low overhead, flexible, scalable and manageable virtualization environment. The SPARC T7-2 server result used Oracle VM  Server for SPARC 3.3 and Oracle Solaris Zones providing a flexible, scalable and manageable virtualization environment. Performance Landscape Complete benchmark results are at the SPEC website, SPECvirt_sc2013 Results.  The following table highlights the leading two-, and four-chip results for the benchmark, bigger is better.   SPECvirt_sc2013 Leading Two to Four-Chip Results System Processor Chips Result @ VMs Virtualization Software SPARC T7-2 SPARC M7 (4.13 GHz, 32core) 2 3198 @ 179 Oracle VM Server for SPARC 3.3 Oracle Solaris Zones HP ProLiant DL580 Gen9 Intel E7-8890 v3 (2.5 GHz, 18core) 4 3020 @ 168 Red Hat Enterprise Linux 7.1 KVM Lenovo System x3850 X6 Intel E7-8890 v3 (2.5 GHz, 18core) 4 2655 @ 147 Red Hat Enterprise Linux 6.6 KVM Huawei FusionServer RH2288H V3 Intel E5-2699 v3 (2.3 GHz, 18core) 2 1616 @ 95 Huawei FusionSphere V1R5C10 HP ProLiant DL360 Gen9 Intel E5-2699 v3 (2.3 GHz, 18core) 2 1614 @ 95 Red Hat Enterprise Linux 7.1 KVM IBM Power S824 POWER8 (3.5 GHz, 6core) 4 1370 @ 79 PowerVM Enterprise Edition 2.2.3   Configuration Summary System Under Test Highlights: Hardware: 1 x SPARC T7-2 server, with 2 x 4.13 GHz SPARC M7 1 TB memory 2 Sun Dual Port 10GBase-T Adapter 2 Sun Storage Dual 16 Gb Fibre Channel PCIe Universal HBA Software: Oracle Solaris 11.3 Oracle VM Server for SPARC 3.3 (LDom) Oracle Solaris Zones Oracle iPlanet Web Server 7.0.20 Oracle PHP 5.3.29 Dovecot v2.2.18 Oracle WebLogic Server Standard Edition Release 10.3.6 Oracle Database 12c Enterprise Edition (12.1.0.2.0) Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.7.0_85-b15 Storage: 3 x Oracle Server X5-2L, with 2 x Intel Xeon Processor E5-2630 v3 8-core 2.4 GHz 32 GB memory 4 x Oracle Flash Accelerator F160 PCIe Card Oracle Solaris 11.3 1 x Oracle Server X5-2L, with 2 x Intel Xeon Processor E5-2630 v3 8-core 2.4 GHz 32 GB memory 4 x Oracle Flash Accelerator F160 PCIe Card 4x 400 GB SSDs Oracle Solaris 11.3   Benchmark Description SPECvirt_sc2013 is SPEC's updated benchmark addressing performance evaluation of datacenter servers used in virtualized server consolidation. SPECvirt_sc2013 measures the end-to-end performance of all system components including the hardware, virtualization platform, and the virtualized guest operating system and application software. It utilizes several SPEC workloads representing applications that are common targets of virtualization and server consolidation. The workloads were made to match a typical server consolidation scenario of CPU resource requirements, memory, disk I/O, and network utilization for each workload. These workloads are modified versions of SPECweb2005, SPECjAppServer2004, SPECmail2008, and SPEC CPU2006. The client-side SPECvirt_sc2013 harness controls the workloads. Scaling is achieved by running additional sets of virtual machines, called "tiles", until overall throughput reaches a peak. Key Points and Best Practices The SPARC T7-2 server running the Oracle Solaris 11.3, utilizes embedded virtualization products as the Oracle VM Server for SPARC and Oracle Solaris Zones, which provide a low overhead, flexible, scalable and manageable virtualization environment. In order to provide a high level of data integrity and availability, all the benchmark data sets are stored on mirrored (RAID1) storage Using Oracle VM Server for SPARC to bind the SPARC M7 processor with its local memory optimized the memory use in this virtual environment. This improved result used a fractional tile to fully saturate the system. See Also SPECvirt_sc2013 Results Page SPARC T7-2 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle WebLogic Suite oracle.com    OTN Oracle Fusion Middleware oracle.com    OTN Java oracle.com    OTN   Disclosure Statement SPEC and the benchmark name SPECvirt_sc are registered trademarks of the Standard Performance Evaluation Corporation. Results from www.spec.org as of 11/19/2015. SPARC T7-2, SPECvirt_sc2013 3198@179 VMs; HP ProLiant DL580 Gen9, SPECvirt_sc2013 3020@168 VMs; Lenovo x3850 X6; SPECvirt_sc2013 2655@147 VMs; Huawei FusionServer RH2288H V3, SPECvirt_sc2013 1616@95 VMs; HP ProLiant DL360 Gen9, SPECvirt_sc2013 1614@95 VMs; IBM Power S824, SPECvirt_sc2013 1370@79 VMs.

Oracle's SPARC T7-2 server delivered a world record SPECvirt_sc2013 result for systems with two to four chips. The SPARC T7-2 server produced a result of 3198 @ 179 VMs SPECvirt_sc2013. The two-chip...

Benchmark

SPECjbb2015: SPARC T7-1 World Record for 1 Chip Result

Updated November 30, 2015 to point to published results and add latest, best x86 two-chip result. Oracle's SPARC T7-1 server, using Oracle Solaris and Oracle JDK, produced world record one-chip SPECjbb2015 benchmark (MultiJVM metric) results beating all previous one- and two-chip results in the process.  This benchmark was designed by the industry to showcase Java performance in the Enterprise. Performance is expressed in terms of two metrics, max-jOPS which is the maximum throughput number, and critical-jOPS which is critical throughput under service level agreements (SLAs). The SPARC T7-1 server achieved 120,603 SPECjbb2015-MultiJVM max-jOPS and 60,280 SPECjbb2015-MultiJVM critical-jOPS on the SPECjbb2015 benchmark. The one-chip SPARC T7-1 server delivered 2.5 times more max-jOPS performance per chip than the best two-chip result which was run on the Cisco UCS C220 M4 server using Intel v3 processors.  The SPARC T7-1 server also produced 4.3 times more critical-jOPS performance per chip compared to the Cisco UCS C220 M4.  The Cisco result enabled the COD BIOS option. The SPARC T7-1 server delivered 2.7 times more max-jOPS performance per chip than the IBM Power S812LC using POWER8 chips.  The SPARC T7-1 server also produced 4.6 times more critical-jOPS performance per chip compared to the IBM server. The SPARC M7 processor also delivered 1.45 times more critical-jOPS performance per core than IBM POWER8 processor. The one-chip SPARC T7-1 server delivered 3 times more max-jOPS performance per chip than the two-chip result on the Lenovo Flex System x240 M5 using Intel v3 processors.  The SPARC T7-1 server also produced 2.8 times more critical-jOPS performance per chip compared to the Lenovo.  The Lenovo result did not enable the COD BIOS option. The SPARC T5-2 server achieved 80,889 SPECjbb2015-MultiJVM max-jOPS and 37,422 SPECjbb2015-MultiJVM critical-jOPS on the SPECjbb2015 benchmark. The one-chip SPARC T7-1 server demonstrated a 3 times max-jOPS performance improvement per chip compared to the previous generation two-chip SPARC T5-2 server. From SPEC's press release: "The SPECjbb2015 benchmark is based on the usage model of a worldwide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases, and data-mining operations.  It exercises Java 7 and higher features, using the latest data formats (XML), communication using compression, and secure messaging." The Cluster on Die (COD) mode is a BIOS setting that effectively splits the chip in half, making the operating system think it has twice as many chips as it does (in this case, four, 9 core chips).  Intel has said that COD is appropriate only for highly NUMA optimized workloads.  Dell has shown that there is a 3.7x slower bandwidth to the other half of the chip split by COD. Performance Landscape One- and two-chip results of SPECjbb2015 MultiJVM from www.spec.org as of November 30, 2015. SPECjbb2015 One- and Two-Chip Results System SPECjbb2015-MultiJVM OS JDK Notes max-jOPS critical-jOPS SPARC T7-1 1 x SPARC M7 (4.13 GHz, 1x 32core) 120,603 60,280 Oracle Solaris 11.3 8u66 - Cisco UCS C220 M4 2 x Intel E5-2699 v3 (2.3 GHz, 2x 18core) 97,551 28,318 Red Hat 6.5 8u60 COD Dell PowerEdge R730 2 x Intel E5-2699 v3 (2.3 GHz, 2x 18core) 94,903 29,033 SUSE 12 8u60 COD Cisco UCS C220 M4 2 x Intel E5-2699 v3 (2.3 GHz, 2x 18core) 92,463 31,654 Red Hat 6.5 8u60 COD Lenovo Flex System x240 M5 2 x Intel E5-2699 v3 (2.3 GHz, 2x 18core) 80,889 43,654 Red Hat 6.5 8u60 - SPARC T5-2 2 x SPARC T5 (3.6 GHz, 2x 16core) 80,889 37,422 Oracle Solaris 11.2 8u66 - Oracle Server X5-2L 2 x Intel E5-2699 v3 (2.3 GHz, 2x 18core) 76,773 26,458 Oracle Solaris 11.2 8u60 - Sun Server X4-2 2 x Intel E5-2697 v2 (2.7 GHz, 2x 12core) 52,482 19,614 Oracle Solaris 11.1 8u60 - HP ProLiant DL120 Gen9 1 x Intel Xeon E5-2699 v3 (2.3 GHz, 18core) 47,334 9,876 Red Hat 7.1 8u51 - IBM Power S812LC 1 x POWER8 (2.92 GHz, 10core) 44,883 13,032 Ubuntu 14.04.3 J9 VM - * Note COD: result uses non-default BIOS setting of Cluster on Die (COD) which splits the chip in two. This requires specific NUMA optimization, in that memory traffic to the other half of the chip can see a 3.7x decrease in bandwidth Configuration Summary Systems Under Test: SPARC T7-1 1 x SPARC M7 processor (4.13 GHz) 512 GB memory (16 x 32 GB dimms) Oracle Solaris 11.3 (11.3.1.5.0) Java HotSpot 64-Bit Server VM, version 1.8.0_66   SPARC T5-2 2 x SPARC T5 processors (3.6 GHz) 512 GB memory (32 x 16 GB dimms) Oracle Solaris 11.2 Java HotSpot 64-Bit Server VM, version 1.8.0_66   Benchmark Description The benchmark description, as found at the SPEC website. The SPECjbb2015 benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is relevant to all audiences who are interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community. Features include: A usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases and data-mining operations. Both a pure throughput metric and a metric that measures critical throughput under service level agreements (SLAs) specifying response times ranging from 10ms to 100ms. Support for multiple run configurations, enabling users to analyze and overcome bottlenecks at multiple layers of the system stack, including hardware, OS, JVM and application layers. Exercising new Java 7 features and other important performance elements, including the latest data formats (XML), communication using compression, and messaging with security. Support for virtualization and cloud environments. Key Points and Best Practices For the SPARC T5-2 server results, processor sets were use to isolate the different JVMs used during the test. See Also SPECjbb2015 Results Website More Information on SPECjbb2015 SPARC T7-1 Server oracle.com    OTN SPARC T5-2 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Java oracle.com    OTN   Disclosure Statement SPEC and the benchmark name SPECjbb are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results from http://www.spec.org as of 11/30/2015. SPARC T7-1 120,603 SPECjbb2015-MultiJVM max-jOPS, 60,280 SPECjbb2015-MultiJVM critical-jOPS; Cisco UCS C220 M4 97,551 SPECjbb2015-MultiJVM max-jOPS, 28,318 SPECjbb2015-MultiJVM critical-jOPS; Dell PowerEdge R730 94,903 SPECjbb2015-MultiJVM max-jOPS, 29,033 SPECjbb2015-MultiJVM critical-jOPS; Cisco UCS C220 M4 92,463 SPECjbb2015-MultiJVM max-jOPS, 31,654 SPECjbb2015-MultiJVM critical-jOPS; Lenovo Flex System x240 M5 80,889 SPECjbb2015-MultiJVM max-jOPS, 43,654 SPECjbb2015-MultiJVM critical-jOPS; SPARC T5-2 80,889 SPECjbb2015-MultiJVM max-jOPS, 37,422 SPECjbb2015-MultiJVM critical-jOPS; Oracle Server X5-2L 76,773 SPECjbb2015-MultiJVM max-jOPS, 26,458 SPECjbb2015-MultiJVM critical-jOPS; Sun Server X4-2 52,482 SPECjbb2015-MultiJVM max-jOPS, 19,614 SPECjbb2015-MultiJVM critical-jOPS; HP ProLiant DL120 Gen9 47,334 SPECjbb2015-MultiJVM max-jOPS, 9,876 SPECjbb2015-MultiJVM critical-jOPS; IBM Power S812LC 44,883 SPECjbb2015-MultiJVM max-jOPS, 13,032 SPECjbb2015-MultiJVM critical-jOPS.

Updated November 30, 2015 to point to published results and add latest, best x86 two-chip result. Oracle's SPARC T7-1 server, using Oracle Solaris and Oracle JDK, produced world record one-chip...

Benchmark

Simultaneous OLTP & In-memory Analytics: SPARC T7-1 Faster Than x86 E5 v3

A goal of the modern business is real-time enterprise where analytics are run simultaneously with transaction processing on the same system to provide the most effective decision making. Oracle Database 12c Enterprise Edition utilizing the In-Memory option is designed to have the same database able to perform transactions at the highest performance and to transform analytical calculations that once took days or hours to complete orders of magnitude faster. Oracle's SPARC M7 processor has deep innovations to take the real-time enterprise to the next level of performance. In this test both OLTP transactions and analytical queries were run in a single database instance using all of the same features of Oracle Database 12c Enterprise Edition utilizing the In-Memory option in order to compare the advantages of the SPARC M7 processor compared to a generic x86 processor. On both systems the OLTP and analytical queries both took about half of the processing load of the server. In this test Oracle's SPARC T7-1 server is compared to a two-chip x86 E5 v3 based server. On analytical queries the SPARC M7 processor is 8.2x faster than the x86 E5 v3 processor. Simultaneously on OLTP transactions the SPARC M7 processor is 2.9x faster than the x86 E5 v3 processor. In addition, the SPARC T7-1 server had better OLTP transactional response time than the x86 E5 v3 server. The SPARC M7 processor does this by using the Data Accelerator co-processor (DAX). DAX is not a SIMD instruction set, but rather an actual co-processor that offloads in-memory queries which frees the cores up for other processing. The DAX has direct access to the memory bus and can execute scans at near full memory bandwidth. Oracle makes the DAX API available to other applications, so this kind of acceleration is not just to the Oracle database, it is open. The results below were obtained running a set of OLTP transactions and analytic queries simultaneously against two schema: a real time online orders system and a related historical orders schema configured as a real cardinality database (RCDB) star schema. The in-memory analytics RCDB queries are executed using the Oracle Database 12c In-Memory columnar feature. The SPARC T7-1 server and the x86 E5 v3 server both ran OLTP transactions and the in-memory analytics on the same database instance using Oracle Database 12c Enterprise Edition utilizing the In-Memory option. The SPARC T7-1 server ran the in-memory analytics RCDB based queries 8.2x faster per chip than a two-chip x86 E5 v3 server on the 48 stream test. The SPARC T7-1 server delivers 2.9x higher OLTP transaction throughput results per chip than a two-chip x86 E5 v3 server on the 48 stream test. Performance Landscape The table below compares the SPARC T7-1 server and 2-chip x86 E5 v3 server while running OLTP and in-memory analytics against tables in the same database instance. The same set of transactions and queries were executed on each system.   Real-Time Enterprise Performance Chart 48 RCDB DSS Streams, 224 OLTP users System OLTP Transactions Analytic Queries Trans Per Second Per Chip Advantage Average Response Time Queries Per Minute Per Chip Advantage SPARC T7-1 1 x SPARC M7 (32core) 338 K 2.9x 11 (msec) 267 8.2x x86 E5 v3 server 2 x Intel E5-2699 v3 (2x 18core) 236 K 1.0 12 (msec) 65 1.0   The number of cores listed is per chip. The Per Chip Advantage it computed by normalizing to a single chip's performance Configuration Summary SPARC Server: 1 X SPARC T7-1 server 1 X SPARC M7 processor 256 GB Memory Oracle Solaris 11.3 Oracle Database 12c Enterprise Edition Release 12.1.0.2.10   x86 Server: 1 X Oracle Server X5-2L 2 X Intel Xeon Processor E5-2699 v3 256 GB Memory Oracle Linux 6 Update 5 (3.8.13-16.2.1.el6uek.x86_64) Oracle Database 12c Enterprise Edition Release 12.1.0.2.10   Benchmark Description The Real-Time Enterprise benchmark simulates the demands of customers who want to simultaneously run both their OLTP database and the related historical warehouse DSS data that would be based on that OLTP data. It answers the question of how a system will perform when doing data analysis while at the same time executing real-time on-line transactions. The OLTP workload simulates an Order Inventory System that exercises both reads and writes with a potentially large number of users that stresses the lock management and connectivity, as well as, database access. The number of customers, orders and users is fully parametrized. This benchmark is base on 100 GB dataset, 15 million customers, 600 million orders and up to 580 users. The workload consists of a number of transaction types including show-expenses, part-cost, supplier-phone, low-inv, high-inv, update-price, update-phone, update-cost, and new-order. The real cardinality database (RCDB) schema was created to showcase the potential speedup one may see moving from on disk, row format data warehouse/Star Schema, to utilizing Oracle Database 12c's In-Memory feature for analytical queries. The workload consists of as many as 2,304 unique queries asking questions such as "In 2014, what was the total revenue of single item orders", or "In August 2013, how many orders exceeded a total price of $50". Questions like these can help a company see where to focus for further revenue growth or identify weaknesses in their offerings. RCDB scale factor 1050 represents a 1.05 TB data warehouse. It is transformed into a star schema of 1.0 TB, and then becomes 110 GB in size when loaded in memory. It consists of 1 fact table, and 4 dimension tables with over 10.5 billion rows. There are 56 columns with most cardinalities varying between 5 and 2,000, a primary key being an example of something outside this range. Two reports are generated: one for the OLTP-Perf workload and one for the RCDB DSS workload. For the analytical DSS workload, queries per minute and average query elapsed times are reported. For the OLTP-Perf workload, both transactions-per-seconds in thousands and OLTP average response times in milliseconds are reported. Key Points and Best Practices This benchmark utilized the SPARC M7 processor's co-processor DAX for query acceleration. All SPARC T7-1 server results were run with out-of-the-box tuning for Oracle Solaris. All Oracle Server X5-2L system results were run with out of the box tunings for Oracle Linux except for the setting in /etc/sysctl.conf to get large pages for the Oracle Database: vm.nr_hugepages=98304 To create an in memory area, the following was added to the init.ora: inmemory_size = 120g An example of how to set a table to be in memory is below: ALTER TABLE CUSTOMER INMEMORY MEMCOMPRESS FOR QUERY HIGH   See Also SPARC T7-1 Server oracle.com    OTN Oracle Server X5-2L oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle Database – In-Memory oracle.com    OTN   Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 25 October 2015.

A goal of the modern business is real-time enterprise where analytics are run simultaneously with transaction processing on the same system to provide the most effective decision making....

Benchmark

In-Memory Aggregation: SPARC T7-2 Beats 4-Chip x86 E7 v2

Oracle's SPARC T7-2 server demonstrates better performance both in throughput and number of users compared to a four-chip x86 E7 v2 sever. The workload consists of a realistic set of business intelligence (BI) queries in a multi-user environment against a 500 million row fact table using Oracle Database 12c Enterprise Edition utilizing the In-Memory option. The SPARC M7 chip delivers 2.3 times more query throughput per hour compared to an x86 E7 v2 chip. The two-chip SPARC T7-2 server delivered 13% more query throughput per hour compared to a four-chip x86 E7 v2 server. The two-chip SPARC T7-2 server supported over 10% more users than a four-chip x86 E7 v2 server. Both the SPARC server and x86 server ran with just under 5 second average response time.   Performance Landscape The results below were run as part of this benchmark. All results use 500,000,000 fact table rows and had average cpu utilization of 100%.   In-Memory Aggregation 500 Million Row Fact Table System Users Queries per Hour Queries per Hour per Chip Average Response Time SPARC T7-2 2 x SPARC M7 (32core) 190 127,540 63,770 4.99 (sec) x86 E7 v2 4 x E7-8895 v2 (4x 15core) 170 112,470 28,118 4.92 (sec)   The number of cores are listed per chip. Configuration Summary SPARC Configuration: SPARC T7-2 2 x 4.13 GHz SPARC M7 processors 1 TB memory (32 x 32 GB) Oracle Solaris 11.3 Oracle Database 12c Enterprise /Edition (12.1.0.2.0)   x86 Configuration: Sun Server X4-4 4 x Intel Xeon Processor E7-8895 v2 processors 1 TB memory (64 x 16 GB) Oracle Linux Server 6.5 (kernel 2.6.32-431.el6.x86_64) Oracle Database 12c Enterprise /Edition (12.1.0.2.0)   Benchmark Description The benchmark is designed to highlight the efficacy of the Oracle Database 12c In-Memory Aggregation facility (join and aggregation optimizations) together with the fast scan and filtering capability of Oracle's in-memory column store facility. The benchmark runs analytic queries such as those seen in typical customer business intelligence (BI) applications. These are done in the context of a star schema database.  The key metrics are query throughput, number of users and average response times. The implementation of the workload used to achieve the results is based on a schema consisting of 9 dimension tables together with a 500 million row fact table. The query workload consists of randomly generated star-style queries simulating a collection of ad-hoc business intelligence users. Up to 300 concurrent users have been run, with each user running approximately 500 queries. The implementation includes a relatively small materialized view, which contains some precomputed data. The creation of the materialized view takes only a few minutes.   Key Points and Best Practices The reported results were obtained by using the following settings on both systems except where otherwise noted: starting with a completely cold shared pool without making use of the result cache without using dynamic sampling or adaptive query optimization running all queries in parallel, where parallel_max_servers = 1600 (on the SPARC T7-2) or parallel_max_servers = 240 (on the Sun Server X4-4) each query hinted with PARALLEL(4) parallel_degree_policy = limited having appropriate queries rewritten to the materialized view, MV3, defined as SELECT /*+ append vector_transform */ d1.calendar_year_name, d1.calendar_quarter_name, d2.all_products_name, d2.department_name, d2.category_name, d2.type_name, d3.all_customers_name, d3.region_name, d3.country_name, d3.state_province_name, d4.all_channels_name, d4.class_name, d4.channel_name, d5.all_ages_name, d5.age_name, d6.all_sizes_name, d6.household_size_name, d7.all_years_name, d7.years_customer_name, d8.all_incomes_name, d8.income_name, d9.all_status_name, d9.marital_status_name, SUM(f.sales) AS sales, SUM(f.units) AS units, SUM(f.measure_3) AS measure_3, SUM(f.measure_4) AS measure_4, SUM(f.measure_5) AS measure_5, SUM(f.measure_6) AS measure_6, SUM(f.measure_7) AS measure_7, SUM(f.measure_8) AS measure_8, SUM(f.measure_9) AS measure_9, SUM(f.measure_10) AS measure_10 FROM time_dim d1, product_dim d2, customer_dim_500M_10 d3, channel_dim d4, age_dim d5, household_size_dim d6, years_customer_dim d7, income_dim d8, marital_status_dim d9, units_fact_500M_10 f WHERE d1.day_id = f.day_id AND d2.item_id = f.item_id AND d3.customer_id = f.customer_id AND d4.channel_id = f.channel_id AND d5.age_id = f.age_id AND d6.household_size_id = f.household_size_id AND d7.years_customer_id = f.years_customer_id AND d8.income_id = f.income_id AND d9.marital_status_id = f.marital_status_id GROUP BY d1.calendar_year_name, d1.calendar_quarter_name, d2.all_products_name, d2.department_name, d2.category_name, d2.type_name, d3.all_customers_name, d3.region_name, d3.country_name, d3.state_province_name, d4.all_channels_name, d4.class_name, d4.channel_name, d5.all_ages_name, d5.age_name, d6.all_sizes_name, d6.household_size_name, d7.all_years_name, d7.years   See Also SPARC T7-2 Server oracle.com    OTN Sun Server X4-4 oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle Database – In-Memory oracle.com    OTN   Disclosure Statement Copyright 2015, Oracle and/or its affiliates.  All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of October 25, 2015.

Oracle's SPARC T7-2 server demonstrates better performance both in throughput and number of users compared to a four-chip x86 E7 v2 sever. The workload consists of a realistic set of...

Benchmark

In-Memory Database: SPARC T7-1 Faster Than x86 E5 v3

Fast analytics on large databases are critical to transforming key business processes. Oracle's SPARC M7 processors are specifically designed to accelerate in-memory analytics using Oracle Database 12c Enterprise Edition utilizing the In-Memory option. The SPARC M7 processor outperforms an x86 E5 v3 chip by up to 10.8x on analytics queries. In order to test real world deep analysis on the SPARC M7 processor a scenario with over 2,300 analytical queries was run against a real cardinality database (RCDB) star schema.  This benchmark was audited by Enterprise Strategy Group (ESG). ESG is an IT research, analyst, strategy, and validation firm focused on the global IT community. The SPARC M7 processor does this by using Data Accelerator co-processor (DAX). DAX is not a SIMD instruction but rather an actual co-processor that offloads in-memory queries which frees the cores up for other processing. The DAX has direct access to the memory bus and can execute scans at near full memory bandwidth. Oracle makes the DAX API available to other applications, so this kind of acceleration not just for the Oracle database, it is open. The SPARC M7 processor delivers up to a 10.8x Query Per Minute speedup per chip over the Intel Xeon Processor E5-2699 v3 when executing analytical queries using the In-Memory option of Oracle Database 12c. Oracle's SPARC T7-1 server delivers up to a 5.4x Query Per Minute speedup over the 2-chip x86 E5 v3 server when executing analytical queries using the In-Memory option of Oracle Database 12c. The SPARC T7-1 server delivers over 143 GB/sec of memory bandwidth which is up to 7x more than the 2-chip x86 E5 v3 server when the Oracle Database 12c is executing the same analytical queries against the RCDB. The SPARC T7-1 server scanned over 48 billion rows per second through the database. The SPARC T7-1 server compresses the on-disk RCDB star schema by around 6x when using the Memcompress For Query High setting (more information following below) and by nearly 10x compared to a standard data warehouse row format version of the same database. Performance Landscape The table below compares the SPARC T7-1 server and 2-chip x86 E5 v3 server.  The x86 E5 v3 server single chip compares are from actual measurements against a single chip configuration. The number of cores is per chip, multiply by number of chips to get system total. RCDB Performance Chart 2,304 Queries System Elapsed Seconds Queries Per Minute System Adv Chip Adv DB Memory Bandwidth SPARC T7-1 1 x SPARC M7 (32core) 381 363 5.4x 10.8x 143 GB/sec x86 E5 v3 server 2 x Intel E5-2699 v3 (2x 18core) 2059 67 1.0x 2.0x 20 GB/sec x86 E5 v3 server 1 x Intel E5-2699 v3 (18core) 4096 34 0.5x 1.0x 10 GB/sec   Fused Decompress + Scan The In-Memory feature of Oracle Database 12c puts tables in columnar format. There are different levels of compression that can be applied. One of these is Oracle Zip (OZIP) which is used with the "MEMCOMPRESS FOR QUERY HIGH" setting. Typically when compression is applied to data, in order to operate on it, the data must be: (1) Decompressed (2) Written back to memory in uncompressed form (3) Scanned and the results returned. When OZIP is applied to the data inside of an In-Memory Columnar Unit (or IMCU, an N sized chunk of rows), the DAX is able to take this data in its compressed format and operate (scan) directly upon it, returning results in a single step. This not only saves on compute power by not having the CPU do the decompression step, but also on memory bandwidth as the uncompressed data is not put back into memory. Only the results are returned. To illustrate this, a microbenchmark was used which measured the amount of rows that could be scanned per second. Compression This performance test was run on a Scale Factor 1750 database, which represents a 1.75 TB row format data warehouse. The database is then transformed into a star schema which ends up around 1.1 TB in size. The star schema is then loaded in memory with a setting of "MEMCOMPRESS FOR QUERY HIGH", which focuses on performance with somewhat more aggressive compression. This memory area is a separate part of the System Global Area (SGA) which is defined by the database initialization parameter "inmemory_size". See below for an example. Here is a breakdown of each table in memory with compression ratios.   Column Name Original Size (Bytes) In Memory Size (Bytes) Compression Ratio LINEORDER 1,103,524,528,128 178,586,451,968 6.2x DATE 11,534,336 1,179,648 9.8x PART 11,534,336 1,179,648 9.8x SUPPLIER 11,534,336 1,179,648 9.8x CUSTOMER 11,534,336 1,179,648 9.8x   Configuration Summary SPARC Server: 1 X SPARC T7-1 server 1 X SPARC M7 processor 512 GB memory Oracle Solaris 11.3 Oracle Database 12c Enterprise Edition Release 12.1.0.2.13   x86 Server: 1 X Oracle Server X5-2L 2 X Intel Xeon Processor E5-2699 v3 512 GB memory Oracle Linux 6 Update 5 (3.8.13-16.2.1.el6uek.x86_64) Oracle Database 12c Enterprise Edition Release 12.1.0.2.13   Benchmark Description The real cardinality database (RCDB) benchmark was created to showcase the potential speedup one may see moving from on disk, row format data warehouse/Star Schema, to utilizing Oracle Database 12c's In-Memory feature for analytical queries. The workload consists of 2,304 unique queries asking questions such as "In 2014, what was the total revenue of single item orders", or "In August 2013, how many orders exceeded a total price of $50". Questions like these can help a company see where to focus for further revenue growth or identify weaknesses in their offerings. RCDB scale factor 1750 represents a 1.75 TB data warehouse. It is transformed into a star schema of 1.1 TB, and then becomes 179 GB in size when loaded in memory. It consists of 1 fact table, and 4 dimension tables with over 10.5 billion rows. There are 56 columns with most cardinalities varying between 5 and 2,000, a primary key being an example of something outside this range. One problem with many industry standard generated databases is that as they have grown in size the cardinalities for the generated columns have become exceedingly unrealistic. For instance one industry standard benchmark uses a schema where at scale factor 1 TB it calls for the number of parts to be SF * 800,000. A 1 TB database that calls for 800 million unique parts is not very realistic. Therefore RCDB attempts to take some of these unrealistic cardinalities and size them to be more representative of at least a section of customer data. Obviously one cannot encompass every database in one schema, this is just an example. We carefully scaled each system so that the optimal number of users was run on each system under test so that we did not create artificial bottlenecks. Each user ran an equal number of queries and the same queries were run on each system, allowing for a fair comparison of the results. Key Points and Best Practices This benchmark utilized the SPARC M7 processor's co-processor DAX for query acceleration. All SPARC T7-1 server results were run with out of the box tuning for Oracle Solaris. All Oracle Server X5-2L system results were run with out of the box tunings for Oracle Linux except for the setting in /etc/sysctl.conf to get large pages for the Oracle Database: vm.nr_hugepages=64520 To create an in memory area, the following was added to the init.ora: inmemory_size = 200g An example of how to set a table to be in memory is below: ALTER TABLE CUSTOMER INMEMORY MEMCOMPRESS FOR QUERY HIGH See Also Audit report by Enterprise Strategy Group SPARC T7-1 Server oracle.com    OTN Oracle Server X5-2L oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle Database – In-Memory oracle.com    OTN   Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 10/25/2015.

Fast analytics on large databases are critical to transforming key business processes. Oracle's SPARC M7 processors are specifically designed to accelerate in-memory analytics using Oracle Database 12c...

Benchmark

Hadoop TeraSort: SPARC T7-4 Top Per-Chip Performance

Oracle's SPARC T7-4 server using virtualization delivered an outstanding single server result running the Hadoop TeraSort benchmark.  The SPARC T7-4 server was run with and without security. Even the secure runs on the SPARC M7 processor based server performed much faster per chip compared to competitive unsecure results. The SPARC T7-4 server on a per chip basis is 4.7x faster than an IBM POWER8 based cluster on the 10 TB Hadoop TeraSort benchmark. The SPARC T7-4 server running with ZFS encryption enabled on the 10 TB Hadoop TeraSort benchmark is 4.6x faster than an unsecure x86 v2 cluster on a per chip basis. The SPARC T7-4 server running with ZFS encryption (AES-256-GCM) enabled on the 10 TB Hadoop TeraSort benchmark is 4.3x faster than an unsecure (plain-text) IBM POWER8 cluster on a per chip basis. The SPARC T7-4 server ran the 10 TB Hadoop TeraSort benchmark in 4,259 seconds. Performance Landscape The following table presents results for the 10 TB Hadoop TeraSort benchmark. The rate results are determined by taking the dataset size (10**13) and dividing by the time (in minutes). These rates are further normalized by the number of systems or chips used in obtaining the results.   10 TB Hadoop TeraSort Performance Landscape System Security Nodes Total Chips Time (sec) Sort Rate (GB/min) Per Node Per Chip SPARC T7-4 SPARC M7 (4.13 GHz) unsecure 1 4 4,259 140.9 35.2 SPARC T7-4 SPARC M7 (4.13 GHz) AES-256-GCM 1 4 4,657 128.8 32.2 IBM Power System S822L POWER8 (3.0 GHz) unsecure 8 32 2,490 30.1 7.5 Dell R720xd/VMware Intel Xeon E5-2680 v2 (2.8 GHz) unsecure 32 64 1,054 17.8 8.9 Cisco UCS CPA C240 M3 Intel Xeon E5-2665 (2.4 GHz) unsecure 16 32 3,112 12.0 6.0   Configuration Summary Server: SPARC T7-4 4 x SPARC M7 processors (4.13 GHz) 2 TB memory (64 x 32 GB) 6 x 600 GB 10K RPM SAS-2 HDD 10 GbE Oracle Solaris 11.3 (11.3.0.29) Oracle Solaris Studio 12.4 Java SE Runtime Environment (build 1.7.0_85-b33) Hadoop 1.2.1   External Storage (Common Multiprotocol SCSI TARget, or COMSTAR enables system to be seen as a SCSI target device): 16 x Sun Server X3-2L 2 x Intel Xeon E5-2609 (2.4 GHz) 16 GB memory (2 x 8 GB) 2 x 600 GB SAS-2 HDD 12 x 3 TB SAS-1 HDD 4 x Sun Flash Accelerator F40 PCIe Card Oracle Solaris 11.1 (11.1.16.5.0) Please note: These devices are only used as storage. No Hadoop is run on these COMSTAR storage nodes. There was no compression or encryption done on these COMSTAR storage nodes.   Benchmark Description The Hadoop TeraSort benchmark sorts 100-byte records by a contained 10-byte random key.  Hadoop TeraSort is characterized by high I/O bandwidth between each compute/data node of a Hadoop cluster and the disk drives that are attached to that node. Note: benchmark size is measured by power-of-ten not power-of-two bytes; 1 TB sort is sorting 10^12 Bytes = 10 billion 100-byte rows using an embedded 10-Byte key field of random characters, 100 GB sort is sorting 10^11 Bytes = 1 billion 100-byte rows, etc. Key Points and Best Practices The SPARC T7-4 server was configured with 15 Oracle Solaris Zones.  Each Zone was running one Hadoop data-node with HDFS layered on an Oracle Solaris ZFS volume. Hadoop uses a distributed, shared nothing, batch processing framework employing divide-conquer serial Map and Reduce JVM tasks with performance coming from scale-out concurrency (e.g. more tasks) rather than parallelism. Only one job scheduler and task manager can be configured per data/compute-node and both (job scheduler and task manager) have inherent scaling limitations (the hadoop design target being small compute-nodes and hundreds or even thousands of them). Multiple data-nodes significantly help improve overall system utilization – HDFS becomes more distributed with more processes servicing file system operations, and more task-trackers are managing all the MapReduce work. On large node systems virtualization is required to improve utilization by increasing the number of independent data/compute nodes each running their own hadoop processes. I/O bandwidth to the local disk drives and network communication bandwidth are the primary determinants of Hadoop performance.  Typically, Hadoop reads input data files from HDFS during the Map phase of computation, and stores intermediate file back to HDFS. Then during the subsequent Reduce phase of computation, Hadoop reads the intermediate files, and outputs the final result. The Map and Reduce phases are executed concurrently by multiple Map tasks and Reduce tasks. Tasks are purpose-built stand-alone serial applications often written in Java (but can be written in any programming language or script). See Also Hadoop Hadoop TeraSort package SPARC T7-4 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Java oracle.com    OTN Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.  Results as of 25 October 2015. Competitive results found at: Dell R720xd/VMware, IBM S822L, Cisco C240 M3

Oracle's SPARC T7-4 server using virtualization delivered an outstanding single server result running the Hadoop TeraSort benchmark.  The SPARC T7-4 server was run with and without security. Even the...

Benchmark

Memory and Bisection Bandwidth: SPARC T7 and M7 Servers Faster Than x86 and POWER8

The STREAM benchmark measures delivered memory bandwidth on a variety of memory intensive tasks.  Delivered memory bandwidth is key to a server delivering high performance on a wide variety of workloads.  The STREAM benchmark is typically run where each chip in the system gets its memory requests satisfied from local memory.  This report presents performance of Oracle's SPARC M7 processor based servers and compares their performance to x86 and IBM POWER8 servers. Bisection bandwidth on a server is a measure of the cross-chip data bandwidth between the processors of a system where no memory access is local to the processor.  Systems with large cross-chip penalties show dramatically lower bisection bandwidth.  Real-world ad hoc workloads tend to perform better on systems with better bisection bandwidth because their memory usage characteristics tend to be chaotic. IBM says the sustained or delivered bandwidth of the IBM POWER8 12-core chip is 230 GB/s. This number is a peak bandwidth calculation: 230.4 GB/sec = 9.6 GHz * 3 (r+w) * 8 byte. A similar calculation is used by IBM for the POWER8 dual-chip-module (two 6-core chips) to show a sustained or delivered bandwidth of 192 GB/sec (192.0 GB/sec = 8.0 GHz * 3 (r+w) * 8 byte).  Peaks are the theoretical limits used for marketing hype, but true measured delivered bandwidth is the only useful comparison to help one understand delivered performance of real applications. The STREAM benchmark is easy to run and anyone can measure memory bandwidth on a target system (see Key Points and Best Practices section). The SPARC M7-8 server delivers over 1 TB/sec on the STREAM benchmark.  This is over 2.4 times the triad bandwidth of an eight-chip x86 E7 v3 server. The SPARC T7-4 delivered 2.2 times the STREAM triad bandwidth of a four-chip x86 E7 v3 server and 1.7 times the triad bandwidth of a four-chip IBM Power System S824 server. The SPARC T7-2 delivered 2.5 times the STREAM triad bandwidth of a two-chip x86 E5 v3 server. The SPARC M7-8 server delivered over 8.5 times the triad bisection bandwidth of an eight-chip x86 E7 v3 server. The SPARC T7-4 server delivered over 2.7 times the triad bisection bandwidth of a four-chip x86 E7 v3 server and 2.3 times the triad bisection bandwidth of a four-chip IBM Power System S824 server. The SPARC T7-2 server delivered over 2.7 times the triad bisection bandwidth of a two-chip x86 E5 v3 server.   Performance Landscape The following SPARC, x86, and IBM S824 STREAM results were run as part of this benchmark effort. The IBM S822L result is from the referenced web location.  The following SPARC results were all run using 32 GB dimms. Maximum STREAM Benchmark Performance System

The STREAM benchmark measures delivered memory bandwidth on a variety of memory intensive tasks.  Delivered memory bandwidth is key to a server delivering high performance on a wide variety...

Benchmark

Graph PageRank: SPARC M7-8 Beats x86 E5 v3 Per Chip

Graph algorithms are used in many big data and analytics workloads.  The report presents performance using the PageRank algorithm.  Oracle's SPARC M7 processor based systems provide better performance than an x86 E5 v3 based system. Oracle's SPARC M7-8 server was able to deliver 3.2 times faster per chip performance than a two-chip x86 E5 v3 server running a PageRank algorithm implemented using Parallel Graph AnalytiX (PGX) from Oracle Labs on a medium sized graph. Performance Landscape The graph used for these results has 41,652,230 nodes and 1,468,365,182 edges using 22 GB of memory.  All of the following results were run as part of this benchmark effort. Performance is a measure of processing rate, bigger is better. PageRank Algorithm Server Performance SPARC Advantage SPARC M7-8 8 x SPARC M7 (4.13 GHz, 8x 32core) 281.1 3.2x faster per chip x86 E5 v3 server 2 x Intel E5-2699 v3 (2.3 GHz, 2x 18core) 22.2 1.0 The number of cores are per processor. Configuration Summary Systems Under Test: SPARC M7-8 server with 4 x SPARC M7 processors (4.13 GHz) 4 TB memory Oracle Solaris 11.3 Oracle Solaris Studio 12.4   Oracle Server X5-2 with 2 x Intel Xeon Processor E5-2699 v3 (2.3 GHz) 384 GB memory Oracle Linux gcc 4.7.4   Benchmark Description Graphs are a core part of many analytics workloads. They are very data intensive and stress computers.  Each algorithm typically traverses the entire graph multiple times, while doing certain arithmetic operations during the traversal, it can perform (double/single precision) floating point operations. The mathematics of PageRank are entirely general and apply to any graph or network in any domain. Thus, PageRank is now regularly used in bibliometrics, social and information network analysis, and for link prediction and recommendation. The PageRank algorithm counts the number and quality of links to a page to determine a rough estimate of the importance of the website. Key Points and Best Practices This algorithm is implemented using PGX (Parallel Graph AnalytiX) from Oracle Labs, a fast, parallel, in-memory graph analytic framework. The graph used for these results is based on real world data from Twitter and has 41,652,230 nodes and 1,468,365,182 edges using 22 GB of memory. See Also PageRank Description More on PGX OTN    Docs SPARC M7-8 Server oracle.com    OTN Oracle Server X5-2 oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Solaris Studio oracle.com    OTN Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of October 25, 2015.

Graph algorithms are used in many big data and analytics workloads.  The report presents performance using the PageRank algorithm.  Oracle's SPARC M7 processor based systems provide better performance...

Benchmark

Yahoo Cloud Serving Benchmark: SPARC T7-4 With Oracle NoSQL Beats x86 E5 v3 Per Chip

Oracle's SPARC T7-4 server delivered 1.9 million ops/sec on 1.6 billion records for the Yahoo Cloud Serving Benchmark (YCSB) 95% read/5% update workload.  Oracle NoSQL Database was used in these tests. NoSQL is important for Big Data Analysis and for Cloud Computing. One processor performance on the SPARC T7-4 server was 2.5 times better than one chip Intel Xeon E5-2699 v3 for the YCSB 95% read/5% update workload. The SPARC T7-4 server showed low average latency of 1.12 msec on read and 4.90 msec on write while achieving nearly 1.9 million ops/sec. The SPARC T7-4 server delivered 325K inserts/sec on 1.6 billion records with a low average latency of 2.65 msec. One processor performance on the SPARC T7-4 server was over half a million (511K ops/sec) on 400 million records for the YCSB 95% read/5% update workload. Near-linear scaling from 1 to 4 processors was 3.7x while maintaining low latency. These results show the SPARC T7-4 server can handle a large database while achieving high throughput with low latency for cloud computing. Performance Landscape This table presents single chip results comparing the SPARC M7 processor (in a SPARC T7-4 server) to the Intel Xeon Processor E5-2699 v3 (in a 2-socket x86 server).  All of the following results were run as part of this benchmark effort. Comparing Single Chip Performance on YCSB Benchmark Processor Insert Mixed Load (95% Read/5% Update) Throughput ops/sec Average Latency Throughput ops/sec Average Latency Write msec Read msec Write msec SPARC M7 89,617 2.40 510,824 1.07 3.80 E5-2699 v3 55,636 1.18 202,701 0.71 2.30   The following table shows the performance of the Yahoo Clouds Serving Benchmark on multiple processor counts on the SPARC T7-4 server. SPARC T7-4 server running YCSB benchmark CPUs Shards Insert Mixed Load (95% Read/5% Update) Throughput ops/sec Average Latency Throughput ops/sec Average Latency Write msec Read msec Write msec 4 16 325,167 2.65 1,890,394 1.12 4.90 3 12 251,051 2.57 1,428,813 1.12 4.68 2 8 170,963 2.52 968,146 1.11 4.37 1 4 89,617 2.40 510,824 1.07 3.80   Configuration Summary System Under Test: SPARC T7-4 server 4 x SPARC M7 processors (4.13 GHz) 2 TB memory (64 x 32 GB) 8 x Sun Storage 16 Gb Fibre Channel PCIe Universal FC HBA, Emulex 8 x Sun Dual Port 10 GbE PCIe 2.0 Low Profile Adapter, Base-T   Oracle Server X5-2L server 2 x Intel Xeon E5-2699 v3 processors (2.3 GHz) 384 GB memory 1 x Sun Storage 16 Gb Fibre Channel PCIe Universal FC HBA, Emulex 1 x Sun Dual 10GbE SFP+ PCIe 2.0 Low Profile Adapter   External Storage (Common Multiprotocol SCSI TARget, or COMSTAR enables system to be seen as a SCSI target device): 16 x Sun Server X3-2L servers configured as COMSTAR nodes, each with 2 x Intel Xeon E5-2609 processors (2.4 GHz) 4 x Sun Flash Accelerator F40 PCIe Cards, 400 GB each 1 x 8 Gb dual port HBA Please note: These devices are only used as storage. No NoSQL is run on these COMSTAR storage nodes. There is no query acceleration done on these COMSTAR storage nodes.   Software Configuration: Oracle Solaris 11.3 (11.3.1.2.0) Logical Domains Manager v3.3.0.0.17 (running on the SPARC T7-4) Oracle NoSQL Database, Enterprise Edition 12c R1.3.2.5 Java(TM) SE Runtime Environment (build 1.8.0_60-b27)   Benchmark Description The Yahoo Cloud Serving Benchmark (YCSB) is a performance benchmark for cloud database and their systems.  The benchmark documentation says: With the many new serving databases available including Sherpa, BigTable, Azure and many more, it can be difficult to decide which system is right for your application, partially because the features differ between systems, and partially because there is not an easy way to compare the performance of one system versus another.  The goal of the Yahoo Cloud Serving Benchmark (YCSB) project is to develop a framework and common set of workloads for evaluating the performance of different "key-value" and "cloud" serving stores. Key Points and Best Practices The SPARC T7-4 server showed 3.7x scaling from 1 to 4 sockets while maintaining low latency. Four Oracle VM for SPARC (LDom) servers were created per processor, for a total of sixteen LDoms.  Each LDom was configured with 120 GB memory accessing two PCIe IO slots under SR-IOV (Single Root IO Virtualization). The Sun Flash Accelerator F40 PCIe Card demonstrated excellent IO capability and performed 841K read IOPS (3.5K IOPS per disk) during the 1.9 million ops/sec benchmark run. There was no performance loss from Fibre Channel SR-IOV (Single Root IO Virtualization) compared to native. Balanced memory bandwidth was delivered across all four processors achieving an average total of 304 GB/sec during 1.9 million ops/sec run. The 1.6 billion records were loaded into 16 Shards with the replication factor set to 3. Each LDom is associated with a processor set (16 total).  The default processor set was additionally used for OS and IO interrupts.  The processors sets were used to ensure a balanced load. Fixed priority class was assigned to Oracle NoSQL Storage Node java processes. The ZFS record size was set to 16K (default 128K) and this worked the best for 95% read/5% update workload. A total of eight Sun Server X4-2 and Sun Server X4-2L systems were used as clients for generating the workload. The LDoms and client systems were connected through a 10 GbE network. See Also Yahoo Cloud Serving Benchmark YCSB Source SPARC T7-4 Server oracle.com    OTN SPARC T5-8 Server oracle.com    OTN Oracle Server X5-2L oracle.com    OTN Sun Flash Accelerator F40 PCIe Card oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle NoSQL Database oracle.com    OTN   Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of Oct 25, 2015.

Oracle's SPARC T7-4 server delivered 1.9 million ops/sec on 1.6 billion records for the Yahoo Cloud Serving Benchmark (YCSB) 95% read/5% update workload.  Oracle NoSQL Database was used in these...

Benchmark

Graph Breadth First Search Algorithm: SPARC T7-4 Beats 4-Chip x86 E7 v2

Graph algorithms are used in many big data and analytics workloads.  Oracle's SPARC T7 processor based systems provide better performance than x86 systems with the Intel Xeon E7 v2 family of processors. The SPARC T7-4 server was able to deliver 3.1x better performance than a four chip x86 server running a breadth-first search (BFS) on a large graph. Performance Landscape The problem is identified by "Scale" and the approximate amount of memory used. Results are listed as edge traversals in billions (ETB) per second (bigger is better).  The SPARC M7 processor results were run as part  of this benchmark effort.  The x86 results were taken from a previous benchmark effort. Breadth-First Search (BFS) Scale Dataset (GB) ETB/sec Speedup T7-4/x86 SPARC T7-4 x86 (4xE7 v2) 30 580 1.68 0.54 3.1x 29 282 1.76 0.62 2.8x 28 140 1.70 0.99 1.7x 27 70 1.56 1.07 1.5x 26 35 1.67 1.19 1.4x   Configuration Summary Systems Under Test: SPARC T7-4 server with 4 x SPARC M7 processors (4.13 GHz) 1 TB memory Oracle Solaris 11.3 Oracle Solaris Studio 12.4   Sun Server X4-4 system with 4 x Intel Xeon E7-8895 v2 processors (2.8 GHz) 1 TB memory Oracle Solaris 11.2 Oracle Solaris Studio 12.4   Benchmark Description Graphs are a core part of many analytics workloads.  They are very data intensive and stress computers.  This benchmark does a breadth-first search on a randomly generated graph. It reports the number of graph edges traversed (in billions) per second (ETB/sec).  To generate the graph, the data generator from the graph500 benchmark was used. A description of what breadth-first search is taken from Introduction to Algorithms, page 594: Given a graph G = (V, E) and a distinguished source vertex s, breadth-first search systematically explores the edges of G to "discover" every vertex that is reachable from s. It computes the distance (smallest number of edges) from s to each reachable vertex. It also produces a "breadth-first tree" with root s that contains all reachable vertices. For any vertex reachable from s, the simple path in the breadth-first tree from s to corresponds to a "shortest path" from s to in G, that is, a path containing the smallest number of edges. The algorithm works on both directed and undirected graphs. Cormen, Thomas H., Leiserson, Charles E., Rivest, Ronald L., Stein, Clifford (2009) [1990].  Introduction to Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN 0-262-03384-4. See Also Similar test – graph500 SPARC T7-4 Server oracle.com    OTN Sun Server X4-4 oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Solaris Studio oracle.com    OTN Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of October 25, 2015.

Graph algorithms are used in many big data and analytics workloads.  Oracle's SPARC T7 processor based systems provide better performance than x86 systems with the Intel Xeon E7 v2 family of...

Benchmark

Neural Network Models Using Oracle R Enterprise: SPARC T7-4 Beats 4-Chip x86 E7 v3

Oracle's SPARC T7-4 server executing neural network algorithms using Oracle R Enterprise (ORE) is up to two times faster than a four-chip x86 E7 v3 server. For a neural network with two hidden layers, 10-neuron with 5-neuron hyperbolic tangent, the SPARC T7-4 server is 1.5 times faster than a four-chip x86 T7 v3 server on calculation time. For a neural network with two hidden layers, 20-neuron with 10-neuron hyperbolic tangent, the SPARC T7-4 server is 2.0 times faster than than a four-chip x86 T7 v3 server on calculation time. Performance Landscape Oracle Enterprise R Statistics in Oracle Database (250 million rows) Neural Network with Two Hidden Layers Elapsed Calculation Time SPARC Advantage 4-chip x86 E7 v3 SPARC T7-4 10-neuron + 5-neuron hyperbolic tangent 520.1 (sec) 337.3 (sec) 1.5x 20-neuron + 10-neuron hyperbolic tangent 1128.4 (sec) 578.1 (sec) 2.0x Configuration Summary SPARC Configuration: SPARC T7-4 4 x SPARC M7 processors (4.13 GHz) 2 TB memory (64 x 32 GB dimms) Oracle Solaris 11.3 Oracle Database 12c Enterprise Edition Oracle R Enterprise 1.5 Oracle Solaris Studio 12.4 with 4/15 patch set x86 Configuration: Oracle Server X5-4 4 x Intel Xeon Processor E7-8895 v3 (2.6 GHz) 512 GB memory Oracle Linux 6.4 Oracle Database 12c Enterprise Edition Oracle R Enterprise 1.5 Storage Configuration: Oracle Server X5-2L 2 x Intel Xeon Processor E5-2699 v3 512 GB memory 4 x 1.6 TB 2.5-inch NVMe PCIe 3.0 SSD 2 x Sun Storage Dual 16Gb FC PCIe HBA Oracle Solaris 11.3 Benchmark Description The benchmark is designed to run various statistical analyses using Oracle R Enterprise (ORE) with historical aviation data.  The size of the benchmark data is about 35 GB, a single table holding 250 million rows. One of the most popular algorithms, neural network, has been used against the dataset to generate comparable results. The neural network algorithms support various features. In this workload, the following two neural network features have been used: neural net with two hidden layers 10-neuron with 5-neuron hyperbolic tangent and neural net with two hidden layers 20-neuron with 10-neuron hyperbolic tangent. See Also Public Government Data Oracle R Technologies Blog SPARC T7-4 Server oracle.com    OTN Oracle Server X5-4 oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle R Enterprise oracle.com    OTN Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 25 October 2015.

Oracle's SPARC T7-4 server executing neural network algorithms using Oracle R Enterprise (ORE) is up to two times faster than a four-chip x86 E7 v3 server. For a neural network with two hidden layers,...

Benchmark

AES Encryption: SPARC T7-2 Beats x86 E5 v3

Oracle's cryptography benchmark measures security performance on important AES security modes. Oracle's SPARC M7 processor with its software in silicon security is faster than x86 servers that have the AES-NI instructions. In this test, the performance of on-processor encryption operations is measured (32 KB encryptions). Multiple threads are used to measure each processor's maximum throughput. Oracle's SPARC T7-2 server shows dramatically faster encryption compared to current x86 two processor servers. SPARC M7 processors running Oracle Solaris 11.3 ran 4.0 times faster executing AES-CFB 256-bit key encryption (in cache) than Intel Xeon E5-2699 v3 processors (with AES-NI) running Oracle Linux 6.5. SPARC M7 processors running Oracle Solaris 11.3 ran 3.7 times faster executing AES-CFB 128-bit key encryption (in cache) than Intel Xeon E5-2699 v3 processors (with AES-NI) running Oracle Linux 6.5. SPARC M7 processors running Oracle Solaris 11.3 ran 6.4 times faster executing AES-CFB 256-bit key encryption (in cache) than the Intel Xeon E5-2697 v2 processors (with AES-NI) running Oracle Linux 6.5. SPARC M7 processors running Oracle Solaris 11.3 ran 6.0 times faster executing AES-CFB 128-bit key encryption (in cache) than the Intel Xeon E5-2697 v2 processors (with AES-NI) running Oracle Linux 6.5. AES-CFB encryption is used by Oracle Database for Transparent Data Encryption (TDE) which provides security for database storage. Oracle has also measured SHA digest performance on the SPARC M7 processor. Performance Landscape Presented below are results for running encryption using the AES cipher with the CFB, CBC, GCM and CCM modes for key sizes of 128, 192 and 256. Decryption performance was similar and is not presented. Results are presented as MB/sec (10**6). All SPARC M7 processor results were run as part of this benchmark effort. All other results were run during previous benchmark efforts. Encryption Performance – AES-CFB (used by Oracle Database) Performance is presented for in-cache AES-CFB128 mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CFB Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CFB SPARC M7 4.13 2 126,948 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 53,794 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v3 2.30 2 31,924 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 19,964 Oracle Linux 6.5, IPP/AES-NI AES-192-CFB SPARC M7 4.13 2 144,299 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 60,736 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v3 2.30 2 37,157 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 23,218 Oracle Linux 6.5, IPP/AES-NI AES-128-CFB SPARC M7 4.13 2 166,324 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 68,691 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v3 2.30 2 44,388 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 27,755 Oracle Linux 6.5, IPP/AES-NI Encryption Performance – AES-CBC Performance is presented for in-cache AES-CBC mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CBC Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CBC SPARC M7 4.13 2 134,278 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 56,788 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v3 2.30 2 31,894 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 19,961 Oracle Linux 6.5, IPP/AES-NI AES-192-CBC SPARC M7 4.13 2 152,961 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 63,937 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v3 2.30 2 37,021 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 23,224 Oracle Linux 6.5, IPP/AES-NI AES-128-CBC SPARC M7 4.13 2 175,151 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 72,870 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2699 v3 2.30 2 44,103 Oracle Linux 6.5, IPP/AES-NI Intel E5-2697 v2 2.70 2 27,730 Oracle Linux 6.5, IPP/AES-NI Encryption Performance – AES-GCM (used by ZFS Filesystem) Performance is presented for in-cache AES-GCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run). AES-GCM Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-GCM SPARC M7 4.13 2 74,221 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 34,022 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 15,338 Oracle Solaris 11.1, libsoftcrypto + libumem AES-192-GCM SPARC M7 4.13 2 81,448 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 36,820 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 15,768 Oracle Solaris 11.1, libsoftcrypto + libumem AES-128-GCM SPARC M7 4.13 2 86,223 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 38,845 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 16,405 Oracle Solaris 11.1, libsoftcrypto + libumem Encryption Performance – AES-CCM (alternative used by ZFS Filesystem) Performance is presented for in-cache AES-CCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run). AES-CCM Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CCM SPARC M7 4.13 2 67,669 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 28,909 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 19,447 Oracle Linux 6.5, IPP/AES-NI AES-192-CCM SPARC M7 4.13 2 77,711 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 33,116 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 22,634 Oracle Linux 6.5, IPP/AES-NI AES-128-CCM SPARC M7 4.13 2 90,729 Oracle Solaris 11.3, libsoftcrypto + libumem SPARC T5 3.60 2 38,529 Oracle Solaris 11.2, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 26,951 Oracle Linux 6.5, IPP/AES-NI Configuration Summary SPARC T7-2 server 2 x SPARC M7 processor, 4.13 GHz 1 TB memory Oracle Solaris 11.3 SPARC T5-2 server 2 x SPARC T5 processor, 3.60 GHz 512 GB memory Oracle Solaris 11.2 Oracle Server X5-2 system 2 x Intel Xeon E5-2699 v3 processors, 2.30 GHz 256 GB memory Oracle Linux 6.5 Sun Server X4-2 system 2 x Intel Xeon E5-2697 v2 processors, 2.70 GHz 256 GB memory Oracle Linux 6.5 Benchmark Description The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-cache and on-chip using various ciphers, including AES-128-CFB, AES-192-CFB, AES-256-CFB, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CCM, AES-192-CCM, AES-256-CCM, AES-128-GCM, AES-192-GCM and AES-256-GCM. The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various ciphers. They were run using optimized libraries for each platform to obtain the best possible performance. The encryption tests were run with pseudo-random data of size 32 KB. The benchmark tests were designed to run out of cache, so memory bandwidth and latency are not the limitations. See Also More about AES SPARC T7-2 Server oracle.com     OTN     Blog SPARC T5-2 Server oracle.com     OTN Oracle Server X5-2 oracle.com     OTN     Blog Sun Server X4-2L oracle.com     OTN Sun Server X3-2 oracle.com     OTN Oracle Solaris oracle.com     OTN     Blog Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 10/25/2015.

Oracle's cryptography benchmark measures security performance on important AES security modes. Oracle's SPARC M7 processor with its software in silicon security is faster than x86 servers that...

Benchmark

SPECjEnterprise2010: SPARC T7-1 World Record with Single Application Server Using 1 to 4 Chips

Oracle's SPARC T7-1 servers have set a world record for the SPECjEnterprise2010 benchmark for solutions using a single application server with one to four chips.  The result of 25,818.85 SPECjEnterprise2010 EjOPS used two SPARC T7-1 servers, one server for the application tier and the other server for the database tier. The SPARC T7-1 servers obtained a result of 25,093.06 SPECjEnterprise2010 EjOPS using encrypted data. This secured result used Oracle Advanced Security Transparent Data Encryption (TDE) for the application database tablespaces with the AES-256-CFB cipher. The network connection between the application server and the database server was also encrypted using the secure JDBC. The SPARC T7-1 server solution delivered 34% more performance compared to the two-chip IBM x3650 M5 server result of 19,282.14 SPECjEnterprise2010 EjOPS.  The SPARC T7-1 server solution delivered 14% more performance compared to the four-chip IBM Power System S824 server result of 22,543.34 SPECjEnterprise2010 EjOPS. The SPARC T7-1 server based results demonstrated 20% more performance compared to the Oracle Server X5-2 system result of 21,504.30 SPECjEnterprise2010 EjOPS. Oracle holds the top x86 two-chip application server SPECjEnterprise2010 result. The application server used Oracle Fusion Middleware components including the Oracle WebLogic 12.1 application server and Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.8.0_60. The database server was configured with Oracle Database 12c Release 1. For the secure result, the application data was encrypted in the Oracle database using the Oracle Advanced Security Transparent Data Encryption (TDE) feature. Hardware accelerated cryptography support in the SPARC M7 processor for the AES-256-CFB cipher was used to provide data security. The benchmark performance using the secure SPARC T7-1 server configuration with encryption was less than 3% when compared to the peak result. This result demonstrated less than 1 second average response times for all SPECjEnterprise2010 transactions and represents Jave EE 5.0 transactions generated by over 210,000 users. Performance Landscape Select single application server results. Complete benchmark results are at the SPEC website, SPECjEnterprise2010 Results. SPECjEnterprise2010 Performance Chart 10/25/2015 Submitter EjOPS* Java EE Server DB Server Notes Oracle 25,818.85 1 x SPARC T7-1 1 x 4.13 GHz SPARC M7 Oracle WebLogic 12c (12.1.3) 1 x SPARC T7-1 1 x 4.13 GHz SPARC M7 Oracle Database 12c (12.1.0.2) - Oracle 25,093.06 1 x SPARC T7-1 1 x 4.13 GHz SPARC M7 Oracle WebLogic 12c (12.1.3) Network Data Encryption for JDBC 1 x SPARC T7-1 1 x 4.13 GHz SPARC M7 Oracle Database 12c (12.1.0.2) Transparent Data Encryption Secure IBM 22,543.34 1 x IBM Power S824 4 x 3.5 GHz POWER 8 WebSphere Application Server V8.5 1 x IBM Power S824 4 x 3.5 GHz POWER 8 IBM DB2 10.5 FP3 - Oracle 21,504.30 1 x Oracle Server X5-2 2 x 2.3 GHz Intel Xeon E5-2699 v3 Oracle WebLogic 12c (12.1.3) 1 x Oracle Server X5-2 2 x 2.3 GHz Intel Xeon E5-2699 v3 Oracle Database 12c (12.1.0.2) COD IBM 19,282.14 1 x System x3650 M5 2 x 2.6 GHz Intel Xeon E5-2697 v3 WebSphere Application Server V8.5 1 x System x3850 X6 4 x 2.8 GHz Intel Xeon E7-4890 v2 IBM DB2 10.5 FP5 - * SPECjEnterprise2010 EjOPS (bigger is better) The Cluster on Die (COD) mode is a BIOS setting that effectively splits the chip in half, making the operating system think it has twice as many chips as it does (in this case, four, 9 core chips).  Intel has stated that COD is appropriate only for highly NUMA optimized workloads.  Dell has shown that there is a 3.7x slower bandwidth to the other half of the chip split by COD. Configuration Summary Application Server: 1 x SPARC T7-1 server, with 1 x SPARC M7 processor (4.13 GHz) 256 GB memory (16 x 16 GB) 2 x 600 GB SAS HDD 2 x 400 GB SAS SSD 3 x Sun Dual Port 10 GbE PCIe 2.0 Networking card with Intel 82599 10 GbE Controller Oracle Solaris 11.3 (11.3.0.0.30) Oracle WebLogic Server 12c (12.1.3) Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.8.0_60 Database Server: 1 x SPARC T7-1 server, with 1 x SPARC M7 processor (4.13 GHz) 512 GB memory (16 x 32 GB) 2 x 600 GB SAS HDD 1 x Sun Storage 16 Gb Fibre Channel Universal HBA Oracle Solaris 11.3 (11.3.0.0.30) Oracle Database 12c (12.1.0.2) Storage Servers: 1 x Oracle Server X5-2L (8-Drive), with 2 x Intel Xeon Processor E5-2699 v3 (2.3 GHz) 32 GB memory 1 x Sun Storage 16 Gb Fibre Channel Universal HBA 4 x 1.6 TB NVMe SSD 2 x 600 GB SAS HDD Oracle Solaris 11.3 (11.3.0.0.30) 1 x Oracle Server X5-2L (24-Drive), with 2 x Intel Xeon Processor E5-2699 v3 (2.3 GHz) 32 GB memory 1 x Sun Storage 16 Gb Fibre Channel Universal HBA 14 x 600 GB SAS HDD Oracle Solaris 11.3 (11.3.0.0.30) 1 x Brocade 6510 16 Gb FC switch Benchmark Description SPECjEnterprise2010 is the third generation of the SPEC organization's J2EE end-to-end industry standard benchmark application. The SPECjEnterprise2010 benchmark has been designed and developed to cover the Java EE 5 specification's significantly expanded and simplified programming model, highlighting the major features used by developers in the industry today. This provides a real world workload driving the Application Server's implementation of the Java EE specification to its maximum potential and allowing maximum stressing of the underlying hardware and software systems, The web zone, servlets, and web services The EJB zone JPA 1.0 Persistence Model JMS and Message Driven Beans Transaction management Database connectivity Moreover, SPECjEnterprise2010 also heavily exercises all parts of the underlying infrastructure that make up the application environment, including hardware, JVM software, database software, JDBC drivers, and the system network. The primary metric of the SPECjEnterprise2010 benchmark is jEnterprise Operations Per Second (SPECjEnterprise2010 EjOPS). The primary metric for the SPECjEnterprise2010 benchmark is calculated by adding the metrics of the Dealership Management Application in the Dealer Domain and the Manufacturing Application in the Manufacturing Domain. There is NO price/performance metric in this benchmark. Key Points and Best Practices Four Oracle WebLogic server instances on the SPARC T7-1 server were hosted in 4 separate Oracle Solaris Zones. The Oracle WebLogic application servers were executed in the FX scheduling class to improve performance by reducing the frequency of context switches. The Oracle log writer process was run in the RT scheduling class. See Also SPECjEnterprise2010 Results Page SPARC T7-1 Result Page at SPEC Encrypted SPARC T7-1 Result Page at SPEC SPARC T7-1 Server oracle.com    OTN Oracle Server X5-2L oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle Fusion Middleware oracle.com    OTN Oracle WebLogic Suite oracle.com    OTN Oracle Database – Transparent Data Encryption oracle.com    OTN Disclosure Statement SPEC and the benchmark name SPECjEnterprise are registered trademarks of the Standard Performance Evaluation Corporation. Results from www.spec.org as of 10/25/2015.  SPARC T7-1, 25,818.85 SPECjEnterprise2010 EjOPS (unsecure); SPARC T7-1, 25,093.06 SPECjEnterprise2010 EjOPS (secure); Oracle Server X5-2, 21,504.30 SPECjEnterprise2010 EjOPS (unsecure); IBM Power S824, 22,543.34 SPECjEnterprise2010 EjOPS (unsecure); IBM x3650 M5, 19,282.14 SPECjEnterprise2010 EjOPS (unsecure).

Oracle's SPARC T7-1 servers have set a world record for the SPECjEnterprise2010 benchmark for solutions using a single application server with one to four chips.  The result of 25,818.85...

Benchmark

SPECvirt_sc2013: SPARC T7-2 World Record for 2 and 4 Chip Systems

Oracle has had a new result accepted by SPEC as of November 19, 2015. This new result may be found here. Oracle's SPARC T7-2 server delivered a world record SPECvirt_sc2013 result for systems with two to four chips. The SPARC T7-2 server produced a result of 3026 @ 168 VMs SPECvirt_sc2013. The two-chip SPARC T7-2 server beat the best two-chip x86 Intel E5-2699 v3 server results by nearly 1.9 times (Huawei FusionServer RH2288H V3, HP ProLiant DL360 Gen9). The two-chip SPARC T7-2 server delivered nearly 2.2 times the performance of the four-chip IBM Power System S824 server solution which used 3.5 GHz POWER8 six core chips. The SPARC T7-2 server running Oracle Solaris 11.3 operating system, utilizes embedded virtualization products as the Oracle Solaris 11 zones, which in turn provide a low overhead, flexible, scalable and manageable virtualization environment. The SPARC T7-2 server result used Oracle VM Server for SPARC 3.3 and Oracle Solaris Zones providing a flexible, scalable and manageable virtualization environment. Performance Landscape Complete benchmark results are at the SPEC website, SPECvirt_sc2013 Results. The following table highlights the leading two-, and four-chip results for the benchmark, bigger is better. SPECvirt_sc2013 Leading Two to Four-Chip Results System Processor Chips Result @ VMs Virtualization Software SPARC T7-2 SPARC M7 (4.13 GHz, 32core) 2 3026 @ 168 Oracle VM Server for SPARC 3.3 Oracle Solaris Zones HP DL580 Gen9 Intel E7-8890 v3 (2.5 GHz, 18core) 4 3020 @ 168 Red Hat Enterprise Linux 7.1 KVM Lenovo System x3850 X6 Intel E7-8890 v3 (2.5 GHz, 18core) 4 2655 @ 147 Red Hat Enterprise Linux 6.6 KVM Huawei FusionServer RH2288H V3 Intel E5-2699 v3 (2.3 GHz, 18core) 2 1616 @ 95 Huawei FusionSphere V1R5C10 HP DL360 Gen9 Intel E5-2699 v3 (2.3 GHz, 18core) 2 1614 @ 95 Red Hat Enterprise Linux 7.1 KVM IBM Power S824 POWER8 (3.5 GHz, 6core) 4 1370 @ 79 PowerVM Enterprise Edition 2.2.3 Configuration Summary System Under Test Highlights: Hardware: 1 x SPARC T7-2 server, with 2 x 4.13 GHz SPARC M7 1 TB memory 2 Sun Dual Port 10GBase-T Adapter 2 Sun Storage Dual 16 Gb Fibre Channel PCIe Universal HBA Software: Oracle Solaris 11.3 Oracle VM Server for SPARC 3.3 (LDom) Oracle Solaris Zones Oracle iPlanet Web Server 7.0.20 Oracle PHP 5.3.29 Dovecot v2.2.18 Oracle WebLogic Server Standard Edition Release 10.3.6 Oracle Database 12c Enterprise Edition (12.1.0.2.0) Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.7.0_85-b15 Storage: 3 x Oracle Server X5-2L, with 2 x Intel Xeon Processor E5-2630 v3 8-core 2.4 GHz 32 GB memory 4 x Oracle Flash Accelerator F160 PCIe Card Oracle Solaris 11.3 1 x Oracle Server X5-2L, with 2 x Intel Xeon Processor E5-2630 v3 8-core 2.4 GHz 32 GB memory 4 x Oracle Flash Accelerator F160 PCIe Card 4x 400 GB SSDs Oracle Solaris 11.3 Benchmark Description SPECvirt_sc2013 is SPEC's updated benchmark addressing performance evaluation of datacenter servers used in virtualized server consolidation. SPECvirt_sc2013 measures the end-to-end performance of all system components including the hardware, virtualization platform, and the virtualized guest operating system and application software. It utilizes several SPEC workloads representing applications that are common targets of virtualization and server consolidation. The workloads were made to match a typical server consolidation scenario of CPU resource requirements, memory, disk I/O, and network utilization for each workload. These workloads are modified versions of SPECweb2005, SPECjAppServer2004, SPECmail2008, and SPEC CPU2006. The client-side SPECvirt_sc2013 harness controls the workloads. Scaling is achieved by running additional sets of virtual machines, called "tiles", until overall throughput reaches a peak. Key Points and Best Practices The SPARC T7-2 server running the Oracle Solaris 11.3, utilizes embedded virtualization products as the Oracle VM Server for SPARC and Oracle Solaris Zones, which provide a low overhead, flexible, scalable and manageable virtualization environment. In order to provide a high level of data integrity and availability, all the benchmark data sets are stored on mirrored (RAID1) storage Using Oracle VM Server for SPARC to bind the SPARC M7 processor with its local memory optimized system memory use in this virtual environment. See Also SPECvirt_sc2013 Results Page SPARC T7-2 Server oracle.com     OTN     Blog Oracle Solaris oracle.com     OTN     Blog Oracle Database oracle.com     OTN     Blog Oracle WebLogic Suite oracle.com     OTN     Blog Oracle Fusion Middleware oracle.com     OTN     Blog Java oracle.com     OTN Disclosure Statement SPEC and the benchmark name SPECvirt_sc are registered trademarks of the Standard Performance Evaluation Corporation. Results from www.spec.org as of 10/25/2015. SPARC T7-2, SPECvirt_sc2013 3026@168 VMs; HP DL580 Gen9, SPECvirt_sc2013 3020@168 VMs; Lenovo x3850 X6; SPECvirt_sc2013 2655@147 VMs; Huawei FusionServer RH2288H V3, SPECvirt_sc2013 1616@95 VMs; HP ProLiant DL360 Gen9, SPECvirt_sc2013 1614@95 VMs; IBM Power S824, SPECvirt_sc2013 1370@79 VMs.

Oracle has had a new result accepted by SPEC as of November 19, 2015. This new result may be found here. Oracle's SPARC T7-2 server delivered a world record SPECvirt_sc2013 result for systems with two...

Benchmark

SHA Digest Encryption: SPARC T7-2 Beats x86 E5 v3

Oracle's cryptography benchmark measures security performance on important Secure Hash Algorithm (SHA) functions. Oracle's SPARC M7 processor with its security software in silicon is faster than current and recent x86 servers. In this test, the performance of on-processor digest operations is measured for three sizes of plaintext inputs (64, 1024 and 8192 bytes) using three SHA2 digests (SHA512, SHA384, SHA256) and the older, weaker SHA1 digest. Multiple parallel threads are used to measure each processor's maximum throughput. Oracle's SPARC T7-2 server shows dramatically faster digest computation compared to current x86 two processor servers. SPARC M7 processors running Oracle Solaris 11.3 ran 17 times faster computing multiple parallel SHA512 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon E5-2699 v3 processors running Oracle Linux 6.5. SPARC M7 processors running Oracle Solaris 11.3 ran 14 times faster computing multiple parallel SHA256 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon E5-2699 v3 processors running Oracle Linux 6.5. SPARC M7 processors running Oracle Solaris 11.3 ran 4.8 times faster computing multiple parallel SHA1 digests of 8 KB inputs (in cache) than Cryptography for Intel Integrated Performance Primitives for Linux (library) on Intel Xeon E5-2699 v3 processors running Oracle Linux 6.5. SHA1 and SHA2 operations are an integral part of Oracle Solaris, while on Linux they are performed using the add-on Cryptography for Intel Integrated Performance Primitives for Linux (library). Oracle has also measured AES (CFB, GCM, CCM, CBC) cryptographic performance on the SPARC M7 processor. Performance Landscape Presented below are results for computing SHA1, SHA256, SHA384 and SHA512 digests for input plaintext sizes of 64, 1024 and 8192 bytes. Results are presented as MB/sec (10**6). All SPARC M7 processor results were run as part of this benchmark effort. All other results were run during previous benchmark efforts. Digest Performance – SHA512 Performance is presented for SHA512 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M7, 4.13 GHz 39,201 167,072 184,944 2 x SPARC T5, 3.6 GHz 18,717 73,810 78,997 2 x Intel Xeon E5-2699 v3, 2.3 GHz 3,949 9,214 10,681 2 x Intel Xeon E5-2697 v2, 2.7 GHz 2,681 6,631 7,701 Digest Performance – SHA384 Performance is presented for SHA384 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M7, 4.13 GHz 39,697 166,898 185,194 2 x SPARC T5, 3.6 GHz 18,814 73,770 78,997 2 x Intel Xeon E5-2699 v3, 2.3 GHz 4,061 9,263 10,678 2 x Intel Xeon E5-2697 v2, 2.7 GHz 2,774 6,669 7,706 Digest Performance – SHA256 Performance is presented for SHA256 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M7, 4.13 GHz 45,148 113,648 119,929 2 x SPARC T5, 3.6 GHz 21,140 49,483 51,114 2 x Intel Xeon E5-2699 v3, 2.3 GHz 3,446 7,785 8,463 2 x Intel Xeon E5-2697 v2, 2.7 GHz 2,404 5,570 6,037 Digest Performance – SHA1 Performance is presented for SHA1 digest. The digest was computed for 64, 1024 and 8192 bytes of pseudo-random input data (same data for each run). Processors Performance (MB/sec) 64B input 1024B input 8192B input 2 x SPARC M7, 4.13 GHz 47,640 92,515 97,545 2 x SPARC T5, 3.6 GHz 21,052 40,107 41,584 2 x Intel Xeon E5-2699 v3, 2.3 GHz 6,677 18,165 20,405 2 x Intel Xeon E5-2697 v2, 2.7 GHz 4,649 13,245 14,842 Configuration Summary SPARC T7-2 server 2 x SPARC M7 processor, 4.13 GHz 1 TB memory Oracle Solaris 11.3 SPARC T5-2 server 2 x SPARC T5 processor, 3.60 GHz 512 GB memory Oracle Solaris 11.2 Oracle Server X5-2 system 2 x Intel Xeon E5-2699 v3 processors, 2.30 GHz 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014 Sun Server X4-2 system 2 x Intel Xeon E5-2697 v2 processors, 2.70 GHz 256 GB memory Oracle Linux 6.5 Intel Integrated Performance Primitives for Linux, Version 8.2 (Update 1) 07 Nov 2014 Benchmark Description The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-cache and on-chip using various digests, including SHA1 and SHA2 (SHA256, SHA384, SHA512). The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various digests. They were run using optimized libraries for each platform to obtain the best possible performance. The encryption tests were run with pseudo-random data of sizes 64 bytes, 1024 bytes and 8192 bytes. The benchmark tests were designed to run out of cache, so memory bandwidth and latency are not the limitations. See Also More about Secure Hash Algorithm (SHA) SPARC T7-2 Server oracle.com     OTN     Blog SPARC T5-2 Server oracle.com     OTN Oracle Server X5-2 oracle.com     OTN     Blog Sun Server X4-2 oracle.com     OTN Oracle Solaris oracle.com     OTN     Blog Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 10/25/2015.

Oracle's cryptography benchmark measures security performance on important Secure Hash Algorithm (SHA) functions. Oracle's SPARC M7 processor with its security software in silicon is faster...

Benchmark

ZFS Encryption: SPARC T7-1 Performance

Oracle's SPARC T7-1 server can encrypt/decrypt at near clear text throughput.  The SPARC T7-1 server can encrypt/decrypt on the fly and have CPU cycles left over for the application. The SPARC T7-1 server performed 475,123 Clear 8k read IOPs. With AES-256-CCM enabled on the file syste, 8K read IOPS only drop 3.2% to 461,038. The SPARC T7-1 server performed 461,038 AES-256-CCM 8K read IOPS and a two-chip x86 E5-2660 v3 server performed 224,360 AES-256-CCM 8K read IOPS.  The SPARC M7 processor result is 4.1 times faster per chip. The SPARC T7-1 server performed 460,600 AES-192-CCM 8K read IOPS and a two chip x86 E5-2660 v3 server performed 228,654 AES-192-CCM 8K read IOPS.  The SPARC M7 processor result is 4.0 times faster per chip. The SPARC T7-1 server performed 465,114 AES-128-CCM 8K read IOPS and a two chip x86 E5-2660 v3 server performed 231,911 AES-128-CCM 8K read IOPS.  The SPARC M7 processor result is 4.0 times faster per chip. The SPARC T7-1 server performed 475,123 clear text 8K read IOPS and a two chip x86 E5-2660 v3 server performed 438,483 clear text 8K read IOPS The SPARC M7 processor result is 2.2 times faster per chip. Performance Landscape Results presented below are for random read performance for 8K size.  All of the following results were run as part of this benchmark effort.   Read Performance – 8K Encryption SPARC T7-1 2 x E5-2660 v3 IOPS Resp Time % Busy IOPS Resp Time % Busy Clear 475,123 0.8 msec 43% 438,483 0.8 msec 95% AES-256-CCM 461,038 0.83 msec 56% 224,360 1.6 msec 97% AES-192-CCM 465,114 0.83 msec 56% 228,654 1.5 msec 97% AES-128-CCM 465,114 0.82 msec 57% 231,911 1.5 msec 96% IOPS – IO operations per second Resp Time – response time % Busy – percent cpu usage Configuration Summary SPARC T7-1 server 1 x SPARC M7 processor (4.13 GHz) 256 GB memory (16 x 16 GB) Oracle Solaris 11.3 4 x StorageTek 8 Gb Fibre Channel PCIe HBA   Oracle Server X5-2L system 2 x Intel Xeon Processor E5-2660 V3 (2.60 GHz) 256 GB memory Oracle Solaris 11.3 4 x StorageTek 8 Gb Fibre Channel PCIe HBA   Storage SAN 2 x Brocade 300 FC switches 2 x Sun Storage 6780 array with 64 disk drives / 16 GB Cache Benchmark Description The benchmark tests the performance of running an encrypted ZFS file system compared to the non-encrypted (clear text) ZFS file system.  The tests were executed with Oracle's Vdbench tool Version 5.04.03.  Three different encryption methods are tested, AES-256-CCM, AES-192-CCM and AES-128-CCM. Key Points and Best Practices The ZFS file system was configured with data cache disabled, meta cache enabled, 4 pools, 128 luns, and 192 file systems with 8K record size. Data cache was disable to insure data would be decrypted as it was read from storage.  This is not a recommended setting for normal customer operations. The tests were executed with Oracle's Vdbench tool against 192 file systems. Each file system was run with a queue depth of 2. The script used for testing is listed below. hd=default,jvms=16 sd=sd001,lun=/dev/zvol/rdsk/p1/vol001,size=5g,hitarea=100m sd=sd002,lun=/dev/zvol/rdsk/p1/vol002,size=5g,hitarea=100m # # sd003 through sd191 statements here # sd=sd192,lun=/dev/zvol/rdsk/p4/vol192,size=5g,hitarea=100m # VDBENCH work load definitions for run # Sequential write to fill storage. wd=swrite1,sd=sd*,readpct=0,seekpct=eof # Random Read work load. wd=rread,sd=sd*,readpct=100,seekpct=random,rhpct=100 # VDBENCH Run Definitions for actual execution of load. rd=default,iorate=max,elapsed=3h,interval=10 rd=seqwritewarmup,wd=swrite1,forxfersize=(1024k),forthreads=(16) rd=default,iorate=max,elapsed=10m,interval=10 rd=rread8k-50,wd=rread,forxfersize=(8k),iorate=curve, \ curve=(95,90,80,70,60,50),forthreads=(2) See Also Vdbench OTN     SPARC T7-1 Server oracle.com    OTN Oracle Server X5-2L oracle.com    OTN Oracle Solaris oracle.com    OTN   Disclosure Statement Copyright 2015, Oracle and/or its affiliates.  All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.  Results as of 10/25/2015.

Oracle's SPARC T7-1 server can encrypt/decrypt at near clear text throughput.  The SPARC T7-1 server can encrypt/decrypt on the fly and have CPU cycles left over for the application. The SPARC T7-1...

Benchmark

Live Migration: SPARC T7-2 Oracle VM Server for SPARC Performance

One of the features that Oracle VM Server for SPARC offers is Live Migration, which is the process of securely moving an active logical domain (LDom, Virtual Machine) between different physical machines while maintaining application services to users.  Memory, storage, and network connectivity of the logical domain are transferred from the original logical domain's machine to the destination target machine with all data compressed and encrypted. Oracle's Live Migration is secure by default using SSL (AES256_GCM_SHA384) to encrypt migration network traffic to protect sensitive data from exploitation and to eliminate the requirement for additional hardware and dedicated networks.  Additional authentication schemes can be set up to increase security for the source and target machines.  VMware vMotion and IBM PowerVM do not support Secure Live Migration by default (see below). An enterprise Java workload with a 74 GB footprint in a 128 GB VM running on Oracle's SPARC T7-2 server migrated to another SPARC T7-2 server in just 95 seconds with 30 seconds suspension time to the user. Performance Landscape Results from moving an active workload as well as two different idle workloads. The LDom was allocated 128 GB of memory. Mission-Critical LDom Live Migration Benchmark Test Total Migration Time (sec) Data Moved (GB) Network Bandwidth (MB/sec) Enterprise Java Workload/Active 95 74.3 835.3 After Active Workload/Idle 13 1.9 236.1 Out of the Box/Idle 13 1.1 135.4 Enterprise Java Workload Performance Test Conditions Average Operations per Second During Live Migration 347,370 No Migration 596,914   Configuration Summary 2 x SPARC T7-2 2 x SPARC M7 processors (4.13 GHz) 512 GB memory (32 x 16 GB DDR4-2133 DIMMs) 6 x 600 GB 10K RPM SAS-2 HDD 10 GbE (built-in network device) Oracle Solaris 11.3 (11.3.0.26.0) Oracle VM Server for SPARC ( LDoms v 3.3.0.0 Integration 17 ) The configuration of the LDoms on the source machine is: Source Machine Configuration LDom vcpus Memory Primary/control 128 (16-cores) 128 GB Guest0 128 (16-cores) 110 GB Guest1 (Migration) 128 (16-cores) 128 GB Guest2 128 (16-cores) 110 GB The configuration of the LDoms on the target machine is: Target Machine Configuration LDom vcpus Memory Primary/control 128 (16-cores) 128 GB Benchmark Description By running a Java workload on a logical domain and start a Live Migration process to move this logical domain to a target machine, the values of the major performance metrics of live migration can be measured: Total Migration Time (the total time it takes to migrate a logical domain). Effect on Application Performance (how much an application's performance degrades because of being migrated). The number of logical domains on the source machine is three (Guest0, Guest1, Guest3) because it could represent a more realistic environment where all the source machine resources (vcpus and memory) are in use, by running the same Java workload on each LDom. Three different experiments are run: Enterprise Java Workload/Active: starting the same Java workload at the same time on three logical domains (Guest0, Guest1, and Guest2), the Live Migration of Guest1 is executed after an arbitrary amount of time. After Active Workload/Idle: after running a Java workload on three logical domains (Guest0, Guest1, and Guest2), so the memory of each has been touched, and no workload is running on any of them, the Live Migration of Guest1 is executed. Out of the Box/Idle: as soon as the three logical domains are installed or rebooted (Guest0, Guest1, and Guest2) with Oracle Solaris and no workload is running on any of them, the Live Migration of Guest1 is executed. Key Points and Best Practices The network interconnection between the primaries on source and target machines is 10 GbE built-in network  device configured to use Jumbo Frames (MTU=9000) in order to get higher bandwidth during the live migration. The Enterprise Java Workload Performance on the non-migrated logical domains (Guest0, Guest2) was not affected before, during, and after the live migration of Guest1. IBM PowerVM does not support Secure Live Migration by default; the IBM's technology name is Live Partition Mobility and it can be found on Cloud Security Guidelines for IBM Power Systems, January 2015, pp 89 "4.10.1 Live Partition Mobility". See Also SPARC T7-2 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle and Virtualization oracle.com    OTN VMware vMotion does not support Secure Live Migration by default IBM PowerVM does not support Secure Live Migration by default Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 25 October 2015.

One of the features that Oracle VM Server for SPARC offers is Live Migration, which is the process of securely moving an active logical domain (LDom, Virtual Machine) between different...

Benchmark

Virtualized Storage: SPARC T7-1 Performance

Oracle's SPARC T7-1 server using SR-IOV enabled HBAs can achieve near native throughput. The SPARC T7-1 server, with its dramatically improved compute engine, can also achieve near native throughput with Virtual Disk (VDISK). The SPARC T7-1 server is able to produce 604,219 8K read IO/Second (IOPS) with native Oracle Solaris 11.3 using 8 Gb FC HBAs.  The SPARC T7-1 server using Oracle VM Server for SPARC 3.1 with 4 LDOM VDISK produced near native performance of 603,766 8K read IOPS.  With SR-IOV enabled using 2 LDOMs, the SPARC T7-1 server produced 604,966 8K read IOPS. The SPARC T7-1 server running Oracle VM Server for SPARC 3.1 ran 2.8 times faster virtualized IO throughput than a Sun Server X3-2L system (two Intel Xeon E5-2690, running a popular virtualization product).  The virtualized x86 system produced 209,166 8K virtualized reads.  The native performance of the x86 system was 338,458 8K read IOPS. The SPARC T7-1 server is able to produce 891,025 4K Read IOPS with native Oracle Solaris 11.3 using 8 Gb FC HBAs.  The SPARC T7-1 server using Oracle VM Server for SPARC 3.1 with 4 LDOM VDISK produced near native performance of 849,493 4K read IOPS. With SR-IOV enabled using 2 LDOMs, the SPARC T7-1 server produced 891,338 4K read IOPS. The SPARC T7-1 server running Oracle VM Server for SPARC 3.1 ran 3.8 times faster virtualized IO throughput than a Sun Server X3-2L system (Intel Xeon E5-2690, running a popular virtualization product).  The virtualized x86 system produced 219,830 4K virtualized reads.  The native performance of the x86 system was 346,868 4K read IOPS. The SPARC T7-1 server running Oracle VM Server for SPARC 3.1 ran 1.3 times faster with 16 Gb HBA compared to 8 Gb HBAs.  This is quite impressive considering it was still attached to 8 Gb switches and storage. Performance Landscape Results presented below are for read performance for 8K size and then for 4K size.  All of the following results were run as part of this benchmark effort. Read Performance — 8K System 8K Read IOPS Performance Native Virtual Disk SR-IOV SPARC T7-1 (16 Gb FC) 796,849 N/A 797,221 SPARC T7-1 (8 Gb FC) 604,219 603,766 604,966 Sun Server X3-2 (8 Gb FC) 338,458 209,166 N/A Read Performance — 4K System 4K Read IOPS Performance Native Virtual Disk SR-IOV SPARC T7-1 (16 Gb FC) 1,185,392 N/A 1,231,808 SPARC T7-1 (8 Gb FC) 891,025 849,493 891,338 Sun Server X3-2 (8 Gb FC) 346,868 219,830 N/A Configuration Summary SPARC T7-1 server 1 x SPARC M7 processor (4.13 GHz) 256 GB memory (16 x 16 GB) Oracle Solaris 11.3 Oracle VM Server for SPARC 3.1 4 x Sun Storage 16 Gb Fibre Channel PCIe Universal FC HBA, Qlogic 4 x StorageTek 8 Gb Fibre Channel PCIe HBA   Sun Server X3-2 system 2 x Intel Xeon Processor E5-2690 (2.90 GHz) 128 GB memory Oracle Solaris 11.2 Popular Virtualization Software 4 x StorageTek 8 Gb Fibre Channel PCIe HBA   Storage SAN Brocade 5300 Switch 2 x Sun Storage 6780 array with 64 disk drives / 16 GB Cache 2 x Sun Storage 2540-M2 arrays with 36 disk drives / 1.5 GB Cache Benchmark Description The benchmark tests operating system IO efficiency of native and virtual machine environments.  The test accesses storage devices raw and with no operating system buffering. The storage space accessed fit within the cache controller on the storage arrays for low latency and highest throughput. All accesses were random 4K or 8K reads. Tests were executed with Oracle's Vdbench Version 5.04.03 tool against 32 LUNs. Each LUN was run with a queue depth of 32. See Also Vdbench Version 5.04.03 OTN     SPARC T7-1 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Disclosure Statement Copyright 2015, Oracle and/or its affiliates.  All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates.  Other names may be trademarks of their respective owners.  Results as of 10/25/2015.

Oracle's SPARC T7-1 server using SR-IOV enabled HBAs can achieve near native throughput. The SPARC T7-1 server, with its dramatically improved compute engine, can also achieve near native throughput...

Benchmark

Oracle Internet Directory: SPARC T7-2 World Record

Oracle's SPARC T7-2 server running Oracle Internet Directory (OID, Oracle's LDAP Directory Server) on Oracle Solaris 11 on a virtualized processor configuration achieved a record result on the Oracle Internet Directory benchmark. The SPARC T7-2 server, virtualized to use a single processor, achieved world record performance running Oracle Internet Directory benchmark with 50M users. The SPARC T7-2 server and Oracle Internet Directory using Oracle Database 12c running on Oracle Solaris 11 achieved record result of 1.18M LDAP searches/sec with an average latency of 0.85 msec with 1000 clients. The SPARC T7 server demonstrated 25% better throughput and 23% better latency for LDAP search/sec over similarly configured SPARC T5 server benchmark environment. Oracle Internet Directory achieved near linear scalability on the virtualized single processor domain on the SPARC T7-2 server with 79K LDAP searches/sec with 2 cores to 1.18M LDAP searches/sec with 32 cores. Oracle Internet Directory and the virtualized single processor domain on the SPARC T7-2 server achieved up to 22,408 LDAP modify/sec with an average latency of 2.23 msec for 50 clients. Performance Landscape A virtualized single SPARC M7 processor in a SPARC T7-2 server was used for the test results presented below.  The SPARC T7-2 server and SPARC T5-2 server results were run as part of this benchmark effort. The remaining results were part of a previous benchmark effort. Oracle Internet Directory Tests System chips/ cores Search Modify Add ops/sec lat (msec) ops/sec lat (msec) ops/sec lat (msec) SPARC T7-2 1/32 1,177,947 0.85 22,400 2.2 1,436 11.1 SPARC T5-2 2/32 944,624 1.05 16,700 2.9 1,000 15.95 SPARC T4-4 4/32 682,000 1.46 12,000 4.0 835 19.0 Scaling runs were also made on the virtualized single processor domain on the SPARC T7-2 server. Scaling of Search Tests – SPARC T7-2, One Processor Cores Clients ops/sec Latency (msec) 32 1000 1,177,947 0.85 24 1000 863,343 1.15 16 500 615,563 0.81 8 500 280,029 1.78 4 100 156,114 0.64 2 100 79,300 1.26 Configuration Summary System Under Test: SPARC T7-2 2 x SPARC M7 processors, 4.13 GHz 512 GB memory 6 x 600 GB internal disks 1 x Sun Storage ZS3-2 (used for database and log files) Flash storage (used for redo logs) Oracle Solaris 11.3 Oracle Internet Directory 11g Release 1 PS7 (11.1.1.7.0) Oracle Database 12c Enterprise Edition 12.1.0.2 (64-bit) Benchmark Description Oracle Internet Directory (OID) is Oracle's LDAPv3 Directory Server.  The throughput for five key operations are measured — Search, Compare, Modify, Mix and Add. LDAP Search Operations Test This test scenario involved concurrent clients binding once to OID and then performing repeated LDAP Search operations. The salient characteristics of this test scenario is as follows: SLAMD SearchRate job was used. BaseDN of the search is root of the DIT, the scope is SUBTREE, the search filter is of the form UID=, DN and UID are the required attribute. Each LDAP search operation matches a single entry. The total number concurrent clients was 1000 and were distributed amongst two client nodes. Each client binds to OID once and performs repeated LDAP Search operations, each search operation resulting in the lookup of a unique entry in such a way that no client looks up the same entry twice and no two clients lookup the same entry and all entries are searched randomly. In one run of the test, random entries from the 50 Million entries are looked up in as many LDAP Search  operations. Test job was run for 60 minutes. LDAP Compare Operations Test This test scenario involved concurrent clients binding once to OID and then performing repeated LDAP Compare operations on userpassword attribute. The salient characteristics of this test scenario is as follows: SLAMD CompareRate job was used. Each LDAP compare operation matches user password of user. The total number concurrent clients was 1000 and were distributed amongst two client nodes. Each client binds to OID once and performs repeated LDAP compare operations. In one run of the test, random entries from the 50 Million entries are compared in as many LDAP compare operations. Test job was run for 60 minutes. LDAP Modify Operations Test This test scenario consisted of concurrent clients binding once to OID and then performing repeated LDAP Modify operations. The salient characteristics of this test scenario is as follows: SLAMD LDAP modrate job was used. A total of 50 concurrent LDAP clients were used. Each client updates a unique entry each time and a total of 50 Million entries are updated. Test job was run for 60 minutes. Value length was set to 11. Attribute that is being modified is not indexed. LDAP Mixed Load Test The test scenario involved both the LDAP search and LDAP modify clients enumerated above. The ratio involved 60% LDAP search clients, 30% LDAP bind and 10% LDAP modify clients. A total of 1000 concurrent LDAP clients were used and were distributed on 2 client nodes. Test job was run for 60 minutes. LDAP Add Load Test The test scenario involved concurrent clients adding new entries as follows. Slamd standard add rate job is used. A total of 500,000 entries were added. A total of 16 concurrent LDAP clients were used. Slamd add's inetorgperson objectclass entry with 21 attributes (includes operational attributes). See Also SPARC T7-2 Server oracle.com    OTN Oracle Internet Directory oracle.com    OTN Oracle Identity Management oracle.com    OTN Oracle Fusion Middleware oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 25 October 2015.

Oracle's SPARC T7-2 server running Oracle Internet Directory (OID, Oracle's LDAP Directory Server) on Oracle Solaris 11 on a virtualized processor configuration achieved a record result on the Oracle...

Benchmark

Oracle FLEXCUBE Universal Banking: SPARC T7-1 World Record

Oracle's SPARC T7-1 servers running Oracle FLEXCUBE Universal Banking Release 12 along with Oracle Database 12c Enterprise Edition with Oracle Real Application Clusters on Oracle Solaris 11 produced record results for two processor solutions. Two SPARC T7-1 servers each running Oracle FLEXCUBE Universal Banking Release 12 (v 12.0.1) and Oracle Real Application Clusters 12c database on Oracle Solaris 11 achieved record End of Year batch processing of 25 million accounts with 200 branches in 4 hrs 34 minutes (total of two processors). A single SPARC T7-1 server running Oracle FLEXCUBE Universal Banking Release 12 processing 100 branches was able to complete the workload in similar time as the two node 200 branches End of Year workload, demonstrating good scaling of the application. The customer representative workload for all 25 million accounts included saving accounts, current accounts, loans and TD accounts were created on the basis 25 million Customer IDs with 200 branches. Oracle's SPARC M7 and T7 Servers running Oracle Solaris 11 with built-in Silicon Secured Memory with Oracle Database 12c can benefit global retail and corporate financial institutions who are running Oracle FLEXCUBE Universal Banking Release 12. The uniquely co-engineered Oracle software and hardware unlock unique agile capabilities demanded by modern business environments. The SPARC T7-1 system and Oracle Solaris are able to provide a combination of uniquely essential characteristics that resonate with core values for a modern financial services institution. The SPARC M7 processor based systems are capable of delivering higher performance and lower total cost of ownership (TCO) than older SPARC infrastructure, without introducing the unseen tax and risk of migrating applications away from older SPARC systems. Performance Landscape Oracle FLEXCUBE Universal Banking Release 12 End of Year Batch Processing System Branches Time in Minutes 2 x SPARC T7-1 200 274 (min) 1 x SPARC T7-1 100 268 (min) Configuration Summary Systems Under Test: 2 x SPARC T7-1 each with 1 x SPARC M7 processor, 4.13 GHz 256 GB memory Oracle Solaris 11.3 (11.3.0.27.0) Oracle Database 12c (RAC/ASM 12.1.0.2 BP7) Oracle FLEXCUBE Universal Banking Release 12 Storage Configuration: Oracle ZFS Storage ZS4-4 appliance Benchmark Description The Oracle FLEXCUBE Universal Banking Release 12 benchmark models an actual customer bank with End of Cycle transaction batch jobs which typically execute during non-banking hours. This benchmark includes accrual for savings and term deposit accounts, interest capitalization for saving accounts, interest pay out for term deposit accounts and consumer  load processing. This benchmark helps banks refine their infrastructure requirements for the volumes and scale of operations for business expansion.  The end of cycle can be year, month or day, with year having the most processing followed by month and then day. See Also SPARC T7-1 Server oracle.com    OTN Oracle FLEXCUBE oracle.com    Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 25 October 2015.

Oracle's SPARC T7-1 servers running Oracle FLEXCUBE Universal Banking Release 12 along with Oracle Database 12c Enterprise Edition with Oracle Real Application Clusters on Oracle Solaris 11 produced...

Benchmark

Oracle Stream Explorer DDOS Attack: SPARC T7-4 World Record

A single processor of Oracle's SPARC T7-4 server achieved a world record result running an Oracle Stream Explorer platform benchmark.  The Oracle Stream Explorer platform is used to process multiple event streams to detect patterns and trends in real time.  The benchmark detects malicious IP addresses that cause a distributed denial of service (DDOS) attack. A single SPARC M7 processor of a SPARC T7-4 server running Oracle Stream Explorer achieved a throughput result of 1.505 million ops/sec. The SPARC M7 processor achieved 2.9 times the throughput of an x86 Intel Xeon Processor E7-8895 v3 based server. Performance Landscape All of the following results were run as part of this benchmark effort. Oracle Stream Explorer Throughput Test One Processor Performance System Throughput SPARC T7-4 1.505 M ops/sec Oracle Server X5-4 0.522 M ops/sec Configuration Summary SPARC Server: SPARC T7-4 4 x SPARC M7 processors 1 TB memory Oracle Solaris 11.3 Oracle Stream Explorer 11.1.1.7 (PS6) Oracle JDK 6 x86 Server: Oracle Server X5-4 4 x Intel Xeon Processor E7-8895 v3 1 TB memory Oracle Solaris 11.3 Oracle Stream Explorer 11.1.1.7 (PS6) Oracle JDK 6 Benchmark Description The benchmark detects malicious IP addresses that cause a distributed denial of service (DDOS) attack on a system.  The benchmark determines which IP address sent the most packets.  The benchmark has a dedicated load generator program for each Oracle Stream Explorer platform instance. The Oracle Stream Explorer platform instance is always in a listening mode. When it receives data on its network socket, it starts incrementing the packet counter. Different Oracle Stream Explorer platform instances are deployed on different network sockets.  The packet counter is printed out in regular intervals as the throughput for benchmarking purposes. Key Points and Best Practices The load generator was run on the system under test.  One processor was used for the event processing, the other processors were used for the load generation. On the SPARC T7-4 server, three SPARC M7 processors were assigned the task of running the 200 load generators.  This was accomplished using the "psrset" command. On the Oracle Server X5-4 system, three Intel Xeon Processor E7-8895 v3 were assigned the task of running the 36 load generators. Only 25 cores of the SPARC M7 processor were required to satisfy the workload.  The 200 Oracle Stream Explorer applications were bound eight per core. All 18 cores of the Intel Xeon Processor E7-8895 v3 were required to satisfy the workload.  The 36 Oracle Stream Explorer applications were bound two per core. See Also SPARC T7-4 Server oracle.com    OTN Oracle Server X5-4 oracle.com    OTN Oracle Stream Explorer Platform oracle.com    OTN Oracle SOA Suite oracle.com    OTN Oracle Fusion Middleware oracle.com    OTN Oracle Solaris oracle.com    OTN Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 25 October 2015.

A single processor of Oracle's SPARC T7-4 server achieved a world record result running an Oracle Stream Explorer platform benchmark.  The Oracle Stream Explorer platform is used to process multiple...

Benchmark

PeopleSoft Human Capital Management 9.1 FP2: SPARC M7-8 World Record

This result demonstrates how Oracle's SPARC M7-8 server using Oracle VM Server for SPARC (LDoms) provides mission critical enterprise virtualization. The virtualized two-chip, 1 TB LDom of the SPARC M7-8 server set a world record two-chip PeopleSoft Human Capital Management (HCM) 9.1 FP2 benchmark result, supporting 35,000 HR Self-Service online users with response times under one second, while simultaneously running the Payroll batch workload. The virtualized two-chip LDom of the SPARC M7-8 server demonstrated 4 times better Search and 6 times better Save average response times running nearly double the number of online users along with payroll batch, compared to the ten-chip x86 solution from Cisco. Using only a single chip in the virtualized two-chip LDom on the SPARC M7-8 server, the batch-only run demonstrated 1.8 times better throughput (payments/hour) compared to a four-chip Cisco UCSB460 M4 server. Using only a single chip in the virtualized two-chip LDom on the SPARC M7-8 server, the batch-only run demonstrated 2.3 times better throughput (payments/hour) compared to a nine-chip IBM zEnterprise z196 server (EC 2817-709, 9-way, 8943 MIPS). This record result demonstrates that a two SPARC M7 processor LDom (in SPARC M7-8), can run the same number of online users as a dynamic domain (PDom) of eight SPARC M6 processors (in SPARC M6-32), with better online response times, batch elapsed times and batch throughput (payments/hour). The SPARC M7-8 server provides enterprise applications high availability and security, where each application is executed on its own environment independent of the others. Performance Landscape The first table presents the combined results, running both the PeopleSoft HR Self-Service Online and Payroll Batch tests concurrently. PeopleSoft HR Self-Service Online And Payroll Batch Using Oracle Database 11g System Processors Chips Used Users Search/Save Batch Elapsed Time Batch Pay/Hr SPARC M7-8 SPARC M7 LDom1 2 35,000 0.67 sec/0.42 sec 22.71 min 1,322,272 LDom2 2 35,000 0.85 sec/0.50 sec 22.96 min 1,307,875 SPARC M6-32 SPARC M6 8 35,000 1.80 sec/1.12 sec 29.2 min 1,029,440 Cisco 1 x B460 M4, 3 x B200 M3 Intel E7-4890 v2, Intel E5-2697 v2 10 18,000 2.70 sec/2.60 sec 21.70 min 1,383,816 The following results are running only the Peoplesoft HR Self-Service Online test. PeopleSoft HR Self-Service Online Using Oracle Database 11g System Processors Chips Used Users Search/Save Avg Response Times SPARC M7-8 SPARC M7 LDom1 2 40,000 0.55 sec/0.33 sec LDom2 2 40,000 0.56 sec/0.32 sec SPARC M6-32 SPARC M6 8 40,000 2.73 sec/1.33 sec Cisco 1 x B460 M4, 3 x B200 M3 Intel E7-4890 v2, Intel E5-2697 v2 10 20,000 0.35 sec/0.17 sec The following results are running only the Peoplesoft Payroll Batch test.  For the SPARC M7-8 server results, only one of the processors was used per LDom. This was accomplished using processor sets to further restrict the test to a single SPARC M7 processor. PeopleSoft Payroll Batch Using Oracle Database 11g System Processors Chips Used Batch Elapsed Time Batch Pay/Hr SPARC M7-8 SPARC M7 LDom1 1 13.06 min 2,299,296 LDom2 1 12.85 min 2,336,872 SPARC M6-32 SPARC M6 2 18.27 min 1,643,612 Cisco UCS B460 M4 Intel E7-4890 v2 4 23.02 min 1,304,655 IBM z196 zEnterprise (5.2 GHz, 8943 MIPS) 9 30.50 min 984,551 Configuration Summary System Under Test: SPARC M7-8 server with 8 x SPARC M7 processor (4.13 GHz) 4 TB memory Virtualized as two Oracle VM Server for SPARC (LDom) each with 2 x SPARC M7 processor (4.13 GHz) 1 TB memory Storage Configuration: 2 x Oracle ZFS Storage ZS3-2 appliance (DB Data) each with 40 x 300 GB 10K RPM SAS-2 HDD, 8 x Write Flash Accelerator SSD and 2 x Read Flash Accelerator SSD 1.6TB SAS 2 x Oracle Server X5-2L (DB redo logs & App object cache) each with 2 x Intel Xeon Processor E5-2630 v3 32 GB memory 4 x 1.6 TB NVMe SSD Software Configuration: Oracle Solaris 11.3 Oracle Database 11g Release 2 (11.2.0.3.0) PeopleSoft Human Capital Management 9.1 FP2 PeopleSoft PeopleTools 8.52.03 Oracle Java SE 6u32 Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 043 Oracle WebLogic Server 11g (10.3.5) Benchmark Description The PeopleSoft Human Capital Management benchmark simulates thousands of online employees, managers and Human Resource administrators executing transactions typical of a Human Resources Self Service application for the Enterprise.  Typical transactions are: viewing paychecks, promoting and hiring employees, updating employee profiles, etc.  The database tier uses a database instance of about 500 GB in size, containing information for 500,480 employees.  The application tier for this test includes web and application server instances, specifically Oracle WebLogic Server 11g, PeopleSoft Human Capital Management 9.1 FP2 and Oracle Java SE 6u32. Key Points and Best Practices In the HR online along with Payroll batch run, each LDom had one Oracle Solaris Zone of 7 cores containing the Web tier, two Oracle Solaris Zones of 16 cores each containing the Application tier and one Oracle Solaris Zone of 23 cores containing the Database tier.  Two cores were dedicated to network and disk interrupt handling.  In the HR online only run, each LDom had one Oracle Solaris Zone of 12 cores containing the Web tier, two Oracle Solaris Zones of 18 cores each containing the Application tier and one Oracle Solaris Zone of 14 cores containing the Database tier. 2 cores were dedicated to network and disk interrupt handling.  In the Payroll batch only run, each LDom had one Oracle Solaris Zone of 31 cores containing the Database tier.  1 core was dedicated to disk interrupt handling. All database data files, recovery files and Oracle Clusterware files for the PeopleSoft test were created with the Oracle Automatic Storage Management (Oracle ASM) volume manager for the added benefit of the ease of management provided by Oracle ASM integrated storage management solution. In the application tier on each LDom, 5 PeopleSoft application domains with 350 application servers (70 per domain) were hosted in two separate Oracle Solaris Zones for a total of 10 domains with 700 application server processes. All PeopleSoft Application processes and the 32 Web Server JVM instances were executed in the Oracle Solaris FX scheduler class. See Also Oracle's PeopleSoft HRMS 9.1 FP2 Self-service and Payroll using Oracle DB for Oracle Solaris (Unicode) on an Oracle's SPARC M7-8 Server Oracle's PeopleSoft HRMS 9.1 FP2 Self-service using Oracle DB for Oracle Solaris (Unicode) on an Oracle's SPARC M7-8 Server Oracle's PeopleSoft HRMS 9.1 FP2 Payroll using Oracle DB for Oracle Solaris (Unicode) on an Oracle's SPARC M7-8 Oracle Applications Benchmarks Oracle's PeopleSoft Benchmark White Papers Cisco HR Self-Serve Online and Payroll Batch Result Cisco HR Self-Service Online Result Cisco HR Payroll Batch Result IBM z196 Payroll Batch Result, Mainframe MIPS SPARC M7-8 Server oracle.com    OTN PeopleSoft Enterprise Human Capital Management oracle.com     OTN PeopleSoft Enterprise Human Capital Management (Payroll) oracle.com     OTN Oracle Database oracle.com     OTN Oracle Solaris oracle.com    OTN Disclosure Statement Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.  Results as of 10/25/2015.

This result demonstrates how Oracle's SPARC M7-8 server using Oracle VM Server for SPARC (LDoms) provides mission critical enterprise virtualization. The virtualized two-chip, 1 TB LDom of the SPARC...

Benchmark

Oracle E-Business Payroll Batch Extra-Large: SPARC T7-1 World Record

Oracle's SPARC T7-1 server set a world record running the Oracle E-Business Suite 12.1.3 Standard Extra-Large (250,000 Employees) Payroll (Batch) workload. The SPARC T7-1 server produced a world record result of 1,527,494 employee records processed per hour (9.82 min elapsed time) on the Oracle E-Business Suite R12 (12.1.3) Extra-Large Payroll (Batch) benchmark. The SPARC T7-1 server equipped with one 4.13 GHz SPARC M7 processor, demonstrated 36% better hourly employee throughput compared to a two-chip Cisco UCS B200 M4 (Intel Xeon E5-2697 v3). The SPARC T7-1 server equipped with one 4.13 GHz SPARC M7 processor, demonstrated 40% better hourly employee throughput compared to two-chip IBM S824 (POWER8 using 12 cores total). Performance Landscape This is the world record result for the Payroll Extra-Large model using Oracle E-Business 12.1.3 workload. Batch Workload: Payroll Extra-Large Model System Processor Employees/Hr Elapsed Time SPARC T7-1 1 x SPARC M7 (4.13 GHz) 1,527,494 9.82 minutes Cisco UCS B200 M4 2 x Intel Xeon Processor E5-2697 v3 1,125,281 13.33 minutes IBM S824 2 x POWER8 (3.52 GHz) 1,090,909 13.75 minutes Cisco UCS B200 M3 2 x Intel Xeon Processor E5-2697 v2 1,017,639 14.74 minutes Cisco UCS B200 M3 2 x Intel Xeon Processor E5-2690 839,865 17.86 minutes Sun Server X3-2L 2 x Intel Xeon Processor E5-2690 789,473 19.00 minutes Configuration Summary Hardware Configuration: SPARC T7-1 server 1 x SPARC M7 processor (4.13 GHz) 256 GB memory (16 x 16 GB) Oracle ZFS Storage ZS3-2 appliance (DB Data storage) with 40 x 900 GB 10K RPM SAS-2 HDD, 8 x Write Flash Accelerator SSD and 2 x Read Flash Accelerator SSD 1.6 TB SAS Oracle Flash Accelerator F160 PCIe Card (1.6 TB NVMe for DB Log storage) Software Configuration: Oracle Solaris 11.3 Oracle E-Business Suite R12 (12.1.3) Oracle Database 11g (11.2.0.3.0) Benchmark Description The Oracle E-Business Suite Standard R12 Benchmark combines online transaction execution by simulated users with concurrent batch processing to model a typical scenario for a global enterprise.  This benchmark ran one Batch component, Payroll, in the Extra-Large size. Results can be published in four sizes and use one or more online/batch modules X-large: Maximum online users running all business flows between 10,000 to 20,000; 750,000 order to cash lines per hour and 250,000 payroll checks per hour. Order to Cash Online — 2400 users The percentage across the 5 transactions in Order Management module is: Insert Manual Invoice — 16.66% Insert Order — 32.33% Order Pick Release — 16.66% Ship Confirm — 16.66% Order Summary Report — 16.66% HR Self-Service — 4000 users Customer Support Flow — 8000 users Procure to Pay — 2000 users Large: 10,000 online users; 100,000 order to cash lines per hour and 100,000 payroll checks per hour. Medium: up to 3000 online users; 50,000 order to cash lines per hour and 10,000 payroll checks per hour. Small: up to 1000 online users; 10,000 order to cash lines per hour and 5,000 payroll checks per hour. Key Points and Best Practices All system optimizations are in the published report which is referenced in the See Also section below. See Also E-Business Suite Applications R2 (R12.1.3) Extra-Large Payroll (Batch) Benchmark - Using Oracle 11g on Oracle's SPARC T7-1   oracle.com Oracle E-Business Suite Applications R12 Benchmark Results Oracle E-Business Suite Standard R12 Benchmark Overview Oracle E-Business R12 Benchmark Description SPARC T7-1 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle E-Business Suite oracle.com    OTN Disclosure Statement Oracle E-Business X-Large Payroll Batch workload, SPARC T7-1, 4.13 GHz, 1 chip, 32 cores, 256 threads, 256 GB memory, elapsed time 9.82 minutes, 1,527,494 hourly employee throughput, Oracle Solaris 11.3, Oracle E-Business Suite 12.1.3, Oracle Database 11g Release 2, Results as of 10/25/2015.

Oracle's SPARC T7-1 server set a world record running the Oracle E-Business Suite 12.1.3 Standard Extra-Large (250,000 Employees) Payroll (Batch) workload. The SPARC T7-1 server produced a world record...

Benchmark

Oracle Communications ASAP – Telco Subscriber Activation: SPARC T7-2 World Record

Oracle's SPARC T7-2 server delivered world record results on Oracle Communications ASAP. The SPARC T7-2 server ran Oracle Solaris 11 with Oracle Database 11g Release 2, Oracle WebLogic Server 12c and Oracle Communications ASAP version 7.2. Running Oracle Communications ASAP, the SPARC T7-2 server delivered a world record result of 3,018 ASDLs/sec (atomic network activation actions). Oracle's SPARC M7 processor delivered over 2.5 times the throughput per ASDL cost compared to the previous generation SPARC T5 processor. The SPARC T7-2 server running a single instance of the Oracle Communications ASAP application, with both the application and database tiers consolidated onto a single machine, easily supported the service activation volumes of 3,018 ASDLs/sec which is representative of a typical mobile operator with more than 100 million subscribers. Oracle Communications ASAP v7.2.0.4 delivered 35% higher throughput on the SPARC T7-2 server when compared to the SPARC T5-4 server. Performance Landscape All of the following results were run as part of this benchmark effort. ASAP 7.2.0.4 Test Results – 16 NEP Both tests used 1 cpu for App tier and 1 cpu for DB tier System ASDLs/sec CPU Usage CPU Cost per ASDL Cost Improvement Ratio SPARC T7-2 3,018.56 11.4% 1.10 2.6 SPARC T5-4 2,238.97 29.6% 2.15

Oracle's SPARC T7-2 server delivered world record results on Oracle Communications ASAP. The SPARC T7-2 server ran Oracle Solaris 11 with Oracle Database 11g Release 2, Oracle WebLogic Server 12c and...

Benchmark

Oracle E-Business Suite Applications R12.1.3 (OLTP X-Large): SPARC M7-8 World Record

Oracle's SPARC M7-8 server, using a four-chip Oracle VM Server for SPARC (LDom) virtualized server, produced a world record 20,000 users running the Oracle E-Business OLTP X-Large benchmark.  The benchmark runs five Oracle E-Business online workloads concurrently: Customer Service, iProcurement, Order Management, Human Resources Self-Service, and Financials. The virtualized four-chip LDom on the SPARC M7-8 was able to handle more users than the previous best result which used eight processors of Oracle's SPARC M6-32 server. The SPARC M7-8 server using Oracle VM Server for SPARC provides enterprise applications high availability, where each application is executed on its own environment, insulated and independent of the others. Performance Landscape Oracle E-Business (3-tier) OLTP X-Large Benchmark System Chips Total Online Users Weighted Average Response Time (sec) 90th Percentile Response Time (sec) SPARC M7-8 4 20,000 0.70 1.13 SPARC M6-32 8 18,500 0.61 1.16 Break down of the total number of users by component. Users per Component Component SPARC M7-8 SPARC M6-32 Total Online Users 20,000 users 18,500 users HR Self-Service Order-to-Cash iProcurement Customer Service Financial 5000 users 2500 users 2700 users 7000 users 2800 users 4000 users 2300 users 2400 users 7000 users 2800 users Configuration Summary System Under Test: SPARC M7-8 server 8 x SPARC M7 processors (4.13 GHz) 4 TB memory 2 x 600 GB SAS-2 HDD using a Logical Domain with 4 x SPARC M7 processors (4.13 GHz) 2 TB memory 2 x Sun Storage Dual 16Gb Fibre Channel PCIe Universal HBA 2 x Sun Dual Port 10GBase-T Adapter Oracle Solaris 11.3 Oracle E-Business Suite 12.1.3 Oracle Database 11g Release 2 Storage Configuration: 4 x Oracle ZFS Storage ZS3-2 appliances each with 2 x Read Flash Accelerator SSD 1 x Storage Drive Enclosure DE2-24P containing: 20 x 900 GB 10K RPM SAS-2 HDD 4 x Write Flash Accelerator SSD 1 x Sun Storage Dual 8Gb FC PCIe HBA Used for Database files, Zones OS, EBS Mid-Tier Apps software stack and db-tier Oracle Server 2 x Sun Server X4-2L server with 2 x Intel Xeon Processor E5-2650 v2 128 GB memory 1 x Sun Storage 6Gb SAS PCIe RAID HBA 4 x 400 GB SSD 14 x 600 GB HDD Used for Redo log files, db backup storage. Benchmark Description The Oracle E-Business OLTP X-Large benchmark simulates thousands of online users executing transactions typical of an internal Enterprise Resource Processing, simultaneously executing five application modules: Customer Service, Human Resources Self Service, iProcurement, Order Management and Financial. Each database tier uses a database instance of about 600 GB in size, supporting thousands of application users, accessing hundreds of objects (tables, indexes, SQL stored procedures, etc.). Key Points and Best Practices This test demonstrates virtualization technologies running concurrently various Oracle multi-tier business critical applications and databases on four SPARC M7 processors contained in a single SPARC M7-8 server supporting thousands of users executing a high volume of complex transactions with constrained (<1 sec) weighted average response time. The Oracle E-Business LDom is further configured using Oracle Solaris Zones. This result of 20,000 users was achieved by load balancing the Oracle E-Business Suite Applications 12.1.3 five online workloads across two Oracle Solaris processor sets and redirecting all network interrupts to a dedicated third processor set. Each applications processor set (set-1 and set-2) was running concurrently two Oracle E-Business Suite Application servers and two database servers instances, each within its own Oracle Solaris Zone (4 x Zones per set). Each application server network interface (to a client) was configured to map with the locality group associated to the CPUs processing the related workload, to guarantee memory locality of network structures and application servers hardware resources. All external storage was connected with at least two paths to the host multipath-capable fibre channel controller ports and Oracle Solaris I/O multipathing feature was enabled. See Also Oracle E-Business SPARC M7-8 Report oracle.com    Oracle E-Business Suite Standard R12 Benchmark Results Oracle E-Business Suite Standard R12 Benchmark Overview Oracle E-Business R12 Benchmark Description SPARC M7-8 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle E-Business Suite oracle.com    OTN Oracle and Virtualization oracle.com    OTN Disclosure Statement Oracle E-Business Suite R12 extra-large multiple-online module benchmark, SPARC M7-8, SPARC M7, 4.13 GHz, 4 chips, 128 cores, 1024 threads, 2 TB memory, 20,000 online users, average response time 0.70 sec, 90th percentile response time 1.13 sec, Oracle Solaris 11.3, Oracle Solaris Zones, Oracle VM Server for SPARC, Oracle E-Business Suite 12.1.3, Oracle Database 11g Release 2, Results as of 10/25/2015.

Oracle's SPARC M7-8 server, using a four-chip Oracle VM Server for SPARC (LDom) virtualized server, produced a world record 20,000 users running the Oracle E-Business OLTP X-Large benchmark.  The...

Benchmark

SPARC T7-1 Delivers 1-Chip World Records for SPEC CPU2006 Rate Benchmarks

This page has been updated on November 19, 2015. The SPARC T7-1 server results have been published at www.spec.org. Oracle's SPARC T7-1 server delivered world record SPEC CPU2006 rate benchmark results for systems with one chip. This was accomplished with Oracle Solaris 11.3 and Oracle Solaris Studio 12.4 software. The SPARC T7-1 server achieved world record scores of 1200 SPECint_rate2006, 1120 SPECint_rate_base2006, 832 SPECfp_rate2006, and 801 SPECfp_rate_base2006. The SPARC T7-1 server beat the 1 chip Fujitsu CELSIUS C740 with an Intel Xeon Processor E5-2699 v3 by 1.7x on the SPECint_rate2006 benchmark.  The SPARC T7-1 server beat the 1 chip NEC Express5800/R120f-1M with an Intel Xeon Processor E5-2699 v3 by 1.8x on the SPECfp_rate2006 benchmark. The SPARC T7-1 server beat the 1 chip IBM Power S812LC server with a POWER8 processor by 1.9 times on the SPECint_rate2006 benchmark and by 1.8 times on the SPECfp_rate2006 benchmark. The SPARC T7-1 server beat the 1 chip Fujitsu SPARC M10-4S with a SPARC64 X+ processor by 2.2x on the SPECint_rate2006 benchmark and by 1.6x on the SPECfp_rate2006 benchmark. The SPARC T7-1 server improved upon the previous generation SPARC platform which used the SPARC T5 processor by 2.5 on the SPECint_rate2006 benchmark and by 2.3 on the SPECfp_rate2006 benchmark. The SPEC CPU2006 benchmarks are derived from the compute-intensive portions of real applications, stressing chip, memory hierarchy, and compilers.  The benchmarks are not intended to stress other computer components such as networking, the operating system, or the I/O system.  Note that there are many other SPEC benchmarks, including benchmarks that  specifically focus on Java computing, enterprise computing, and network file systems. Performance Landscape Complete benchmark results are at the SPEC website. The tables below provide the new Oracle results, as well as select results from other vendors. Presented are single chip SPEC CPU2006 rate results. Only the best results published at www.spec.org per chip type are presented (best Intel, IBM, Fujitsu, Oracle chips). SPEC CPU2006 Rate Results – One Chip System Chip Peak Base   SPECint_rate2006 SPARC T7-1 SPARC M7 (4.13 GHz, 32 cores) 1200 1120 Fujitsu CELSIUS C740 Intel E5-2699 v3 (2.3 GHz, 18 cores) 715 693 IBM Power S812LC POWER8 (2.92 GHz, 10 cores) 642 482 Fujitsu SPARC M10-4S SPARC64 X+ (3.7 GHz, 16 cores) 546 479 SPARC T5-1B SPARC T5 (3.6 GHz, 16 cores) 489 441 IBM Power 710 Express POWER7 (3.55 GHz, 8 cores) 289 255   SPECfp_rate2006 SPARC T7-1 SPARC M7 (4.13 GHz, 32 cores) 832 801 NEC Express5800/R120f-1M Intel E5-2699 v3 (2.3 GHz, 18 cores) 474 460 IBM Power S812LC POWER8 (2.92 GHz, 10 cores) 468 394 Fujitsu SPARC M10-4S SPARC64 X+ (3.7 GHz, 16 cores) 462 418 SPARC T5-1B SPARC T5 (3.6 GHz, 16 cores) 369 350 IBM Power 710 Express POWER7 (3.55 GHz, 8 cores) 248 229 The following table highlights the performance of the single-chip SPARC M7 processor based server to the best published two-chip POWER8 processor based server. SPEC CPU2006 Rate Results Comparing One SPARC M7 Chip to Two POWER8 Chips System Chip Peak Base   SPECint_rate2006 SPARC T7-1 1 x SPARC M7 (4.13 GHz, 32core) 1200 1120 IBM Power S822LC 2 x POWER8 (2.92 GHz, 2x 10core) 1100 853   SPECfp_rate2006 SPARC T7-1 1 x SPARC M7 (4.13 GHz, 32 cores) 832 801 IBM Power S822LC 2 x POWER8 (2.92 GHz, 2x 10core) 888 745 Configuration Summary System Under Test: SPARC T7-1 1 x SPARC M7 processor (4.13 GHz) 512 GB memory (16 x 32 GB dimms) 800 GB on 4 x 400 GB SAS SSD (mirrored) Oracle Solaris Studio 12.4 with 4/15 Patch Set Benchmark Description SPEC CPU2006 is SPEC's most popular benchmark. It measures: Speed — single copy performance of chip, memory, compiler Rate — multiple copy (throughput) The benchmark is also divided into integer intensive applications and floating point intensive applications: integer: 12 benchmarks derived from applications such as artificial intelligence chess playing, artificial intelligence go playing, quantum computer simulation, perl, gcc, XML processing, and pathfinding floating point: 17 benchmarks derived from applications, including chemistry, physics, genetics, and weather. It is also divided depending upon the amount of optimization allowed: base: optimization is consistent per compiled language, all benchmarks must be compiled with the same flags per language. peak: specific compiler optimization is allowed per application. The overall metrics for the benchmark which are commonly used are: SPECint_rate2006, SPECint_rate_base2006: integer, rate SPECfp_rate2006, SPECfp_rate_base2006: floating point, rate SPECint2006, SPECint_base2006: integer, speed SPECfp2006, SPECfp_base2006: floating point, speed Key Points and Best Practices Jobs were bound using the "pbind" command. See Also SPEC website SPARC T7-1 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Solaris Studio oracle.com    OTN Disclosure Statement SPEC and the benchmark names SPECfp and SPECint are registered trademarks of the Standard Performance Evaluation Corporation.  Results as of November 19, 2015 from www.spec.org. SPARC T7-1: 1200 SPECint_rate2006, 1120 SPECint_rate_base2006, 832 SPECfp_rate2006, 801 SPECfp_rate_base2006; SPARC T5-1B: 489 SPECint_rate2006, 440 SPECint_rate_base2006, 369 SPECfp_rate2006, 350 SPECfp_rate_base2006; Fujitsu SPARC M10-4S: 546 SPECint_rate2006, 479 SPECint_rate_base2006, 462 SPECfp_rate2006, 418 SPECfp_rate_base2006. IBM Power 710 Express: 289 SPECint_rate2006, 255 SPECint_rate_base2006, 248 SPECfp_rate2006, 229 SPECfp_rate_base2006; Fujitsu CELSIUS C740: 715 SPECint_rate2006, 693 SPECint_rate_base2006; NEC Express5800/R120f-1M: 474 SPECfp_rate2006, 460 SPECfp_rate_base2006; IBM Power S822LC: 1100 SPECint_rate2006, 853 SPECint_rate_base2006, 888 SPECfp_rate2006, 745 SPECfp_rate_base2006; IBM Power S812LC: 642 SPECint_rate2006, 482 SPECint_rate_base2006, 468 SPECfp_rate2006, 394 SPECfp_rate_base2006.

This page has been updated on November 19, 2015. The SPARC T7-1 server results have been published at www.spec.org. Oracle's SPARC T7-1 server delivered world record SPEC CPU2006 rate benchmark results...

Benchmark

Virtualized Network Performance: SPARC T7-1

Oracle's SPARC T7-1 server using Oracle VM Server for SPARC exhibits lower network latency under virtualization.  The network latency and bandwidth were measured using the Netperf benchmark. TCP network latency between two Oracle VM Server for SPARC guests running on separate SPARC T7-1 servers each using SR-IOV is similar to that of two SPARC T7-1 servers without virtualization (native/bare metal). TCP and UDP network latencies between two Oracle VM Server for SPARC guests running on separate SPARC T7-1 servers each using assigned I/O were significantly less than the other two I/O configurations (SR-IOV and paravirtual I/O). TCP and UDP network latencies between two Oracle VM Server for SPARC guests running on separate SPARC T7-1 servers each using SR-IOV were significantly less than when using paravirtual I/O. Terminology notes: VM – virtual machine guest – encapsulated operating system instance, typically running in a VM. assigned I/O – network hardware driven directly and exclusively by guests paravirtual I/O – network hardware driven by hosts, indirectly by guests via paravirtualized drivers SR-IOV – single root i/o virtualization; virtualized network interfaces provided by network hardware, driven directly by guests. LDom – logical domain (previous name for Oracle VM Server for SPARC) Performance Landscape The following tables show the results for TCP and UDP Netperf Latency and Bandwidth tests (single stream).  Netperf latency, often called the round-trip time, is measured in microseconds (usec), smaller is better. TCP Networking Method Netperf Latency (usec) Bandwidth (Mb/sec) MTU=1500 MTU=9000 MTU=1500 MTU=9000 Native/Bare Metal 58 58 9100 9900 assigned I/O 51 51 9400 9900 SR-IOV 58 59 9400 9900 paravirtual I/O 91 91 4800 9800 UDP Networking Method Netperf Latency (usec) Bandwidth (Mb/sec) MTU=1500 MTU=9000 MTU=1500 MTU=9000 Native/Bare Metal 57 57 9100 9900 assigned I/O 51 51 9400 9900 SR-IOV 66 63 9400 9900 paravirtual I/O 98 97 4800 9800 Specifically, the Netperf benchmark latency: is the average request/response time computed by inverse of the throughput reported by the program, is measured within the program from 20 sample-runs of 30 seconds each, uses single-in-flight [i.e. non-burst] 1 byte messages, sends between separate servers connected by 10 GbE, for each test, uses servers connected back-to-back (no network switch) and configured identically: native or guest VM. Configuration Summary System Under Test: 2 x SPARC T7-1 servers, each with 1 x SPARC M7 processor (4.13 GHz) 256 GB memory (16 x 16 GB) 2 x 600 GB 10K RPM SAS-2 HDD 10 GbE (on-board and PCIe network devices) Oracle Solaris 11.3 Oracle VM Server for SPARC 3.2 Benchmark Description Using the Netperf 2.6.0 benchmark to evaluate native and virtualized (LDoms) network performance. Netperf is a client/server benchmark measuring network performance providing a number of independent tests, including the omni Request/Response (aka ping-pong) test with TCP or UDP protocols used here to obtain the Netperf latency measurements, and TCP stream for bandwidth. Netperf was run between separate servers connected back-to-back (no network switch) by 10 GbE network interconnection. To measure the cost of virtualization, for each test the servers were configured identically: native (without virtualization) or guest VM.  When in a virtual environment, in similar identical fashion on each server, some representative methods were configured to connect the environment to the network hardware (e.g. assigned I/O, paravirtualization, SR-IOV). Key Points and Best Practices Oracle VM Server for SPARC requires explicit partitioning of guests into Logical Domains of bound CPUs and memory, typically chosen to be local, and does not provide dynamic load balancing between guests on a host. Oracle VM Server for SPARC guests (LDoms) were assigned 32 virtual CPUs (4 complete processor cores) and 64 GB of memory. The control domain served as the I/O domain (for paravirtualized I/O) and was assigned 4 cores and 64 GB of memory. Each latency average reported was computed from the inverse of the reported throughput (similar to the transaction rate) of a Netperf Request/Response test run using 20 samples (aka iterations) of 30 second measurements of non-concurrent 1 byte messages. To obtain a meaningful average latency from a Netperf Request/Response test, it is important that the transactions consist of single messages, which is Netperf's default. If, for instance, Netperf options for "burst" and "TCP_NODELAY" are turned on, multiple messages can overlap in the transactions and the reported transaction rate or throughput cannot be used to compute the latency. All results were obtained with interrupt coalescence (aka interrupt throttling, interrupt blanking) turned on in the physical NIC, and if applicable, for the attachment driver in the guest. Also, interrupt coalescence turned on is the default for all the platforms used here. All the results were obtained with large receive offload (LRO) turned off in the physical NIC, and, if applicable, for the attachment driver in the guest, in order to reduce the network latency between the two guests. The netperf bandwidth test used send and receive 1MB (1048576 Bytes) messages. The paravirtual variation of the measurements refers to the use of a paravirtualized network driver in the guest instance. IP traffic consequently is routed across the guest, the virtualization subsystem in the host, a virtual network switch or bridge (depending upon the platform), and the network interface card. The assigned I/O variation of the measurements refers to the use of the card's driver in the guest instance itself. This use is possible by exclusively assigning the device to the guest. Device assignment results in less (software) routing for IP traffic and consequently less overhead than using paravirtualized drivers, but virtualization still can impose significant overhead. Note also NICs used in this way cannot be shared amongst guests, and may obviate the use of certain other VM features like migration. The T7-1 system has four on-board 10 GbE devices, but all of them are connected to the same PCIe branch, making it impossible to configure them as assigned I/O devices. Using a PCIe 10 GbE NIC allows configuring it as an assigned I/O device. In the context of Oracle VM Server for SPARC and these tests, assigned I/O refers to PCI endpoint device assignment, while paravirtualized I/O refers to virtual I/O using a virtual network device (vnet) in the guest connected to a virtual switch (vsw) through the I/O domain to the physical network device (NIC). See Also Netperf official site SPARC T7-1 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle and Virtualization oracle.com    OTN Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 25 October 2015.

Oracle's SPARC T7-1 server using Oracle VM Server for SPARC exhibits lower network latency under virtualization.  The network latency and bandwidth were measured using the Netperf benchmark. TCP...

Benchmark

PeopleSoft Enterprise Financials 9.2: SPARC T7-2 World Record

Oracle's SPARC T7-2 server achieved world record performance being the first to publish on Oracle's PeopleSoft Enterprise Financials 9.2 benchmark.  This result was obtained using one Oracle VM Server for SPARC (LDom) virtualized system configured with a single SPARC M7 processor. The single processor LDom on the SPARC T7-2 server achieved world record performance executing 200 million Journal Lines in 18.60 minutes. The single processor LDom on the SPARC T7-2 server was able to process General Ledger Journal Edit and Post batch jobs at 10,752,688 journal lines/min which reflects a large customer environment that utilizes a back-end database of nearly 1.0 TB performing highly competitive journal processing for Ledger. Performance Landscape Results are presented for PeopleSoft Financials Benchmark 9.2.  Results obtained with PeopleSoft Financials Benchmark 9.2 are not comparable to the the previous version of the benchmark, PeopleSoft Financials Benchmark 9.1, due to significant change in data model and supports only batch. PeopleSoft Financials Benchmark, Version 9.2 Solution Under Test Batch Journal lines/min SPARC T7-2 (using 1 x SPARC M7, 4.13 GHz) 18.60 min 10,752,688 Configuration Summary System: SPARC T7-2 server with 2 x SPARC M7 processors 1 TB memory 4 x Oracle Flash Accelerator F160 PCIe Card (DB Redo, DB undo and DB Data) 4 x 600 GB internal disks Oracle Solaris 11.3 Oracle Database 11g (11.2.0.4) PeopleSoft Financials (9.20.348) PeopleSoft PeopleTools (8.53.09) Java HotSpot 64-Bit Server VM (build 1.7.0_45-b18) Oracle Tuxedo, Version 11.1.1.3.0, 64-bit Oracle WebLogic Server 11g (10.3.6) LDom Under Test: Oracle VM Server for SPARC (LDom) virtualized server (APP & DB Tier) 1 x SPARC M7 processor 512 GB memory Benchmark Description The PeopleSoft Enterprise Financials 9.2 benchmark emulates a large enterprise that processes and validates a large number of financial journal transactions before posting the journal entry to the ledger.  The validation process certifies that the journal entries are accurate, ensuring that ChartFields values are valid, debits and credits equal out, and inter/intra-units are balanced. Once validated, the entries are processed, ensuring that each journal line posts to the correct target ledger, and then changes the journal status to posted. In this benchmark, the Journal Edit & Post is set up to edit and post both Inter-Unit and Regular multi-currency journals. The benchmark processes 200 million journal lines using AppEngine for edits and Cobol for post processes. Key Points and Best Practices The PeopleSoft Enterprise Financials 9.2 Batch benchmark ran on a one chip LDom consisting of 32 cores, each core had 8 threads. All total there were 256 virtual processors. The LDom contained two Oracle Solaris Zones:  a database tier zone and an application tier zone.  The application tier zone consisted of 1 core with 8 virtual processors.  The database tier zone consisted of 244 virtual processors from 31 cores. The remaining four virtual processors were dedicated to network and disk interrupt handling. Inside of the database tier zone, the database log writer ran under 4 virtual processors and eight virtual processors were dedicated to four database writers. There were 160 PeopleSoft Application instance processes running 320 streams of PeopleSoft Financial workload in the Oracle Solaris Fixed Priority FX class. See Also Oracle's PeopleSoft General Ledger 9.2 (with Combo Editing) using Oracle Database 11g for Oracle Solaris (Unicode) on an Oracle's SPARC T7-2 Server oracle.com PeopleSoft Benchmark White Papers oracle.com   SPARC T7-2 Server oracle.com    OTN Oracle Solaris oracle.com    OTN PeopleSoft Financial Management oracle.com    OTN Oracle Database oracle.com    OTN Disclosure Statement Copyright 2015, Oracle and/or its affiliates. All rights reserved.  Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.  Results as of 25 October 2015.

Oracle's SPARC T7-2 server achieved world record performance being the first to publish on Oracle's PeopleSoft Enterprise Financials 9.2 benchmark.  This result was obtained using one Oracle VM Server...

Benchmark

SAP Two-Tier Standard Sales and Distribution SD Benchmark: SPARC T7-2 World Record 2 Processors

Oracle's SPARC T7-2 server produces a world record result for 2-processors on the SAP two-tier Sales and Distribution (SD) Standard Application Benchmark using SAP Enhancement Package 5 for SAP ERP 6.0 (2 chips / 64 cores / 512 threads). The SPARC T7-2 server achieved 30,800 SAP SD benchmark users running the two-tier SAP Sales and Distribution (SD) Standard Application Benchmark using SAP Enhancement Package 5 for SAP ERP 6.0. The SPARC T7-2 server achieved 1.9 times more users than the Dell PowerEdge R730 server result. The SPARC T7-2 server achieved 1.5 times more users than the IBM Power System S824 server result. The SPARC T7-2 server achieved 1.9 times more users than the HP ProLiant DL380 Gen9 server result. The SPARC T7-2 server result was run with Oracle Solaris 11 and used Oracle Database 12c. Performance Landscape SAP-SD 2-tier performance table in decreasing performance order for leading two-processor systems and four-processor IBM Power System S824 server, with SAP ERP 6.0 Enhancement Package 5 for SAP ERP 6.0 results (current version of the benchmark as of May, 2012). SAP SD Two-Tier Benchmark System Processor OS Database Users Resp Time (sec) Version Cert# SPARC T7-2 2 x SPARC M7 (2x 32core) Oracle Solaris 11 Oracle Database 12c 30,800 0.96 EHP5 2015050 IBM Power S824 4 x POWER8 (4x 6core) AIX 7 DB2 10.5 21,212 0.98 EHP5 2014016 Dell PowerEdge R730 2 x Intel E5-2699 v3 (2x 18core) Red Hat Enterprise Linux 7 SAP ASE 16 16,500 0.99 EHP5 2014033 HP ProLiant DL380 Gen9 2 x Intel E5-2699 v3 (2x 18core) Red Hat Enterprise Linux 6.5 SAP ASE 16 16,101 0.99 EHP5 2014032 Version – Version of SAP, EHP5 refers to SAP ERP 6.0 Enhancement Package 5 for SAP ERP 6.0 Number of cores presented are per chip, to get system totals, multiple by the number of chips. Complete benchmark results may be found at the SAP benchmark website http://www.sap.com/benchmark. Configuration Summary and Results Database/Application Server: 1 x SPARC T7-2 server with 2 x SPARC M7 processors (4.13 GHz, total of 2 processors / 64 cores / 512 threads) 1 TB memory Oracle Solaris 11.3 Oracle Database 12c Database Storage: 3 x Sun Server X3-2L each with 2 x Intel Xeon Processors E5-2609 (2.4 GHz) 16 GB memory 4 x Sun Flash Accelerator F40 PCIe Card 12 x 3 TB SAS disks Oracle Solaris 11 REDO log Storage: 1 x Pillar FS-1 Flash Storage System, with 2 x FS1-2 Controller (Netra X3-2) 2 x FS1-2 Pilot (X4-2) 4 x DE2-24P Disk enclosure 96 x 300 GB 10000 RPM SAS Disk Drive Assembly Certified Results (published by SAP) Number of SAP SD benchmark users:   30,800 Average dialog response time:   0.96 seconds Throughput:       Fully processed order line items per hour:   3,372,000   Dialog steps per hour:   10,116,000   SAPS:   168,600 Average database request time (dialog/update):   0.022 sec / 0.047 sec SAP Certification:   2015050 Benchmark Description The SAP Standard Application SD (Sales and Distribution) Benchmark is an ERP business test that is indicative of full business workloads of complete order processing and invoice processing, and demonstrates the ability to run both the application and database software on a single system. The SAP Standard Application SD Benchmark represents the critical tasks performed in real-world ERP business environments. SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products. See Also SPARC T7-2 Benchmark Details Certification Form SAP Benchmark Website SPARC T7-2 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Disclosure Statement Two-tier SAP Sales and Distribution (SD) standard application benchmarks, SAP Enhancement Package 5 for SAP ERP 6.0 as of 10/23/15: SPARC T7-2 (2 processors, 64 cores, 512 threads) 30,800 SAP SD users, 2 x 4.13 GHz SPARC M7, 1 TB memory, Oracle Database 12c, Oracle Solaris 11, Cert# 2015050. IBM Power System S824 (4 processors, 24 cores, 192 threads) 21,212 SAP SD users, 4 x 3.52 GHz POWER8, 512 GB memory, DB2 10.5, AIX 7, Cert#2014016. Dell PowerEdge R730 (2 processors, 36 cores, 72 threads) 16,500 SAP SD users, 2 x 2.3 GHz Intel Xeon Processor E5-2699 v3 256 GB memory, SAP ASE 16, RHEL 7, Cert#2014033. HP ProLiant DL380 Gen9 (2 processors, 36 cores, 72 threads) 16,101 SAP SD users, 2 x 2.3 GHz Intel Xeon Processor E5-2699  v3 256 GB memory, SAP ASE 16, RHEL 6.5, Cert#2014032. SAP, R/3, reg TM of SAP AG in Germany and other countries.  More info www.sap.com/benchmark

Oracle's SPARC T7-2 server produces a world record result for 2-processors on the SAP two-tier Sales and Distribution (SD) Standard Application Benchmark using SAP Enhancement Package 5 for SAP ERP...

Benchmark

Oracle E-Business Order-To-Cash Batch Large: SPARC T7-1 World Record

Oracle's SPARC T7-1 server set a world record running the Oracle E-Business Suite 12.1.3 Standard Large (100,000 Order/Inventory Lines) Order-To-Cash (Batch) workload. The SPARC T7-1 server produced a world record hourly order line throughput of 273,973 per hour (21.90 min elapsed time) on the Oracle E-Business Suite R12 (12.1.3) Large Order-To-Cash (Batch) benchmark using a SPARC T7-1 server for the database and application tiers running Oracle Database 11g on Oracle Solaris 11. The SPARC T7-1 server demonstrated 12% better hourly order line throughput compared to a two-chip Cisco UCS B200 M4 (Intel Xeon Processor E5-2697 v3). Performance Landscape Results for the Oracle E-Business 12.1.3 Order-To-Cash Batch Large model workload. Batch Workload: Order-To-Cash Large Model System CPU Employees/Hr Elapsed Time (min) SPARC T7-1 1 x SPARC M7 processor 273,973 21.90 Cisco UCS B200 M4 2 x Intel Xeon Processor E5-2697 v3 243,803 24.61 Cisco UCS B200 M3 2 x Intel Xeon Processor E5-2690 232,739 25.78 Configuration Summary Hardware Configuration: SPARC T7-1 server with 1 x SPARC M7 processor (4.13 GHz) 256 GB memory (16 x 16 GB) Oracle ZFS Storage ZS3-2 appliance (DB Data storage) with 40 x 900 GB 10K RPM SAS-2 HDD, 8 x Write Flash Accelerator SSD and 2 x Read Flash Accelerator SSD 1.6TB SAS Oracle Flash Accelerator F160 PCIe Card (1.6 TB NVMe for DB Log storage) Software Configuration: Oracle Solaris 11.3 Oracle E-Business Suite R12 (12.1.3) Oracle Database 11g (11.2.0.3.0) Benchmark Description The Oracle E-Business Suite Standard R12 Benchmark combines online transaction execution by simulated users with concurrent batch processing to model a typical scenario for a global enterprise.  This benchmark ran one Batch component, Order-To-Cash, in the Large size. Results can be published in four sizes and use one or more online/batch modules X-large: Maximum online users running all business flows between 10,000 to 20,000; 750,000 order to cash lines per hour and 250,000 payroll checks per hour. Order to Cash Online — 2400 users The percentage across the 5 transactions in Order Management module is: Insert Manual Invoice — 16.66% Insert Order — 32.33% Order Pick Release — 16.66% Ship Confirm — 16.66% Order Summary Report — 16.66% HR Self-Service — 4000 users Customer Support Flow — 8000 users Procure to Pay — 2000 users Large: 10,000 online users; 100,000 order to cash lines per hour and 100,000 payroll checks per hour. Medium: up to 3000 online users; 50,000 order to cash lines per hour and 10,000 payroll checks per hour. Small: up to 1000 online users; 10,000 order to cash lines per hour and 5,000 payroll checks per hour. Key Points and Best Practices All system optimizations are in the published report, find link in See Also section below. See Also E-Business Suite Applications R2 (R12.1.3) Order-To-Cash (Batch) Benchmark - Using Oracle Database 11g on Oracle's SPARC T7-1 oracle.com Oracle E-Business Suite Standard R12 Benchmark Results Oracle E-Business Suite Standard R12 Benchmark Overview Oracle E-Business R12 Benchmark Description SPARC T7-1 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle E-Business Suite oracle.com    OTN Disclosure Statement Oracle E-Business Large Order-To-Cash Batch workload, SPARC T7-1, 4.13 GHz, 1 chip, 32 cores, 256 threads, 256 GB memory, elapsed time 21.90 minutes, 273,973 hourly order line throughput, Oracle Solaris 11.3, Oracle E-Business Suite 12.1.3, Oracle Database 11g Release 2, Results as of 10/25/2015.

Oracle's SPARC T7-1 server set a world record running the Oracle E-Business Suite 12.1.3 Standard Large (100,000 Order/Inventory Lines) Order-To-Cash (Batch) workload. The SPARC T7-1 server produced a...

Benchmark

SPARC T7-4 Delivers 4-Chip World Record for SPEC OMP2012

Oracle's SPARC T7-4 server delivered world record performance on the SPEC OMP2012 benchmark for systems with four chips.  This was accomplished with Oracle Solaris 11.3 and Oracle Solaris Studio 12.4 software. The SPARC T7-4 server delivered world record for systems with four chips of 27.9 SPECompG_peak2012 and 26.4 SPECompG_base2012. The SPARC T7-4 server beat the four chip HP ProLiant DL580 Gen9 with Intel Xeon Processor E7-8890 v3 by 29% on the SPECompG_peak2012 benchmark. This SPEC OMP2012 benchmark result demonstrates that the SPARC M7 processor performs well on floating-point intensive technical computing and modeling workloads. Performance Landscape Complete benchmark results are at the SPEC website, SPEC OMP2012 Results.  The table below provides the new Oracle result as well as the previous best four chip results. SPEC OMP2012 Results Four Chip Results System Processor Peak Base SPARC T7-4 SPARC M7, 4.13 GHz 27.9 26.4 HP ProLiant DL580 Gen9 Intel Xeon E7-8890 v3, 2.5 GHz 21.5 20.4 Cisco UCS C460 M4 Intel Xeon E7-8890 v3, 2.5 GHz -- 20.8 Configuration Summary Systems Under Test: SPARC T7-4 4 x 4.13 GHz SPARC M7 processors 2 TB memory (64 x 32 GB dimms) 4 x 600 GB SAS 10,000 RPM HDD (mirrored) Oracle Solaris 11.3 (11.3.0.30.0) Oracle Solaris Studio 12.4 with 4/15 Patch Set Benchmark Description The following was taken from the SPEC website. SPEC OMP2012 focuses on compute intensive performance, which means these benchmarks emphasize the performance of: the computer processor (CPU), the memory architecture, the parallel support libraries, and the compilers. It is important to remember the contribution of the latter three components. SPEC OMP performance intentionally depends on more than just the processor. SPEC OMP2012 contains a suite that focuses on parallel computing performance using the OpenMP parallelism standard. The suite can be used to measure along the following vector: Compilation method: Consistent compiler options across all programs of a given language (the base metrics) and, optionally, compiler options tuned to each program (the peak metrics). SPEC OMP2012 is not intended to stress other computer components such as networking, the operating system, graphics, or the I/O system.  Note that there are many other SPEC benchmarks, including benchmarks that specifically focus on graphics, distributed Java computing, webservers, and network file systems. Key Points and Best Practices Jobs were bound using the OpenMP environment direction OMP_PLACES. See Also SPEC OMP2012 website SPEC website SPARC T7-4 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Solaris Studio oracle.com    OTN Disclosure Statement SPEC and the benchmark name SPEC OMP are registered trademarks of the Standard Performance Evaluation Corporation.  Results as of November 11, 2015 from www.spec.org.  SPARC T7-4 (4 chips, 128 cores, 1024 threads): 27.9 SPECompG_peak2012, 26.4 SPECompG_base2012; HP ProLiant DL580 Gen9 (4 chips, 72 cores, 144 threads): 21.5 SPECompG_peak2012, 20.4 SPECompG_base2012; Cisco UCS C460 M7 (4 chips, 72 cores, 144 threads): 20.8 SPECompG_base2012.

Oracle's SPARC T7-4 server delivered world record performance on the SPEC OMP2012 benchmark for systems with four chips.  This was accomplished with Oracle Solaris 11.3 and Oracle Solaris Studio 12.4...

Benchmark

Oracle Server X5-2 Produces World Record 2-Chip Single Application Server SPECjEnterprise2010 Result

Two Oracle Server X5-2 systems, using the Intel Xeon E5-2699 v3 processor, produced a World Record x86 two-chip single application server SPECjEnterprise2010 benchmark result of 21,504.30 SPECjEnterprise2010 EjOPS.  One Oracle Server X5-2 ran the application tier and the second Oracle Server X5-2 was used for the database tier. The Oracle Server X5-2 system demonstrated 11% better performance when compared to the IBM X3650 M5 server result of 19,282.14 SPECjEnterprise2010 EjOPS. The Oracle Server X5-2 system demonstrated 1.9x better performance when compared to the previous generation Sun Server X4-2 server result of 11,259.88 SPECjEnterprise2010 EjOPS. This result used Oracle WebLogic Server 12c, Java HotSpot(TM) 64-Bit Server 1.8.0_40 Oracle Database 12c, and Oracle Linux. Performance Landscape Complete benchmark results are at the SPEC website, SPECjEnterprise2010 Results.  The table below shows the top single application server, two-chip x86 results. SPECjEnterprise2010 Performance Chart as of 4/1/2015 Submitter EjOPS* Application Server Database Server Oracle 21,504.30 1x Oracle Server X5-2 2x 2.3 GHz Intel Xeon E5-2699 v3 Oracle WebLogic 12c (12.1.3) 1x Oracle Server X5-2 2x 2.3 GHz Intel Xeon E5-2699 v3 Oracle Database 12c (12.1.0.2) IBM 19,282.14 1x IBM X3650 M5 2x 2.6 GHz Intel Xeon E5-2697 v3 WebSphere Application Server V8.5 1x IBM X3850 X6 4x 2.8 GHz Intel Xeon E7-4890 v2 IBM DB2 10.5 Oracle 11,259.88 1x Sun Server X4-2 2x 2.7 GHz Intel Xeon E5-2697 v2 Oracle WebLogic 12c (12.1.2) 1x Sun Server X4-2L 2x 2.7 GHz Intel Xeon E5-2697 v2 Oracle Database 12c (12.1.0.1) * SPECjEnterprise2010 EjOPS, bigger is better. Configuration Summary Application Server: 1 x Oracle Server X5-2 2 x 2.3 GHz Intel Xeon E5-2699 v3 processors 256 GB memory 3 x 10 GbE NIC Oracle Linux 6 Update 5 (kernel-2.6.39-400.243.1.el6uek.x86_64) Oracle WebLogic Server 12c (12.1.3) Java HotSpot(TM) 64-Bit Server VM on Linux, version 1.8.0_40 (Java SE 8 Update 40) BIOS SW 1.2 Database Server: 1 x Oracle Server X5-2 2 x 2.3 GHz Intel Xeon E5-2699 v3 processors 512 GB memory 2 x 10 GbE NIC 1 x 16 Gb FC HBA 2 x Oracle Server X5-2L Storage Oracle Linux 6 Update 5 (kernel-3.8.13-16.2.1.el6uek.x86_64) Oracle Database 12c Enterprise Edition Release 12.1.0.2 Benchmark Description SPECjEnterprise2010 is the third generation of the SPEC organization's J2EE end-to-end industry standard benchmark application. The SPECjEnterprise2010 benchmark has been designed and developed to cover the Java EE 5 specification's significantly expanded and simplified programming model, highlighting the major features used by developers in the industry today. This provides a real world workload driving the Application Server's implementation of the Java EE specification to its maximum potential and allowing maximum stressing of the underlying hardware and software systems. The workload consists of an end to end web based order processing domain, an RMI and Web Services driven manufacturing domain and a supply chain model utilizing document based Web Services. The application is a collection of Java classes, Java Servlets, Java Server Pages, Enterprise Java Beans, Java Persistence Entities (pojo's) and Message Driven Beans. The SPECjEnterprise2010 benchmark heavily exercises all parts of the underlying infrastructure that make up the application environment, including hardware, JVM software, database software, JDBC drivers, and the system network. The primary metric of the SPECjEnterprise2010 benchmark is jEnterprise Operations Per Second ("SPECjEnterprise2010 EjOPS"). This metric is calculated by adding the metrics of the Dealership Management Application in the Dealer Domain and the Manufacturing Application in the Manufacturing Domain. There is no price/performance metric in this benchmark. Key Points and Best Practices Four Oracle WebLogic server instances were started using "numactl" binding 2 instances per chip. Four Oracle database listener processes were started, 2 processes bound per processor. Additional tuning information is in the report at http://spec.org. COD (Cluster on Die) is enabled in the BIOS on the application server. See Also SPECjEnterprise2010 Results Page Oracle Server X5-2 oracle.com    OTN Oracle Linux oracle.com    OTN Oracle Database oracle.com    OTN Oracle WebLogic Suite oracle.com    OTN Disclosure Statement SPEC and the benchmark name SPECjEnterprise are registered trademarks of the Standard Performance Evaluation Corporation. Oracle Server X5-2, 21,504.30 SPECjEnterprise2010 EjOPS; IBM System X3650 M5, 19,282.14 SPECjEnterprise2010 EjOPS; Sun Server X4-2, 11,259.88 SPECjEnterprise2010 EjOPS. Results from www.spec.org as of 4/1/2015.

Two Oracle Server X5-2 systems, using the Intel Xeon E5-2699 v3 processor, produced a World Record x86 two-chip single application server SPECjEnterprise2010 benchmark result of...

Benchmark

Oracle ZFS Storage ZS4-4 Shows 1.8x Generational Performance Improvement on SPC-2 Benchmark

The Oracle ZFS Storage ZS4-4 appliance delivered 1.8x improved performance and 1.3x improved price performance over the previous generation Oracle ZFS Storage ZS3-4 appliance as shown by the SPC-2 benchmark. Running the SPC-2 benchmark, the Oracle ZFS Storage ZS4-4 appliance delivered SPC-2 Price-Performance of $17.09 and an overall score of 31,486.23 SPC-2 MBPS. The Oracle ZFS Storage continues its strong price performance by occupying the three of the top five SPC-2 price performance. Oracle holds the three of the top four performance results on the SPC-2 benchmark for HDD based systems. The Oracle ZFS Storage ZS4-4 appliance has a 7.6x price-performance advantage over the IBM DS8870 and 2x performance advantage as measured by the SPC-2 benchmark. The Oracle ZFS Storage ZS4-4 appliance has a 5.0x performance advantage over the new Fujitsu DX200 S3 as measured by the SPC-2 benchmark. The Oracle ZFS Storage ZS4-4 appliance has a 4.6x price-performance advantage over the Fujitsu ET8700 S2 and 1.9x performance advantage as shown by the SPC-2 benchmark. The Oracle ZFS Storage ZS4-4 appliance has a 4.6x price-performance advantage over the Hitachi Virtual Storage Platform (VSP) and 1.96x performance advantage as measured by the SPC-2 benchmark. The Oracle ZFS Storage ZS4-4 appliance has a 1.6x price-performance advantage over the HP XP7 disk array as shown by the SPC-2 benchmark (HP even discounted their hardware 63%). Performance Landscape SPC-2 Price-Performance Below is a table of the top SPC-2 Price-Performance results for HDD storage based systems, presented in increasing price-performance order (as of 03/17/2015).  The complete set of results may be found at SPC2 top 10 Price-Performance list. System SPC-2 MBPS $/SPC-2 MBPS Results Identifier Oracle ZFS Storage ZS3-2 16,212.66 $12.08 BE00002 Fujitsu Eternus DX200 S3 6,266.50 $15.42 B00071 SGI InfiniteStorage 5600 8,855.70 $15.97 B00065 Oracle ZFS Storage ZS4-4 31,486.23 $17.09 B00072 Oracle ZFS Storage ZS3-4 17,244.22 $22.53 B00067 NEC Storage M700 14,408.89 $25.10 B00066 Sun StorageTek 2530 663.51 $26.48 B00026 HP XP7 storage 43,012.53 $28.30 B00070 Fujitsu ETERNUS DX80 S2 2,685.50 $28.48 B00055 SGI InfiniteStorage 5500-SP 4,064.49 $28.57 B00059 Hitachi Unified Storage VM 11,274.83 $32.64 B00069 SPC-2 MBPS = the Performance Metric $/SPC-2 MBPS = the Price-Performance Metric Results Identifier = A unique identification of the result SPC-2 Performance The following table list the top SPC-2 -Performance results for HDD storage based systems, presented in decreasing performance order (as of 03/17/2015).  The complete set of results may be found at the SPC2 top 10 Performance list. HDD Based Systems SPC-2 MBPS $/SPC-2 MBPS TSC Price Results Identifier HP XP7 storage 43,012.52 $28.30 $1,217,462 B00070 Oracle ZFS Storage ZS4-4 31,486.23 $17.09 $538,050 B00072 Oracle ZFS Storage ZS3-4 17,244.22 $22.53 $388,472 B00067 Oracle ZFS Storage ZS3-2 16,212.66 $12.08 $195,915 BE00002 Fujitsu ETERNUS DX8870 S2 16,038.74 $79.51 $1,275,163 B00063 IBM System Storage DS8870 15,423.66 $131.21 $2,023,742 B00062 IBM SAN VC v6.4 14,581.03 $129.14 $1,883,037 B00061 Hitachi Virtual Storage Platform (VSP) 13,147.87 $95.38 $1,254,093 B00060 HP StorageWorks P9500 XP Storage Array 13,147.87 $88.34 $1,161,504 B00056 SPC-2 MBPS = the Performance Metric $/SPC-2 MBPS = the Price-Performance Metric TSC Price = Total Cost of Ownership Metric Results Identifier = A unique identification of the result Metric Complete SPC-2 benchmark results may be found at http://www.storageperformance.org/results/benchmark_results_spc2. Configuration Summary Storage Configuration: Oracle ZFS Storage ZS4-4 storage system in clustered configuration 2 x Oracle ZFS Storage ZS4-4 controllers with 8 x Intel Xeon processors 3 TB memory 24 x Oracle Storage Drive Enclosure DE2-24P, each with 24 x 300 GB 10K RPM SAS-2 drives Benchmark Description SPC Benchmark 2 (SPC-2):  Consists of three distinct workloads designed to demonstrate the performance of a storage subsystem during the execution of business critical applications that require the large-scale, sequential movement of data. Those applications are characterized predominately by large I/Os organized into one or more concurrent sequential patterns. A description of each of the three SPC-2 workloads is listed below as well as examples of applications characterized by each workload. Large File Processing: Applications in a wide range of fields, which require simple sequential process of one or more large files such as scientific computing and large-scale financial processing. Large Database Queries: Applications that involve scans or joins of large relational tables, such as those performed for data mining or business intelligence. Video on Demand: Applications that provide individualized video entertainment to a community of subscribers by drawing from a digital film library. SPC-2 is built to: Provide a level playing field for test sponsors. Produce results that are powerful and yet simple to use. Provide value for engineers as well as IT consumers and solution integrators. Is easy to run, easy to audit/verify, and easy to use to report official results. See Also Oracle ZFS Storage ZS4-4 SPC-2 Executive Summary storageperformance.org Complete Oracle ZFS Storage ZS4-4 SPC-2 Full Disclosure Report storageperformance.org Storage Performance Council (SPC) Home Page Oracle ZFS Storage ZS4-4 oracle.com    OTN Disclosure Statement SPC-2 and SPC-2 MBPS are registered trademarks of Storage Performance Council (SPC).  Results as of March 17, 2015, for more information see www.storageperformance.org. Oracle ZFS Storage ZS4-4 - B00072, Oracle ZFS Storage ZS3-2 - BE00002, Oracle ZFS Storage ZS3-4 - B00067, Fujitsu ETERNUS DX80 S2, B00055, Fujitsu ETERNUS DX8870 S2 - B00063, Fujitsu ETERNUS DX200 S3 - B00071, HP StorageWorks P9500 XP Storage Array - B00056, HP XP7 Storage Array - B00070, Hitachi Unified Storage VM - B00069, Hitachi Virtual Storage Platform (VSP) - B00060, IBM SAN VC v6.4 - B00061, IBM System Storage DS8870 - B00062, IBM XIV Storage System Gen3 - BE00001, NEC Storage M700 - B00066, SGI InfiniteStorage 5500-SP - B00059, SGI InfiniteStorage 5600 - B00065, Sun StorageTek 2530 - B00026.

The Oracle ZFS Storage ZS4-4 appliance delivered 1.8x improved performance and 1.3x improved price performance over the previous generation Oracle ZFS Storage ZS3-4 appliance as shown by the...

Benchmark

Oracle ZFS Storage ZS3-2 Delivers World Record Price-Performance on SPC-2/E

The Oracle ZFS Storage ZS3-2 appliance delivered a world record Price-Performance result, world record energy result and excellent overall performance for the SPC-2/E benchmark. The Oracle ZFS Storage ZS3-2 appliance delivered the top SPC-2 Price-Performance of $12.08 and it delivered an overall score of 16,212.66 SPC-2 MBPS for the SPC-2/E benchmark. The Oracle ZFS Storage ZS3-2 appliance produced the top Performance-Energy SPC-2/E benchmark result of 3.67 SPC2 MBPS / watt. Oracle holds the top two performance results on the SPC-2 benchmark for HDD based systems. The Oracle ZFS Storage ZS3-2 appliance has an 11x price-performance advantage over the IBM DS8870. The Oracle ZFS Storage ZS3-2 appliance has an 8x price-performance advantage over the Hitachi Virtual Storage Platform (VSP). The Oracle ZFS Storage ZS3-2 appliance has an 7.3x price-performance advantage over the HP P9500 XP disk array. Performance Landscape SPC-2 Price-Performance Below is a table of the top SPC-2 Price-Performance results for HDD storage based systems, presented in increasing price-performance order (as of 06/25/2014). The complete set of results may be found at SPC2 top 10 Price-Performance list. System SPC-2 MBPS $/SPC-2 MBPS Results Identifier Oracle ZFS Storage ZS3-2 16,212.66 $12.08 BE00002 SGI InfiniteStorage 5600 8,855.70 $15.97 B00065 Oracle ZFS Storage ZS3-4 17,244.22 $22.53 B00067 NEC Storage M700 14,408.89 $25.10 B00066 Sun StorageTek 2530 663.51 $26.48 B00026 Fujitsu ETERNUS DX80 S2 2,685.50 $28.48 B00055 SGI InfiniteStorage 5500-SP 4,064.49 $28.57 B00059 Hitachi Unified Storage VM 11,274.83 $32.64 B00069 SPC-2 MBPS = the Performance Metric $/SPC-2 MBPS = the Price-Performance Metric Results Identifier = A unique identification of the result SPC-2/E Results The table below list all SPC-2/E results.  The SPC-2/E benchmark extends the SPC-2 benchmark by additionally measuring power consumption during the SPC-2 benchmark run. System SPC-2 MBPS $/SPC-2 MBPS TSC Price SPC2 MBPS / watt Results Identifier Oracle ZFS Storage ZS3-2 16,212.66 $12.08 $195,915 3.67 BE00002 IBM XIV Storage System Gen3 7,467.99 $152.34 $1,137,641 0.81 BE00001 SPC-2 MBPS = the Performance Metric $/SPC-2 MBPS = the Price-Performance Metric TSC Price = Total Cost of Ownership Metric SPC2 MBPS / watt = Number of SPC2 MB/second produced per watt consumed. Higher is Better. Results Identifier = A unique identification of the result SPC-2 Performance The following table list the top SPC-2 -Performance results for HDD storage based systems, presented in decreasing performance order (as of 06/25/2014).  The complete set of results may be found at the SPC2 top 10 Performance list. System SPC-2 MBPS $/SPC-2 MBPS TSC Price Results Identifier Oracle ZFS Storage ZS3-4 17,244.22 $22.53 $388,472 B00067 Oracle ZFS Storage ZS3-2 16,212.66 $12.08 $195,915 BE00002 Fujitsu ETERNUS DX8870 S2 16,038.74 $79.51 $1,275,163 B00063 IBM System Storage DS8870 15,423.66 $131.21 $2,023,742 B00062 IBM SAN VC v6.4 14,581.03 $129.14 $1,883,037 B00061 Hitachi Virtual Storage Platform (VSP) 13,147.87 $95.38 $1,254,093 B00060 HP StorageWorks P9500 XP Storage Array 13,147.87 $88.34 $1,161,504 B00056 SPC-2 MBPS = the Performance Metric $/SPC-2 MBPS = the Price-Performance Metric TSC Price = Total Cost of Ownership Metric Results Identifier = A unique identification of the result Metric Complete SPC-2 benchmark results may be found at http://www.storageperformance.org/results/benchmark_results_spc2. Configuration Summary Storage Configuration: Oracle ZFS Storage ZS3-2 storage system in clustered configuration 2 x Oracle ZFS Storage ZS3-2 controllers, each with 4 x 2.1 GHz 8-core Intel Xeon processors 512 GB memory 12 x Sun Disk shelves, each with 24 x 300 GB 10K RPM SAS-2 drives Benchmark Description SPC Benchmark 2 (SPC-2):  Consists of three distinct workloads designed to demonstrate the performance of a storage subsystem during the execution of business critical applications that require the large-scale, sequential movement of data. Those applications are characterized predominately by large I/Os organized into one or more concurrent sequential patterns. A description of each of the three SPC-2 workloads is listed below as well as examples of applications characterized by each workload. Large File Processing: Applications in a wide range of fields, which require simple sequential process of one or more large files such as scientific computing and large-scale financial processing. Large Database Queries: Applications that involve scans or joins of large relational tables, such as those performed for data mining or business intelligence. Video on Demand: Applications that provide individualized video entertainment to a community of subscribers by drawing from a digital film library. SPC-2 is built to: Provide a level playing field for test sponsors. Produce results that are powerful and yet simple to use. Provide value for engineers as well as IT consumers and solution integrators. Is easy to run, easy to audit/verify, and easy to use to report official results. SPC Benchmark 2/Energy (SPC-2/E): consists of the complete set of SPC-2 performance measurement and reporting plus the measurement and reporting of energy use. This benchmark extension provides measurement and reporting to complete storage configurations, complementing SPC-2C/E, which focuses on storage component configurations. See Also Oracle ZFS Storage ZS3-2 SPC-2/E Executive Summary storageperformance.org Complete Oracle ZFS Storage ZS3-2 SPC-2/E Full Disclosure Report storageperformance.org Storage Performance Council (SPC) Home Page Oracle ZFS Storage ZS3-2 oracle.com    OTN Disclosure Statement SPC-2 and SPC-2 MBPS are registered trademarks of Storage Performance Council (SPC).  Results as of June 25, 2014, for more information see www.storageperformance.org. Fujitsu ETERNUS DX80 S2, B00055, Fujitsu ETERNUS DX8870 S2 - B00063, HP StorageWorks P9500 XP Storage Array - B00056, Hitachi Unified Storage VM - B00069, Hitachi Virtual Storage Platform (VSP) - B00060, IBM SAN VC v6.4 - B00061, IBM System Storage DS8870 - B00062, IBM XIV Storage System Gen3 - BE00001, NEC Storage M700 - B00066, Oracle ZFS Storage ZS3-2 - BE00002, Oracle ZFS Storage ZS3-4 - B00067, SGI InfiniteStorage 5500-SP - B00059, SGI InfiniteStorage 5600 - B00065, Sun StorageTek 2530 - B00026.

The Oracle ZFS Storage ZS3-2 appliance delivered a world record Price-Performance result, world record energy result and excellent overall performance for the SPC-2/E benchmark. The Oracle ZFS Storage...

Benchmark

SPARC M6-32 Produces SAP SD Two-Tier Benchmark World Record for 32-Processor Systems

Oracle's SPARC M6-32 server produced a world record result for 32-processors on the SAP two-tier Sales and Distribution (SD) Standard Application Benchmark using SAP Enhancement Package 5 for SAP ERP 6.0 (32 chips / 384 cores / 3072 threads). SPARC M6-32 server achieved 140,000 SAP SD benchmark users with a low average dialog response time of 0.58 seconds running the SAP two-tier Sales and Distribution (SD) Standard Application Benchmark using SAP Enhancement package 5 for SAP ERP 6.0. The SPARC M6-32 delivered 2.5 times more users than the IBM Power 780 result using SAP Enhancement Package 5 for SAP ERP 6.0. The IBM result also had 1.7 times worse average dialog response time compared to the SPARC M6-32 server result. The SPARC M6-32 delivered 3.0 times more users than the Fujitsu PRIMEQUEST 2800E (with Intel Xeon E7-8890 v2 processors) result.  The Fujitsu result also had 1.7 times worse average dialog response time compared to the SPARC M6-32 server result. The SPARC M6-32 server solution was run with Oracle Solaris 11 and used Oracle Database 11g. Performance Landscape SAP-SD 2-Tier Performance Table (in decreasing performance order).  With SAP ERP 6.0 Enhancement Package 4 for SAP ERP 6.0 (Old version of the benchmark, obsolete at the end of April, 2012), and SAP ERP 6.0 Enhancement Package 5 for SAP ERP 6.0 results (current version of the benchmark as of May, 2012). System Processor Ch / Co / Th — Memory OS Database Users RespTime (sec) Version Cert# Fujitsu SPARC M10-4S SPARC64 X @3.0 GHz 40 / 640 / 1280 — 10 TB Solaris 11 Oracle 11g 153,000 0.87 EHP5 2013014 SPARC M6-32 Server SPARC M6 @3.6 GHz 32 / 384 / 3072 — 16 TB Solaris 11 Oracle 11g 140,000 0.58 EHP5 2014008 IBM Power 795 POWER7 @4 GHz 32 / 256 / 1024 — 4 TB AIX 7.1 DB2 9.7 126,063 0.98 EHP4 2010046 IBM Power 780 POWER7+ @3.72 GHz 12 / 96 / 834 — 1536 GB AIX 7.1 DB2 10 57,024 0.98 EHP5 2012033 Fujitsu PRIMEQUEST 2800E Intel E7-8890 v2 @2.8 GHz 8 / 120 / 240 — 1024 GB Windows Server 2012 SE SQL Server 2012 47,500 0.97 EHP5 2014003 IBM Power 760 POWER7+ @3.41 GHz 8 / 48 / 192 — 1024 GB AIX 7.1 DB2 10 25,488 0.99 EHP5 2013004 Version – Version of SAP, EHP5 refers to SAP ERP 6.0 Enhancement Package 5 for SAP ERP 6.0 and EHP4 refers to SAP ERP 6.0 Enhancement Package 4 for SAP ERP 6.0 Ch / Co / Th – Total chips, coreas and threads Complete benchmark results may be found at the SAP benchmark website http://www.sap.com/benchmark. Configuration Summary and Results Hardware Configuration: 1 x SPARC M6-32 server with 32 x 3.6 GHz SPARC M6 processors (total of 32 processors / 384 cores / 3072 threads) 16 TB memory 6 x Sun Server X3-2L each with 2 x Intel Xeon E5-2609 2.4 GHz Processors 16 GB Memory 4 x Flash Accelerator F40 12 x 3 TB SAS disks 2 x Sun Server X3-2L each with 2 x Intel Xeon E5-2609 2.4 GHz Processors 16 GB Memory 1 x 8-Port 6Gbps SAS-2 RAID PCI Express HBA 12 x 3 TB SAS disks Software Configuration: Oracle Solaris 11 SAP Enhancement Package 5 for SAP ERP 6.0 Oracle Database 11g Release 2 Certified Results (published by SAP) Number of SAP SD benchmark users:   140,000 Average dialog response time:   0.58 seconds Throughput:       Fully processed order line items per hour:   15,878,670   Dialog steps per hour:   47,636,000   SAPS:   793,930 Average database request time (dialog/update):   0.020 sec / 0.041 sec SAP Certification:   2014008 Benchmark Description The SAP Standard Application SD (Sales and Distribution) Benchmark is an ERP business test that is indicative of full business workloads of complete order processing and invoice processing, and demonstrates the ability to run both the application and database software on a single system. The SAP Standard Application SD Benchmark represents the critical tasks performed in real-world ERP business environments. SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products. See Also SAP Benchmark Website SPARC M6-32 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Disclosure Statement Two-tier SAP Sales and Distribution (SD) standard application benchmarks, SAP Enhancement Package 5 for SAP ERP 6.0 as of 3/26/14: SPARC M6-32 (32 processors, 384 cores, 3072 threads) 140,000 SAP SD users, 32 x 3.6 GHz SPARC M6, 16 TB memory, Oracle Database 11g, Oracle Solaris 11, Cert# 2014008. Fujitsu SPARC M10-4S (40 processors, 640 cores, 1280 threads) 153,000 SAP SD users, 40 x 3.0 GHz SPARC65 X, 10 TB memory, Oracle Database 11g, Oracle Solaris 11, Cert# 2013014.  IBM Power 780 (12 processors, 96 cores, 384 threads) 57,024 SAP SD users, 12 x 3.72 GHz IBM POWER7+, 1536 GB memory, DB210, AIX7.1, Cert#2012033.  Fujitsu PRIMEQUEST 2800E (8 processors, 120 cores, 240 threads) 47,500 SAP SD users, 8 x 2.8 GHz Intel Xeon Processor E7-8890 v2, 1024 GB memory, SQL Server 2012, Windows Server 2012 Standard Edition, Cert# 2014003.  IBM Power 760 (8 processors, 48 cores, 192 threads) 25,488 SAP SD users, 8 x 3.41 GHz IBM POWER7+, 1024 GB memory, DB2 10, AIX 7.1, Cert#2013004. Two-tier SAP Sales and Distribution (SD) standard application benchmarks, SAP Enhancement Package 4 for SAP ERP 6.0 as of 3/26/14: IBM Power 795 (32 processors, 256 cores, 1024 threads) 126,063 SAP SD users, 32 x 4 GHz IBM POWER7, 4 TB memory, DB2 9.7, AIX7.1, Cert#2010046. SAP, R/3, reg TM of SAP AG in Germany and other countries.  More info www.sap.com/benchmark

Oracle's SPARC M6-32 server produced a world record result for 32-processors on the SAP two-tier Sales and Distribution (SD) Standard Application Benchmark using SAP Enhancement Package 5 for SAP ERP...

Benchmark

SPARC T5-2 Delivers World Record 2-Socket SPECvirt_sc2010 Benchmark

Oracle's SPARC T5-2 server delivered a world record two-chip SPECvirt_sc2010 result of 4270 @ 264 VMs, establishing performance superiority in virtualized environments of the SPARC T5 processors with Oracle Solaris 11, which includes as standard virtualization products Oracle VM for SPARC and Oracle Solaris Zones. The SPARC T5-2 server has 2.3x better performance than an HP BL620c G7 blade server (with two Westmere EX processors) which used VMware ESX 4.1 U1 virtualization software (best SPECvirt_sc2010 result on two-chip servers using VMware software). The SPARC T5-2 server has 1.6x better performance than an IBM Flex System x240 server (with two Sandy Bridge processors) which used Kernel-based Virtual Machines (KVM). This is the first SPECvirt_sc2010 result using Oracle production level software:  Oracle Solaris 11.1, Oracle WebLogic Server 10.3.6, Oracle Database 11g Enterprise Edition, Oracle iPlanet Web Server 7 and Oracle Java Development Kit 7 (JDK).  The only exception for the Dovecot mail server. Performance Landscape Complete benchmark results are at the SPEC website, SPECvirt_sc2010 Results.  The following table highlights the leading two-chip results for the benchmark, bigger is better. SPECvirt_sc2010 Leading Two-Chip Results System Processor Result @ VMs Virtualization Software SPARC T5-2 2 x SPARC T5 3.6 GHz 4270 @ 264 Oracle VM Server for SPARC 3.0 Oracle Solaris Zones IBM Flex System x240 2 x Intel E5-2690 2.9 GHz 2741 @ 168 Red Hat Enterprise Linux 6.4 KVM HP Proliant BL6200c G7 2 x Intel E7-2870 2.4 GHz 1878 @ 120 VMware ESX 4.1 U1 Configuration Summary System Under Test Highlights: 1 x SPARC T5-2 server, with 2 x 3.6 GHz SPARC T5 processors 1 TB memory Oracle Solaris 11.1 Oracle VM Server for SPARC 3.0 Oracle iPlanet Web Server 7.0.15 Oracle PHP 5.3.14 Dovecot 2.1.17 Oracle WebLogic Server 11g (10.3.6) Oracle Database 11g (11.2.0.3) Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.7.0_51 Benchmark Description The SPECvirt_sc2010 benchmark is SPEC's first benchmark addressing performance of virtualized systems.  It measures the end-to-end performance of all system components that make up a virtualized environment. The benchmark utilizes several previous SPEC benchmarks which represent common tasks which are commonly used in virtualized environments.  The workloads included are derived from SPECweb2005, SPECjAppServer2004 and SPECmail2008. Scaling of the benchmark is achieved by running additional sets of virtual machines until overall throughput reaches a peak.  The benchmark includes a quality of service criteria that must be met for a successful run. Key Points and Best Practices The SPARC T5 server running the Oracle Solaris 11.1, utilizes embedded virtualization products as the Oracle VM for SPARC and Oracle Solaris Zones, which provide a low overhead, flexible, scalable and manageable virtualization environment. In order to provide a high level of data integrity and availability, all the benchmark data sets are stored on mirrored (RAID1) storage. See Also SPECvirt_sc2010 Results Page SPARC T5-2 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle WebLogic Suite oracle.com    OTN Oracle Fusion Middleware oracle.com    OTN Java oracle.com    OTN Disclosure Statement SPEC and the benchmark name SPECvirt_sc are registered trademarks of the Standard Performance Evaluation Corporation.  Results from www.spec.org as of 3/5/2014.  SPARC T5-2, SPECvirt_sc2010 4270 @ 264 VMs; IBM Flex System x240, SPECvirt_sc2010 2741 @ 168 VMs; HP Proliant BL620c G7, SPECvirt_sc2010 1878 @ 120 VMs.

Oracle's SPARC T5-2 server delivered a world record two-chip SPECvirt_sc2010 result of 4270 @ 264 VMs, establishing performance superiority in virtualized environments of the SPARC T5 processors with...

Benchmark

SPARC T5-2 Produces SPECjbb2013-MultiJVM World Record for 2-Chip Systems

From www.spec.org Defects Identified in SPECjbb®2013 December 9, 2014 - SPEC has identified a defect in its SPECjbb®2013 benchmark suite. SPEC has suspended sales of the benchmark software and is no longer accepting new submissions of SPECjbb®2013 results for publication on SPEC's website. Current SPECjbb®2013 licensees will receive a free copy of the new version of the benchmark when it becomes available. SPEC is advising SPECjbb®2013 licensees and users of the SPECjbb®2013 metrics that the recently discovered defect impacts the comparability of results. This defect can significantly impact the amount of work done during the measurement period, resulting in an inflated SPECjbb®2013 metric. SPEC recommends that users not utilize these results for system comparisons without a full understanding of the impact of these defects on each benchmark result. Additional information is available here. The SPECjbb2013 benchmark shows modern Java application performance. Oracle's SPARC T5-2 set a two-chip world record, which is 1.8x faster than the best two-chip x86-based server. Using Oracle Solaris and Oracle Java, Oracle delivered this two-chip world record result on the MultiJVM SPECjbb2013 metric. The SPARC T5-2 server achieved 114,492 SPECjbb2013-MultiJVM max-jOPS and 43,963 SPECjbb2013-MultiJVM critical-jOPS on the SPECjbb2013 benchmark. This result is a two-chip world record. The SPARC T5-2 server running SPECjbb2013 is 1.8x faster than the Cisco UCS C240 M3 server (2.7 GHz Intel Xeon E5-2697 v2) based on both the SPECjbb2013-MultiJVM max-jOPS and SPECjbb2013-MultiJVM critical-jOPS metrics. The SPARC T5-2 server running SPECjbb2013 is 2x faster than the HP ProLiant ML350p Gen8 server (2.7 GHz Intel Xeon E5-2697 v2) based on SPECjbb2013-MultiJVM max-jOPS and 1.3x faster based on SPECjbb2013-MultiJVM critical-jOPS. The new Oracle results were obtained using Oracle Solaris 11 along with Oracle Java SE 8 on the SPARC T5-2 server. The SPARC T5-2 server running SPECjbb2013 on a per chip basis is 1.3x faster than the NEC Express5800/A040b server (2.8 GHz Intel Xeon E7-4890 v2) based on both the SPECjbb2013-MultiJVM max-jOPS and SPECjbb2013-MultiJVM critical-jOPS metrics. There are no IBM POWER7 or POWER7+ based server results on the SPECjbb2013 benchmark. IBM has published IBM POWER7+ based servers on the SPECjbb2005 which was retired by SPEC in 2013. Performance Landscape Results of SPECjbb2013 from www.spec.org as of March 6, 2014. These are the leading 2-chip SPECjbb2013 MultiJVM results. SPECjbb2013 - 2-Chip MultiJVM Results System Processor SPECjbb2013-MultiJVM JDK max-jOPS critical-jOPS SPARC T5-2 2xSPARC T5, 3.6 GHz 114,492 43,963 Oracle Java SE 8 Cisco UCS C240 M3 2xIntel E5-2697 v2, 2.7 GHz 63,079 23,797 Oracle Java SE 7u45 HP ProLiant ML350p Gen8 2xIntel E5-2697 v2, 2.7 GHz 62,393 24,310 Oracle Java SE 7u45 IBM System x3650 M4 BD 2xIntel E5-2695 v2, 2.4 GHz 59,124 22,275 IBM SDK V7 SR6 (*) HP ProLiant ML350p Gen8 2xIntel E5-2697 v2, 2.7 GHz 57,594 32,103 Oracle Java SE 7u40 HP ProLiant BL460c Gen8 2xIntel E5-2697 v2, 2.7 GHz 56,367 30,078 Oracle Java SE 7u40 Sun Server X4-2, DDR3-1600 2xIntel E5-2697 v2, 2.7 GHz 52,664 20,553 Oracle Java SE 7u40 HP ProLiant DL360e Gen8 2xIntel E5-2470 v2, 2.4 GHz 48,772 17,915 Oracle Java SE 7u40 * IBM SDK V7 SR6 – IBM SDK, Java Technology Edition, Version 7, Service Refresh 6 The following table compares the SPARC T5 processor to the Intel E7 v2 processor. SPECjbb2013 - Results Using JDK 8 Per Chip Comparison System SPECjbb2013-MultiJVM SPECjbb2013-MultiJVM/Chip JDK max-jOPS critical-jOPS max-jOPS critical-jOPS SPARC T5-2 2xSPARC T5, 3.6 GHz 114,492 43,963 57,246 21,981 Oracle Java SE 8 NEC Express5800/A040b 4xIntel E7-4890 v2, 2.8 GHz 177,753 65,529 44,438 16,382 Oracle Java SE 8   SPARC per Chip Advantage 1.29x 1.34x   Configuration Summary System Under Test: SPARC T5-2 server 2 x SPARC T5, 3.60 GHz 512 GB memory (32 x 16 GB dimms) Oracle Solaris 11.1 Oracle Java SE 8 Benchmark Description The SPECjbb2013 benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is relevant to all audiences who are interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community. From SPEC's press release, "SPECjbb2013 replaces SPECjbb2005. The new benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is expected to be used widely by all those interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community." SPECjbb2013 features include: A usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases and data-mining operations. Both a pure throughput metric and a metric that measures critical throughput under service-level agreements (SLAs) specifying response times ranging from 10ms to 500ms. Support for multiple run configurations, enabling users to analyze and overcome bottlenecks at multiple layers of the system stack, including hardware, OS, JVM and application layers. Exercising new Java 7 features and other important performance elements, including the latest data formats (XML), communication using compression, and messaging with security. Support for virtualization and cloud environments. See Also SPEC website SPARC T5-2 result SPARC T5-2 Server oracle.com    OTN Sun Server X4-2 oracle.com   OTN Oracle Solaris oracle.com    OTN Java oracle.com    OTN Disclosure Statement SPEC and the benchmark name SPECjbb are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results as of 3/6/2014, see http://www.spec.org for more information.  SPARC T5-2 114,492 SPECjbb2013-MultiJVM max-jOPS, 43,963 SPECjbb2013-MultiJVM critical-jOPS; NEC Express5800/A040b 177,753 SPECjbb2013-MultiJVM max-jOPS, 65,529 SPECjbb2013-MultiJVM critical-jOPS; Cisco UCS c240 M3 63,079 SPECjbb2013-MultiJVM max-jOPS, 23,797 SPECjbb2013-MultiJVM critical-jOPS; HP ProLiant ML350p Gen8 62,393 SPECjbb2013-MultiJVM max-jOPS, 24,310 SPECjbb2013-MultiJVM critical-jOPS; IBM System X3650 M4 BD 59,124 SPECjbb2013-MultiJVM max-jOPS, 22,275 SPECjbb2013-MultiJVM critical-jOPS; HP ProLiant ML350p Gen8 57,594 SPECjbb2013-MultiJVM max-jOPS, 32,103 SPECjbb2013-MultiJVM critical-jOPS; HP ProLiant BL460c Gen8 56,367 SPECjbb2013-MultiJVM max-jOPS, 30,078 SPECjbb2013-MultiJVM critical-jOPS; Sun Server X4-2 52,664 SPECjbb2013-MultiJVM max-jOPS, 20,553 SPECjbb2013-MultiJVM critical-jOPS; HP ProLiant DL360e Gen8 48,772 SPECjbb2013-MultiJVM max-jOPS, 17,915 SPECjbb2013-MultiJVM critical-jOPS.

From www.spec.org Defects Identified in SPECjbb®2013 December 9, 2014 - SPEC has identified a defect in its SPECjbb®2013 benchmark suite. SPEC has suspended sales of the benchmark software and is no...

Benchmark

SPARC M6-32 Delivers Oracle E-Business and PeopleSoft World Record Benchmarks, Linear Data Warehouse Scaling in a Virtualized Configuration

This result demonstrates how the combination of Oracle virtualization technologies for SPARC and Oracle's SPARC M6-32 server allow the deployment and concurrent high performance execution of multiple Oracle applications and databases sized for the Enterprise. In an 8-chip Dynamic Domain (also known as PDom), the SPARC M6-32 server set a World Record E-Business 12.1.3 X-Large world record with 14,660 online users running five simultaneous E-Business modules. In a second 8-chip Dynamic Domain, the SPARC M6-32 server set a World Record PeopleSoft HCM 9.1 HR Self-Service online supporting 35,000 users while simultaneously running a batch workload in 29.17 minutes.  This was done with a database of 600,480 employees.  Two other separate tests were run, one supporting 40,000 online users only and another a batch-only workload that was run in 18.27 min. In a third Dynamic Domain with 16-chips on the SPARC M6-32 server, a data warehouse test was run that showed near-linear scaling. On the SPARC M6-32 server, several critical applications instances were virtualized:  an Oracle E-Business application and database, an Oracle's PeopleSoft application and database, and a Decision Support database instance using Oracle Database 12c. In this Enterprise Virtualization benchmark a SPARC M6-32 server utilized all levels of Oracle Virtualization features available for SPARC servers. The 32-chip SPARC M6 based server was divided in three separate Dynamic Domains (also known as PDoms), available only on the SPARC Enterprise M-Series systems, which are completely electrically isolated and independent hardware partitions. Each PDom was subsequently split into multiple hypervisor-based Oracle VM for SPARC partitions (also known as LDoms), each one running its own Oracle Solaris kernel and managing its own CPUs and I/O resources. The hardware resources allocated to each Oracle VM for SPARC partition were then organized in various Oracle Solaris Zones, to further refine application tier isolation and resources management. The three PDoms were dedicated to the enterprise applications as follows: Oracle E-Business PDom: Oracle E-Business 12.1.3 Suite World Record Extra-Large benchmark, exercising five Online Modules: Customer Service, Human Resources Self Service, iProcurement, Order Management and Financial, with 14,660 users and an average user response time under 2 seconds. PeopleSoft PDom: PeopleSoft Human Capital Management (HCM) 9.1 FP2 World Record Benchmark, using PeopleTools 8.52 and an Oracle Database 11g Release 2, with 35,000 users, at an average user Search Time of 1.46 seconds and Save Time of 0.93 seconds.  An online run with 40,000 users, had an average user Search Time of 2.17 seconds and Save Time of 1.39 seconds, and a Payroll batch run completed in 29.17 minutes elapsed time for more than 500,000 employees. Decision Support PDom: An Oracle Database 12c instance executing a Decision Support workload on about 30 billion rows of data and achieving linear scalability, i.e. on the 16 chips comprising the PDom, the workload ran 16x faster than on a single chip. Specifically, the 16-chip PDom processed about 320M rows/sec whereas a single chip could process about 20M rows/sec. The SPARC M6-32 server is ideally suited for large-memory utilization. In this virtualized environment, three critical applications made use of 16 TB of physical memory. Each of the Oracle VM Server for SPARC environments utilized from 4 to 8 TB of memory, more than the limits of other virtualization solutions. SPARC M6-32 Server Virtualization Layout Highlights The Oracle E-Business application instances were run in a dedicated Dynamic Domain consisting of 8 SPARC M6 processors and 4 TB of memory. The PDom was split into four symmetric Oracle VM Server for SPARC (LDoms) environments of 2 chips and 1 TB of memory each, two dedicated to the Application Server tier and the other two to the Database Server tier. Each Logical Domain was subsequently divided into two Oracle Solaris Zones, for a total of eight, one for each E-Business Application server and one for each Oracle Database 11g instance. The PeopleSoft application was run in a dedicated Dynamic Domain (PDom) consisting of 8 SPARC M6 processors and 4 TB of memory. The PDom was split into two Oracle VM Server for SPARC (LDoms) environments one of 6 chips and 3 TB of memory, reserved for the Web and Application Server tiers, and a second one of 2 chips and 1 TB of memory, reserved for the Database tier. Two PeopleSoft Application Servers, a Web Server instance, and a single Oracle Database 11g instance were each executed in their respective and exclusive Oracle Solaris Zone. The Oracle Database 12c Decision Support workload was run in a Dynamic Domain consisting of 16 SPARC M6 processors and 8 TB of memory. All the Oracle Applications and Database instances were running at high level of performance and concurrently in a virtualized environment. Running three Enterprise level application environments on a single SPARC M6-32 server offers centralized administration, simplified physical layout, high availability and security features (as each PDom and LDom runs its own Oracle Solaris operating system copy physically and logically isolated from the other environments), enabling the coexistence of multiple versions Oracle Solaris and application software on a single physical server. Dynamic Domains and Oracle VM Server for SPARC guests were configured with independent direct I/O domains, allowing for fast and isolated I/O paths, providing secure and high performance I/O access. Performance Landscape Oracle E-Business Test using Oracle Database 11g SPARC M6-32 PDom, 8 SPARC M6 Processors, 4 TB Memory Total Online Users Weighted Average Response Time (sec) 90th Percentile Response Time (s) 14,660 0.81 0.88 Multiple Online Modules X-Large Configuration (HR Self-Service, Order Management, iProcurement, Customer Service, Financial) PeopleSoft HR Self-Service Online Plus Payroll Batch using Oracle Database 11g SPARC M6-32 PDom, 8 SPARC M6 Processors, 4 TB Memory HR Self-Service Payroll Batch Elapsed (min) Online Users Average User Search / Save Time (sec) Transactions per Second 35,000 1.46 / 0.93 116 29.17   HR Self-Service Only Payroll Batch Only Elapsed (min) 40,000 2.17 / 1.39 132 18.27 Oracle Database 12c Decision Support Query Test SPARC M6-32 PDom, 16 SPARC M6 Processors, 8 TB Memory Parallelism Chips Used Rows Processing Rate (rows/s) Scaling Normalized to 1 Chip 16 319,981,734 15.9 8 162,545,303 8.1 4 80,943,271 4.0 2 40,458,329 2.0 1 20,086,829 1.0 Configuration Summary System Under Test: SPARC M6-32 server with 32 x SPARC M6 processors (3.6 GHz) 16 TB memory Storage Configuration: 6 x Sun Storage 2540-M2 each with 8 x Expansion Trays (each tray equipped with 12 x 300 GB SAS drives) 7 x Sun Server X3-2L each with 2 x Intel Xeon E5-2609 2.4 GHz Processors 16 GB Memory 4 x Sun Flash Accelerator F40 PCIe 400 GB cards Oracle Solaris 11.1 (COMSTAR) 1 x Sun Server X3-2L with 2 x Intel Xeon E5-2609 2.4 GHz Processors 16 GB Memory 12 x 3 TB SAS disks Oracle Solaris 11.1 (COMSTAR) Software Configuration: Oracle Solaris 11.1 (11.1.10.5.0), Oracle E-Business Oracle Solaris 11.1 (11.1.10.5.0), PeopleSoft Oracle Solaris 11.1 (11.1.9.5.0), Decision Support Oracle Database 11g Release 2, Oracle E-Business and PeopleSoft Oracle Database 12c Release 1, Decision Support Oracle E-Business Suite 12.1.3 PeopleSoft Human Capital Management 9.1 FP2 PeopleSoft PeopleTools 8.52.03 Oracle Java SE 6u32 Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 043 Oracle WebLogic Server 11g (10.3.4) Oracle Dynamic Domains (PDoms) resources:   Oracle E-Business PeopleSoft Oracle DSS Processors 8 8 16 Memory 4 TB 4 TB 8 TB Oracle Solaris 11.1 (11.1.10.5.0) 11.1 (11.1.10.5.0) 11.1 (11.1.9.5.0) Oracle Database 11g 11g 12c Oracle VM for SPARC / Oracle Solaris Zones 4 LDom / 8 Zones 2 LDom / 4 Zones None Storage 7 x Sun Server X3-2L 1 x Sun Server X3-2L (12 x 3 TB SAS ) 2 x Sun Storage 2540-M2 / 2501 pairs 4 x Sun Storage 2540-M2/2501 pairs Benchmark Description This benchmark consists of three different applications running concurrently. It shows that large, enterprise workloads can be run on a single system and without performance impact between application environments. The three workloads are: Oracle E-Business Suite Online This test simulates thousands of online users executing transactions typical of an internal Enterprise Resource Processing, including 5 application modules:  Customer Service, Human Resources Self Service, Procurement, Order Management and Financial. Each database tier uses a database instance of about 600 GB in size, and supporting thousands of application users, accessing hundreds of objects (tables, indexes, SQL stored procedures, etc.). The application tier includes multiple web and application server instances, specifically Apache Web Server, Oracle Application Server 10g and Oracle Java SE 6u32. PeopleSoft Human Capital Management This test simulates thousands of online employees, managers and Human Resource administrators executing transactions typical of a Human Resources Self Service application for the Enterprise.  Typical transactions are: viewing paychecks, promoting and hiring employees, updating employee profiles, etc. The database tier uses a database instance of about 500 GB in size, containing information for 500,480 employees. The application tier for this test includes web and application server instances, specifically Oracle WebLogic Server 11g, PeopleSoft Human Capital Management 9.1 and Oracle Java SE 6u32. Decision Support Workload using the Oracle Database. The query processes 30 billion rows stored in the Oracle Database, making heavy use of Oracle parallel query processing features. It performs multiple aggregations and summaries by reading and processing all the rows of the database. Key Points and Best Practices Oracle E-Business Environment The Oracle E-Business Suite setup consisted 4 Oracle E-Business environments running 5 online Oracle E-Business modules simultaneously. The Oracle E-Business environments were deployed on 4 Oracle VM for SPARC, respectively 2 for the Application tier and 2 for the Database tier. Each LDom included 2 SPARC M6 processor chips. The Application LDom was further split into 2 Oracle Solaris Zones, each one containing one Oracle E-Business Application instance. Similarly, on the Database tier, each LDom was further divided into 2 Oracle Solaris Zones, each containing an Oracle Database instance. Applications on the same LDom shared a 10 GbE network link to connect to the Database tier LDom.  Each Application in a Zone was connected to its own dedicated Database Zone. The communication between the two Zones was implemented via Oracle Solaris 11 virtual network, which provides high performance, low latency transfers at memory speed using large frames (9000 bytes vs typical 1500 bytes frames). The Oracle E-Business setup made use of the Oracle Database Shared Server feature in order to limit memory utilization, as well as the number of database Server processes. The Oracle Database configuration and optimization was substantially out-of-the-box, except for proper sizing the Oracle Database memory areas (System Global Area and Program Global Area). In the Oracle E-Business Application LDom handling Customer Service and HR Self Service modules, 28 Forms servers and 8 OC4J application servers were hosted in the two separate Oracle Solaris Zones, for a total of 56 forms servers and 16 applications servers. All the Oracle Database server processes and the listener processes were executed in the Oracle Solaris FX scheduler class. PeopleSoft Environment The PeopleSoft Application Oracle VM for SPARC had one Oracle Solaris Zone of 12 cores containing the web tier and two Oracle Solaris Zones of 57 cores total containing the Application tier.  The Database tier was contained in an Oracle VM for SPARC consisting of one Oracle Solaris Zone of 24 cores.  One core, in the Application Oracle VM, was dedicated to network and disk interrupt handling. All database data files, recovery files and Oracle Clusterware files for the PeopleSoft test were created with the Oracle Automatic Storage Management (Oracle ASM) volume manager for the added benefit of the ease of management provided by Oracle ASM integrated storage management solution. In the application tier, 5 PeopleSoft domains with 350 application servers (70 per each domain) were hosted in the two separate Oracle Solaris Zones for a total of 10 domains with 700 application server processes. All PeopleSoft Application processes and Web Server JVM instances were executed in the Oracle Solaris FX scheduler class. Oracle Decision Support Environment The decision support workload showed how the combination of a large memory (8 TB) and a large number of processors (16 chips comprising 1536 virtual CPUs) together with Oracle parallel query facility can linearly increase the performance of certain decision support queries as the number of CPUs increase. The large memory was used to cache the entire 30 billion row Oracle table in memory. There are a number of ways to accomplish this. The method deployed in this test was to allocate sufficient memory for Oracle's "keep cache" and direct the table to the "keep cache." To demonstrate scalability, it was necessary to ensure that the number of Oracle parallel servers was always equal to the number of available virtual CPUs. This was accomplished by the combination of providing a degree of parallelism hint to the query and setting both "parallel_max_servers" and "parallel_min_servers" to the number of virtual CPUs. The number of virtual CPUs for each stage of the scalability test was adjusted using the "psradm" command available in Oracle Solaris. See Also Oracle E-Business SPARC M6-32 Report oracle.com    AWR Report 1    AWR Report 2    AWR Report 3    AWR Report 4 Oracle Applications Benchmarks Oracle R12 E-Business Standard Benchmark Overview Oracle E-Business Suite Standard Benchmark Results Oracle's PeopleSoft Benchmark White Papers SPARC M6-32 Server oracle.com    OTN Sun Flash Accelerator F40 PCIe Card oracle.com    OTN Oracle E-Business Suite oracle.com    OTN PeopleSoft Enterprise Human Capital Management oracle.com    OTN PeopleSoft Enterprise Human Capital Management (Payroll) oracle.com    OTN Oracle Database oracle.com    OTN Oracle Solaris oracle.com    OTN Disclosure Statement Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.  PeopleSoft results as of 02/14/2014. Other results as of 09/22/2013. Oracle E-Business Suite R12 extra-large multiple-online module benchmark, SPARC M6-32, SPARC M6, 3.6 GHz, 8 chips, 96 cores, 768 threads, 4 TB memory, 14,660 online users, average response time 0.81 sec, 90th percentile response time 0.88 sec, Oracle Solaris 11.1, Oracle Solaris Zones, Oracle VM for SPARC, Oracle E-Business Suite 12.1.3, Oracle Database 11g Release 2, Results as of 9/22/2013.

This result demonstrates how the combination of Oracle virtualization technologies for SPARC and Oracle's SPARC M6-32 server allow the deployment and concurrent high performance execution of multiple...

Benchmark

SPARC T5-2 Delivers World Record 2-Socket Application Server for SPECjEnterprise2010 Benchmark

Oracle's SPARC T5-2 servers have set the world record for the SPECjEnterprise2010 benchmark using two-socket application servers with a result of 17,033.54 SPECjEnterprise2010 EjOPS.  The result used two SPARC T5-2 servers, one server for the application tier and the other server for the database tier. The SPARC T5-2 server delivered 29% more performance compared to the 2-socket IBM PowerLinux server result of 13,161.07 SPECjEnterprise2010 EjOPS. The two SPARC T5-2 servers have 1.2x better price performance than the two IBM PowerLinux 7R2 POWER7+ processor-based servers (based on hardware plus software configuration costs for both tiers). The price performance of the SPARC T5-2 server is $35.99 compared to the IBM PowerLinux 7R2 at $44.75. The SPARC T5-2 server demonstrated 1.5x more performance compared to Oracle's x86-based 2-socket Sun Server X4-2 system (Ivy Bridge) result of 11,259.88 SPECjEnterprise2010 EjOPS. Oracle holds the top x86 2-socket application server SPECjEnterprise2010 result. This SPARC T5-2 server result represents the best performance per socket for a single system in the application tier of 8,516.77 SPECjEnterprise2010 EjOPS per socket. The application server used Oracle Fusion Middleware components including the Oracle WebLogic 12.1 application server and Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.7.0_45. The database server was configured with Oracle Database 12c Release 1. This result demonstrated less than 1 second average response times for all SPECjEnterprise2010 transactions and represents Jave EE 5.0 transactions generated by 139,000 users. Performance Landscape Select 2-socket single application server results.  Complete benchmark results are at the SPEC website,SPECjEnterprise2010 Results. SPECjEnterprise2010 Performance Chart 1/22/2014 Submitter EjOPS* Java EE Server DB Server Oracle 17,033.54 1 x SPARC T5-2 2 x 3.6 GHz SPARC T5 Oracle WebLogic 12c (12.1.2) 1 x SPARC T5-2 2 x 3.6 GHz SPARC T5 Oracle Database 12c (12.1.0.1) IBM 13,161.07 1x IBM PowerLinux 7R2 2 x 4.2 GHz POWER 7+ WebSphere Application Server V8.5 1x IBM PowerLinux 7R2 2 x 4.2 GHz POWER 7+ IBM DB2 10.1 FP2 Oracle 11,259.88 1x Sun Server X4-2 2 x 2.7 GHz Intel Xeon E5-2697 v2 Oracle WebLogic 12c (12.1.2) 1x Sun Server X4-2L 2 x 2.7 GHz Intel Xeon E5-2697 v2 Oracle Database 12c (12.1.0.1) * SPECjEnterprise2010 EjOPS (bigger is better) Configuration Summary Application Server: 1 x SPARC T5-2 server, with 2 x 3.6 GHz SPARC T5 processors 512 GB memory 2 x 10 GbE dual-port NIC Oracle Solaris 11.1 (11.1.13.6.0) Oracle WebLogic Server 12c (12.1.2) Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.7.0_45 Database Server: 1 x SPARC T5-2 server, with 2 x 3.6 GHz SPARC T5 processors 512 GB memory 1 x 10 GbE dual-port NIC 2 x 8 Gb FC HBA Oracle Solaris 11.1 (11.1.13.6.0) Oracle Database 12c (12.1.0.1) Storage Servers: 2 x Sun Server X4-2L (24-Drive), with 2 x 2.6 GHz Intel Xeon 64 GB memory 1 x 8 Gb FC HBA 4 x Sun Flash Accelerator F80 PCI-E Cards Oracle Solaris 11.1 Benchmark Description SPECjEnterprise2010 is the third generation of the SPEC organization's J2EE end-to-end industry standard benchmark application. The new SPECjEnterprise2010 benchmark has been re-designed and developed to cover the Java EE 5 specification's significantly expanded and simplified programming model, highlighting the major features used by developers in the industry today. This provides a real world workload driving the Application Server's implementation of the Java EE specification to its maximum potential and allowing maximum stressing of the underlying hardware and software systems, The web zone, servlets, and web services The EJB zone JPA 1.0 Persistence Model JMS and Message Driven Beans Transaction management Database connectivity Moreover, SPECjEnterprise2010 also heavily exercises all parts of the underlying infrastructure that make up the application environment, including hardware, JVM software, database software, JDBC drivers, and the system network. The primary metric of the SPECjEnterprise2010 benchmark is jEnterprise Operations Per Second (SPECjEnterprise2010 EjOPS). The primary metric for the SPECjEnterprise2010 benchmark is calculated by adding the metrics of the Dealership Management Application in the Dealer Domain and the Manufacturing Application in the Manufacturing Domain. There is NO price/performance metric in this benchmark. Key Points and Best Practices Two Oracle WebLogic server instances on the SPARC T5-2 server were hosted in 2 separate Oracle Solaris Zones. The Oracle WebLogic application servers were executed in the FX scheduling class to improve performance by  reducing the frequency of context switches. The Oracle log writer process was run in the RT scheduling class. See Also SPECjEnterprise2010 Results Page SPARC T5-2 Result Page at SPEC SPARC T5-2 Server oracle.com    OTN Sun Server X4-2L oracle.com   OTN Sun Flash Accelerator F80 PCIe Card oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle Fusion Middleware oracle.com    OTN Oracle WebLogic Suite oracle.com    OTN Disclosure Statement SPEC and the benchmark name SPECjEnterprise are registered trademarks of the Standard Performance Evaluation Corporation. Results from www.spec.org as of 1/22/2014.  SPARC T5-2, 17,033.54 SPECjEnterprise2010 EjOPS; IBM PowerLinux 7R2, 13,161.07 SPECjEnterprise2010 EjOPS; Sun Server X4-2, 11,259.88 SPECjEnterprise2010 EjOPS. The SPARC T5-2 configuration cost is the total application and database server hardware plus software. List price is $613,052 from http://www.oracle.com as of 1/22/2014. The IBM PowerLinux 7R2 configuration total hardware plus software list price is $588,970 based on public pricing from http://www.ibm.com as of 1/22/2014. Pricing does not include database storage hardware for IBM or Oracle.

Oracle's SPARC T5-2 servers have set the world record for the SPECjEnterprise2010 benchmark using two-socket application servers with a result of 17,033.54 SPECjEnterprise2010 EjOPS.  The result used...

Benchmark

World Record Single System TPC-H @10000GB Benchmark on SPARC T5-4

Oracle's SPARC T5-4 server delivered world record single server performance of 377,594 QphH@10000GB with price/performance of $4.65/QphH@10000GB USD on the TPC-H @10000GB benchmark. This result shows that the 4-chip SPARC T5-4 server is significantly faster than the 8-chip server results from HP (Intel x86 based). The SPARC T5-4 server with four SPARC T5 processors is 2.4 times faster than the HP ProLiant DL980 G7 server with eight x86 processors. The SPARC T5-4 server delivered 4.8 times better performance per chip and 3.0 times better performance per core than the HP ProLiant DL980 G7 server. The SPARC T5-4 server has 28% better price/performance than the HP ProLiant DL980 G7 server (for the price/QphH metric). The SPARC T5-4 server with 2 TB memory is 2.4 times faster than the HP ProLiant DL980 G7 server with 4 TB memory (for the composite metric). The SPARC T5-4 server took 9 hours, 37 minutes, 54 seconds for data loading while the HP ProLiant DL980 G7 server took 8.3 times longer. The SPARC T5-4 server accomplished the refresh function in around a minute, the HP ProLiant DL980 G7 server took up to 7.1 times longer to do the same function. This result demonstrates a complete data warehouse solution that shows the performance both of individual and concurrent query processing streams, faster loading, and refresh of the data during business operations. The SPARC T5-4 server delivers superior performance and cost efficiency when compared to the HP result. Performance Landscape The table lists the leading TPC-H @10000GB results for non-clustered systems. TPC-H @10000GB, Non-Clustered Systems System Processor P/C/T – Memory Composite (QphH) $/perf ($/QphH) Power (QppH) Throughput (QthH) Database Available SPARC T5-4 3.6 GHz SPARC T5 4/64/512 – 2048 GB 377,594.3 $4.65 342,714.1 416,024.4 Oracle 11g R2 11/25/13 HP ProLiant DL980 G7 2.4 GHz Intel E7-4870 8/80/160 – 4096 GB 158,108.3 $6.49 185,473.6 134,780.5 SQL Server 2012 04/15/13 P/C/T = Processors, Cores, Threads QphH = the Composite Metric (bigger is better) $/QphH = the Price/Performance metric in USD (smaller is better) QppH = the Power Numerical Quantity (bigger is better) QthH = the Throughput Numerical Quantity (bigger is better) The following table lists data load times and average refresh function times. TPC-H @10000GB, Non-Clustered Systems Database Load & Database Refresh System Processor Data Loading (h:m:s) T5 Advan RF1 (sec) T5 Advan RF2 (sec) T5 Advan SPARC T5-4 3.6 GHz SPARC T5 09:37:54 8.3x 58.8 7.1x 62.1 6.4x HP ProLiant DL980 G7 2.4 GHz Intel Xeon E7-4870 79:28:23 1.0x 416.4 1.0x 394.9 1.0x Data Loading = database load time RF1 = throughput average first refresh transaction RF2 = throughput average second refresh transaction T5 Advan = the ratio of time to the SPARC T5-4 server time Complete benchmark results found at the TPC benchmark website http://www.tpc.org. Configuration Summary and Results Server Under Test: SPARC T5-4 server 4 x SPARC T5 processors (3.6 GHz total of 64 cores, 512 threads) 2 TB memory 2 x internal SAS (2 x 300 GB) disk drives 12 x 16 Gb FC HBA External Storage: 24 x Sun Server X4-2L servers configured as COMSTAR nodes, each with 2 x 2.5 GHz Intel Xeon E5-2609 v2 processors 4 x Sun Flash Accelerator F80 PCIe Cards, 800 GB each 6 x 4 TB 7.2K RPM 3.5" SAS disks 1 x 8 Gb dual port HBA 2 x 48 port Brocade 6510 Fibre Channel Switches Software Configuration: Oracle Solaris 11.1 Oracle Database 11g Release 2 Enterprise Edition Audited Results: Database Size: 10000 GB (Scale Factor 10000) TPC-H Composite: 377,594.3 QphH@10000GB Price/performance: $4.65/QphH@10000GB USD Available: 11/25/2013 Total 3 year Cost: $1,755,709 USD TPC-H Power: 342,714.1 TPC-H Throughput: 416,024.4 Database Load Time: 9:37:54 Benchmark Description The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB, 10000GB, 30000GB and 100000GB) are not allowed by the TPC. TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system. The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multiple user modes. The benchmark requires reporting of price/performance, which is the ratio of the total HW/SW cost plus 3 years maintenance to the QphH. A secondary metric is the storage efficiency, which is the ratio of total configured disk space in GB to the scale factor. Key Points and Best Practices COMSTAR (Common Multiprotocol SCSI Target) is the software framework that enables an Oracle Solaris host to serve as a SCSI Target platform.  COMSTAR uses a modular approach to break the huge task of handling all the different pieces in a SCSI target subsystem into independent functional modules which are glued together by the SCSI Target Mode Framework (STMF). The modules implementing functionality at SCSI level (disk, tape, medium changer etc.) are not required to know about the underlying transport. And the modules implementing the transport protocol (FC, iSCSI, etc.) are not aware of the SCSI-level functionality of the packets they are transporting. The framework hides the details of allocation providing execution context and cleanup of SCSI commands and associated resources and simplifies the task of writing the SCSI or transport modules. The SPARC T5-4 server achieved a peak IO rate of 37 GB/sec from thestyle="font-family: Arial, Helvetica, sans-serif;" Oracle database configured with this storage. Twelve COMSTAR nodes were mirrored to another twelve COMSTAR nodes on which all of the Oracle database files were placed. IO performance was high and balanced across all the nodes. Oracle Solaris 11.1 required very little system tuning. Some vendors try to make the point that storage ratios are of customer concern. However, storage ratio size has more to do with disk layout and the increasing capacities of disks – so this is not an important metric when comparing systems. The SPARC T5-4 server and Oracle Solaris efficiently managed the system load of nearly two thousand Oracle Database parallel processes. See Also SPARC T5-4 Server TPC-H Executive Summary tpc.org SPARC T5-4 Server TPC-H Full Disclosure Report tpc.org Transaction Processing Performance Council (TPC) Home Page SPARC T5-4 Server oracle.com    OTN Sun Server X4-2L oracle.com   OTN Sun Flash Accelerator F80 PCIe Card oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Disclosure Statement TPC Benchmark, TPC-H, QphH, QthH, QppH are trademarks of the Transaction Processing Performance Council (TPC). Results as of 11/25/13, prices are in USD.  SPARC T5-4 www.tpc.org/3293; HP ProLiant DL980 G7 www.tpc.org/3285.

Oracle's SPARC T5-4 server delivered world record single server performance of 377,594 QphH@10000GB with price/performance of $4.65/QphH@10000GB USD on the TPC-H @10000GB benchmark. This result shows...

Benchmark

SPARC M6-32 Delivers Oracle E-Business and PeopleSoft World Record Benchmarks, Linear Data Warehouse Scaling in a Virtualized Configuration

This result has been superceded.  Please see the latest result. This result demonstrates how the combination of Oracle virtualization technologies for SPARC and Oracle's SPARC M6-32 server allow the deployment and concurrent high performance execution of multiple Oracle applications and databases sized for the Enterprise. In an 8-chip Dynamic Domain (also known as PDom), the SPARC M6-32 server set a World Record E-Business 12.1.3 X-Large world record with 14,660 online users running five simultaneous E-Business modules. In a second 8-chip Dynamic Domain, the SPARC M6-32 server set a World Record PeopleSoft HCM 9.1 HR Self-Service online supporting 34,000 users while simultaneously running a batch workload in 29.7 minutes.  This was done with a database of 600,480 employees.  In a separate test running a batch-only workload was run in 21.2 min. In a third Dynamic Domain with 16-chips on the SPARC M6-32 server, a data warehouse test was run that showed near-linear scaling. On the SPARC M6-32 server, several critical applications instances were virtualized:  an Oracle E-Business application and database, an Oracle's PeopleSoft application and database, and a Decision Support database instance using Oracle Database 12c. In this Enterprise Virtualization benchmark a SPARC M6-32 server utilized all levels of Oracle Virtualization features available for SPARC servers. The 32-chip SPARC M6 based server was divided in three separate Dynamic Domains (also known as PDoms), available only on the SPARC Enterprise M-Series systems, which are completely electrically isolated and independent hardware partitions. Each PDom was subsequently split into multiple hypervisor-based Oracle VM for SPARC partitions (also known as LDoms), each one running its own Oracle Solaris kernel and managing its own CPUs and I/O resources. The hardware resources allocated to each Oracle VM for SPARC partition were then organized in various Oracle Solaris Zones, to further refine application tier isolation and resources management. The three PDoms were dedicated to the enterprise applications as follows: Oracle E-Business PDom: Oracle E-Business 12.1.3 Suite World Record Extra-Large benchmark, exercising five Online Modules: Customer Service, Human Resources Self Service, iProcurement, Order Management and Financial, with 14,660 users and an average user response time under 2 seconds. PeopleSoft PDom: PeopleSoft Human Capital Management (HCM) 9.1 FP2 World Record Benchmark, using PeopleTools 8.52 and an Oracle Database 11g Release 2, with 34,000 users, at an average user Search Time of 1.11 seconds and Save Time of 0.77 seconds, and a Payroll batch run completed in 29.7 minutes elapsed time for more than 500,000 employees. Decision Support PDom: An Oracle Database 12c instance executing a Decision Support workload on about 30 billion rows of data and achieving linear scalability, i.e. on the 16 chips comprising the PDom, the workload ran 16x faster than on a single chip. Specifically, the 16-chip PDom processed about 320M rows/sec whereas a single chip could process about 20M rows/sec. The SPARC M6-32 server is ideally suited for large-memory utilization. In this virtualized environment, three critical applications made use of 16 TB of physical memory. Each of the Oracle VM Server for SPARC environments utilized from 4 to 8 TB of memory, more than the limits of other virtualization solutions. SPARC M6-32 Server Virtualization Layout Highlights The Oracle E-Business application instances were run in a dedicated Dynamic Domain consisting of 8 SPARC M6 processors and 4 TB of memory. The PDom was split into four symmetric Oracle VM Server for SPARC (LDoms) environments of 2 chips and 1 TB of memory each, two dedicated to the Application Server tier and the other two to the Database Server tier. Each Logical Domain was subsequently divided into two Oracle Solaris Zones, for a total of eight, one for each E-Business Application server and one for each Oracle Database 11g instance. The PeopleSoft application was run in a dedicated Dynamic Domain (PDom) consisting of 8 SPARC M6 processors and 4 TB of memory. The PDom was split into two Oracle VM Server for SPARC (LDoms) environments one of 6 chips and 3 TB of memory, reserved for the Web and Application Server tiers, and a second one of 2 chips and 1 TB of memory, reserved for the Database tier. Two PeopleSoft Application Servers, a Web Server instance, and a single Oracle Database 11g instance were each executed in their respective and exclusive Oracle Solaris Zone. The Oracle Database 12c Decision Support workload was run in a Dynamic Domain consisting of 16 SPARC M6 processors and 8 TB of memory. All the Oracle Applications and Database instances were running at high level of performance and concurrently in a virtualized environment. Running three Enterprise level application environments on a single SPARC M6-32 server offers centralized administration, simplified physical layout, high availability and security features (as each PDom and LDom runs its own Oracle Solaris operating system copy physically and logically isolated from the other environments), enabling the coexistence of multiple versions Oracle Solaris and application software on a single physical server. Dynamic Domains and Oracle VM Server for SPARC guests were configured with independent direct I/O domains, allowing for fast and isolated I/O paths, providing secure and high performance I/O access. Performance Landscape Oracle E-Business Test using Oracle Database 11g SPARC M6-32 PDom, 8 SPARC M6 Processors, 4 TB Memory Total Online Users Weighted Average Response Time (sec) 90th Percentile Response Time (s) 14,660 0.81 0.88 Multiple Online Modules X-Large Configuration (HR Self-Service, Order Management, iProcurement, Customer Service, Financial)   PeopleSoft HR Self-Service Online Plus Payroll Batch using Oracle Database 11g SPARC M6-32 PDom, 8 SPARC M6 Processors, 4 TB Memory HR Self-Service Payroll Batch Elapsed (min) Online Users Average User Search / Save Time (sec) Transactions per Second 34,000 1.11 / 0.77 113 29.7   Payroll Batch Only Elapsed (min) 21.17   Oracle Database 12c Decision Support Query Test SPARC M6-32 PDom, 16 SPARC M6 Processors, 8 TB Memory Parallelism Chips Used Rows Processing Rate (rows/s) Scaling Normalized to 1 Chip 16 319,981,734 15.9 8 162,545,303 8.1 4 80,943,271 4.0 2 40,458,329 2.0 1 20,086,829 1.0 Configuration Summary System Under Test: SPARC M6-32 server with 32 x SPARC M6 processors (3.6 GHz) 16 TB memory Storage Configuration: 6 x Sun Storage 2540-M2 each with 8 x Expansion Trays (each tray equipped with 12 x 300 GB SAS drives) 7 x Sun Server X3-2L each with 2 x Intel Xeon E5-2609 2.4 GHz Processors 16 GB Memory 4 x Sun Flash Accelerator F40 PCIe 400 GB cards Oracle Solaris 11.1 (COMSTAR) 1 x Sun Server X3-2L with 2 x Intel Xeon E5-2609 2.4 GHz Processors 16 GB Memory 12 x 3 TB SAS disks Oracle Solaris 11.1 (COMSTAR) Software Configuration: Oracle Solaris 11.1 (11.1.10.5.0), Oracle E-Business Oracle Solaris 11.1 (11.1.10.5.0), PeopleSoft Oracle Solaris 11.1 (11.1.9.5.0), Decision Support Oracle Database 11g Release 2, Oracle E-Business and PeopleSoft Oracle Database 12c Release 1, Decision Support Oracle E-Business Suite 12.1.3 PeopleSoft Human Capital Management 9.1 FP2 PeopleSoft PeopleTools 8.52.03 Oracle Java SE 6u32 Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 043 Oracle WebLogic Server 11g (10.3.4) Oracle Dynamic Domains (PDoms) resources:   Oracle E-Business PeopleSoft Oracle DSS Processors 8 8 16 Memory 4 TB 4 TB 8 TB Oracle Solaris 11.1 (11.1.10.5.0) 11.1 (11.1.10.5.0) 11.1 (11.1.9.5.0) Oracle Database 11g 11g 12c Oracle VM for SPARC / Oracle Solaris Zones 4 LDom / 8 Zones 2 LDom / 4 Zones None Storage 7 x Sun Server X3-2L 1 x Sun Server X3-2L (12 x 3 TB SAS ) 2 x Sun Storage 2540-M2 / 2501 pairs 4 x Sun Storage 2540-M2/2501 pairs Benchmark Description This benchmark consists of three different applications running concurrently. It shows that large, enterprise workloads can be run on a single system and without performance impact between application environments. The three workloads are: Oracle E-Business Suite Online This test simulates thousands of online users executing transactions typical of an internal Enterprise Resource Processing, including 5 application modules:  Customer Service, Human Resources Self Service, Procurement, Order Management and Financial. Each database tier uses a database instance of about 600 GB in size, and supporting thousands of application users, accessing hundreds of objects (tables, indexes, SQL stored procedures, etc.). The application tier includes multiple web and application server instances, specifically Apache Web Server, Oracle Application Server 10g and Oracle Java SE 6u32. PeopleSoft Human Capital Management This test simulates thousands of online employees, managers and Human Resource administrators executing transactions typical of a Human Resources Self Service application for the Enterprise.  Typical transactions are: viewing paychecks, promoting and hiring employees, updating employee profiles, etc. The database tier uses a database instance of about 500 GB in size, containing information for 500,480 employees. The application tier for this test includes web and application server instances, specifically Oracle WebLogic Server 11g, PeopleSoft Human Capital Management 9.1 and Oracle Java SE 6u32. Decision Support Workload using the Oracle Database. The query processes 30 billion rows stored in the Oracle Database, making heavy use of Oracle parallel query processing features. It performs multiple aggregations and summaries by reading and processing all the rows of the database. Key Points and Best Practices Oracle E-Business Environment The Oracle E-Business Suite setup consisted 4 Oracle E-Business environments running 5 online Oracle E-Business modules simultaneously.  The Oracle E-Business environments were deployed on 4 Oracle VM for SPARC, respectively 2 for the Application tier and 2 for the Database tier. Each LDom included 2 SPARC M6 processor chips. The Application LDom was further split into 2 Oracle Solaris Zones, each one containing one Oracle E-Business Application instance. Similarly, on the Database tier, each LDom was further divided into 2 Oracle Solaris Zones, each containing an Oracle Database instance. Applications on the same LDom shared a 10 GbE network link to connect to the Database tier LDom.  Each Application in a Zone was connected to its own dedicated Database Zone. The communication between the two Zones was implemented via Oracle Solaris 11 virtual network, which provides high performance, low latency transfers at memory speed using large frames (9000 bytes vs typical 1500 bytes frames). The Oracle E-Business setup made use of the Oracle Database Shared Server feature in order to limit memory utilization, as well as the number of database Server processes. The Oracle Database configuration and optimization was substantially out-of-the-box, except for proper sizing the Oracle Database memory areas (System Global Area and Program Global Area). In the Oracle E-Business Application LDom handling Customer Service and HR Self Service modules, 28 Forms servers and 8 OC4J application servers were hosted in the two separate Oracle Solaris Zones, for a total of 56 forms servers and 16 applications servers. All the Oracle Database server processes and the listener processes were executed in the Oracle Solaris FX scheduler class. PeopleSoft Environment The PeopleSoft Application Oracle VM for SPARC had one Oracle Solaris Zone of 12 cores containing the web tier and two Oracle Solaris Zones of 28 cores each containing the Application tier.  The Database tier was contained in an Oracle VM for SPARC consisting of one Oracle Solaris Zone of 24 cores.  One and a half cores, in the Application Oracle VM, were dedicated to network and disk interrupt handling. All database data files, recovery files and Oracle Clusterware files for the PeopleSoft test were created with the Oracle Automatic Storage Management (Oracle ASM) volume manager for the added benefit of the ease of management provided by Oracle ASM integrated storage management solution. In the application tier, 5 PeopleSoft domains with 350 application servers (70 per each domain) were hosted in the two separate Oracle Solaris Zones for a total of 10 domains with 700 application server processes. All PeopleSoft Application processes and Web Server JVM instances were executed in the Oracle Solaris FX scheduler class. Oracle Decision Support Environment The decision support workload showed how the combination of a large memory (8 TB) and a large number of processors (16 chips comprising 1536 virtual CPUs) together with Oracle parallel query facility can linearly increase the performance of certain decision support queries as the number of CPUs increase. The large memory was used to cache the entire 30 billion row Oracle table in memory. There are a number of ways to accomplish this. The method deployed in this test was to allocate sufficient memory for Oracle's "keep cache" and direct the table to the "keep cache." To demonstrate scalability, it was necessary to ensure that the number of Oracle parallel servers was always equal to the number of available virtual CPUs. This was accomplished by the combination of providing a degree of parallelism hint to the query and setting both "parallel_max_servers" and "parallel_min_servers" to the number of virtual CPUs. The number of virtual CPUs for each stage of the scalability test was adjusted using the "psradm" command available in Oracle Solaris. See Also Oracle E-Business SPARC M6-32 Report oracle.com    AWR Report 1    AWR Report 2    AWR Report 3    AWR Report 4 Oracle Applications Benchmarks Oracle R12 E-Business Standard Benchmark Overview Oracle E-Business Suite Standard Benchmark Results Oracle's PeopleSoft Benchmark White Papers SPARC M6-32 Server oracle.com    OTN Sun ZFS Storage 7420 Appliance oracle.com    OTN Sun Flash Accelerator F40 PCIe Card oracle.com    OTN Oracle E-Business Suite oracle.com    OTN PeopleSoft Enterprise Human Capital Management oracle.com    OTN PeopleSoft Enterprise Human Capital Management (Payroll) oracle.com    OTN Oracle Database oracle.com    OTN Oracle Solaris oracle.com    OTN Disclosure Statement Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 09/22/2013. Oracle E-Business Suite R12 extra-large multiple-online module benchmark, SPARC M6-32, SPARC M6, 3.6 GHz, 8 chips, 96 cores, 768 threads, 4 TB memory, 14,660 online users, average response time 0.81 sec, 90th percentile response time 0.88 sec, Oracle Solaris 11.1, Oracle Solaris Zones, Oracle VM for SPARC, Oracle E-Business Suite 12.1.3, Oracle Database 11g Release 2, Results as of 9/20/2013.

This result has been superceded.  Please see the latest result. This result demonstrates how the combination of Oracle virtualization technologies for SPARC and Oracle's SPARC M6-32 server allow the...

Benchmark

SPARC T5-8 Delivers World Record Single Server SPECjEnterprise2010 Benchmark, Utilizes Virtualized Environment

Oracle produced a world record single-server SPECjEnterprise2010 benchmark result of 36,571.36 SPECjEnterprise2010 EjOPS using one of Oracle's SPARC T5-8 servers for both the application and the database tier.  Oracle VM Server for SPARC was used to virtualize the system to achieve this result. The 8-chip SPARC T5 processor based server is 3.3x faster than the 8-chip IBM Power 780 server (POWER7+ processor based). The SPARC T5-8 has 4.4x better price performance than the IBM Power 780, a POWER7+ processor based server (based on hardware plus software configuration costs).  The price performance of the SPARC T5-8 server is $40.68 compared to the IBM Power 780 at $177.41.  The IBM Power 780, POWER7+ based system has 1.2x better performance per core, but this did not reduce the total software and hardware cost to the customer. As shown by this comparison, performance-per-core is a poor predictor of characteristics relevant to customers.  The SPARC T5-8 virtualized price performance was also less than the low-end IBM PowerLinux 7R2 at $62.26. The SPARC T5-8 server ran the Oracle Solaris 11.1 operating system and used Oracle VM Server for SPARC to consolidate ten Oracle WebLogic application server instances and one database server instance to achieve this result. This result demonstrated sub-second average response times for all SPECjEnterprise2010 transactions and represents JEE 5.0 transactions generated by 299,000 users. The SPARC T5-8 server requires only 8 rack units, the same as the space of the IBM Power 780. In this configuration IBM has a hardware core density of 4 cores per rack unit which contrasts with the 16 cores per rack unit for the SPARC T5-8 server. This again demonstrates why performance-per-core is a poor predictor of characteristics relevant to customers. The application server used Oracle Fusion Middleware components including the Oracle WebLogic 12.1 application server and Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.7.0_25.  The database server was configured with Oracle Database 12c Release 1. The SPARC T5-8 server is 2.8x faster than a non-virtualized IBM POWER7+ based server result (one server for application and one server for database), the IBM PowerLinux 7R2 achieved 13,161.07 SPECjEnterprise2010 EjOPS. Performance Landscape SPECjEnterprise2010 Performance Chart Only Three Virtualized Results (App+DB on 1 Server) as of 9/23/2013 Submitter EjOPS* Chips per Server Java EE Server & DB Server App DB Oracle 36,571.36 5 3 1 x SPARC T5-8 8 chips, 128 cores, 3.6 GHz SPARC T5 Oracle WebLogic 12c (12.1.2) Oracle Database 12c (12.1.0.1) Oracle 27,843.57 4 4 1 x SPARC T5-8 8 chips, 128 cores, 3.6 GHz SPARC T5 Oracle WebLogic 12c (12.1.1) Oracle Database 11g (11.2.0.3) IBM 10,902.30 4 4 1 x IBM Power 780 8 chips, 32 cores, 4.42 GHz POWER7+ WebSphere Application Server V8.5 IBM DB2 Universal Database 10.1 * SPECjEnterprise2010 EjOPS (bigger is better) Complete benchmark results are at the SPEC website, SPECjEnterprise2010 Results. Configuration Summary Oracle Summary Application and Database Server: 1 x SPARC T5-8 server, with 8 x 3.6 GHz SPARC T5 processors 2 TB memory 9 x 10 GbE dual-port NIC 6 x 8 Gb dual-port HBA Oracle Solaris 11.1 SRU 10.5 Oracle VM Server for SPARC Oracle WebLogic Server 12c (12.1.2) Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.7.0_25 Oracle Database 12c (12.1.0.1) Storage Servers: 6 x Sun Server X3-2L (12-Drive), with 2 x 2.4 GHz Intel Xeon 16 GB memory 1 x 8 Gb FC HBA 4 x Sun Flash Accelerator F40 PCI-E Card Oracle Solaris 11.1 2 x Sun Storage 2540-M2 Array 12 x 600 GB 15K RPM SAS HDD Switch Hardware: 1 x Sun Network 10 GbE 72-port Top of Rack (ToR) Switch IBM Summary Application and Database Server: 1 x IBM Power 780 server, with 8 x 4.42 GHz POWER7+ processors 786 GB memory 6 x 10 GbE dual-port NIC 3 x 8 Gb four-port HBA IBM AIX V7.1 TL2 IBM WebSphere Application Server V8.5 IBM J9 VM (build 2.6, JRE 1.7.0 IBM J9 AIX ppc-32) IBM DB2 10.1 IBM InfoSphere Optim pureQuery Runtime v3.1.1 Storage: 2 x DS5324 Disk System with 48 x 146 GB 15K E-DDM Disks 1 x v7000 Disk Controller with 16 x 400 GB SSD Disks Benchmark Description SPECjEnterprise2010 is the third generation of the SPEC organization's J2EE end-to-end industry standard benchmark application. The new SPECjEnterprise2010 benchmark has been re-designed and developed to cover the Java EE 5 specification's significantly expanded and simplified programming model, highlighting the major features used by developers in the industry today. This provides a real world workload driving the Application Server's implementation of the Java EE specification to its maximum potential and allowing maximum stressing of the underlying hardware and software systems, The web zone, servlets, and web services The EJB zone JPA 1.0 Persistence Model JMS and Message Driven Beans Transaction management Database connectivity Moreover, SPECjEnterprise2010 also heavily exercises all parts of the underlying infrastructure that make up the application environment, including hardware, JVM software, database software, JDBC drivers, and the system network. The primary metric of the SPECjEnterprise2010 benchmark is jEnterprise Operations Per Second (SPECjEnterprise2010 EjOPS). The primary metric for the SPECjEnterprise2010 benchmark is calculated by adding the metrics of the Dealership Management Application in the Dealer Domain and the Manufacturing Application in the Manufacturing Domain. There is NO price/performance metric in this benchmark. Key Points and Best Practices Ten Oracle WebLogic server instances on the SPARC T5-8 server were hosted in 10 separate Oracle Solaris Zones within a separate guest domain on 80 cores (5 cpu chips). The database ran in a separate guest domain consisting of 47 cores (3 cpu chips).  One core was reserved for the primary domain. The Oracle WebLogic application servers were executed in the FX scheduling class to improve performance by reducing the frequency of context switches. The Oracle log writer process was run in the FX scheduling class at processor priority 60 to use the Critical Thread feature. See Also SPECjEnterprise2010 Results Page SPARC T5-8 Result Page at SPEC SPARC T5-8 Server oracle.com    OTN Sun Flash Accelerator F40 PCIe Card oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Oracle Fusion Middleware oracle.com    OTN Oracle WebLogic Suite oracle.com    OTN Disclosure Statement SPEC and the benchmark name SPECjEnterprise are registered trademarks of the Standard Performance Evaluation Corporation.  Results from www.spec.org as of 9/23/2013.  SPARC T5-8, 36,571.36 SPECjEnterprise2010 EjOPS (using Oracle VM for SPARC and 5+3 split); SPARC T5-8, 27,843.57 SPECjEnterprise2010 EjOPS (using Oracle Zones and 4+4 split); IBM Power 780, 10,902.30 SPECjEnterprise2010 EjOPS; IBM PowerLinux 7R2, 13,161.07 SPECjEnterprise2010 EjOPS. SPARC T5-8 server total hardware plus software list price is $1,487,792 from http://www.oracle.com as of 9/20/2013.  IBM Power 780 server total hardware plus software cost of $1,934,162 based on public pricing from http://www.ibm.com as of 5/22/2013. IBM PowerLinux 7R2 server total hardware plus software cost of $819,451 based on whywebsphere.com/2013/04/29/weblogic-12c-on-oracle-sparc-t5-8-delivers-half-the-transactions-per-core-at-double-the-cost-of-the-websphere-on-ibm-power7/  retrieved 9/20/2013.

Oracle produced a world record single-server SPECjEnterprise2010 benchmark result of 36,571.36 SPECjEnterprise2010 EjOPS using one of Oracle's SPARC T5-8 servers for both the application and...

Benchmark

SPARC T5-2 Server Beats x86 Server on Oracle Database Transparent Data Encryption

Database security is becoming increasingly important.  Oracle Database Advanced Security Transparent Data Encryption (TDE) stops would-be attackers from bypassing the database and reading sensitive information from storage by enforcing data-at-rest encryption in the database layer.  Oracle's SPARC T5-2 server outperformed x86 systems when running Oracle Database 12c with Transparent Data Encryption. The SPARC T5-2 server sustained more than 8.0 GB/sec of read bandwidth while decrypting using Transparent Data Encryption (TDE) in Oracle Database 12c. This was the bandwidth available on the system and matched the rate for querying the non-encrypted data. The SPARC T5-2 server achieves about 1.5x higher decryption rate per socket using Oracle Database 12c with TDE than a Sun Server X4-2 system. The SPARC T5-2 server achieves more than double the decryption rate per socket using Oracle Database 12c with TDE than a Sun Server X3-2 system. Performance Landscape Table of Size 250 GB Encrypted with AES-128-CFB Full Table Scan with Degree of Parallelism 128 System Chips Table Data Format SPARC T5-2 Advantage Clear Encrypted SPARC T5-2 2 8.4 GB/sec 8.3 GB/sec 1.0 Sun Server X4-2L 2 8.2 GB/sec 5.6 GB/sec 1.5   SPARC T5-2 1 8.4 GB/sec 4.2 GB/sec 1.0 Sun Server X4-2L 1 8.2 GB/sec 2.8 GB/sec 1.5 Sun Server X3-2L 1 8.2 GB/sec 2.0 GB/sec 2.1   Configuration Summary Systems Under Test: SPARC T5-2 2 x SPARC T5 processors, 3.6 GHz 256 GB memory Oracle Solaris 11.1 Oracle Database 12c Sun Server X3-2L 2 x Intel Xeon E5-2690 processor, 2.90 GHz 64 GB memory Oracle Solaris 11.1 Oracle Database 12c Sun Server X4-2L 2 x Intel Xeon E5-2697 v2 processor, 2.70 GHz 256 GB memory Oracle Solaris 11.1 Oracle Database 12c   Storage: Flash Storage   Benchmark Description The purpose of the benchmark is to show the query performance of a database using data encryption to keep the data secure. The benchmark creates a 250 GB table. It is loaded both into a clear text (no encryption) tablespace and an AES-128 encrypted tablespace. Full table scans of the tables were timed. Key Points and Best Practices The Oracle Database feature, Transparent Data Encryption (TDE), simplifies the encryption of data within datafiles, preventing unauthorized access to it from the operating system. Transparent Data Encryption allows encryption of the entire contents of a tablespace. With hardware acceleration of the encryption routines, the SPARC T5-2 server can achieve nearly the same query rate whether the table is encrypted or not up to a limit of about 4 GB/sec per chip. See Also SPARC T5-2 Server oracle.com    OTN Sun Server X4-2 oracle.com   OTN Oracle Solaris oracle.com    OTN Oracle Database – Transparent Data Encryption oracle.com    OTN Oracle Database oracle.com    OTN Disclosure Statement Copyright 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 23 September 2013.

Database security is becoming increasingly important.  Oracle Database Advanced Security Transparent Data Encryption (TDE) stops would-be attackers from bypassing the database and reading sensitive...

Benchmark

SPARC T5-8 Delivers World Record Oracle OLAP Perf Version 3 Benchmark Result on Oracle Database 12c

Oracle's SPARC T5-8 server delivered world record query performance for systems running Oracle Database 12c for the Oracle OLAP Perf Version 3 benchmark. The query throughput on the SPARC T5-8 server is 1.7x higher than that of an 8-chip Intel Xeon E7-8870 server. Both systems had sub-second average response times. The SPARC T5-8 server with the Oracle Database demonstrated the ability to support at least 700 concurrent users querying OLAP cubes (with no think time), processing 2.33 million analytic queries per hour with an average response time of less than 1 second per query. This performance was enabled by keeping the entire cube in-memory utilizing the 4 TB of memory on the SPARC T5-8 server. Assuming a 60 second think time between query requests, the SPARC T5-8 server can support approximately 39,450 concurrent users with the same sub-second response time. The workload uses a set of realistic Business Intelligence (BI) queries that run against an OLAP cube based on a 4 billion row fact table of sales data. The 4 billion rows are partitioned by month spanning 10 years. The combination of the Oracle Database 12cwith the Oracle OLAP option running on a SPARC T5-8 server supports live data updates occurring concurrently with minimally impacted user query executions. Performance Landscape Oracle OLAP Perf Version 3 Benchmark Oracle cube base on 4 billion fact table rows 10 years of data partitioned by month System Queries/ hour Users Average Response Time (sec) 0 sec think time 60 sec think time SPARC T5-8 2,329,000 700 39,450 <1 sec 8-chip Intel Xeon E7-8870 1,354,000 120 22,675 <1 sec Configuration Summary SPARC T5-8: 1 x SPARC T5-8 server with 8 x SPARC T5 processors, 3.6 GHz 4 TB memory Data Storage and Redo Storage Flash Storage Oracle Solaris 11.1 (11.1.8.2.0) Oracle Database 12c Release 1 (12.1.0.1) with Oracle OLAP option Sun Server X2-8: 1 x Sun Server X2-8 with 8 x Intel Xeon E7-8870 processors, 2.4 GHz 1 TB memory Data Storage and Redo Storage Flash Storage Oracle Solaris 10 10/12 Oracle Database 12c Release 1 (12.1.0.1) with Oracle OLAP option Benchmark Description The Oracle OLAP Perf Version 3 benchmark is a workload designed to demonstrate and stress the ability of the OLAP Option to deliver fast query, near real-time updates and rich calculations using a multi-dimensional model in the context of the Oracle data warehousing. The bulk of the benchmark entails running a number of concurrent users, each issuing typical multidimensional queries against an Oracle cube. The cube has four dimensions: time, product, customer, and channel. Each query user issues approximately 150 different queries. One query chain may ask for total sales in a particular region (e.g South America) for a particular time period (e.g. Q4 of 2010) followed by additional queries which drill down into sales for individual countries (e.g. Chile, Peru, etc.) with further queries drilling down into individual stores, etc. Another query chain may ask for yearly comparisons of total sales for some product category (e.g. major household appliances) and then issue further queries drilling down into particular products (e.g. refrigerators, stoves. etc.), particular regions, particular customers, etc. While the core of every OLAP Perf benchmark is real world query performance, the benchmark itself offers numerous execution options such as varying data set sizes, number of users, numbers of queries for any given user and cube update frequency. Version 3 of the benchmark is executed with a much larger number of query streams than previous versions and used a cube designed for near real-time updates. The results produced by version 3 of the benchmark are not directly comparable to results produced by previous versions of the benchmark. The near real-time update capability is implemented along the following lines. A large Oracle cube, H, is built from a 4 billion row star schema, containing data up until the end of last business day. A second small cube, D, is then created which will contain all of today's new data coming in from outside the world. It will be updated every L minutes with the data coming in within the last L minutes. A third cube, R, joins cubes H and D for reporting purposes much like a view might join data from two tables. Calculations are installed into cube R. The use of a reporting cube which draws data from different storage cubes is a common practice. Query users are never locked out of query operations while new data is added to the update cube. The point of the demonstration is to show that an Oracle OLAP system can be designed which results in data being no more than L minutes out of date, where L may be as low as just a few minutes. This is what is meant by near real-time analytics. Key Points and Best Practices Building and querying cubes with the Oracle OLAP option requires a large temporary tablespace. Normally temporary tablespaces would reside on disk storage. However, because the SPARC T5-8 server used in this benchmark had 4 TB of main memory, it was possible to use main memory for the OLAP temporary tablespace. This was accomplished by using a temporary, memory-based file system (TMPFS) for the temporary tablespace datafiles. Since typical business intelligence users are often likely to issue similar queries, either with the same or different constants in the where clauses, setting the init.ora parameter "cursor_sharing" to "force" provides for additional query throughput and a larger number of potential users. Assuming the normal Oracle Database initialization parameters (e.g. SGA, PGA, processes etc.) are appropriately set, out of the box performance for the Oracle OLAP workload should be close to what is reported here. Additional performance resulted from using memory for the OLAP temporary tablespace setting "cursor_sharing" to force. Oracle OLAP Cube update performance was optimized by running update processes in the FX class with a priority greater than 0. The maximum lag time between updates to the source fact table and data availability to query users (what was referred to as L in the benchmark description) was less than 3 minutes for the benchmark environment on the SPARC T5-8 server. See Also SPARC T5-8 Server oracle.com    OTN Oracle Database – Oracle OLAP oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database oracle.com    OTN Disclosure Statement Copyright 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 09/22/2013.

Oracle's SPARC T5-8 server delivered world record query performance for systems running Oracle Database 12c for the Oracle OLAP Perf Version 3 benchmark. The query throughput on the SPARC T5-8 server...

Benchmark

SPARC T5 Encryption Performance Tops Intel E5-2600 v2 Processor

The cryptography benchmark suite was developed by Oracle to measure security performance on important AES security modes. Oracle's SPARC T5 processor with it security software in silicon is faster than x86 servers that have the AES-NI instructions. In this test, the performance of on-processor encryption operations is measured (32 KB encryptions). Multiple threads are used to measure each processors maximum throughput. The SPARC T5-8 shows dramatically faster encryption. A SPARC T5 processor running Oracle Solaris 11.1 is 2.7 times faster executing AES-CFB 256-bit key encryption (in cache) than the Intel E5-2697 v2 processor (with AES-NI) running Oracle Linux 6.3. AES-CFB encryption is used by Oracle Database for Transparent Data Encryption (TDE) which provides security for database storage. On the AES-CFB 128-bit key encryption, the SPARC T5 processor is 2.5 times faster than the Intel E5-2697 v2 processor (with AES-NI) running Oracle Linux 6.3 for in-cache encryption. AES-CFB mode is used by Oracle Database for Transparent Data Encryption (TDE) which provides security for database storage. The IBM POWER7+ has three hardware security units for 8-core processors, but IBM has not publicly shown any measured performance results on AES-CFB or other encryption modes. Performance Landscape Presented below are results for running encryption using the AES cipher with the CFB, CBC, CCM and GCM modes for key sizes of 128, 192 and 256. Decryption performance was similar and is not presented. Results are presented as MB/sec (10**6). Encryption Performance – AES-CFB Performance is presented for in-cache AES-CFB128 mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are  presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CFB Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CFB SPARC T5 3.60 2 54,396 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 19,960 Oracle Linux 6.3, IPP/AES-NI Intel E5-2690 2.90 2 12,823 Oracle Linux 6.3, IPP/AES-NI AES-192-CFB SPARC T5 3.60 2 61,000 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 23,217 Oracle Linux 6.3, IPP/AES-NI Intel E5-2690 2.90 2 14,928 Oracle Linux 6.3, IPP/AES-NI AES-128-CFB SPARC T5 3.60 2 68,695 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 27,740 Oracle Linux 6.3, IPP/AES-NI Intel E5-2690 2.90 2 17,824 Oracle Linux 6.3, IPP/AES-NI Encryption Performance – AES-GCM Performance is presented for in-cache AES-GCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run). AES-GCM Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-GCM SPARC T5 3.60 2 34,101 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 15,338 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 13,520 Oracle Linux 6.3, IPP/AES-NI AES-192-GCM SPARC T5 3.60 2 36,852 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 15,768 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 14,159 Oracle Linux 6.3, IPP/AES-NI AES-128-GCM SPARC T5 3.60 2 39,003 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 16,405 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 14,877 Oracle Linux 6.3, IPP/AES-NI Encryption Performance – AES-CCM Performance is presented for in-cache AES-CCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run). AES-CCM Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CCM SPARC T5 3.60 2 29,431 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 19,447 Oracle Linux 6.3, IPP/AES-NI Intel E5-2690 2.90 2 12,493 Oracle Linux 6.3, IPP/AES-NI AES-192-CCM SPARC T5 3.60 2 33,715 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 22,634 Oracle Linux 6.3, IPP/AES-NI Intel E5-2690 2.90 2 14,507 Oracle Linux 6.3, IPP/AES-NI AES-128-CCM SPARC T5 3.60 2 39,188 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 26,951 Oracle Linux 6.3, IPP/AES-NI Intel E5-2690 2.90 2 17,256 Oracle Linux 6.3, IPP/AES-NI Encryption Performance – AES-CBC Performance is presented for in-cache AES-CBC mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CBC Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CBC SPARC T5 3.60 2 56,933 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 19,962 Oracle Linux 6.3, IPP/AES-NI Intel E5-2690 2.90 2 12,822 Oracle Linux 6.3, IPP/AES-NI AES-192-CBC SPARC T5 3.60 2 63,767 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 23,224 Oracle Linux 6.3, IPP/AES-NI Intel E5-2690 2.90 2 14,915 Oracle Linux 6.3, IPP/AES-NI AES-128-CBC SPARC T5 3.60 2 72,508 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2697 v2 2.70 2 27,733 Oracle Linux 6.3, IPP/AES-NI Intel E5-2690 2.90 2 17,823 Oracle Linux 6.3, IPP/AES-NI Configuration Summary SPARC T5-2 server 2 x SPARC T5 processor, 3.6 GHz 512 GB memory Oracle Solaris 11.1 SRU 4.2 Sun Server X4-2L server 2 x E5-2697 v2 processors, 2.70 GHz 256 GB memory Oracle Linux 6.3 Sun Server X3-2 server 2 x E5-2690 processors, 2.90 GHz 128 GB memory Oracle Linux 6.3 Benchmark Description The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-cache (32 KB encryptions) and on-chip using various ciphers, including AES-128-CFB, AES-192-CFB, AES-256-CFB, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CCM, AES-192-CCM, AES-256-CCM, AES-128-GCM, AES-192-GCM and AES-256-GCM. The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various ciphers. They were run using optimized libraries for each platform to obtain the best possible performance. See Also More about AES SPARC T5-2 Server oracle.com    OTN Sun Server X4-2L oracle.com   OTN Sun Server X3-2 oracle.com   OTN Oracle Solaris oracle.com    OTN Disclosure Statement Copyright 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/23/2013.

The cryptography benchmark suite was developed by Oracle to measure security performance on important AES security modes. Oracle's SPARC T5 processor with it security software in silicon is faster...

Benchmark

Sun Server X4-2 Delivers Single App Server, 2-Chip x86 World Record SPECjEnterprise2010

Oracle's Sun Server X4-2 and Sun Server X4-2L servers, using the Intel Xeon E5-2697 v2 processor, produced a world record x86 two-chip single application server SPECjEnterprise2010 benchmark result of 11,259.88 SPECjEnterprise2010 EjOPS. The Sun Server X4-2 ran the application tier and the Sun Server X4-2L was used for the database tier. The 2-socket Sun Server X4-2 demonstrated 16% better performance when compared to the 2-socket IBM X3650 M4 server result of 9,696.43 SPECjEnterprise2010 EjOPS. This result used Oracle WebLogic Server 12c, Java HotSpot(TM) 64-Bit Server 1.7.0_40, Oracle Database 12c, and Oracle Linux. Performance Landscape Complete benchmark results are at the SPEC website, SPECjEnterprise2010 Results. The table below shows the top single application server, two-chip x86 results. SPECjEnterprise2010 Performance Chart as of 9/22/2013 Submitter EjOPS* Application Server Database Server Oracle 11,259.88 1x Sun Server X4-2 2x 2.7 GHz Intel Xeon E5-2697 v2 Oracle WebLogic 12c (12.1.2) 1x Sun Server X4-2L 2x 2.7 GHz Intel Xeon E5-2697 v2 Oracle Database 12c (12.1.0.1) IBM 9,696.43 1x IBM X3650 M4 2x 2.9 GHz Intel Xeon E5-2690 WebSphere Application Server V8.5 1x IBM X3650 M4 2x 2.9 GHz Intel Xeon E5-2690 IBM DB2 10.1 Oracle 8,310.19 1x Sun Server X3-2 2x 2.9 GHz Intel Xeon E5-2690 Oracle WebLogic 11g (10.3.6) 1x Sun Server X3-2L 2x 2.9 GHz Intel Xeon E5-2690 Oracle Database 11g (11.2.0.3) * SPECjEnterprise2010 EjOPS, bigger is better. Configuration Summary Application Server: 1 x Sun Server X4-2

Oracle's Sun Server X4-2 and Sun Server X4-2L servers, using the Intel Xeon E5-2697 v2 processor, produced a world record x86 two-chip single application server SPECjEnterprise2010 benchmark result of...

Benchmark

SPARC T5-2 Delivers Best 2-Chip MultiJVM SPECjbb2013 Result

From www.spec.org Defects Identified in SPECjbb®2013 December 9, 2014 - SPEC has identified a defect in its SPECjbb®2013 benchmark suite. SPEC has suspended sales of the benchmark software and is no longer accepting new submissions of SPECjbb®2013 results for publication on SPEC's website. Current SPECjbb®2013 licensees will receive a free copy of the new version of the benchmark when it becomes available. SPEC is advising SPECjbb®2013 licensees and users of the SPECjbb®2013 metrics that the recently discovered defect impacts the comparability of results. This defect can significantly impact the amount of work done during the measurement period, resulting in an inflated SPECjbb®2013 metric. SPEC recommends that users not utilize these results for system comparisons without a full understanding of the impact of these defects on each benchmark result. Additional information is available here. SPECjbb2013 is a new benchmark designed to show modern Java server performance. Oracle's SPARC T5-2 set a world record as the fastest two-chip system beating just introduced two-chip x86-based servers. Oracle, using Oracle Solaris and Oracle JDK, delivered this two-chip world record result on the MultiJVM SPECjbb2013 metric. SPECjbb2013 is the replacement for SPECjbb2005 (SPECjbb2005 will soon be retired by SPEC). Oracle's SPARC T5-2 server achieved 81,084 SPECjbb2013-MultiJVM max-jOPS and 39,129 SPECjbb2013-MultiJVM critical-jOPS on the SPECjbb2013 benchmark. This result is a two chip world record. There are no IBM POWER7 or POWER7+ based server results on the SPECjbb2013 benchmark. IBM has published IBM POWER7+ based servers on the SPECjbb2005 which will soon be retired by SPEC. The 2-chip SPARC T5-2 server running SPECjbb2013 is 30% faster than the 2-chip Cisco UCS B200 M3 server (2.7 GHz E5-2697 v2 Ivy Bridge-based) based on SPECjbb2013-MultiJVM max-jOPS. The 2-chip SPARC T5-2 server running SPECjbb2013 is 66% faster than the 2-chip Cisco UCS B200 M3 server (2.7 GHz E5-2697 v2 Ivy Bridge-based) based on SPECjbb2013-MultiJVM critical-jOPS. These results were obtained using Oracle Solaris 11 along with Java Platform, Standard Edition, JDK 7 Update 40 on the SPARC T5-2 server. From SPEC's press release, "SPECjbb2013 replaces SPECjbb2005. The new benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is expected to be used widely by all those interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community." Performance Landscape Results of SPECjbb2013 from www.spec.org  as of September 22, 2013 and this report. SPECjbb2013 System Processor SPECjbb2013-MultiJVM JDK type # max-jOPS critical-jOPS SPARC T5-2 SPARC T5, 3.6 GHz 2 81,084 39,129 Oracle JDK 7u40 Cisco UCS B200 M3, DDR3-1866 Intel E5-2697 v2, 2.7 GHz 2 62,393 23,505 Oracle JDK 7u40 Sun Server X4-2, DDR3-1600 Intel E5-2697 v2, 2.7 GHz 2 52,664 20,553 Oracle JDK 7u40 Cisco UCS C220 M3 Intel E5-2690, 2.9 GHz 2 41,954 16,545 Oracle JDK 7u11 The above table represents all of the published results on www.spec.org. SPEC allows for self publication of SPECjbb2013 results. See below for locations where full reports were made available. Configuration Summary System Under Test: SPARC T5-2 server 2 x SPARC T5, 3.60 GHz 512 GB memory (32 x 16 GB dimms) Oracle Solaris 11.1 Oracle JDK 7 Update 40 Benchmark Description The SPECjbb2013 benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is relevant to all audiences who are interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community. SPECjbb2013 replaces SPECjbb2005. New features include: A usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases and data-mining operations. Both a pure throughput metric and a metric that measures critical throughput under service-level agreements (SLAs) specifying response times ranging from 10ms to 500ms. Support for multiple run configurations, enabling users to analyze and overcome bottlenecks at multiple layers of the system stack, including hardware, OS, JVM and application layers. Exercising new Java 7 features and other important performance elements, including the latest data formats (XML), communication using compression, and messaging with security. Support for virtualization and cloud environments. See Also SPEC website SPARC T5-2 Server oracle.com    OTN Sun Server X4-2 oracle.com   OTN Oracle Solaris oracle.com    OTN Java oracle.com    OTN Disclosure Statement SPEC and the benchmark name SPECjbb are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results as of 9/23/2013, see http://www.spec.org for more information. SPARC T5-2 81,084 SPECjbb2013-MultiJVM max-jOPS, 39,129 SPECjbb2013-MultiJVM critical-jOPS, result from https://blogs.oracle.com/BestPerf/resource/jbb2013/sparct5-922.pdf Cisco UCS B200 M3 62,393 SPECjbb2013-MultiJVM max-jOPS, 23,505 SPECjbb2013-MultiJVM critical-jOPS, result from http://www.cisco.com/en/US/prod/collateral/ps10265/le_41704_pb_specjbb2013b200.pdf; Sun Server X4-2 52,664 SPECjbb2013-MultiJVM max-jOPS, 20,553 SPECjbb2013-MultiJVM critical-jOPS, result from https://blogs.oracle.com/BestPerf/entry/20130918_x4_2_specjbb2013; Cisco UCS C220 M3 41,954 SPECjbb2013-MultiJVM max-jOPS, 16,545 SPECjbb2013-MultiJVM critical-jOPS result from www.spec.org.

From www.spec.org Defects Identified in SPECjbb®2013 December 9, 2014 - SPEC has identified a defect in its SPECjbb®2013 benchmark suite. SPEC has suspended sales of the benchmark software and is no...

Benchmark

Sun Server X4-2 Performance Running SPECjbb2013 MultiJVM Benchmark

From www.spec.org Defects Identified in SPECjbb®2013 December 9, 2014 - SPEC has identified a defect in its SPECjbb®2013 benchmark suite. SPEC has suspended sales of the benchmark software and is no longer accepting new submissions of SPECjbb®2013 results for publication on SPEC's website. Current SPECjbb®2013 licensees will receive a free copy of the new version of the benchmark when it becomes available. SPEC is advising SPECjbb®2013 licensees and users of the SPECjbb®2013 metrics that the recently discovered defect impacts the comparability of results. This defect can significantly impact the amount of work done during the measurement period, resulting in an inflated SPECjbb®2013 metric. SPEC recommends that users not utilize these results for system comparisons without a full understanding of the impact of these defects on each benchmark result. Additional information is available here. Oracle's Sun Server X4-2 system, using Oracle Solaris and Oracle JDK, produced a SPECjbb2013 benchmark (MultiJVM metric) result. This benchmark was designed by the industry to showcase Java server performance. The Sun Server X4-2 system is 24% faster than the fastest published Intel Xeon E5-2600 (Sandy Bridge) based two socket system's (Dell PowerEdge R720's) SPECjbb2013-MultiJVM max-jOPS. The Sun Server X4-2 is 22% faster than the fastest published Intel Xeon E5-2600 (Sandy Bridge) based two socket system's (Dell PowerEdge R720's) SPECjbb2013-MultiJVM critical-jOPS. The Sun Server X4-2 runs SPECjbb2013 (MultiJVM metric) at 70% of the published T5-2 SPECjbb2013-MultiJVM max-jOPS. The Sun Server X4-2 runs SPECjbb2013 (MultiJVM metric) at 88% of the published T5-2 SPECjbb2013-MultiJVM critical-jOPS. The combination of Oracle Solaris 11.1 and Oracle JDK 7 update 40 delivered a result of 52,664 SPECjbb2013-MultiJVM max-jOPS and 20,553 SPECjbb2013-MultiJVM critical-jOPS on the SPECjbb2013 benchmark. From SPEC's press release, "SPECjbb2013 replaces SPECjbb2005. The new benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is expected to be used widely by all those interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community." Performance Landscape Top two-socket results of SPECjbb2013 MultiJVM as of October 8, 2013. SPECjbb2013 System Processor DDR3 SPECjbb2013-MultiJVM OS JDK max-jOPS critical-jOPS SPARC T5-2 2 x 3.6 GHz SPARC T5 1600 75,658 23,334 Solaris 11.1 7u17 Cisco UCS B200 M3 2 x 2.7 GHz Intel E5-2697 v2 1866 62,393 23,505 RHEL 6.4 7u40 Sun Server X4-2 2 x 2.7 GHz Intel E5-2697 v2 1600 52,664 20,553 Solaris 11.1 7u40 Dell PowerEdge R720 2 x 2.9 GHz Intel Xeon E5-2690 1600 42,431 16,779 RHEL 6.4 7u21 The above table includes published results from www.spec.org. Configuration Summary System Under Test: Sun Server X4-2 2 x Intel E5-2697 v2, 2.7 GHz Hyper-Threading enabled Turbo Boost enabled 128 GB memory (16 x 8 GB dimms) Oracle Solaris 11.1 (11.1.4.2.0) Oracle JDK 7u40 Benchmark Description The SPECjbb2013 benchmark has been developed from the ground up to measure performance based on the latest Java application features. It is relevant to all audiences who are interested in Java server performance, including JVM vendors, hardware developers, Java application developers, researchers and members of the academic community. SPECjbb2013 replaces SPECjbb2005. New features include: A usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases and data-mining operations. Both a pure throughput metric and a metric that measures critical throughput under service-level agreements (SLAs) specifying response times ranging from 10ms to 500ms. Support for multiple run configurations, enabling users to analyze and overcome bottlenecks at multiple layers of the system stack, including hardware, OS, JVM and application layers. Exercising new Java 7 features and other important performance elements, including the latest data formats (XML), communication using compression, and messaging with security. Support for virtualization and cloud environments. See Also SPEC website Sun Server X4-2 oracle.com   OTN SPARC T5-2 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Java oracle.com    OTN Disclosure Statement SPEC and the benchmark name SPECjbb are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results from http://www.spec.org as of 10/8/2013. SPARC T5-2, 75,658 SPECjbb2013-MultiJVM max-jOPS, 23,334 SPECjbb2013-MultiJVM critical-jOPS; Cisco UCS B200 M3 62,393 SPECjbb2013-MultiJVM max-jOPS, 23,505 SPECjbb2013-MultiJVM critical-jOPS; Dell PowerEdge R720 42,431 SPECjbb2013-MultiJVM max-jOPS, 16,779 SPECjbb2013-MultiJVM critical-jOPS; Sun Server X4-2 52,664 SPECjbb2013-MultiJVM max-jOPS, 20,553 SPECjbb2013-MultiJVM critical-jOPS.

From www.spec.org Defects Identified in SPECjbb®2013 December 9, 2014 - SPEC has identified a defect in its SPECjbb®2013 benchmark suite. SPEC has suspended sales of the benchmark software and is no...

Benchmark

Oracle ZFS Storage ZS3-4 Delivers World Record SPC-2 Performance

The Oracle Storage ZS3-4 storage system delivered a world record performance result for the SPC-2 benchmark along with excellent price-performance. The Oracle Storage ZS3-4 storage system delivered an overall score of 17,244.22 SPC-2 MBPS™ and a SPC-2 price-performance of $22.53 on the SPC-2 benchmark. This is over a 1.6X generational improvement in performance and over a 1.5X generational improvement in price-performance than over Oracle's Sun ZFS Storage 7420 SPC-2 Benchmark results. The Oracle ZFS Storage ZS3-4 storage system has 6.8X better overall throughput and nearly 1.2X better price-performance than the IBM DS3524 Express turbo, which is IBM's best overall price-performance score on the SPC-2 benchmark. The Oracle ZFS Storage ZS3-4 storage system has over 1.1X overall throughput and 5.8X better price-performance than the IBM DS8870, which is IBM's best overall performance score on the SPC-2 benchmark. The Oracle ZFS Storage ZS3-4 storage system has over 1.3X overall throughput and 3.9X better price-performance than the HP StorageWorks P9500XP Disk Array on the SPC-2 benchmark. Performance Landscape SPC-2 Performance Chart (in decreasing performance order) System SPC-2 MB/s $/SPC-2 MB/s ASU Capacity (GB) TSC Price Data Protection Level Date Results Identifier Oracle ZFS Storage ZS3-4 17,244.22 $22.53 31,611 $388,472 Mirroring 09/10/13 B00067 Fujitsu DX8700 S2 16,039 $79.51 71,404 $1,275,163 Mirroring 12/03/12 B00063 IBM DS8870 15,424 $131.21 30,924 $2,023,742 RAID-5 10/03/12 B00062 IBM SAN VC v6.4 14,581 $129.14 74,492 $1,883,037 RAID-5 08/01/12 B00061 NEC Storage M700 14,409 $25.13 53,550 $361,613 Mirroring 08/19/12 B00066 Hitachi VSP 13,148 $95.38 129,112 $1,254,093 RAID-5 07/27/12 B00060 HP StorageWorks P9500 13,148 $88.34 129,112 $1,161,504 RAID-5 03/07/12 B00056 Sun ZFS Storage 7420 10,704 $35.24 31,884 $377,225 Mirroring 04/12/12 B00058 IBM DS8800 9,706 $270.38 71,537 $2,624,257 RAID-5 12/01/10 B00051 HP XP24000 8,725 $187.45 18,401 $1,635,434 Mirroring 09/08/08 B00035 SPC-2 MB/s = the Performance Metric $/SPC-2 MB/s = the Price-Performance Metric ASU Capacity = the Capacity Metric Data Protection = Data Protection Metric TSC Price = Total Cost of Ownership Metric Results Identifier = A unique identification of the result Metric SPC-2 Price-Performance Chart (in increasing price-performance order) System SPC-2 MB/s $/SPC-2 MB/s ASU Capacity (GB) TSC Price Data Protection Level Date Results Identifier SGI InfiniteStorage 5600 8,855.70 $15.97 28,748 $141,393 RAID6 03/06/13 B00065 Oracle ZFS Storage ZS3-4 17,244.22 $22.53 31,611 $388,472 Mirroring 09/10/13 B00067 Sun Storage J4200 548.80 $22.92 11,995 $12,580 Unprotected 07/10/08 B00033 NEC Storage M700 14,409 $25.13 53,550 $361,613 Mirroring 08/19/12 B00066 Sun Storage J4400 887.44 $25.63 23,965 $22,742 Unprotected 08/15/08 B00034 Sun StorageTek 2530 672.05 $26.15 1,451 $17,572 RAID5 08/16/07 B00026 Sun StorageTek 2530 663.51 $26.48 854 $17,572 Mirroring 08/16/07 B00025 Fujitsu ETERNUS DX80 1,357.55 $26.70 4,681 $36,247 Mirroring 03/15/10 B00050 IBM DS3524 Express Turbo 2,510 $26.76 14,374 $67,185 RAID-5 12/31/10 B00053 Fujitsu ETERNUS DX80 S2 2,685.50 $28.48 17,231 $76,475 Mirroring 08/19/11 B00055 SPC-2 MB/s = the Performance Metric $/SPC-2 MB/s = the Price-Performance Metric ASU Capacity = the Capacity Metric Data Protection = Data Protection Metric TSC Price = Total Cost of Ownership Metric Results Identifier = A unique identification of the result Metric Complete SPC-2 benchmark results may be found at http://www.storageperformance.org/results/benchmark_results_spc2. Configuration Summary Storage Configuration: Oracle ZFS Storage ZS3-4 storage system in clustered configuration 2 x Oracle ZFS Storage ZS3-4 controllers, each with 4 x 2.4 GHz 10-core Intel Xeon processors 1024 GB memory 16 x Sun Disk shelves, each with 24 x 300 GB 15K RPM SAS-2 drives Benchmark Description SPC Benchmark-2 (SPC-2): Consists of three distinct workloads designed to demonstrate the performance of a storage subsystem during the execution of business critical applications that require the large-scale, sequential movement of data. Those applications are characterized predominately by large I/Os organized into one or more concurrent sequential patterns. A description of each of the three SPC-2 workloads is listed below as well as examples of applications characterized by each workload. Large File Processing: Applications in a wide range of fields, which require simple sequential process of one or more large files such as scientific computing and large-scale financial processing. Large Database Queries: Applications that involve scans or joins of large relational tables, such as those performed for data mining or business intelligence. Video on Demand: Applications that provide individualized video entertainment to a community of subscribers by drawing from a digital film library. SPC-2 is built to: Provide a level playing field for test sponsors. Produce results that are powerful and yet simple to use. Provide value for engineers as well as IT consumers and solution integrators. Is easy to run, easy to audit/verify, and easy to use to report official results. See Also Oracle ZFS Storage ZS3-4 SPC-2 Executive Summary storageperformance.org Complete Oracle ZFS Storage ZS3-4 SPC-2 Full Disclosure Report storageperformance.org Storage Performance Council (SPC) Home Page Oracle ZFS Storage ZS3-4 oracle.com    OTN Disclosure Statement SPC-2 and SPC-2 MBPS are registered trademarks of Storage Performance Council (SPC). Results as of September 10, 2013, for more information see www.storageperformance.org. Oracle ZFS Storage ZS3-4 B00067, Fujitsu ET 8700 S2 B00063, IBM DS8870 B00062, IBM S.V.C 6.4 B00061, NEC Storage M700 B00066, Hitachi VSP B00060, HP P9500 XP Disk Array B00056, IBM DS8800 B00051.

The Oracle Storage ZS3-4 storage system delivered a world record performance result for the SPC-2 benchmark along with excellent price-performance. The Oracle Storage ZS3-4 storage system delivered an...

Benchmark

Oracle ZFS Storage ZS3-4 Produces Best 2-Node Performance on SPECsfs2008 NFSv3

The Oracle ZFS Storage ZS3-4 storage system delivered world record two-node performance on the SPECsfs2008 NFSv3 benchmark, beating results published on NetApp's dual-controller and four-node high-end FAS6240 storage systems. The Oracle ZFS Storage ZS3-4 storage system delivered a world record two-node result of 450,702 SPECsfs2008_nfs.v3 Ops/sec with an Overall Response Time (ORT) of 0.70 msec on the SPECsfs2008 NFSv3 benchmark. The Oracle ZFS Storage ZS3-4 storage system delivered 2.4x higher throughput than the dual-controller NetApp FAS6240 and 4.5x higher throughput than the dual-controller NetApp FAS3270 on the SPECsfs2008_nfs.v3 benchmark at less than half the list price of either result. The Oracle ZFS Storage ZS3-4 storage system had 42 percent higher throughput than the four-node NetApp FAS6240 on the SPECsfs2008 NFSv3 benchmark. The Oracle ZFS Storage ZS3-4 storage aystem has 54 percent better Overall Response Time than the 4-node NetApp FAS6240 on the SPECsfs2008 NFSv3 benchmark. Performance Landscape Two node results for SPECsfs2008_nfs.v3 presented (in decreasing SPECsfs2008_nfs.v3 Ops/sec order) along with other select results. Sponsor System Nodes Disks Throughput (Ops/sec) Overall Response Time (msec) Oracle ZS3-4 2 464 450,702 0.70 IBM SONAS 1.2 2 1975 403,326 3.23 NetApp FAS6240 4 288 260,388 1.53 NetApp FAS6240 2 288 190,675 1.17 EMC VG8   312 135,521 1.92 Oracle 7320 2 136 134,140 1.51 EMC NS-G8   100 110,621 2.32 NetApp FAS3270 2 360 101,183 1.66 Throughput SPECsfs2008_nfs.v3 Ops/sec — the Performance Metric Overall Response Time — the corresponding Response Time Metric Nodes — Nodes and Controllers are being used interchangeably Complete SPECsfs2008 benchmark results may be found at http://www.spec.org/sfs2008/results/sfs2008.html. Configuration Summary Storage Configuration: Oracle ZFS Storage ZS3-4 storage system in clustered configuration 2 x Oracle ZFS Storage ZS3-4 controllers, each with 8 x 2.4 GHz Intel Xeon E7-4870 processors 2 TB memory 2 x 10GbE NICs 20 x Sun Disk shelves 18 x shelves with 24 x 300 GB 15K RPM SAS-2 drives 2 x shelves with 20 x 300 GB 15K RPM SAS-2 drives and 8 x 73 GB SAS-2 flash-enabled write-cache Benchmark Description SPECsfs2008 is the latest version of the Standard Performance Evaluation Corporation (SPEC) benchmark suite measuring file server throughput and response time, providing a standardized method for comparing performance across different vendor platforms. SPECsfs2008 results summarize the server's capabilities with respect to the number of operations that can be handled per second, as well as the overall latency of the operations. The suite is a follow-on to the SFS97_R1 benchmark, adding a CIFS workload, an updated NFSv3 workload, support for additional client platforms, and a new test harness and reporting/submission framework. See Also Standard Performance Evaluation Corporation (SPEC) Home Page Oracle ZFS Storage ZS3-4 oracle.com    OTN Disclosure Statement SPEC and SPECsfs are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results as of September 10, 2013, for more information see www.spec.org. Oracle ZFS Storage ZS3-4 Appliance 450,702 SPECsfs2008_nfs.v3 Ops/sec, 0.70 msec ORT, NetApp Data ONTAP 8.1 Cluster-Mode (4-node FAS6240) 260,388 SPECsfs2008_nfs.v3 Ops/Sec, 1.53 msec ORT, NetApp FAS6240 190,675 SPECsfs2008_nfs.v3 Ops/Sec, 1.17 msec ORT. NetApp FAS3270 101,183 SPECsfs2008_nfs.v3 Ops/Sec, 1.66 msec ORT. Nodes refer to the item in the SPECsfs2008 disclosed Configuration Bill of Materials that have the Processing Elements that perform the NFS Processing Function. These are the first item listed in each of disclosed Configuration Bill of Materials except for EMC where it is both the first and third items listed, and HP, where it is the second item listed as Blade Servers. The number of nodes is from the QTY disclosed in the Configuration Bill of Materials as described above. Configuration Bill of Materials list price for Oracle result of US$ 423,644. Configuration Bill of Materials list price for NetApp FAS3270 result of US$ 1,215,290. Configuration Bill of Materials list price for NetApp FAS6240 result of US$ 1,028,118. Oracle pricing from https://shop.oracle.com/pls/ostore/f?p=dstore:home:0, traverse to "Storage and Tape" and then to "NAS Storage". NetApp's pricing from http://www.netapp.com/us/media/na-list-usd-netapp-custom-state-new-discounts.html.

The Oracle ZFS Storage ZS3-4 storage system delivered world record two-node performance on the SPECsfs2008 NFSv3 benchmark, beating results published on NetApp's dual-controller and four-node high-end...

Benchmark

Oracle ZFS Storage ZS3-2 Beats Comparable NetApp on SPECsfs2008 NFSv3

Oracle ZFS Storage ZS3-2 storage system delivered outstanding performance on the SPECsfs2008 NFSv3 benchmark, beating results published on NetApp's fastest midrange platform, the NetApp FAS3270, the NetApp FAS6240 and the EMC Gateway NS-G8 Server Failover Cluster. The Oracle ZFS Storage ZS3-2 storage system delivered 210,535 SPECsfs2008_nfs.v3 Ops/sec with an Overall Response Time (ORT) of 1.12 msec on the SPECsfs2008 NFSv3 benchmark. The Oracle ZFS Storage ZS3-2 storage system delivered 10% higher throughput than the NetApp FAS6240 on the SPECsfs2008 NFSv3 benchmark. The Oracle ZFS Storage ZS3-2 storage system has 52% higher throughput than the NetApp FAS3270 on the SPECsfs2008 NFSv3 benchmark. The Oracle ZFS Storage ZS3-2 storage system has 5% better Overall Response Time than the NetApp FAS6240 on the SPECsfs2008 NFSv3 benchmark. The Oracle ZFS Storage ZS3-2 storage system has 33% better Overall Response Time than the NetApp FAS3270 on the SPECsfs2008 NFSv3 benchmark. Performance Landscape Results for SPECsfs2008 NFSv3 (in decreasing SPECsfs2008_nfs.v3 Ops/sec order) for competitive systems. Sponsor System Throughput (Ops/sec) Overall Response Time (msec) Oracle ZS3-2 210,535 1.12 NetApp FAS6240 190,675 1.17 EMC VG8 135,521 1.92 EMC NS-G8 110,621 2.32 NetApp FAS3270 101,183 1.66 NetApp FAS3250 100,922 1.76 Throughput SPECsfs2008_nfs.v3 Ops/sec = the Performance Metric Overall Response Time = the corresponding Response Time Metric Complete SPECsfs2008 benchmark results may be found at http://www.spec.org/sfs2008/results/sfs2008.html. Configuration Summary Storage Configuration: Oracle ZFS Storage ZS3-2 storage system in clustered configuration 2 x Oracle ZFS Storage ZS3-2 controllers, each with 4 x 2.1 GHz Intel Xeon E5-2658 processors 512 GB memory 8 x Sun Disk shelves 3 x shelves with 24 x 900 GB 10K RPM SAS-2 drives 3 x shelves with 20 x 900 GB 10K RPM SAS-2 drives 2 x shelves with 20 x 900 GB 10K RPM SAS-2 drives and 4 x 73 GB SAS-2 flash-enabled write-cache Benchmark Description SPECsfs2008 is the latest version of the Standard Performance Evaluation Corporation (SPEC) benchmark suite measuring file server throughput and response time, providing a standardized method for comparing performance across different vendor platforms. SPECsfs2008 results summarize the server's capabilities with respect to the number of operations that can be handled per second, as well as the overall latency of the operations. The suite is a follow-on to the SFS97_R1 benchmark, adding a CIFS workload, an updated NFSv3 workload, support for additional client platforms, and a new test harness and reporting/submission framework. See Also Standard Performance Evaluation Corporation (SPEC) Home Page Oracle ZFS Storage ZS3-2 oracle.com    OTN Disclosure Statement SPEC and SPECsfs are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results as of September 10, 2013, for more information see www.spec.org. Oracle ZFS Storage ZS3-2 Appliance 210,535 SPECsfs2008_nfs.v3 Ops/sec, 1.12 msec ORT, NetApp FAS6240 190,675 SPECsfs2008_nfs.v3 Ops/Sec, 1.17 msec ORT, EMC Celerra VG8 Server Failover Cluster, 2 Data Movers (1 stdby) / Symmetrix VMAX 135,521 SPECsfs2008_nfs.v3 Ops/Sec,  1.92 msec ORT, EMC Celerra Gateway NS-G8 Server Failover Cluster, 3 Datamovers (1 stdby) / Symmetrix V-Max 110,621 SPECsfs2008_nfs.v3 Ops/Sec, 2.32 msec ORT. NetApp FAS3270 101,183 SPECsfs2008_nfs.v3 Ops/Sec, 1.66 msec ORT. NetApp FAS3250 100,922 SPECsfs2008_nfs.v3 Ops/Sec, 1.76 msec ORT.

Oracle ZFS Storage ZS3-2 storage system delivered outstanding performance on the SPECsfs2008 NFSv3 benchmark, beating results published on NetApp's fastest midrange platform, the NetApp FAS3270, the...

Benchmark

SPARC T5-4 Produces World Record Single Server TPC-H @3000GB Benchmark Result

Oracle's SPARC T5-4 server delivered world record single server performance of 409,721 QphH@3000GB with price/performance of $3.94/QphH@3000GB on the TPC-H @3000GB benchmark. This result shows that the 4-chip SPARC T5-4 server is significantly faster than the 8-chip server results from IBM (POWER7 based) and HP (Intel x86 based). This result demonstrates a complete data warehouse solution that shows the performance both of individual and concurrent query processing streams, faster loading, and refresh of the data during business operations. The SPARC T5-4 server delivers superior  performance and cost efficiency when compared to the IBM POWER7 result. The SPARC T5-4 server with four SPARC T5 processors is 2.1 times faster than the IBM Power 780 server with eight POWER7 processors and 2.5 times faster than the HP ProLiant DL980 G7 server with eight x86 processors on the TPC-H @3000GB benchmark. The SPARC T5-4 server also delivered better performance per core than these eight processor systems from IBM and HP. The SPARC T5-4 server with four SPARC T5 processors is 2.1 times faster than the IBM Power 780 server with eight POWER7 processors on the TPC-H @3000GB benchmark. The SPARC T5-4 server costs 38% less per $/QphH@3000GB compared to the IBM Power 780 server with the TPC-H @3000GB benchmark. The SPARC T5-4 server took 2 hours, 6 minutes, 4 seconds for data loading while the IBM Power 780 server took 2.8 times longer. The SPARC T5-4 server executed the first refresh function (RF1) in 19.4 seconds, the IBM Power 780 server took 7.6 times longer. The SPARC T5-4 server with four SPARC T5 processors is 2.5 times faster than the HP ProLiant DL980 G7 server with the same number of cores on the TPC-H @3000GB benchmark. The SPARC T5-4 server took 2 hours, 6 minutes, 4 seconds for data loading while the HP ProLiant DL980 G7 server took 4.1 times longer. The SPARC T5-4 server executed the first refresh function (RF1) in 19.4 seconds, the HP ProLiant DL980 G7 server took 8.9 times longer. The SPARC T5-4 server delivered 6% better performance than the SPARC Enterprise M9000-64 server and 2.1 times better than the SPARC Enterprise M9000-32 server on the TPC-H @3000GB benchmark. Performance Landscape The table lists the leading TPC-H @3000GB results for non-clustered systems. TPC-H @3000GB, Non-Clustered Systems System Processor P/C/T – Memory Composite (QphH) $/perf ($/QphH) Power (QppH) Throughput (QthH) Database Available SPARC T5-4 3.6 GHz SPARC T5 4/64/512 – 2048 GB 409,721.8 $3.94 345,762.7 485,512.1 Oracle 11g R2 09/24/13 SPARC Enterprise M9000 3.0 GHz SPARC64 VII+ 64/256/256 – 1024 GB 386,478.3 $18.19 316,835.8 471,428.6 Oracle 11g R2 09/22/11 SPARC T4-4 3.0 GHz SPARC T4 4/32/256 – 1024 GB 205,792.0 $4.10 190,325.1 222,515.9 Oracle 11g R2 05/31/12 SPARC Enterprise M9000 2.88 GHz SPARC64 VII 32/128/256 – 512 GB 198,907.5 $15.27 182,350.7 216,967.7 Oracle 11g R2 12/09/10 IBM Power 780 4.1 GHz POWER7 8/32/128 – 1024 GB 192,001.1 $6.37 210,368.4 175,237.4 Sybase 15.4 11/30/11 HP ProLiant DL980 G7 2.27 GHz Intel Xeon X7560 8/64/128 – 512 GB 162,601.7 $2.68 185,297.7 142,685.6 SQL Server 2008 10/13/10 P/C/T = Processors, Cores, Threads QphH = the Composite Metric (bigger is better) $/QphH = the Price/Performance metric in USD (smaller is better) QppH = the Power Numerical Quantity QthH = the Throughput Numerical Quantity The following table lists data load times and refresh function times during the power run. TPC-H @3000GB, Non-Clustered Systems Database Load & Database Refresh System Processor Data Loading (h:m:s) T5 Advan RF1 (sec) T5 Advan RF2 (sec) T5 Advan SPARC T5-4 3.6 GHz SPARC T5 02:06:04 1.0x 19.4 1.0x 22.4 1.0x IBM Power 780 4.1 GHz POWER7 05:51:50 2.8x 147.3 7.6x 133.2 5.9x HP ProLiant DL980 G7 2.27 GHz Intel Xeon X7560 08:35:17 4.1x 173.0 8.9x 126.3 5.6x Data Loading = database load time RF1 = power test first refresh transaction RF2 = power test second refresh transaction T5 Advan = the ratio of time to T5 time Complete benchmark results found at the TPC benchmark website http://www.tpc.org. Configuration Summary and Results Hardware Configuration: SPARC T5-4 server 4 x SPARC T5 processors (3.6 GHz total of 64 cores, 512 threads) 2 TB memory 2 x internal SAS (2 x 300 GB) disk drives External Storage: 12 x Sun Storage 2540-M2 array with Sun Storage 2501-M2 expansion trays, each with 24 x 15K RPM 300 GB drives, 2 controllers, 2 GB cache 2 x Brocade 6510 Fibre Channel Switches (48 x 16 Gbs port each) Software Configuration: Oracle Solaris 11.1 Oracle Database 11g Release 2 Enterprise Edition Audited Results: Database Size: 3000 GB (Scale Factor 3000) TPC-H Composite: 409,721.8 QphH@3000GB Price/performance: $3.94/QphH@3000GB Available: 09/24/2013 Total 3 year Cost: $1,610,564 TPC-H Power: 345,762.7 TPC-H Throughput: 485,512.1 Database Load Time: 2:06:04 Benchmark Description The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB, 10000GB, 30000GB and 100000GB) are not allowed by the TPC. TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system. The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multiple user modes. The benchmark requires reporting of price/performance, which is the ratio of the total HW/SW cost plus 3 years maintenance to the QphH. A secondary metric is the storage efficiency, which is the ratio of total configured disk space in GB to the scale factor. Key Points and Best Practices Twelve of Oracle's Sun Storage 2540-M2 arrays with Sun Storage 2501-M2 expansion trays were used for the benchmark. Each contains 24 15K RPM drives and is connected to a single dual port 16Gb FC HBA using 2 ports through a Brocade 6510 Fibre Channel switch. The SPARC T5-4 server achieved a peak IO rate of 33 GB/sec from the Oracle database configured with this storage. Oracle Solaris 11.1 required very little system tuning. Some vendors try to make the point that storage ratios are of customer concern. However, storage ratio size has more to do with disk layout and the increasing capacities of disks – so this is not an important metric when comparing systems. The SPARC T5-4 server and Oracle Solaris efficiently managed the system load of two thousand Oracle Database parallel processes. Six Sun Storage 2540-M2/2501-M2 arrays were mirrored to another six Sun Storage 2540-M2/25001-M2 arrays on which all of the Oracle database files were placed. IO performance was high and balanced across all the arrays. The TPC-H Refresh Function (RF) simulates periodical refresh portion of Data Warehouse by adding new sales and deleting old sales data. Parallel DML (parallel insert and delete in this case) and database log performance are a key for this function and the SPARC T5-4 server outperformed both the IBM POWER7 server and HP ProLiant DL980 G7 server. (See the RF columns above.) See Also SPARC T5-4 Server TPC-H Executive Summary tpc.org Complete SPARC T5-4 Server TPC-H Full Disclosure Report tpc.org Transaction Processing Performance Council (TPC) Home Page Ideas International Benchmark Page SPARC T5-4 Server oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database 11g Release 2 oracle.com    OTN Sun Storage 2540-M2 Array oracle.com    OTN Disclosure Statement TPC-H, QphH, $/QphH are trademarks of Transaction Processing Performance Council (TPC). For more information, see www.tpc.org, results as of 6/7/13. Prices are in USD. SPARC T5-4 www.tpc.org/3288; SPARC T4-4 www.tpc.org/3278; SPARC Enterprise M9000 www.tpc.org/3262; SPARC Enterprise M9000 www.tpc.org/3258; IBM Power 780 www.tpc.org/3277; HP ProLiant DL980 www.tpc.org/3285.

Oracle's SPARC T5-4 server delivered world record single server performance of 409,721 QphH@3000GB with price/performance of $3.94/QphH@3000GB on the TPC-H @3000GB benchmark. This result shows that...

Benchmark

SPARC T5-8 Delivers Best Single System SPECjEnterprise2010 Benchmark, Beats IBM

Oracle produced a world record single-server SPECjEnterprise2010 benchmark result of 27,843.57 SPECjEnterprise2010 EjOPS using one of Oracle's SPARC T5-8 servers for both the application and the database tier. This result directly compares the 8-chip SPARC T5-8 server (8 SPARC T5 processors) to the 8-chip IBM Power 780 server (8 POWER7+ processor). The 8-chip SPARC T5 processor based server is 2.6x faster than the 8-chip IBM POWER7+ processor based server. Both Oracle and IBM used virtualization to provide 4-chips for application and 4-chips for database. The server cost/performance for the SPARC T5 processor based server was 6.9x better than the server cost/performance of the IBM POWER7+ processor based server. The cost/performance of the SPARC T5-8 server is $10.72 compared to the IBM Power 780 at $73.83. The total configuration cost/performance (hardware+software) for the SPARC T5 processor based server was 3.6x better than the IBM POWER7+ processor based server. The cost/performance of the SPARC T5-8 server is $56.21 compared to the IBM Power 780 at $199.42. The IBM system had 1.6x better performance per core, but this did not reduce the total software and hardware cost to the customer. As shown by this comparison, performance-per-core is a poor predictor of characteristics relevant to customers. The total IBM hardware plus software cost was $2,174,152 versus the total Oracle hardware plus software cost of $1,565,092. At this price IBM could only provide 768 GB of memory while Oracle was able to deliver 2 TB in the SPARC T5-8 server. The SPARC T5-8 server requires only 8 rack units, the same as the space of the IBM Power 780. In this configuration IBM has a hardware core density of 4 cores per rack unit which contrasts with the 16 cores per rack unit for the SPARC T5-8 server. This again demonstrates why performance-per-core is a poor predictor of characteristics relevant to customers. The virtualized SPARC T5 processor based server ran the application tier servers on 4 chips using Oracle Solaris Zones and the database tier in a 4-chip Oracle Solaris Zone. The virtualized IBM POWER7+ processor based server ran the application in a 4-chip LPAR and the database in a 4-chip LPAR. The SPARC T5-8 server ran the Oracle Solaris 11.1 operating system and used Oracle Solaris Zones to consolidate eight Oracle WebLogic application server instances and one database server instance to achieve this result. The IBM system used LPARS and AIX V7.1. This result demonstrated less than 1 second average response times for all SPECjEnterprise2010 transactions and represents JEE 5.0 transactions generated by 227,500 users. The application server used Oracle Fusion Middleware components including the Oracle WebLogic 12.1 application server and Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.7.0_15. The database server was configured with Oracle Database 11g Release 2. IBM has a non-virtualized result (one server for application and one server for database). The IBM PowerLinux 7R2 achieved 13,161.07 SPECjEnterprise2010 EjOPS which means it was 2.1x slower than the SPARC T5-8 server. The total configuration cost/performance (hardware+software) for the SPARC T5 processor based server was 11%  better than the IBM POWER7+ processor based server. The cost/performance of the SPARC T5-8 server is $56.21  compared to the IBM PowerLinux 7R2 at $62.26. As shown by this comparison, performance-per-core is a poor  predictor of characteristics relevant to customers. Performance Landscape Complete benchmark results are at the SPEC website, SPECjEnterprise2010 Results. SPECjEnterprise2010 Performance Chart Only Two Virtualized Results (App+DB on 1 Server) as of 5/1/2013 Submitter EjOPS* Java EE Server & DB Server Oracle 27,843.57 1 x SPARC T5-8 8 chips, 128 cores, 3.6 GHz SPARC T5 Oracle WebLogic 12c (12.1.1) Oracle Database 11g (11.2.0.3) IBM 10,902.30 1 x IBM Power 780 8 chips, 32 cores, 4.42 GHz POWER7+ WebSphere Application Server V8.5 IBM DB2 Universal Database 10.1 * SPECjEnterprise2010 EjOPS (bigger is better) Configuration Summary Oracle Summary Application and Database Server: 1 x SPARC T5-8 server, with 8 x 3.6 GHz SPARC T5 processors 2 TB memory 5 x 10 GbE dual-port NIC 6 x 8 Gb dual-port HBA Oracle Solaris 11.1 SRU 4.5 Oracle WebLogic Server 12c (12.1.1) Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.7.0_15 Oracle Database 11g (11.2.0.3) Storage Servers: 6 x Sun Server X3-2L (12-Drive), with 2 x 2.4 GHz Intel Xeon 16 GB memory 1 x 8 Gb FC HBA 4 x Sun Flash Accelerator F40 PCI-E Card Oracle Solaris 11.1 2 x Sun Storage 2540-M2 Array 12 x 600 GB 15K RPM SAS HDD Switch Hardware: 1 x Sun Network 10 GbE 72-port Top of Rack (ToR) Switch IBM Summary Application and Database Server: 1 x IBM Power 780 server, with 8 x 4.42 GHz POWER7+ processors 786 GB memory 6 x 10 GbE dual-port NIC 3 x 8 Gb four-port HBA IBM AIX V7.1 TL2 IBM WebSphere Application Server V8.5 IBM J9 VM (build 2.6, JRE 1.7.0 IBM J9 AIX ppc-32) IBM DB2 10.1 IBM InfoSphere Optim pureQuery Runtime v3.1.1 Storage: 2 x DS5324 Disk System with 48 x 146GB 15K E-DDM Disks 1 x v7000 Disk Controller with 16 x 400GB SSD Disks Benchmark Description SPECjEnterprise2010 is the third generation of the SPEC organization's J2EE end-to-end industry standard benchmark application. The new SPECjEnterprise2010 benchmark has been re-designed and developed to cover the Java EE 5 specification's significantly expanded and simplified programming model, highlighting the major features used by developers in the industry today. This provides a real world workload driving the Application Server's implementation of the Java EE specification to its maximum potential and allowing maximum stressing of the underlying hardware and software systems, The web zone, servlets, and web services The EJB zone JPA 1.0 Persistence Model JMS and Message Driven Beans Transaction management Database connectivity Moreover, SPECjEnterprise2010 also heavily exercises all parts of the underlying infrastructure that make up the application environment, including hardware, JVM software, database software, JDBC drivers, and the system network. The primary metric of the SPECjEnterprise2010 benchmark is jEnterprise Operations Per Second (SPECjEnterprise2010 EjOPS). The primary metric for the SPECjEnterprise2010 benchmark is calculated by adding the metrics of the Dealership Management Application in the Dealer Domain and the Manufacturing Application in the Manufacturing Domain. There is NO price/performance metric in this benchmark. Key Points and Best Practices Eight Oracle WebLogic server instances on the SPARC T5-8 server were hosted in 8 separate Oracle Solaris Zones to demonstrate consolidation of multiple application servers. The 8 zones were bound to 4 resource pools using 64 cores (4 cpu chips). The database ran in a separate Oracle Solaris Zone bound to a resource pool consisting 64 cores (4 cpu chips). The database shadow processes were run in the FX scheduling class and bound to one of four cpu chips using the "plgrp" command. The Oracle WebLogic application servers were executed in the FX scheduling class to improve performance by reducing the frequency of context switches. The Oracle log writer process was run in the FX scheduling class at processor priority 60 to use the Critical Thread feature. See Also SPECjEnterprise2010 Results Page SPARC T5-8 Result Page at SPEC SPARC T5-8 Server oracle.com    OTN Sun Flash Accelerator F40 PCIe Card oracle.com    OTN Oracle Solaris oracle.com    OTN Oracle Database 11g Release 2 oracle.com    OTN Oracle Fusion Middleware oracle.com    OTN Oracle WebLogic Suite oracle.com    OTN Disclosure Statement SPEC and the benchmark name SPECjEnterprise are registered trademarks of the Standard Performance Evaluation Corporation. Results from www.spec.org as of 5/1/2013. SPARC T5-8, 27,843.57 SPECjEnterprise2010 EjOPS; IBM Power 780, 10,902.30 SPECjEnterprise2010 EjOPS; IBM PowerLinux 7R2, 13,161.07 SPECjEnterprise2010 EjOPS. Oracle server only hardware list price is $298,494 and total hardware plus software list price is $1,565,092 from www.oracle.com as of 5/22/2013. IBM server only hardware list price is $804,931 and total hardware plus software cost of $2,174,152 based on public pricing from http://www.ibm.com as of 5/22/2013. IBM PowerLinux 7R2 server total hardware plus software cost of $819,451 based on public pricing from http://www.ibm.com as of 5/22/2013.

Oracle produced a world record single-server SPECjEnterprise2010 benchmark result of 27,843.57 SPECjEnterprise2010 EjOPS using one of Oracle's SPARC T5-8 servers for both the application and...

Benchmark

SPARC T5 System Performance for Encryption Microbenchmark

The cryptography benchmark suite was internally developed by Oracle to measure the maximum throughput of in-memory, on-chip encryption operations that a system can perform. Multiple threads are used to achieve the maximum throughput. Systems powered by Oracle's SPARC T5 processor show outstanding performance on the tested encryption operations, beating Intel processor based systems. A SPARC T5 processor running Oracle Solaris 11.1 runs from 2.4x to 4.4x faster on AES 256-bit key encryption than the Intel E5-2690 processor running in-memory encryption of 32 KB blocks using CFB128, CBC, CCM and GCM modes fully hardware subscribed. AES CFB mode is used by the Oracle Database 11g for Transparent Data Encryption (TDE) which provides security to database storage. Performance Landscape Presented below are results for running encryption using the AES cipher with the CFB, CBC, CCM and GCM modes for key sizes of 128, 192 and 256. Decryption performance was similar and is not presented. Results are presented as MB/sec (10**6). Encryption Performance – AES-CFB Performance is presented for in-memory AES-CFB128 mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CFB Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CFB SPARC T5 3.60 2 54,396 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 12,823 IPP/AES-NI AES-192-CFB SPARC T5 3.60 2 61,000 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 14,928 IPP/AES-NI AES-128-CFB SPARC T5 3.60 2 68,695 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 17,824 IPP/AES-NI Encryption Performance – AES-CBC Performance is presented for in-memory AES-CBC mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are  presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run). AES-CBC Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CBC SPARC T5 3.60 2 56,933 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 12,822 IPP/AES-NI AES-192-CBC SPARC T5 3.60 2 63,767 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 14,915 IPP/AES-NI AES-128-CBC SPARC T5 3.60 2 72,508 Oracle Solaris 11.1, libsoftcrypto + libumem SPARC T4 2.85 2 31,085 Oracle Solaris 11.1, libsoftcrypto + libumem Intel X5690 3.47 2 20,721 IPP/AES-NI Intel E5-2690 2.90 2 17,823 IPP/AES-NI Encryption Performance – AES-CCM Performance is presented for in-memory AES-CCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run). AES-CCM Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-CCM SPARC T5 3.60 2 29,431 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 12,493 IPP/AES-NI AES-192-CCM SPARC T5 3.60 2 33,715 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 14,507 IPP/AES-NI AES-128-CCM SPARC T5 3.60 2 39,188 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 17,256 IPP/AES-NI Encryption Performance – AES-GCM Performance is presented for in-memory AES-GCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run). AES-GCM Microbenchmark Performance (MB/sec) Processor GHz Chips Performance Software Environment AES-256-GCM SPARC T5 3.60 2 34,101 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 13,520 IPP/AES-NI AES-192-GCM SPARC T5 3.60 2 36,852 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 14,159 IPP/AES-NI AES-128-GCM SPARC T5 3.60 2 39,003 Oracle Solaris 11.1, libsoftcrypto + libumem Intel E5-2690 2.90 2 14,877 IPP/AES-NI Configuration Summary SPARC T5-2 server 2 x SPARC T5 processor, 3.6 GHz 512 GB memory Oracle Solaris 11.1 SRU 4.2 Sun Server X3-2 server 2 x E5-2690 processors, 2.90 GHz 128 GB memory Benchmark Description The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-memory and on-chip using various ciphers, including AES-128-CFB, AES-192-CFB, AES-256-CFB, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CCM, AES-192-CCM, AES-256-CCM, AES-128-GCM, AES-192-GCM and AES-256-GCM. The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various ciphers. They were run using optimized libraries for each platform to obtain the best possible performance. See Also More about AES SPARC T5-2 Server oracle.com    OTN Sun Server X3-2 oracle.com   OTN Oracle Solaris oracle.com    OTN Disclosure Statement Copyright 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 3/26/2013.

The cryptography benchmark suite was internally developed by Oracle to measure the maximum throughput of in-memory, on-chip encryption operations that a system can perform. Multiple threads are used...

Oracle

Integrated Cloud Applications & Platform Services