Tuesday May 01, 2012

SPARC T4 Servers Running Oracle Solaris 11 and Oracle RAC Deliver World Record on PeopleSoft HRMS 9.1

Oracle's SPARC T4-4 server running Oracle's PeopleSoft HRMS Self-Service 9.1 benchmark achieved world record performance with 18,000 interactive users. This was accomplished using a high availability configuration using Oracle Real Application Clusters (RAC) 11g Release 2 software for the database tier running on Oracle Solaris 11. The benchmark configuration included the SPARC T4-4 server for the application tier, a SPARC T4-2 server for the web tier and two SPARC T4-2 servers for the database tier.

  • The combination of the SPARC T4 servers running PeopleSoft HRSS 9.1 benchmark supports 4.5x the number of users an IBM pSeries 570 running PeopleSoft HRSS 8.9, with an average response time 40 percent better than IBM.

  • This result was obtained with two SPARC T4-2 servers running the database service using Oracle Real Application Clusters 11g Release 2 software in a high availability configuration.

  • The two SPARC T4-2 servers in the database tier used Oracle Solaris 11, and Oracle RAC 11g Release 2 software with database shared disk storage managed by Oracle Automatic Storage Management (ASM).

  • The average CPU utilization on one SPARC T4-4 server in the application tier handling 18,000 users is 54 percent, showing significant headroom for growth.

  • The SPARC T4 server for the application tier used Oracle Solaris Containers on Oracle Solaris 10, which provides a flexible, scalable and manageable virtualized environment.

  • The Peoplesoft HRMS Self-Service benchmark demonstrates better performance on Oracle hardware and software, engineered to work together, than Oracle software on IBM.

Performance Landscape

PeopleSoft HRMS Self-Service 9.1 Benchmark
Systems Processors Users Ave Response -
Search (sec)
Ave Response -
Save (sec)
SPARC T4-2 (web)
SPARC T4-4 (app)
2 x SPARC T4-2 (db)
2 x SPARC T4, 2.85 GHz
4 x SPARC T4, 3.0 GHz
2 x (2 x SPARC T4, 2.85 GHz)
18,000 1.048 0.742
SPARC T4-2 (web)
SPARC T4-4 (app)
SPARC T4-4 (db)
2 x SPARC T4, 2.85 GHz
4 x SPARC T4, 3.0 GHz
4 x SPARC T4, 3.0 GHz
15,000 1.01 0.63
PeopleSoft HRMS Self-Service 8.9 Benchmark
IBM Power 570 (web/app)
IBM Power 570 (db)
12 x POWER5, 1.9 GHz
4 x POWER5, 1.9 GHz
4,000 1.74 1.25
IBM p690 (web)
IBM p690 (app)
IBM p690 (db)
4 x POWER4, 1.9 GHz
12 x POWER4, 1.9 GHz
6 x 4392 MIPS/Gen1
4,000 1.35 1.01

The main differences between version 9.1 and version 8.9 of the benchmark are:

  • the database expanded from 100K employees and 20K managers to 500K employees and 100K managers,
  • the manager data was expanded,
  • a new transaction, "Employee Add Profile," was added, the percent of users executing it is less then 2%, and the transaction has a heavier footprint,
  • version 9.1 has a different benchmark metric (Average Response Search/Save time for x number of users) versus single user search/save time,
  • newer versions of the PeopleSoft application and PeopleTools software are used.

Configuration Summary

Application Server:

1 x SPARC T4-4 server
4 x SPARC T4 processors 3.0 GHz
512 GB main memory
5 x 300 GB SAS internal disks,
2 x 100 GB internal SSDs
1 x 300 GB internal SSD
Oracle Solaris 10 8/11
PeopleSoft PeopleTools 8.51.02
PeopleSoft HCM 9.1
Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 031
Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.6.0_20

Web Server:

1 x SPARC T4-2 server
2 x SPARC T4 processors 2.85 GHz
256 GB main memory
2 x 300 GB SAS internal disks
1 x 100 GB internal SSD
Oracle Solaris 10 8/11
PeopleSoft PeopleTools 8.51.02
Oracle WebLogic Server 11g (10.3.3)
Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.6.0_20

Database Server:

2 x SPARC T4-2 servers, each with
2 x SPARC T4 processors 2.85 GHz
128 GB main memory
3 x 300 GB SAS internal disks
Oracle Solaris 11 11/11
Oracle Database 11g Release 2
Oracle Real Application Clusters

Database Storage:

Data
1 x Sun Storage F5100 Flash Array (80 flash modules)
1 x COMSTAR Sun Fire X4470 M2 server
4 x Intel Xeon X7550 processors 2.0 GHz
128 GB main memory
Oracle Solaris 11 11/11
Redo
2 x COMSTAR Sun Fire X4275 servers, each with
1 x Intel Xeon E5540 processor 2.53 GHz
6 GB main memory)
12 x 2 TB SAS disks
Oracle Solaris 11 Express 2010.11

Connectivity:

1 x 8-port 10GbE switch
1 x 24-port 1GbE switch
1 x 32-port Brocade FC switch

Benchmark Description

The purpose of the PeopleSoft HRMS Self-Service 9.1 benchmark is to measure comparative online performance of the selected processes in PeopleSoft Enterprise HCM 9.1 with Oracle Database 11g. The benchmark kit is an Oracle standard benchmark kit run by all platform vendors to measure the performance. It is an OLTP benchmark with no dependency on remote COBOL calls, there is no batch workload, and DB SQLs are moderately complex. The results are certified by Oracle and a white paper is published.

PeopleSoft defines a business transaction as a series of HTML pages that guide a user through a particular scenario. Users are defined as corporate Employees, Managers and HR administrators. The benchmark consists of 14 scenarios which emulate users performing typical HCM transactions such as viewing paychecks, promoting and hiring employees, updating employee profiles and other typical HCM application transactions.

All of these transactions are well defined in the PeopleSoft HR Self-Service 9.1 benchmark kit. This benchmark metric is the Weighted Average Response search/save time for all users.

Key Points and Best Practices

  • The combined processing power of two SPARC T4-2 servers running the highly available Oracle RAC database can provide greater throughput and Oracle RAC scalability than is available from a single server.

  • All database data files/recovery files and Oracle Clusterware files were created with Oracle Automatic Storage Management (Oracle ASM) volume manager and file system which resulted in equivalent performance of conventional volume managers, file systems, and raw devices, but with the added benefit of the ease of management provided by Oracle ASM integrated storage management solution.

  • Five Oracle PeopleSoft Domains with 200 application servers (40 per each Domain) on the SPARC T4-4 server were hosted in two separate Oracle Solaris Containers for a total of 10 Domains/400 application servers processes to demonstrate consolidation of multiple application servers, ease of administration and load balancing.

  • Each Oracle Solaris Container was bound to a separate processor set, each containing 124 virtual processors. The default set (composed of 4 virtual processors from first and third processor socket, total of 8 virtual processors) was used for network and disk interrupt handling. This was done to improve performance by reducing memory access latency by using the physical memory closest to the processors and offload I/O interrupt handling to default set virtual processors, freeing up processing resources for application server virtual processors.

See Also

Disclosure Statement

Oracle's PeopleSoft HRMS 9.1 benchmark, www.oracle.com/us/solutions/benchmark/apps-benchmark/peoplesoft-167486.html, results 5/1/2012.

Thursday Apr 19, 2012

Sun ZFS Storage 7420 Appliance Delivers 2-Node World Record SPECsfs2008 NFS Benchmark

Oracle's Sun ZFS Storage 7420 appliance delivered world record two-node performance on the SPECsfs2008 NFS benchmark, beating results published on NetApp's dual-controller and 4-node high-end FAS6240 storage systems.

  • The Sun ZFS Storage 7420 appliance delivered a world record two-node result of 267,928 SPECsfs2008_nfs.v3 Ops/sec with an Overall Response Time (ORT) of 1.31 msec on the SPECsfs2008 NFS benchmark.

  • The Sun ZFS Storage 7420 appliance delivered 1.4x higher throughput than the dual-controller NetApp FAS6240 and 2.6x higher throughput than the dual-controller NetApp FAS3270 on the SPECsfs2008_nfs.v3 benchmark at less than half the list price of either result.

  • The Sun ZFS Storage 7420 appliance required 10 percent less rack space than the dual-controller NetApp FAS6240.

  • The Sun ZFS Storage 7420 appliance had 3 percent higher throughput than the 4-node NetApp FAS6240 on the SPECsfs2008_nfs.v3 benchmark.

  • The Sun ZFS Storage 7420 appliance required 25 percent less rack space than the 4-node NetApp FAS6240.

  • The Sun ZFS Storage 7420 appliance has 14 percent better Overall Response Time than the 4-node NetApp FAS6240 on the SPECsfs2008_nfs.v3 benchmark.

Performance Landscape

SPECsfs2008_nfs.v3 Performance Chart (in decreasing SPECsfs2008_nfs.v3 Ops/sec order)

Sponsor System Throughput
(Ops/sec)
Overall Response
Time (msec)
Nodes Memory (GB)
Including Flash
Disks Rack Units –
Controllers
+Disks
Oracle 7420 267,928 1.31 2 6,728 280 54
NetApp FAS6240 260,388 1.53 4 2,256 288 72
NetApp FAS6240 190,675 1.17 2 1,128 288 60
EMC VG8 135,521 1.92 280 312
Oracle 7320 134,140 1.51 2 4,968 136 26
EMC NS-G8 110,621 2.32 264 100
NetApp FAS3270 101,183 1.66 2 40 360 66

Throughput SPECsfs2008_nfs.v3 Ops/sec — the Performance Metric
Overall Response Time — the corresponding Response Time Metric
Nodes — Nodes and Controllers are being used interchangeably

Complete SPECsfs2008 benchmark results may be found at http://www.spec.org/sfs2008/results/sfs2008.html.

Configuration Summary

Storage Configuration:

Sun ZFS Storage 7420 appliance in clustered configuration
2 x Sun ZFS Storage 7420 controllers, each with
4 x 2.4 GHz Intel Xeon E7-4870 processors
1 TB memory
4 x 512 GB SSD flash-enabled read-cache
2 x 10GbE NICs
12 x Sun Disk shelves
10 x shelves with 24 x 300 GB 15K RPM SAS-2 drives
2 x shelves with 20 x 300 GB 15K RPM SAS-2 drives and 4 x 73 GB SAS-2 flash-enabled write-cache

Server Configuration:

4 x Sun Fire X4270 M2 servers, each with
2 x 3.3 GHz Intel Xeon E5680 processors
144 GB memory
1 x 10 GbE NIC
Oracle Solaris 10 9/10

Switches:

1 x 24-port 10Gb Ethernet Switch

Benchmark Description

SPECsfs2008 is the latest version of the Standard Performance Evaluation Corporation (SPEC) benchmark suite measuring file server throughput and response time, providing a standardized method for comparing performance across different vendor platforms. SPECsfs2008 results summarize the server's capabilities with respect to the number of operations that can be handled per second, as well as the overall latency of the operations. The suite is a follow-on to the SFS97_R1 benchmark, adding a CIFS workload, an updated NFSv3 workload, support for additional client platforms, and a new test harness and reporting/submission framework.

See Also

Disclosure Statement

SPEC and SPECsfs are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results as of April 18, 2012, for more information see www.spec.org. Sun ZFS Storage 7420 Appliance 267,928 SPECsfs2008_nfs.v3 Ops/sec, 1.31 msec ORT, NetApp Data ONTAP 8.1 Cluster-Mode (4-node FAS6240) 260,388 SPECsfs2008_nfs.v3 Ops/Sec, 1.53 msec ORT, NetApp FAS6240 190,675 SPECsfs2008_nfs.v3 Ops/Sec, 1.17 msec ORT. NetApp FAS3270 101,183 SPECsfs2008_nfs.v3 Ops/Sec, 1.66 msec ORT.

Nodes refer to the item in the SPECsfs2008 disclosed Configuration Bill of Materials that have the Processing Elements that perform the NFS Processing Function. These are the first item listed in each of disclosed Configuration Bill of Materials except for EMC where it is both the first and third items listed, and HP, where it is the second item listed as Blade Servers. The number of nodes is from the QTY disclosed in the Configuration Bill of Materials as described above. Configuration Bill of Materials list price for Oracle result of US$ 423,644. Configuration Bill of Materials list price for NetApp FAS3270 result of US$ 1,215,290. Configuration Bill of Materials list price for NetApp FAS6240 result of US$ 1,028,118. Oracle pricing from https://shop.oracle.com/pls/ostore/f?p=dstore:home:0, traverse to "Storage and Tape" and then to "NAS Storage". NetApp's pricing from http://www.netapp.com/us/media/na-list-usd-netapp-custom-state-new-discounts.html.

Sunday Apr 15, 2012

Sun ZFS Storage 7420 Appliance Delivers Top High-End Price/Performance Result for SPC-2 Benchmark

Oracle's Sun ZFS Storage 7420 appliance delivered leading high-end price/performance on the SPC Benchmark 2 (SPC-2).

  • The Sun ZFS Storage 7420 appliance delivered a result of 10,704 SPC-2 MB/s at $35.24 $/SPC-2 MB/s on the SPC-2 benchmark.

  • The Sun ZFS Storage 7420 appliance beats the IBM DS8800 result by over 10% on SPC-2 MB/s and has 7.7x better $/SPC-2 MB/s.

  • The Sun ZFS Storage 7420 appliance achieved the best price/performance for the top 18 posted unique performance results on the SPC-2 benchmark.

Performance Landscape

SPC-2 Performance Chart (in decreasing performance order)

System SPC-2
MB/s
$/SPC-2
MB/s
ASU
Capacity
(GB)
TSC Price Data
Protection
Level
Date Results
Identifier
HP StorageWorks P9500 13,148 $88.34 129,112 $1,161,504 RAID-5 03/07/12 B00056
Sun ZFS Storage 7420 10,704 $35.24 31,884 $377,225 Mirroring 04/12/12 B00058
IBM DS8800 9,706 $270.38 71,537 $2,624,257 RAID-5 12/01/10 B00051
HP XP24000 8,725 $187.45 18,401 $1,635,434 Mirroring 09/08/08 B00035
Hitachi Storage Platform V 8,725 $187.49 18,401 $1,635,770 Mirroring 09/08/08 B00036
TMS RamSan-630 8,323 $49.37 8,117 $410,927 RAID-5 05/10/11 B00054
IBM XIV 7,468 $152.34 154,619 $1,137,641 RAID-1 10/19/11 BE00001
IBM DS8700 7,247 $277.22 32,642 $2,009,007 RAID-5 11/30/09 B00049
IBM SAN Vol Ctlr 4.2 7,084 $463.66 101,155 $3,284,767 RAID-5 07/12/07 B00024
Fujitsu ETERNUS DX440 S2 5,768 $66.50 42,133 $383,576 Mirroring 04/12/12 B00057
IBM DS5300 5,634 $74.13 16,383 $417,648 RAID-5 10/21/09 B00045
Sun Storage 6780 5,634 $47.03 16,383 $264,999 RAID-5 10/28/09 B00047
IBM DS5300 5,544 $75.33 14,043 $417,648 RAID-6 10/21/09 B00046
Sun Storage 6780 5,544 $47.80 14,043 $264,999 RAID-6 10/28/09 B00048
IBM DS5300 4,818 $93.80 16,383 $451,986 RAID-5 09/25/08 B00037
Sun Storage 6780 4,818 $53.61 16,383 $258,329 RAID-5 02/02/09 B00039
IBM DS5300 4,676 $96.67 14,043 $451,986 RAID-6 09/25/08 B00038
Sun Storage 6780 4,676 $55.25 14,043 $258,329 RAID-6 02/03/09 B00040
IBM SAN Vol Ctlr 4.1 4,544 $400.78 51,265 $1,821,301 RAID-5 09/12/06 B00011
IBM SAN Vol Ctlr 3.1 3,518 $563.93 20,616 $1,983,785 Mirroring 12/14/05 B00001
Fujitsu ETERNUS8000 1100 3,481 $238.93 4,570 $831,649 Mirroring 03/08/07 B00019
IBM DS8300 3,218 $539.38 15,393 $1,735,473 Mirroring 12/14/05 B00006
IBM Storwize V7000 3,133 $71.32 29,914 $223,422 RAID-5 12/13/10 B00052

SPC-2 MB/s = the Performance Metric
$/SPC-2 MB/s = the Price/Performance Metric
ASU Capacity = the Capacity Metric
Data Protection = Data Protection Metric
TSC Price = Total Cost of Ownership Metric
Results Identifier = A unique identification of the result Metric

Complete SPC-2 benchmark results may be found at http://www.storageperformance.org.

Configuration Summary

Storage Configuration:

Sun ZFS Storage 7420 appliance in clustered configuration
2 x Sun ZFS Storage 7420 controllers, each with
4 x 2.0 GHz Intel Xeon X7550 processors
512 GB memory, 64 x 8 GB 1066 MHz DDR3 DIMMs
16 x Sun Disk shelves, each with
24 x 300 GB 15K RPM SAS-2 drives

Server Configuration:

1 x Sun Fire X4470 server, with
4 x 2.4 GHz Intel Xeon E7-4870 processors
512 GB memory
8 x 8 Gb FC connections to the Sun ZFS Storage 7420 appliance
Oracle Solaris 11 11/11

2 x Sun Fire X4470 servers, each with
4 x 2.4 GHz Intel Xeon E7-4870 processors
256 GB memory
8 x 8 Gb FC connections to the Sun ZFS Storage 7420 appliance
Oracle Solaris 11 11/11

Benchmark Description

SPC Benchmark-2 (SPC-2): Consists of three distinct workloads designed to demonstrate the performance of a storage subsystem during the execution of business critical applications that require the large-scale, sequential movement of data. Those applications are characterized predominately by large I/Os organized into one or more concurrent sequential patterns. A description of each of the three SPC-2 workloads is listed below as well as examples of applications characterized by each workload.

  • Large File Processing: Applications in a wide range of fields, which require simple sequential process of one or more large files such as scientific computing and large-scale financial processing.
  • Large Database Queries: Applications that involve scans or joins of large relational tables, such as those performed for data mining or business intelligence.
  • Video on Demand: Applications that provide individualized video entertainment to a community of subscribers by drawing from a digital film library.

SPC-2 is built to:

  • Provide a level playing field for test sponsors.
  • Produce results that are powerful and yet simple to use.
  • Provide value for engineers as well as IT consumers and solution integrators.
  • Is easy to run, easy to audit/verify, and easy to use to report official results.

See Also

Disclosure Statement

SPC-2, SPC-2 MB/s, $/SPC-2 MB/s are registered trademarks of Storage Performance Council (SPC). Results as of April 12, 2012, for more information see www.storageperformance.org. Sun ZFS Storage 7420 Appliance http://www.storageperformance.org/results/benchmark_results_spc2#b00058; IBM DS8800 http://www.storageperformance.org/results/benchmark_results_spc2#b00051.

Thursday Apr 12, 2012

Sun Fire X4270 M3 SAP Enhancement Package 4 for SAP ERP 6.0 (Unicode) Two-Tier Standard Sales and Distribution (SD) Benchmark

Oracle's Sun Fire X4270 M3 server (now known as Sun Server X3-2L) achieved 8,320 SAP SD Benchmark users running SAP enhancement package 4 for SAP ERP 6.0 with unicode software using Oracle Database 11g and Oracle Solaris 10.

  • The Sun Fire X4270 M3 server using Oracle Database 11g and Oracle Solaris 10 beat both IBM Flex System x240 and IBM System x3650 M4 server running DB2 9.7 and Windows Server 2008 R2 Enterprise Edition.

  • The Sun Fire X4270 M3 server running Oracle Database 11g and Oracle Solaris 10 beat the HP ProLiant BL460c Gen8 server using SQL Server 2008 and Windows Server 2008 R2 Enterprise Edition by 6%.

  • The Sun Fire X4270 M3 server using Oracle Database 11g and Oracle Solaris 10 beat Cisco UCS C240 M3 server running SQL Server 2008 and Windows Server 2008 R2 Datacenter Edition by 9%.

  • The Sun Fire X4270 M3 server running Oracle Database 11g and Oracle Solaris 10 beat the Fujitsu PRIMERGY RX300 S7 server using SQL Server 2008 and Windows Server 2008 R2 Enterprise Edition by 10%.

Performance Landscape

SAP-SD 2-Tier Performance Table (in decreasing performance order).

SAP ERP 6.0 Enhancement Pack 4 (Unicode) Results
(benchmark version from January 2009 to April 2012)

System OS
Database
Users SAP
ERP/ECC
Release
SAPS SAPS/
Proc
Date
Sun Fire X4270 M3
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Oracle Solaris 10
Oracle Database 11g
8,320 2009
6.0 EP4
(Unicode)
45,570 22,785 10-Apr-12
IBM Flex System x240
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Windows Server 2008 R2 EE
DB2 9.7
7,960 2009
6.0 EP4
(Unicode)
43,520 21,760 11-Apr-12
HP ProLiant BL460c Gen8
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Windows Server 2008 R2 EE
SQL Server 2008
7,865 2009
6.0 EP4
(Unicode)
42,920 21,460 29-Mar-12
IBM System x3650 M4
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Windows Server 2008 R2 EE
DB2 9.7
7,855 2009
6.0 EP4
(Unicode)
42,880 21,440 06-Mar-12
Cisco UCS C240 M3
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Windows Server 2008 R2 DE
SQL Server 2008
7,635 2009
6.0 EP4
(Unicode)
41,800 20,900 06-Mar-12
Fujitsu PRIMERGY RX300 S7
2xIntel Xeon E5-2690 @2.90GHz
128 GB
Windows Server 2008 R2 EE
SQL Server 2008
7,570 2009
6.0 EP4
(Unicode)
41,320 20,660 06-Mar-12

Complete benchmark results may be found at the SAP benchmark website http://www.sap.com/benchmark.

Configuration and Results Summary

Hardware Configuration:

Sun Fire X4270 M3
2 x 2.90 GHz Intel Xeon E5-2690 processors
128 GB memory
Sun StorageTek 6540 with 4 * 16 * 300GB 15Krpm 4Gb FC-AL

Software Configuration:

Oracle Solaris 10
Oracle Database 11g
SAP enhancement package 4 for SAP ERP 6.0 (Unicode)

Certified Results (published by SAP):

Number of benchmark users:
8,320
Average dialog response time:
0.95 seconds
Throughput:

Fully processed order line:
911,330

Dialog steps/hour:
2,734,000

SAPS:
45,570
SAP Certification:
2012014

Benchmark Description

The SAP Standard Application SD (Sales and Distribution) Benchmark is a two-tier ERP business test that is indicative of full business workloads of complete order processing and invoice processing, and demonstrates the ability to run both the application and database software on a single system. The SAP Standard Application SD Benchmark represents the critical tasks performed in real-world ERP business environments.

SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products.

See Also

Disclosure Statement

Two-tier SAP Sales and Distribution (SD) standard SAP SD benchmark based on SAP enhancement package 4 for SAP ERP 6.0 (Unicode) application benchmark as of 04/11/12: Sun Fire X4270 M3 (2 processors, 16 cores, 32 threads) 8,320 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, Oracle 11g, Solaris 10, Cert# 2012014. IBM Flex System x240 (2 processors, 16 cores, 32 threads) 7,960 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, DB2 9.7, Windows Server 2008 R2 EE, Cert# 2012016. IBM System x3650 M4 (2 processors, 16 cores, 32 threads) 7,855 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, DB2 9.7, Windows Server 2008 R2 EE, Cert# 2012010. Cisco UCS C240 M3 (2 processors, 16 cores, 32 threads) 7,635 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 DE, Cert# 2012011. Fujitsu PRIMERGY RX300 S7 (2 processors, 16 cores, 32 threads) 7,570 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 EE, Cert# 2012008. HP ProLiant DL380p Gen8 (2 processors, 16 cores, 32 threads) 7,865 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 EE, Cert# 2012012.

SAP, R/3, reg TM of SAP AG in Germany and other countries. More info www.sap.com/benchmark

Tuesday Apr 10, 2012

World Record Oracle E-Business Suite 12.1.3 Standard Extra-Large Payroll (Batch) Benchmark on Sun Server X3-2L

Oracle's Sun Server X3-2L (formerly Sun Fire X4270 M3) server set a world record running the Oracle E-Business Suite 12.1.3 Standard Extra-Large Payroll (Batch) benchmark.

  • This is the first published result using Oracle E-Business 12.1.3.

  • The Sun Server X3-2L result ran the Extra-Large Payroll workload in 19 minutes.

Performance Landscape

This is the first published result for the Payroll Extra-Large model using Oracle E-Business 12.1.3 benchmark.

Batch Workload: Payroll Extra-Large Model
System Employees/Hr Elapsed Time
Sun Server X3-2L 789,515 19 minutes

Configuration Summary

Hardware Configuration:

Sun Server X3-2L
2 x Intel Xeon E5-2690, 2.9 GHz
128 GB memory
8 x 100 GB SSD for data
1 x 300 GB SSD for log

Software Configuration:

Oracle Linux 5.7
Oracle E-Business Suite R12 (12.1.3)
Oracle Database 11g (11.2.0.3)

Benchmark Description

The Oracle E-Business Suite Standard R12 Benchmark combines online transaction execution by simulated users with concurrent batch processing to model a typical scenario for a global enterprise. This benchmark ran one Batch component, Payroll, in the Extra-Large size. The goal of the benchmark proposal is to execute and achieve best batch-payroll performance using X-Large configuragion.

Results can be published in four sizes and use one or more online/batch modules

  • X-large: Maximum online users running all business flows between 10,000 to 20,000; 750,000 order to cash lines per hour and 250,000 payroll checks per hour.
    • Order to Cash Online -- 2400 users
      • The percentage across the 5 transactions in Order Management module is:
        • Insert Manual Invoice -- 16.66%
        • Insert Order -- 32.33%
        • Order Pick Release -- 16.66%
        • Ship Confirm -- 16.66%
        • Order Summary Report -- 16.66%
    • HR Self-Service -- 4000 users
    • Customer Support Flow -- 8000 users
    • Procure to Pay -- 2000 users
  • Large: 10,000 online users; 100,000 order to cash lines per hour and 100,000 payroll checks per hour.
  • Medium: up to 3000 online users; 50,000 order to cash lines per hour and 10,000 payroll checks per hour.
  • Small: up to 1000 online users; 10,000 order to cash lines per hour and 5,000 payroll checks per hour.

See Also

Disclosure Statement

Oracle E-Business X-Large Batch-Payroll benchmark, Sun Server X3-2L, 2.90 GHz, 2 chips, 16 cores, 32 threads, 128 GB memory, elapsed time 19.0 minutes, 789,515 Employees/HR, Oracle Linux 5.7, Oracle E-Business Suite 12.1.3, Oracle Database 11g Release 2, Results as of 7/10/2012.

SPEC CPU2006 Results on Oracle's Sun x86 Servers

Oracle's new Sun x86 servers delivered world records on the benchmarks SPECfp2006 and SPECint_rate2006 for two processor servers. This was accomplished with Oracle Solaris 11 and Oracle Solaris Studio 12.3 software.

  • The Sun Fire X4170 M3 (now known as Sun Server X3-2) server achieved a world record result in for SPECfp2006 benchmark with a score of 96.8.

  • The Sun Blade X6270 M3 server module (now known as Sun Blade X3-2B) produced best integer throughput performance for all 2-socket servers with a SPECint_rate2006 score of 705.

  • The Sun x86 servers with Intel Xeon E5-2690 2.9 GHz processors produced a cross-generational performance improvement up to 1.8x over the previous generation, Sun x86 M2 servers.

Performance Landscape

Complete benchmark results are at the SPEC website, SPEC CPU2006 Results. The tables below provide the new Oracle results, as well as, select results from other vendors.

SPECint2006
System Processor c/c/t * Peak Base O/S Compiler
Fujitsu PRIMERGY BX924 S3 Intel E5-2690, 2.9 GHz 2/16/16 60.8 56.0 RHEL 6.2 Intel 12.1.2.273
Sun Fire X4170 M3 Intel E5-2690, 2.9 GHz 2/16/32 58.5 54.3 Oracle Linux 6.1 Intel 12.1.0.225
Sun Fire X4270 M2 Intel X5690, 3.47 GHz 2/12/12 46.2 43.9 Oracle Linux 5.5 Intel 12.0.1.116

SPECfp2006
System Processor c/c/t * Peak Base O/S Compiler
Sun Fire X4170 M3 Intel E5-2690, 2.9 GHz 2/16/32 96.8 86.4 Oracle Solaris 11 Studio 12.3
Sun Blade X6270 M3 Intel E5-2690, 2.9 GHz 2/16/32 96.0 85.2 Oracle Solaris 11 Studio 12.3
Sun Fire X4270 M3 Intel E5-2690, 2.9 GHz 2/16/32 95.9 85.1 Oracle Solaris 11 Studio 12.3
Fujitsu CELSIUS R920 Intel E5-2687, 2.9 GHz 2/16/16 93.8 87.6 RHEL 6.1 Intel 12.1.2.273
Sun Fire X4270 M2 Intel X5690, 3.47 GHz 2/12/24 64.2 59.2 Oracle Solaris 10 Studio 12.2

Only 2-chip server systems listed below, excludes workstations.

SPECint_rate2006
System Processor Base
Copies
c/c/t * Peak Base O/S Compiler
Sun Blade X6270 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 705 632 Oracle Solaris 11 Studio 12.3
Sun Fire X4270 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 705 630 Oracle Solaris 11 Studio 12.3
Sun Fire X4170 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 702 628 Oracle Solaris 11 Studio 12.3
Cisco UCS C220 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 697 671 RHEL 6.2 Intel 12.1.0.225
Sun Blade X6270 M2 Intel X5690, 3.47 GHz 24 2/12/24 410 386 Oracle Linux 5.5 Intel 12.0.1.116

SPECfp_rate2006
System Processor Base
Copies
c/c/t * Peak Base O/S Compiler
Cisco UCS C240 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 510 496 RHEL 6.2 Intel 12.1.2.273
Sun Fire X4270 M3 Intel E5-2690, 2.9 GHz 64 2/16/32 497 461 Oracle Solaris 11 Studio 12.3
Sun Blade X6270 M3 Intel E5-2690, 2.9 GHz 32 2/16/32 497 460 Oracle Solaris 11 Studio 12.3
Sun Fire X4170 M3 Intel E5-2690, 2.9 GHz 64 2/16/32 495 464 Oracle Solaris 11 Studio 12.3
Sun Fire X4270 M2 Intel E5690, 3.47 GHz 24 2/12/24 273 265 Oracle Linux 5.5 Intel 12.0.1.116

* c/c/t — chips / cores / threads enabled

Configuration Summary and Results

Hardware Configuration:

Sun Fire X4170 M3 server
2 x 2.90 GHz Intel Xeon E5-2690 processors
128 GB memory (16 x 8 GB 2Rx4 PC3-12800R-11, ECC)

Sun Fire X4270 M3 server
2 x 2.90 GHz Intel Xeon E5-2690 processors
128 GB memory (16 x 8 GB 2Rx4 PC3-12800R-11, ECC)

Sun Blade X6270 M3 server module
2 x 2.90 GHz Intel Xeon E5-2690 processors
128 GB memory (16 x 8 GB 2Rx4 PC3-12800R-11, ECC)

Software Configuration:

Oracle Solaris 11 11/11 (SRU2)
Oracle Solaris Studio 12.3 (patch update 1 nightly build 120313) Oracle Linux Server Release 6.1
Intel C++ Studio XE 12.1.0.225
SPEC CPU2006 V1.2

Benchmark Description

SPEC CPU2006 is SPEC's most popular benchmark. It measures:

  • Speed — single copy performance of chip, memory, compiler
  • Rate — multiple copy (throughput)

The benchmark is also divided into integer intensive applications and floating point intensive applications:

  • integer: 12 benchmarks derived from real applications such as perl, gcc, XML processing, and pathfinding
  • floating point: 17 benchmarks derived from real applications, including chemistry, physics, genetics, and weather.

It is also divided depending upon the amount of optimization allowed:

  • base: optimization is consistent per compiled language, all benchmarks must be compiled with the same flags per language.
  • peak: specific compiler optimization is allowed per application.

The overall metrics for the benchmark which are commonly used are:

  • SPECint_rate2006, SPECint_rate_base2006: integer, rate
  • SPECfp_rate2006, SPECfp_rate_base2006: floating point, rate
  • SPECint2006, SPECint_base2006: integer, speed
  • SPECfp2006, SPECfp_base2006: floating point, speed

See here for additional information.

See Also

Disclosure Statement

SPEC and the benchmark names SPECfp and SPECint are registered trademarks of the Standard Performance Evaluation Corporation. Results as of 10 April 2012 from www.spec.org and this report.

SPEC CPU2006 Results on Oracle's Netra Server X3-2

Oracle's Netra Server X3-2 (formerly Sun Netra X4270 M3) equipped with the new Intel Xeon processor E5-2658, is up to 2.5x faster than the previous generation Netra systems on SPEC CPU2006 workloads.

Performance Landscape

Complete benchmark results are at the SPEC website, SPEC CPU2006 results. The tables below provide the new Oracle results and previous generation results.

SPECint2006
System Processor c/c/t * Peak Base O/S Compiler
Netra Server X3-2
Intel E5-2658, 2.1 GHz 2/16/32 38.5 36.0 Oracle Linux 6.1 Intel 12.1.0.225
Sun Netra X4270 Intel L5518, 2.13 GHz 2/8/16 27.9 25.0 Oracle Linux 5.4 Intel 11.1
Sun Netra X4250 Intel L5408, 2.13 GHz 2/8/8 20.3 17.9 SLES 10 SP1 Intel 11.0

SPECfp2006
System Processor c/c/t * Peak Base O/S Compiler
Netra Server X3-2 Intel E5-2658, 2.1 GHz 2/16/32 65.3 61.6 Oracle Linux 6.1 Intel 12.1.0.225
Sun Netra X4270 Intel L5518, 2.13 GHz 2/8/16 32.5 29.4 Oracle Linux 5.4 Intel 11.1
Sun Netra X4250 Intel L5408, 2.13 GHz 2/8/8 18.5 17.7 SLES 10 SP1 Intel 11.0

SPECint_rate2006
System Processor Base
Copies
c/c/t * Peak Base O/S Compiler
Netra Server X3-2 Intel E5-2658, 2.1 GHz 32 2/16/32 477 455 Oracle Linux 6.1 Intel 12.1.0.225
Sun Netra X4270 Intel L5518, 2.13 GHz 16 2/8/16 201 189 Oracle Linux 5.4 Intel 11.1
Sun Netra X4250 Intel L5408, 2.13 GHz 8 2/8/8 103 82.0 SLES 10 SP1 Intel 11.0

SPECfp_rate2006
System Processor Base
Copies
c/c/t * Peak Base O/S Compiler
Netra Server X3-2 Intel E5-2658, 2.1 GHz 32 2/16/32 392 383 Oracle Linux 6.1 Intel 12.1.0.225
Sun Netra X4270 Intel L5518, 2.13 GHz 16 2/8/16 155 153 Oracle Linux 5.4 Intel 11.1
Sun Netra X4250 Intel L5408, 2.13 GHz 8 2/8/8 55.9 52.3 SLES 10 SP1 Intel 11.0

* c/c/t — chips / cores / threads enabled

Configuration Summary

Hardware Configuration:

Netra Server X3-2
2 x 2.10 GHz Intel Xeon E5-2658 processors
128 GB memory (16 x 8 GB 2Rx4 PC3-12800R-11, ECC)

Software Configuration:

Oracle Linux Server Release 6.1
Intel C++ Studio XE 12.1.0.225
SPEC CPU2006 V1.2

Benchmark Description

SPEC CPU2006 is SPEC's most popular benchmark. It measures:

  • Speed — single copy performance of chip, memory, compiler
  • Rate — multiple copy (throughput)

The benchmark is also divided into integer intensive applications and floating point intensive applications:

  • integer: 12 benchmarks derived from real applications such as perl, gcc, XML processing, and pathfinding
  • floating point: 17 benchmarks derived from real applications, including chemistry, physics, genetics, and weather.

It is also divided depending upon the amount of optimization allowed:

  • base: optimization is consistent per compiled language, all benchmarks must be compiled with the same flags per language.
  • peak: specific compiler optimization is allowed per application.

The overall metrics for the benchmark which are commonly used are:

  • SPECint_rate2006, SPECint_rate_base2006: integer, rate
  • SPECfp_rate2006, SPECfp_rate_base2006: floating point, rate
  • SPECint2006, SPECint_base2006: integer, speed
  • SPECfp2006, SPECfp_base2006: floating point, speed

See here for additional information.

See Also

Disclosure Statement

SPEC and the benchmark names SPECfp and SPECint are registered trademarks of the Standard Performance Evaluation Corporation. Results as of 10 July 2012 from www.spec.org and this report.

Thursday Mar 29, 2012

Sun Server X2-8 (formerly Sun Fire X4800 M2) Delivers World Record TPC-C for x86 Systems

Oracle's Sun Server X2-8 (formerly Sun Fire X4800 M2 server) equipped with eight 2.4 GHz Intel Xeon Processor E7-8870 chips obtained a result of 5,055,888 tpmC on the TPC-C benchmark. This result is a world record for x86 servers. Oracle demonstrated this world record database performance running Oracle Database 11g Release 2 Enterprise Edition with Partitioning.

  • The Sun Server X2-8 delivered a new x86 TPC-C world record of 5,055,888 tpmC with a price performance of $0.89/tpmC using Oracle Database 11g Release 2. This configuration is available 7/10/12.

  • The Sun Server X2-8 delivers 3.0x times better performance than the next 8-processor result, an IBM System p 570 equipped with POWER6 processors.

  • The Sun Server X2-8 has 3.1x times better price/performance than the 8-processor 4.7GHz POWER6 IBM System p 570.

  • The Sun Server X2-8 has 1.6x times better performance than the 4-processor IBM x3850 X5 system equipped with Intel Xeon processors.

  • This is the first TPC-C result on any system using eight Intel Xeon Processor E7-8800 Series chips.

  • The Sun Server X2-8 is the first x86 system to get over 5 million tpmC.

  • The Oracle solution utilized Oracle Linux operating system and Oracle Database 11g Enterprise Edition Release 2 with Partitioning to produce the x86 world record TPC-C benchmark performance.

Performance Landscape

Select TPC-C results (sorted by tpmC, bigger is better)

System p/c/t tpmC Price
/tpmC
Avail Database Memory
Size
Sun Server X2-8 8/80/160 5,055,888 0.89 USD 7/10/2012 Oracle 11g R2 4 TB
IBM x3850 X5 4/40/80 3,014,684 0.59 USD 7/11/2011 DB2 ESE 9.7 3 TB
IBM x3850 X5 4/32/64 2,308,099 0.60 USD 5/20/2011 DB2 ESE 9.7 1.5 TB
IBM System p 570 8/16/32 1,616,162 3.54 USD 11/21/2007 DB2 9.0 2 TB

p/c/t - processors, cores, threads
Avail - availability date

Oracle and IBM TPC-C Response times

System tpmC Response Time (sec)
New Order 90th%
Response Time (sec)
New Order Average

Sun Server X2-8 5,055,888 0.210 0.166
IBM x3850 X5 3,014,684 0.500 0.272
Ratios - Oracle Better 1.6x 1.4x 1.3x

Oracle uses average new order response time for comparison between Oracle and IBM.

Graphs of Oracle's and IBM's response times for New-Order can be found in the full disclosure reports on TPC's website TPC-C Official Result Page.

Configuration Summary and Results

Hardware Configuration:

Server
Sun Server X2-8
8 x 2.4 GHz Intel Xeon Processor E7-8870
4 TB memory
8 x 300 GB 10K RPM SAS internal disks
8 x Dual port 8 Gbs FC HBA

Data Storage
10 x Sun Fire X4270 M2 servers configured as COMSTAR heads, each with
1 x 3.06 GHz Intel Xeon X5675 processor
8 GB memory
10 x 2 TB 7.2K RPM 3.5" SAS disks
2 x Sun Storage F5100 Flash Array storage (1.92 TB each)
1 x Brocade 5300 switches

Redo Storage
2 x Sun Fire X4270 M2 servers configured as COMSTAR heads, each with
1 x 3.06 GHz Intel Xeon X5675 processor
8 GB memory
11 x 2 TB 7.2K RPM 3.5" SAS disks

Clients
8 x Sun Fire X4170 M2 servers, each with
2 x 3.06 GHz Intel Xeon X5675 processors
48 GB memory
2 x 300 GB 10K RPM SAS disks

Software Configuration:

Oracle Linux (Sun Fire 4800 M2)
Oracle Solaris 11 Express (COMSTAR for Sun Fire X4270 M2)
Oracle Solaris 10 9/10 (Sun Fire X4170 M2)
Oracle Database 11g Release 2 Enterprise Edition with Partitioning
Oracle iPlanet Web Server 7.0 U5
Tuxedo CFS-R Tier 1

Results:

System: Sun Server X2-8
tpmC: 5,055,888
Price/tpmC: 0.89 USD
Available: 7/10/2012
Database: Oracle Database 11g
Cluster: no
New Order Average Response: 0.166 seconds

Benchmark Description

TPC-C is an OLTP system benchmark. It simulates a complete environment where a population of terminal operators executes transactions against a database. The benchmark is centered around the principal activities (transactions) of an order-entry environment. These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses.

Key Points and Best Practices

  • Oracle Database 11g Release 2 Enterprise Edition with Partitioning scales easily to this high level of performance.

  • COMSTAR (Common Multiprotocol SCSI Target) is the software framework that enables an Oracle Solaris host to serve as a SCSI Target platform. COMSTAR uses a modular approach to break the huge task of handling all the different pieces in a SCSI target subsystem into independent functional modules which are glued together by the SCSI Target Mode Framework (STMF). The modules implementing functionality at SCSI level (disk, tape, medium changer etc.) are not required to know about the underlying transport. And the modules implementing the transport protocol (FC, iSCSI, etc.) are not aware of the SCSI-level functionality of the packets they are transporting. The framework hides the details of allocation providing execution context and cleanup of SCSI commands and associated resources and simplifies the task of writing the SCSI or transport modules.

  • Oracle iPlanet Web Server middleware is used for the client tier of the benchmark. Each web server instance supports more than a quarter-million users while satisfying the response time requirement from the TPC-C benchmark.

See Also

Disclosure Statement

TPC Benchmark C, tpmC, and TPC-C are trademarks of the Transaction Processing Performance Council (TPC). Sun Server X2-8 (8/80/160) with Oracle Database 11g Release 2 Enterprise Edition with Partitioning, 5,055,888 tpmC, $0.89 USD/tpmC, available 7/10/2012. IBM x3850 X5 (4/40/80) with DB2 ESE 9.7, 3,014,684 tpmC, $0.59 USD/tpmC, available 7/11/2011. IBM x3850 X5 (4/32/64) with DB2 ESE 9.7, 2,308,099 tpmC, $0.60 USD/tpmC, available 5/20/2011. IBM System p 570 (8/16/32) with DB2 9.0, 1,616,162 tpmC, $3.54 USD/tpmC, available 11/21/2007. Source: http://www.tpc.org/tpcc, results as of 7/15/2011.

Monday Feb 27, 2012

Sun ZFS Storage 7320 Appliance 33% Faster Than NetApp FAS3270 on SPECsfs2008

Oracle's Sun ZFS Storage 7320 appliance delivered outstanding performance on the SPECsfs2008 NFS benchmark, beating results published on NetApp's fastest midrange platform, the NetApp FAS3270, and the EMC Gateway NS-G8 Server Failover Cluster.

  • The Sun ZFS Storage 7320 appliance delivered 134,140 SPECsfs2008_nfs.v3 Ops/sec with an Overall Response Time (ORT) of 1.51 msec on the SPECsfs2008 NFS benchmark.

  • The Sun ZFS Storage 7320 appliance has 33% higher throughput than the NetApp FAS3270 on the SPECsfs2008 NFS benchmark.

  • The Sun ZFS Storage 7320 appliance required less than half the rack space of the NetApp FAS3270.

  • The Sun ZFS Storage 7320 appliance has 9% better Overall Response Time than the NetApp FAS3270 on the SPECsfs2008 NFS benchmark.

Performance Landscape

SPECsfs2008_nfs.v3 Performance Chart (in decreasing SPECsfs2008_nfs.v3 Ops/sec order)

Sponsor System Throughput
(Ops/sec)
Overall Response
Time (msec)
Memory
(GB)
Disks Exported
Capacity (TB)
Rack Units
Controllers+Disks
EMC VG8 135,521 1.92 280 312 19.2
Oracle 7320 134,140 1.51 288 136 37.0 26
EMC NS-G8 110,621 2.32 264 100 17.6
NetApp FAS3270 101,183 1.66 40 360 110.1 66

Throughput SPECsfs2008_nfs.v3 Ops/sec = the Performance Metric
Overall Response Time = the corresponding Response Time Metric

Complete SPECsfs2008 benchmark results may be found at http://www.spec.org/sfs2008/results/sfs2008.html.

Configuration Summary

Storage Configuration:

Sun ZFS Storage 7320 appliance in clustered configuration
2 x Sun ZFS Storage 7320 controllers, each with
2 x 2.4 GHz Intel Xeon E5620 processors
144 GB memory
4 x 512 GB SSD flash-enabled read-cache
6 x Sun Disk shelves
4 x shelves with 24 x 300 GB 15K RPM SAS-2 drives
2 x shelves with 20 x 300 GB 15K RPM SAS-2 drives and 4 x 73 GB SAS-2 flash-enabled write-cache

Server Configuration:

3 x Sun Fire X4270 M2 servers, each with
2 x 2.4 GHz Intel Xeon E5620 processors
12 GB memory
1 x 10 GbE connection to the Sun ZFS Storage 7320 appliance
Oracle Solaris 10 8/11

Benchmark Description

SPECsfs2008 is the latest version of the Standard Performance Evaluation Corporation (SPEC) benchmark suite measuring file server throughput and response time, providing a standardized method for comparing performance across different vendor platforms. SPECsfs2008 results summarize the server's capabilities with respect to the number of operations that can be handled per second, as well as the overall latency of the operations. The suite is a follow-on to the SFS97_R1 benchmark, adding a CIFS workload, an updated NFSv3 workload, support for additional client platforms, and a new test harness and reporting/submission framework.

See Also

Disclosure Statement

SPEC and SPECsfs are registered trademarks of Standard Performance Evaluation Corporation (SPEC). Results as of February 22, 2012, for more information see www.spec.org. Sun ZFS Storage 7320 Appliance 134,140 SPECsfs2008_nfs.v3 Ops/sec, 1.51 msec ORT, NetApp FAS3270 101,183 SPECsfs2008_nfs.v3 Ops/Sec, 1.66 msec ORT, EMC Celerra Gateway NS-G8 Server Failover Cluster, 3 Datamovers (1 stdby) / Symmetrix V-Max 110,621 SPECsfs2008_nfs.v3 Ops/Sec, 2.32 msec ORT.

Thursday Jan 12, 2012

Netra SPARC T4-2 SPECjvm2008 World Record Performance

Oracle's Netra SPARC T4-2 server equipped with two SPARC T4 processors running at 2.85 GHz delivered a World Record result of 454.52 SPECjvm2008 Peak ops/m on the SPECjvm2008 benchmark. This result just eclipsed the previous record which was run on a similar product, Oracle's SPARC T4-2 server, which is also a two SPARC T4 processor based system.

  • The Netra SPARC T4-2 server demonstrates 41% better performance than the SPARC T3-2 server and similar performance to Oracle's SPARC T4-2 server.

  • The Netra SPARC T4-2 server running the SPECjvm2008 benchmark achieved a score of 454.52 SPECjvm2008 Peak ops/m while the Sun Blade X6270 server module achieved 317.13 SPECjvm2008 Base ops/m.

  • The Netra SPARC T4-2 server with hardware cryptography acceleration greatly increases performance with subtests using AES and RSA encryption ciphers.

  • This result was produced using Oracle Solaris 11 and Oracle JDK 7 Update 2.

  • There are no SPECjvm2008 results published by IBM on POWER7 based systems.

  • The Netra SPARC T4-2 server demonstrates Oracle's position of leadership in Java-based computing by publishing world record results for the SPECjvm2008 benchmark.

Performance Landscape

Complete benchmark results are at the SPECjvm2008 website.

SPECjvm2008 Performance Chart
(ordered by performance)
System Processors Performance
base peak
Netra SPARC T4-2 2 x 2.85 GHz SPARC T4 - 454.52
SPARC T4-2 2 x 2.85 GHz SPARC T4 - 454.25
SPARC T3-2 2 x 1.65 GHz SPARC T3 - 320.52
Sun Blade X6270 2 x 2.93 GHz Intel X5570 317.13 -

base: SPECjvm2008 Base ops/m (bigger is better)
peak: SPECjvm2008 Peak ops/m (bigger is better)

SPEC allows base and peak to be submitted separately. The base metric does not allow any optimization of the JVM, the peak metric allows optimization.

Configuration Summary

Hardware Configuration:

Netra SPARC T4-2 server
2 x 2.85 GHz SPARC T4 processors
256 GB memory

Software Configuration:

Oracle Solaris 11 11/11
Java Platform, Standard Edition, JDK 7 Update 2

Benchmark Description

SPECjvm2008 (Java Virtual Machine Benchmark) is a benchmark suite for measuring the performance of a Java Runtime Environment (JRE), containing several real life applications and benchmarks focusing on core Java functionality. The suite focuses on the performance of the JRE executing a single application; it reflects the performance of the hardware processor and memory subsystem, but has low dependence on file I/O and includes no network I/O across machines.

The SPECjvm2008 workload mimics a variety of common general purpose application computations. These characteristics reflect the intent that this benchmark will be applicable to measuring basic Java performance on a wide variety of both client and server systems.

SPECjvm2008 benchmark highlights:

  • Leverages real life applications (like derby, sunflow, and javac) and area-focused benchmarks (like xml, serialization, crypto, and scimark).
  • Also measures the performance of the operating system and hardware in the context of executing the JRE.

The current rules for the benchmark allow either base or peak to be run. The base run is done without any tuning of the JVM to improve the out of the box performance. The peak run allows tuning of the JVM.

Key Points and Best Practices

  • Enhancements to the JVM had a major impact on performance, especially for the security tests.

See Also

Disclosure Statement

SPEC and SPECjvm are registered trademarks of Standard Performance Evaluation Corporation. Results from www.spec.org and this report as of 1/9/2012. Netra SPARC T4-2 454.52 SPECjvm2008 Peak ops/m submitted for review, SPARC T4-2 454.25 SPECjvm2008 Peak ops/m, SPARC T3-2 320.52 SPECjvm2008 Peak ops/m, Sun Blade X6270 317.13 SPECjvm2008 Base ops/m.

Wednesday Nov 30, 2011

SPARC T4-4 Beats 8-CPU IBM POWER7 on TPC-H @3000GB Benchmark

Oracle's SPARC T4-4 server delivered a world record TPC-H @3000GB benchmark result for systems with four processors. This result beats eight processor results from IBM (POWER7) and HP (x86). The SPARC T4-4 server also delivered better performance per core than these eight processor systems from IBM and HP. Comparisons below are based upon system to system comparisons, highlighting Oracle's complete software and hardware solution.

This database world record result used Oracle's Sun Storage 2540-M2 arrays (rotating disk) connected to a SPARC T4-4 server running Oracle Solaris 11 and Oracle Database 11g Release 2 demonstrating the power of Oracle's integrated hardware and software solution.

  • The SPARC T4-4 server based configuration achieved a TPC-H scale factor 3000 world record for four processor systems of 205,792 QphH@3000GB with price/performance of $4.10/QphH@3000GB.

  • The SPARC T4-4 server with four SPARC T4 processors (total of 32 cores) is 7% faster than the IBM Power 780 server with eight POWER7 processors (total of 32 cores) on the TPC-H @3000GB benchmark.

  • The SPARC T4-4 server is 36% better in price performance compared to the IBM Power 780 server on the TPC-H @3000GB Benchmark.

  • The SPARC T4-4 server is 29% faster than the IBM Power 780 for data loading.

  • The SPARC T4-4 server is up to 3.4 times faster than the IBM Power 780 server for the Refresh Function.

  • The SPARC T4-4 server with four SPARC T4 processors is 27% faster than the HP ProLiant DL980 G7 server with eight x86 processors on the TPC-H @3000GB benchmark.

  • The SPARC T4-4 server is 52% faster than the HP ProLiant DL980 G7 server for data loading.

  • The SPARC T4-4 server is up to 3.2 times faster than the HP ProLiant DL980 G7 for the Refresh Function.

  • The SPARC T4-4 server achieved a peak IO rate from the Oracle database of 17 GB/sec. This rate was independent of the storage used, as demonstrated by the TPC-H @3000TB benchmark which used twelve Sun Storage 2540-M2 arrays (rotating disk) and the TPC-H @1000TB benchmark which used four Sun Storage F5100 Flash Array devices (flash storage). [*]

  • The SPARC T4-4 server showed linear scaling from TPC-H @1000GB to TPC-H @3000GB. This demonstrates that the SPARC T4-4 server can handle the increasingly larger databases required of DSS systems. [*]

  • The SPARC T4-4 server benchmark results demonstrate a complete solution of building Decision Support Systems including data loading, business questions and refreshing data. Each phase usually has a time constraint and the SPARC T4-4 server shows superior performance during each phase.

[*] The TPC believes that comparisons of results published with different scale factors are misleading and discourages such comparisons.

Performance Landscape

The table lists the leading TPC-H @3000GB results for non-clustered systems.

TPC-H @3000GB, Non-Clustered Systems
System
Processor
P/C/T – Memory
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
SPARC Enterprise M9000
3.0 GHz SPARC64 VII+
64/256/256 – 1024 GB
386,478.3 $18.19 316,835.8 471,428.6 Oracle 11g R2 09/22/11
SPARC T4-4
3.0 GHz SPARC T4
4/32/256 – 1024 GB
205,792.0 $4.10 190,325.1 222,515.9 Oracle 11g R2 05/31/12
SPARC Enterprise M9000
2.88 GHz SPARC64 VII
32/128/256 – 512 GB
198,907.5 $15.27 182,350.7 216,967.7 Oracle 11g R2 12/09/10
IBM Power 780
4.1 GHz POWER7
8/32/128 – 1024 GB
192,001.1 $6.37 210,368.4 175,237.4 Sybase 15.4 11/30/11
HP ProLiant DL980 G7
2.27 GHz Intel Xeon X7560
8/64/128 – 512 GB
162,601.7 $2.68 185,297.7 142,685.6 SQL Server 2008 10/13/10

P/C/T = Processors, Cores, Threads
QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric in USD (smaller is better)
QppH = the Power Numerical Quantity
QthH = the Throughput Numerical Quantity

The following table lists data load times and refresh function times during the power run.

TPC-H @3000GB, Non-Clustered Systems
Database Load & Database Refresh
System
Processor
Data Loading
(h:m:s)
T4
Advan
RF1
(sec)
T4
Advan
RF2
(sec)
T4
Advan
SPARC T4-4
3.0 GHz SPARC T4
04:08:29 1.0x 67.1 1.0x 39.5 1.0x
IBM Power 780
4.1 GHz POWER7
05:51:50 1.5x 147.3 2.2x 133.2 3.4x
HP ProLiant DL980 G7
2.27 GHz Intel Xeon X7560
08:35:17 2.1x 173.0 2.6x 126.3 3.2x

Data Loading = database load time
RF1 = power test first refresh transaction
RF2 = power test second refresh transaction
T4 Advan = the ratio of time to T4 time

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Hardware Configuration:

SPARC T4-4 server
4 x SPARC T4 3.0 GHz processors (total of 32 cores, 128 threads)
1024 GB memory
8 x internal SAS (8 x 300 GB) disk drives

External Storage:

12 x Sun Storage 2540-M2 array storage, each with
12 x 15K RPM 300 GB drives, 2 controllers, 2 GB cache

Software Configuration:

Oracle Solaris 11 11/11
Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 3000 GB (Scale Factor 3000)
TPC-H Composite: 205,792.0 QphH@3000GB
Price/performance: $4.10/QphH@3000GB
Available: 05/31/2012
Total 3 year Cost: $843,656
TPC-H Power: 190,325.1
TPC-H Throughput: 222,515.9
Database Load Time: 4:08:29

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB, 10000GB, 30000GB and 100000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multiple user modes. The benchmark requires reporting of price/performance, which is the ratio of the total HW/SW cost plus 3 years maintenance to the QphH. A secondary metric is the storage efficiency, which is the ratio of total configured disk space in GB to the scale factor.

Key Points and Best Practices

  • Twelve Sun Storage 2540-M2 arrays were used for the benchmark. Each Sun Storage 2540-M2 array contains 12 15K RPM drives and is connected to a single dual port 8Gb FC HBA using 2 ports. Each Sun Storage 2540-M2 array showed 1.5 GB/sec for sequential read operations and showed linear scaling, achieving 18 GB/sec with twelve Sun Storage 2540-M2 arrays. These were stand alone IO tests.

  • The peak IO rate measured from the Oracle database was 17 GB/sec.

  • Oracle Solaris 11 11/11 required very little system tuning.

  • Some vendors try to make the point that storage ratios are of customer concern. However, storage ratio size has more to do with disk layout and the increasing capacities of disks – so this is not an important metric in which to compare systems.

  • The SPARC T4-4 server and Oracle Solaris efficiently managed the system load of over one thousand Oracle Database parallel processes.

  • Six Sun Storage 2540-M2 arrays were mirrored to another six Sun Storage 2540-M2 arrays on which all of the Oracle database files were placed. IO performance was high and balanced across all the arrays.

  • The TPC-H Refresh Function (RF) simulates periodical refresh portion of Data Warehouse by adding new sales and deleting old sales data. Parallel DML (parallel insert and delete in this case) and database log performance are a key for this function and the SPARC T4-4 server outperformed both the IBM POWER7 server and HP ProLiant DL980 G7 server. (See the RF columns above.)

See Also

Disclosure Statement

TPC-H, QphH, $/QphH are trademarks of Transaction Processing Performance Council (TPC). For more information, see www.tpc.org. SPARC T4-4 205,792.0 QphH@3000GB, $4.10/QphH@3000GB, available 5/31/12, 4 processors, 32 cores, 256 threads; IBM Power 780 QphH@3000GB, 192,001.1 QphH@3000GB, $6.37/QphH@3000GB, available 11/30/11, 8 processors, 32 cores, 128 threads; HP ProLiant DL980 G7 162,601.7 QphH@3000GB, $2.68/QphH@3000GB available 10/13/10, 8 processors, 64 cores, 128 threads.

Wednesday Nov 09, 2011

SPARC T4-2 Delivers World Record SPECjvm2008 Result with Oracle Solaris 11

Oracle's SPARC T4-2 server equipped with two SPARC T4 processors running at 2.85 GHz delivered a World Record result of 454.25 SPECjvm2008 Peak ops/m on the SPECjvm2008 benchmark.

  • The SPARC T4-2 server demonstrates 41% better performance than the SPARC T3-2 server.

  • The SPARC T4-2 server with hardware cryptography acceleration greatly increases performance with subtests using AES and RSA encryption ciphers.

  • This result was produced using Oracle Solaris 11 and Oracle JDK 7 Update 2.

  • There are no SPECjvm2008 results published by IBM on POWER7 based systems.

  • The SPARC T4-2 server demonstrates Oracle's position of leadership in Java-based computing by publishing world record results for the SPECjvm2008 benchmark.

Performance Landscape

Complete benchmark results are at the SPECjvm2008 website.

SPECjvm2008 Performance Chart
(ordered by performance)
System Processors Performance
base peak
SPARC T4-2 2 x 2.85 GHz SPARC T4 - 454.25
SPARC T3-2 2 x 1.65 GHz SPARC T3 - 320.52
Sun Blade X6270 2 x 2.93 GHz Intel X5570 317.13 -

base: SPECjvm2008 Base ops/m (bigger is better)
peak: SPECjvm2008 Peak ops/m (bigger is better)

SPEC allows base and peak to be submitted separately. The base metric does not allow any optimization of the JVM, the peak metric allows optimization.

Configuration Summary

Hardware Configuration:

SPARC T4-2 server
2 x 2.85 GHz SPARC T4 processors
256 GB memory

Software Configuration:

Oracle Solaris 11 11/11
Java Platform, Standard Edition, JDK 7 Update 2

Benchmark Description

SPECjvm2008 (Java Virtual Machine Benchmark) is a benchmark suite for measuring the performance of a Java Runtime Environment (JRE), containing several real life applications and benchmarks focusing on core Java functionality. The suite focuses on the performance of the JRE executing a single application; it reflects the performance of the hardware processor and memory subsystem, but has low dependence on file I/O and includes no network I/O across machines.

The SPECjvm2008 workload mimics a variety of common general purpose application computations. These characteristics reflect the intent that this benchmark will be applicable to measuring basic Java performance on a wide variety of both client and server systems.

SPECjvm2008 benchmark highlights:

  • Leverages real life applications (like derby, sunflow, and javac) and area-focused benchmarks (like xml, serialization, crypto, and scimark).
  • Also measures the performance of the operating system and hardware in the context of executing the JRE.

The current rules for the benchmark allow either base or peak to be run. The base run is done without any tuning of the JVM to improve the out of the box performance. The peak run allows tuning of the JVM.

Key Points and Best Practices

  • Enhancements to the JVM had a major impact on performance, especially for the security tests.

See Also

Disclosure Statement

SPEC and SPECjvm are registered trademarks of Standard Performance Evaluation Corporation. Results from www.spec.org and this report as of 11/9/2011. SPARC T4-2 454.25 SPECjvm2008 Peak ops/m submitted for review, SPARC T3-2 320.52 SPECjvm2008 Peak ops/m, Sun Blade X6270 317.13 SPECjvm2008 Base ops/m.

Monday Oct 03, 2011

SPARC T4-4 Beats IBM POWER7 and HP Itanium on TPC-H @1000GB Benchmark

Oracle's SPARC T4-4 server configured with SPARC-T4 processors, Oracle's Sun Storage F5100 Flash Array storage, Oracle Solaris, and Oracle Database 11g Release 2 achieved a TPC-H benchmark performance result of 201,487 QphH@1000GB with price/performance of $4.60/QphH@1000GB.

  • The SPARC T4-4 server benchmark results demonstrate a complete solution of building Decision Support Systems including data loading, business questions and refreshing data. Each phase usually has a time constraint and the SPARC T4-4 server shows superior performance during each phase.

  • The SPARC T4-4 server is 22% faster than the 8-socket IBM POWER7 server with the same number of cores. The SPARC T4-4 server has over twice the performance per socket compared to the IBM POWER7 server.

  • The SPARC T4-4 server achieves 33% better price/performance than the IBM POWER7 server.

  • The SPARC T4-4 server is up to 4 times faster than the IBM POWER7 server for the Refresh Function.

  • The SPARC T4-4 server is 44% faster than the HP Superdome 2 server. The SPARC T4-4 server has 5.7x the performance per socket of the HP Superdome 2 server.

  • The SPARC T4-4 server is 62% better on price/performance than the HP Itanium server.

  • The SPARC T4-4 server is up to 3.7 times faster than the HP Itanium server for the Refresh Function.

  • The SPARC T4-4 server delivers nearly the same performance as Oracle's SPARC Enterprise M8000 server, but with 52% better price/performance on the TPC-H @1000GB benchmark.

  • Oracle used Storage Redundancy Level 3 as defined by the TPC-H 2.14.2 specification which is the strictest level.

  • This TPC-H result demonstrates that the SPARC T4-4 server can deliver the performance while running the increasingly larger databases required of DSS systems. The server measured more than 16 GB/sec of IO throughput through Oracle Database 11g Release 2 software while maintaining the high cpu load.

Performance Landscape

The table below lists published non-cluster results from comparable enterprise class systems from Oracle, IBM and HP. Each system was configured with 512 GB of memory.

TPC-H @1000GB

System
CPU type
Proc/Core/Thread
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
SPARC Enterprise M8000
3 GHz SPARC64 VII+
16 / 64 / 128
209,533.6 $9.53 177,845.9 246,867.2 Oracle 11g 09/22/11
SPARC T4-4
3 GHz SPARC-T4
4 / 32 / 256
201,487.0 $4.60 181,760.6 223,354.2 Oracle 11g 10/30/11
IBM Power 780
4.14 GHz POWER7
8 / 32 / 128
164,747.2 $6.85 170,206.4 159,463.1 Sybase 03/31/11
HP Superdome 2
1.73 GHz Intel Itanium 9350
16 / 64 / 64
140,181.1 $12.15 139,181.0 141,188.3 Oracle 11g 10/20/10

QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric (smaller is better)
QppH = the Power Numerical Quantity
QthH = the Throughput Numerical Quantity

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Hardware Configuration:

SPARC T4-4 server
4 x SPARC-T4 3.0 GHz processors (total of 32 cores, 128 threads)
512 GB memory
8 x internal SAS (8 x 300 GB) disk drives

External Storage:

4 x Sun Storage F5100 Flash Array storage, each with
80 x 24 GB Flash Modules

Software Configuration:

Oracle Solaris 10 8/11
Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 1000 GB (Scale Factor 1000)
TPC-H Composite: 201,487 QphH@1000GB
Price/performance: $4.60/QphH@1000GB
Available: 10/30/2011
Total 3 Year Cost: $925,525
TPC-H Power: 181,760.6
TPC-H Throughput: 223,354.2
Database Load Time: 1:22:39

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB and 10000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multi user modes. The benchmark requires reporting of price/performance, which is the ratio of QphH to total HW/SW cost plus 3 years maintenance.

Key Points and Best Practices

  • Four Sun Storage F5100 Flash Array devices were used for the benchmark. Each F5100 device contains 80 flash modules (FMODs). Twenty (20) FMODs from each F5100 device were connected to a single SAS 6 Gb HBA. A single F5100 device showed 4.16 GB/sec for sequential read and demonstrated linear scaling of 16.62 GB/sec with 4 x F5100 devices.

  • The IO rate from the Oracle database was over 16 GB/sec.

  • Oracle Solaris 10 8/11 required very little system tuning.

  • The SPARC T4-4 server and Oracle Solaris efficiently managed the system load of over one thousand Oracle parallel processes.

  • The Oracle database files for tables and indexes were managed by Oracle Automatic Storage Manager (ASM) with 4M stripe. Two F5100 devices were mirrored to another 2 F5100 devices under ASM. IO performance was high and balanced across all the FMODs.
  • The Oracle redo log files were mirrored across the F5100 devices using Oracle Solaris Volume Manager with 128K stripe.
  • Parallel degree on tables and indexes was set to 128. This setting worked the best for performance.
  • TPC-H Refresh Function simulates periodical Refresh portion of Data Warehouse by adding new sales and deleting old sales data. Parallel DML (parallel insert and delete in this case) and database log performance are a key for this function and the SPARC T4-4 server outperformed both HP Superdome 2 and IBM POWER7 servers.

See Also

Disclosure Statement

TPC-H, QphH, $/QphH are trademarks of Transaction Processing Performance Council (TPC). For more information, see www.tpc.org. SPARC T4-4 201,487 QphH@1000GB, $4.60/QphH@1000GB, avail 10/30/2011, 4 processors, 32 cores, 256 threads; SPARC Enterprise M8000 209,533.6 QphH@1000GB, $9.53/QphH@1000GB, avail 09/22/11, 16 processors, 64 cores, 128 threads; IBM Power 780 QphH@1000GB, 164,747.2 QphH@1000GB, $6.85/QphH@1000GB, avail 03/31/11, 8 processors, 32 cores, 128 threads; HP Integrity Superdome 2 140,181.1 QphH@1000GB, $12.15/QphH@1000GB avail 10/20/10, 16 processors, 64, cores, 64 threads.

Sun ZFS Storage 7420 Appliance Doubles NetApp FAS3270A on SPC-1 Benchmark

Oracle's Sun ZFS Storage 7420 appliance delivered outstanding performance and price/performance on the SPC Benchmark 1, beating results published on the NetApp FAS3270A.

  • The Sun ZFS Storage 7420 appliance delivered 137,066.20 SPC-1 IOPS at $2.99 $/SPC-1 IOPS on the SPC-1 benchmark.

  • The Sun ZFS Storage 7420 appliance outperformed the NetApp FAS3270A by 2x on the SPC-1 benchmark.

  • The Sun ZFS Storage 7420 appliance outperformed the NetApp FAS3270A by 2.5x on price/performance on the SPC-1 benchmark.

Performance Landscape

SPC-1 Performance Chart (in decreasing performance order)

System SPC-1
IOPS
$/SPC-1
IOPS
ASU
Capacity
(GB)
TSC Price Data
Protection
Level
Date Results
Identifier
Huawei Symantec S6800T 150,061.17 $3.08 43,937.515 $461,471.75 Mirroring 08/31/11 A00107
Sun ZFS Storage 7420 137,066.20 $2.99 23,703.035 $409,933 Mirroring 10/03/11 A00108
Huawei Symantec S5600T 102,471.66 $2.73 35,945.185 $279,914.53 Mirroring 08/25/11 A00106
Pillar Axiom 600 70,102.27 $7.32 32,000.000 $513,112 Mirroring 04/19/11 A00104
NetApp FAS3270A 68,034.63 $7.48 21,659.386 $509,200.79 RAID DP 11/09/10 AE00004
Sun Storage 6780 62,261.80 $6.89 13,742.218 $429,294 Mirroring 06/01/10 A00094
NetApp FAS3170 60,515.34 $10.01 19,628,500 $605,492 RAID-DP 06/10/08 A00066
IBM V7000 56,510.85 $7.24 14,422.309 $409,410.86 Mirroring 10/22/10 A00097
IBM V7000 53,014.29 $7.52 24,433.592 $389,425.11 Mirroring 03/14/11 A00103

SPC-1 IOPS = the Performance Metric
$/SPC-1 IOPS = the Price/Performance Metric
ASU Capacity = the Capacity Metric
Data Protection = Data Protection Metric
TSC Price = Total Cost of Ownership Metric
Results Identifier = A unique identification of the result Metric

Complete SPC-1 benchmark results may be found at http://www.storageperformance.org.

Configuration Summary

Storage Configuration:

Sun ZFS Storage 7420 appliance in clustered configuration
2 x Sun ZFS Storage 7420 controllers, each with
4 x 2.0 GHz Intel Xeon X7550 processors
512 GB memory, 64 x 8 GB 1066 MHz DDR3 DIMMs
4 x 512 GB SSD flash-enabled read-cache
12 x Sun Disk shelves
10 x shelves with 24 x 300 GB 15K RPM SAS-2 drives
2 x shelves with 20 x 300 GB 15K RPM SAS-2 drives and 4 x 73 GB SAS-2 flash-enabled write-cache

Server Configuration:

1 x SPARC T3-2 server
2 x 1.65 GHz SPARC T3 processors
128 GB memory
6 x 8 Gb FC connections to the Sun ZFS Storage 7420 appliance
Oracle Solaris 10 9/10

Benchmark Description

SPC Benchmark-1 (SPC-1): is the first industry standard storage benchmark and is the most comprehensive performance analysis environment ever constructed for storage subsystems. The I/O workload in SPC-1 is characterized by predominately random I/O operations as typified by multi-user OLTP, database, and email servers environments. SPC-1 uses a highly efficient multi-threaded workload generator to thoroughly analyze direct attach or network storage subsystems. The SPC-1 benchmark enables companies to rapidly produce valid performance and price/performance results using a variety of host platforms and storage network topologies.

SPC1 is built to:

  • Provide a level playing field for test sponsors.
  • Produce results that are powerful and yet simple to use.
  • Provide value for engineers as well as IT consumers and solution integrators.
  • Is easy to run, easy to audit/verify, and easy to use to report official results.

See Also

Disclosure Statement

SPC-1, SPC-1 IOPS, $/SPC-1 IOPS are registered trademarks of Storage Performance Council (SPC). Results as of October 2, 2011, for more information see www.storageperformance.org. Sun ZFS Storage 7420 Appliance http://www.storageperformance.org/results/benchmark_results_spc1#a00108; NetApp FAS3270A http://www.storageperformance.org/results/benchmark_results_spc1#ae00004.

SPARC T4-4 Produces World Record Oracle OLAP Capacity

Oracle's SPARC T4-4 server delivered world record capacity on the Oracle OLAP Perf workload.

  • The SPARC T4-4 server was able to operate on a cube with a 3 billion row fact table of sales data containing 4 dimensions which represents as many as 70 quintillion aggregate rows (70 followed by 18 zeros).

  • The SPARC T4-4 server supported 3,500 cube-queries/minute against the Oracle OLAP cube with an average response time of 1.5 seconds and the median response time of 0.15 seconds.

Performance Landscape

Oracle OLAP Perf Benchmark
System Fact Table
Num of Rows
Cube-Queries/
minute
Median Response
seconds
Average Response
seconds
SPARC T4-4 3 Billion 3,500 0.15 1.5

Configuration Summary and Results

Hardware Configuration:

SPARC T4-4 server with
4 x SPARC T4 processors, 3.0 GHz
1 TB main memory
2 x Sun Storage F5100 Flash Array

Software Configuration:

Oracle Solaris 10 8/11
Oracle Database 11g Enterprise Edition with Oracle OLAP option

Benchmark Description

OLAP Perf is a workload designed to demonstrate and stress the Oracle OLAP product's core functionalities of fast query, fast update, and rich calculations on a dimensional model to support Enhanced Data Warehousing. The workload uses a set of realistic business intelligence (BI) queries that run against an OLAP cube.

Key Points and Best Practices

  • The SPARC T4-4 server is estimated to support 2,400 interactive users with this fast response time assuming only 5 seconds between query requests.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 10/3/2011.

Friday Sep 30, 2011

SPARC T4-2 Server Beats Intel (Westmere AES-NI) on ZFS Encryption Tests

Oracle continues to lead in enterprise security. Oracle's SPARC T4 processors combined with Oracle's Solaris ZFS file system demonstrate faster file system encryption than equivalent systems based on the Intel Xeon Processor 5600 Sequence chips which use AES-NI security instructions.

Encryption is the process where data is encoded for privacy and a key is needed by the data owner to access the encoded data. The benefits of using ZFS encryption are:

  • The SPARC T4 processor is 3.5x to 5.2x faster than the Intel Xeon Processor X5670 that has the AES-NI security instructions in creating encrypted files.

  • ZFS encryption is integrated with the ZFS command set. Like other ZFS operations, encryption operations such as key changes and re-key are performed online.

  • Data is encrypted using AES (Advanced Encryption Standard) with key lengths of 256, 192, and 128 in the CCM and GCM operation modes.

  • The flexibility of encrypting specific file systems is a key feature.

  • ZFS encryption is inheritable to descendent file systems. Key management can be delegated through ZFS delegated administration.

  • ZFS encryption uses the Oracle Solaris Cryptographic Framework which gives it access to SPARC T4 processor and Intel Xeon X5670 processor (Intel AES-NI) hardware acceleration or to optimized software implementations of the encryption algorithms automatically.

Performance Landscape

Below are results running two different ciphers for ZFS encryption. Results are presented for runs without any cipher, labeled clear, and a variety of different key lengths.

Encryption Using AES-CCM Ciphers

MB/sec – 5 File Create* Encryption
Clear AES-256-CCM AES-192-CCM AES-128-CCM
SPARC T4-2 server 3,803 3,167 3,335 3,225
SPARC T3-2 server 2,286 1,554 1,561 1,594
2-Socket 2.93 GHz Xeon X5670 3,325 750 764 773

Speedup T4-2 vs X5670 1.1x 4.2x 4.4x 4.2x
Speedup T4-2 vs T3-2 1.7x 2.0x 2.1x 2.0x

Encryption Using AES-GCM Ciphers

MB/sec – 5 File Create* Encryption
Clear AES-256-GCM AES-192-GCM AES-128-GCM
SPARC T4-2 server 3,618 3,929 3,164 2,613
SPARC T3-2 server 2,278 1,451 1,455 1,449
2-Socket 2.93 GHz Xeon X5670 3,299 749 748 753

Speedup T4-2 vs X5670 1.1x 5.2x 4.2x 3.5x
Speedup T4-2 vs T3-2 1.6x 2.7x 2.2x 1.8x

(*) Maximum Delivered values measured over 5 concurrent mkfile operations.

Configuration Summary

Storage Configuration:

Sun Storage 6780 array
16 x 15K RPM drives
Raid 0 pool
Write back cache enable
Controller cache mirroring disabled for maximum bandwidth for test
Eight 8 Gb/sec ports per host

Server Configuration:

SPARC T4-2 server
2 x SPARC T4 2.85 GHz processors
256 GB memory
Oracle Solaris 11

SPARC T3-2 server
2 x SPARC T3 1.6 GHz processors
Oracle Solaris 11 Express 2010.11

Sun Fire X4270 M2 server
2 x Intel Xeon X5670, 2.93 GHz processors
Oracle Solaris 11

Benchmark Description

The benchmark ran the UNIX command mkfile (1M). Mkfile is a simple single threaded program to create a file of a specified size. The script ran 5 mkfile operations in the background and observed the peak bandwidth observed during the test.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of December 16, 2011.

SPARC T4 Processor Beats Intel (Westmere AES-NI) on AES Encryption Tests

The cryptography benchmark suite was internally developed by Oracle to measure the maximum throughput of in-memory, on-chip encryption operations that a system can perform. Multiple threads are used to achieve the maximum throughput.

  • Oracle's SPARC T4 processor running Oracle Solaris 11 is 1.5x faster on AES 256-bit key CFB mode encryption than the Intel Xeon X5690 processor running Oracle Linux 6.1 for in-memory encryption of 32 KB blocks.

  • The SPARC T4 processor running Oracle Solaris 11 is 1.7x faster on AES 256-bit key CBC mode encryption than the Intel Xeon X5690 processor running Oracle Linux 6.1 for in-memory encryption of 32 KB blocks.

  • The SPARC T4 processor running Oracle Solaris 11 is 3.6x faster on AES 256-bit key CCM mode encryption than the Intel Xeon X5690 processor running Oracle Linux 6.1 for in-memory encryption with authentication of 32 KB blocks.

  • The SPARC T4 processor running Oracle Solaris 11 is 1.4x faster on AES 256-bit key GCM mode encryption than the Intel Xeon X5690 processor running Oracle Linux 6.1 for in-memory encryption with authentication of 32 KB blocks.

  • The SPARC T4 processor running Oracle Solaris 11 is 9% faster on single-threaded AES 256-bit key CFB mode encryption than the Intel Xeon X5690 processor running Oracle Linux 6.1 for in-memory encryption of 32 KB blocks.

  • The SPARC T4 processor running Oracle Solaris 11 is 1.8x faster on AES 256-bit key CFB mode encryption than the SPARC T3 running Solaris 11 Express.

  • AES CFB mode is used by the Oracle Database 11g for Transparent Data Encryption (TDE) which provides security to database storage.

Performance Landscape

Encryption Performance – AES-CFB

Performance is presented for in-memory AES-CFB128 mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run).

AES-256-CFB
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 10,963 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 7,526 Oracle Linux 6.1, IPP/AES-NI
SPARC T3 1.65 32 6,023 Oracle Solaris 11 Express, libpkcs11
Intel X5690 3.47 12 2,894 Oracle Solaris 11, libsoftcrypto
SPARC T4 2.85 1 712 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 653 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 425 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 331 Oracle Solaris 11 Express, libpkcs11

AES-192-CFB
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 12,451 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 8,677 Oracle Linux 6.1, IPP/AES-NI
SPARC T3 1.65 32 6,175 Oracle Solaris 11 Express, libpkcs11
Intel X5690 3.47 12 2,976 Oracle Solaris 11, libsoftcrypto
SPARC T4 2.85 1 816 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 752 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 461 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 371 Oracle Solaris 11 Express, libpkcs11

AES-128-CFB
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 14,388 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 10,214 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 32 6,390 Oracle Solaris 11 Express, libpkcs11
Intel X5690 3.47 12 3,115 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 953 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 886 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 509 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 395 Oracle Solaris 11 Express, libpkcs11

Encryption Performance – AES-CBC

Performance is presented for in-memory AES-CBC mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run).

AES-256-CBC
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 11,588 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 7,171 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 6,704 Oracle Linux 6.1, IPP/AES-NI
SPARC T3 1.65 32 5,980 Oracle Solaris 11 Express, libpkcs11
SPARC T4 2.85 1 748 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 592 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 569 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 336 Oracle Solaris 11 Express, libpkcs11

AES-192-CBC
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 13,216 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 8,211 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 7,588 Oracle Linux 6.1, IPP/AES-NI
SPARC T3 1.65 32 6,333 Oracle Solaris 11 Express, libpkcs11
SPARC T4 2.85 1 862 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 672 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 643 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 358 Oracle Solaris 11 Express, libpkcs11

AES-128-CBC
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 15,323 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 9,785 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 8,746 Oracle Linux 6.1, IPP/AES-NI
SPARC T3 1.65 32 6,347 Oracle Solaris 11 Express, libpkcs11
SPARC T4 2.85 1 1,017 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 781 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 739 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 434 Oracle Solaris 11 Express, libpkcs11

Encryption Performance – AES-CCM

Performance is presented for in-memory AES-CCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run).

AES-256-CCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 5,850 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 1,860 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 1,613 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 480 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 258 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 190 Oracle Linux 6.1, IPP/AES-NI

AES-192-CCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 6,709 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 1,930 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 1,715 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 565 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 293 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 206 Oracle Linux 6.1, IPP/AES-NI

AES-128-CCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 7,856 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 2,031 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 1,838 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 664 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 321 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 225 Oracle Linux 6.1, IPP/AES-NI

Encryption Performance – AES-GCM

Performance is presented for in-memory AES-GCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run).

AES-256-GCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 6,871 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 4,794 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 12 1,685 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 691 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 571 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 253 Oracle Solaris 11, libsoftcrypto

AES-192-GCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 7,450 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 5,054 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 12 1,724 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 727 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 618 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 268 Oracle Solaris 11, libsoftcrypto

AES-128-GCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 7,987 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 5,315 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 12 1,781 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 765 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 655 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 281 Oracle Solaris 11, libsoftcrypto

Configuration Summary

SPARC T4-1 server
1 x SPARC T4 processor, 2.85 GHz
128 GB memory
Oracle Solaris 11

SPARC T3-1 server
1 x SPARC T3 processor, 1.65 GHz
128 GB memory
Oracle Solaris 11 Express

Sun Fire X4270 M2 server
2 x Intel Xeon X5690, 3.47 GHz
Hyper-Threading enabled
Turbo Boost enabled
24 GB memory
Oracle Linux 6.1

Sun Fire X4270 M2 server
2 x Intel Xeon X5690, 3.47 GHz
Hyper-Threading enabled
Turbo Boost enabled
24 GB memory
Oracle Solaris 11 Express

Benchmark Description

The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-memory and on-chip using various ciphers, including AES-128-CFB, AES-192-CFB, AES-256-CFB, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CCM, AES-192-CCM, AES-256-CCM, AES-128-GCM, AES-192-GCM and AES-256-GCM.

The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various ciphers. They were run using optimized libraries for each platform to obtain the best possible performance.

See Also

Disclosure Statement

Copyright 2012, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 1/13/2012.

Thursday Sep 29, 2011

SPARC T4 Processor Outperforms IBM POWER7 and Intel (Westmere AES-NI) on OpenSSL AES Encryption Test

Oracle's SPARC T4 processor is faster than the Intel Xeon X5690 (with AES-NI) and the IBM POWER7.

  • On single-thread OpenSSL encryption, the 2.85 GHz SPARC T4 processor is 4.3 times faster than the 3.5 GHz IBM POWER7 processor.

  • On single-thread OpenSSL encryption, the 2.85 GHz SPARC T4 processor is 17% faster than the 3.46 GHz Intel Xeon X5690 processor.

The SPARC T4 processor has Encryption Instruction Accelerators for encryption and decryption for AES and many other ciphers. The Intel Xeon X5690 processor has AES-NI instructions which accelerate only AES ciphers. The IBM POWER7 does not have cryptographic instructions, but cryptographic coprocessors are available.

Performance Landscape

The table below shows results when running the OpenSSL speed command with the AES-256-CBC cipher. The reported results are for a message size of 8192 bytes. Results are reported for a single thread and for running on all available hardware threads (no over subscribing).

OpenSSL Performance with
AES-256-CBC Encryption
Processor Performance (MB/sec)
1 Thread Maximum Throughput
(at number of threads)
SPARC T4, 2.85 GHz 769 11,967 (64)
Intel Xeon X5690, 3.46 GHz 660 7,362 (12)
IBM POWER7, 3.5 GHz 179 2,860 (est*)

(est*) The performance of the IBM POWER7 is estimated at 16 times the rate of the single thread performance. The estimate is considered an upper bound on expected performance for this processor.

Configuration Summary

SPARC Configuration:

SPARC T4-1 server
1 x SPARC T4 processors, 2.85 GHz
64 GB memory
Oracle Solaris 11

Intel Configuration:

Sun Fire X4270 M2 server
1 x Intel Xeon X5690 processors, 3.46 GHz
24 GB memory
Oracle Solaris 11

Software Configuration:

OpenSSL 1.0.0.d
gcc 3.4.3

Benchmark Description

The in-memory SSL performance was measured with the openssl command. openssl has an option for measuring the speed of various ciphers and message sizes. The actual command used to measure the speed of AES-256-CBC was:

openssl speed -multi {number of threads} -evp aes-256-cbc

openssl runs for several minutes and measures the speed, in units of MB/sec, of the specified cipher for messages of sizes 16 bytes to 8192 bytes.

Key Points and Best Practices

  • The Encryption Instruction Accelerators are accessed through a platform independent API for cryptographic engines.
  • The OpenSSL libraries use the API. The default is to not use the Encryption Instruction Accelerators.
  • Cryptography is compute intensive. Using all available threads streams, both the SPARC T4 processor and the Intel Xeon processor were able to saturate the memory bandwidth of the respective systems.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/26/2011.

SPARC T4-2 Server Beats Intel (Westmere AES-NI) on SSL Network Tests

Oracle's SPARC T4 processor is faster and more efficient than the Intel Xeon X5690 processor (with AES-NI) when running network SSL thoughput tests.

  • The SPARC T4 processor at 2.85 GHz is 20% faster than the 3.46 GHz Intel Xeon X5690 processor on single stream network SSL encryption.

  • The SPARC T4 processor requires fewer streams to attain near-linespeed of a 10 GbE secure network and does this with 5 times less CPU resources compared to the Intel Xeon X5690 processor.

  • Oracle's SPARC T4-2 server using 8 threads achieves line speed over a 10 GbE network with only 9% CPU utilization.

  • Oracle's Sun Fire X4270 M2 with two Intel Xeon X5690 processors achieves line speed with 8 threads, but at 45% CPU utilization.

The SPARC T4 processor has hardware support via Encryption Instruction Accelerators for encryption and decryption for AES and many other ciphers. The Intel Xeon X5690 processor has AES-NI instructions which accelerate only AES ciphers.

Performance Landscape

The following table shows single stream results running encrypted (SSL Read) and unencrypted (Clear Text) messages of 1 MB in size. These tests were run with the uperf benchmark and used the AES-256-CBC cipher. They were run across a 10 GbE connection. Write messages saw similar performance.

Single Stream Network Communication with Uperf
Processor Performance (Mb/sec)
Clear Text SSL Read
SPARC T4, 2.85 GHz 4,194 1,678
Intel Xeon X5690, 3.46 GHz 5,591 1,398

The next table shows how many streams it takes to achieve 90% of the 10 GbE network bandwidth (9000 Mb/sec) for encrypted read messages of 1 MB in size. These tests were run with the uperf benchmark and used the AES-256-CBC cipher. Write messages saw similar performance.

Uperf SSL Read with AES-256-CBC
Processor Number of
Streams for 90%
Network Utilization
CPU Utilization
SPARC T4, 2.85 GHz 8 9%
Intel Xeon X5690, 3.46 GHz 12 45%

Configuration Summary

SPARC T4 Configuration:

2 x SPARC T4-2 servers each with
2 x SPARC T4 processors, 2.85 GHz
128 GB memory
1 x 10-Gigabit Ethernet XAUI Adapter
Oracle Solaris 11
Back-to-back 10 GbE connection

Intel Configuration:

2 x Sun Fire X4270 M2 servers each with
2 x Intel Xeon X5690 processors, 3.46 GHz
48 GB memory
1 x Sun Dual Port 10GbE PCIe 2.0 Networking Card with Intel 82599 10GbE Controller
Oracle Solaris 11
Back-to-back 10 GbE connection

Software Configuration:

OpenSSL 1.0.0.d
uperf 1.0.3
gcc 3.4.3

Benchmark Description

Uperf is an open source benchmark program for simulating and measuring network performance. Uperf is able to measure the performance of various protocols, including TCP, UDP, SCTP and SSL. The uperf benchmark uses an input-defined workload to test network performance. This input workload can be used to model complex situations or to isolate simple tasks. The workload used for these tests was simple network reads and simple network writes.

Key Points and Best Practices

  • The Encryption Instruction Accelerators are accessed through a platform independent API for cryptographic engines.
  • The OpenSSL libraries use the API. The default is to not use the Encryption Instruction Accelerators.
  • Cryptography is compute intensive. Using 8 streams, the SPARC T4 processor was able to match the bandwidth of the 10 GbE network with 8 threads.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/26/2011.

Wednesday Sep 28, 2011

SPARC T4 Servers Set World Record on Oracle E-Business Suite R12 X-Large Order to Cash

With Oracle's SPARC T4-2 server running the application and SPARC T4-4 server running the database, Oracle set a world record result for the Oracle E-Business Suite Standard X-Large Order to Cash (OLTP) benchmark.

  • The combination of a SPARC T4-2 server running the Oracle E-Business Suite R12.1.2 application and a SPARC T4-4 server running the Oracle Database 11g Release 2 database enabled 2400 Order to Cash users of the X-Large Benchmark to simultaneously execute a large volume of medium to heavy transactions with an average response time of 2.4 seconds.

  • The SPARC T4-2 server in the application tier and the SPARC T4-4 server in the database tier are only about half utilized providing significant headroom for additional Oracle E-Business Suite R12.1.2 processing modules and future growth.

Performance Landscape

This is the first published result for the X-large benchmark using Oracle E-Business Order Management module.

OLTP Workload: Order to Cash
X-Large Configuration
System Users Average
Response Time
90th Percentile
Response Time
SPARC T4-2 2400 2.413 sec. 3.114 sec.

Configuration Summary

Application Tier Configuration:

1 x SPARC T4-2 server
2 x SPARC T4 processors, 2.85 GHz
256 GB memory
Oracle Solaris 10 8/11
Oracle E-Business Suite 12.1.2

Database Tier Configuration:

1 x SPARC T4-4 server
4 x SPARC T4 processors, 3.0 GHz
256 GB memory
Oracle Solaris 10 8/11
Oracle Database 11g Release 2

Storage Configuration:

1 x Sun Storage F5100 Flash Array

Benchmark Description

The Oracle R12 E-Business Suite Standard Benchmark combines online transaction execution by simulated users with concurrent batch processing to model a typical scenario for a global enterprise. This benchmark ran one OLTP component, Order to Cash, in the Extra-Large size. The goal is to obtain reference response times.

Results can be published in four sizes and utilize different combination

  • X-large: Maximum online users running all business flows between 10,000 to 20,000; 750,000 order to cash lines per hour and 250,000 payroll checks per hour.
    • Order to Cash Online -- 2400 users
      • The percentage across the 5 transactions in Order Management module is:
        • Insert Manual Invoice -- 16.66%
        • Insert Order -- 32.33%
        • Order Pick Release -- 16.66%
        • Ship Confirm -- 16.66%
        • Order Summary Report -- 16.66%
    • HR Self-Service -- 4000 users
    • Customer Support Flow -- 8000 users
    • Procure to Pay -- 2000 users
  • Large: 10,000 online users; 100,000 order to cash lines per hour and 100,000 payroll checks per hour.
  • Medium: up to 3000 online users; 50,000 order to cash lines per hour and 10,000 payroll checks per hour.
  • Small: up to 1000 online users; 10,000 order to cash lines per hour and 5,000 payroll checks per hour.

See Also

Disclosure Statement

Oracle E-Business X-Large Order to Cash benchmark, SPARC T4-2, SPARC T4, 2.85 GHz, 2 chips, 16 cores, 128 threads, 256 GB memory, SPARC T4-4, SPARC T4, 3.0 GHz, 4 chips, 32 cores, 256 threads, 256 GB memory, average response time 2.413 sec, 90th percentile response time 3.114 sec, Oracle Solaris 10 8/11, Oracle E-Business Suite 12.1.2, Oracle Database 11g Release 2, Results as of 9/26/2011.

SPARC T4-2 Server Beats Intel (Westmere AES-NI) on Oracle Database Tablespace Encryption Queries

Oracle's SPARC T4 processor with Encryption Instruction Accelerators greatly improves performance over software implementations. This will greatly expand the use of TDE for many customers.

  • Oracle's SPARC T4-2 server is over 42% faster than Oracle's Sun Fire X4270 M2 (Intel AES-NI) when running DSS-style queries referencing an encrypted tablespace.

Oracle's Transparent Data Encryption (TDE) feature of the Oracle Database simplifies the encryption of data within datafiles preventing unauthorized access to it from the operating system. Tablespace encryption allows encryption of the entire contents of a tablespace.

TDE tablespace encryption has been certified with Siebel, PeopleSoft, and Oracle E-Business Suite applications

Performance Landscape

Total Query Time (time in seconds)
System GHz AES-128 AES-192 AES-256
SPARC T4-2 server 2.85 588 588 588
Sun Fire X4270 M2 (Intel X5690) 3.46 836 841 842
SPARC T4-2 Advantage
42% 43% 43%

Configuration Summary

SPARC Configuration:

SPARC T4-2 server
2 x SPARC T4 processors, 2.85 GHz
256 GB memory
2 x Sun Storage F5100 Flash Array
Oracle Solaris 11
Oracle Database 11g Release 2

Intel Configuration:

Sun Fire X4270 M2 server
2 x Intel Xeon X5690 processors, 3.46 GHz
48 GB memory
2 x Sun Storage F5100 Flash Array
Oracle Linux 5.7
Oracle Database 11g Release 2

Benchmark Description

To test the performance of TDE, a 1 TB database was created. To demonstrate secure transactions, four 25 GB tables emulating customer private data were created: clear text, encrypted AES-128, encrypted AES-192, and encrypted AES-256. Eight queries of varying complexity that join on the customer table were executed.

The time spent scanning the customer table during each query was measured and query plans analyzed to ensure a fair comparison, e.g. no broken queries. The total query time for all queries is reported.

Key Points and Best Practices

  • Oracle Database 11g Release 2 is required for SPARC T4 processor Encryption Instruction Accelerators support with TDE tablespaces.

  • TDE tablespaces support the SPARC T4 processor Encryption Instruction Accelerators for Advanced Encryption Standard (AES) only.

  • AES-CFB is the mode used in the Oracle database with TDE

  • Prior to using TDE tablespaces you must create a wallet and setup an encryption key. Here is one method to do that:

  • Create a wallet entry in $ORACLE_HOME/network/admin/sqlnet.ora.
    ENCRYPTION_WALLET_LOCATION=
    (SOURCE=(METHOD=FILE)(METHOD_DATA=
    (DIRECTORY=/oracle/app/oracle/product/11.2.0/dbhome_1/encryption_wallet)))
    
    Set an encryption key. This also opens the wallet.
    $ sqlplus / as sysdba
    SQL> ALTER SYSTEM SET ENCRYPTION KEY IDENTIFIED BY "tDeDem0";
    
    On subsequent instance startup open the wallet.
    $ sqlplus / as sysdba
    SQL> STARTUP;
    SQL> ALTER SYSTEM SET ENCRYPTION WALLET OPEN IDENTIFIED BY "tDeDem0";
    
  • TDE tablespace encryption and decryption occur on physical writes and reads of database blocks, respectively.

  • For parallel query using direct path reads decryption overhead varies inversely with the complexity of the query.

    For a simple full table scan query overhead can be reduced and performance improved by reducing the degree of parallelism (DOP) of the query.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/26/2011.

SPARC T4 Servers Set World Record on PeopleSoft HRMS 9.1

Oracle's SPARC T4-4 servers running Oracle's PeopleSoft HRMS Self-Service 9.1 benchmark and Oracle Database 11g Release 2 achieved World Record performance on Oracle Solaris 10.

  • Using two SPARC T4-4 servers to run the application and database tiers and one SPARC T4-2 server to run the webserver tier, Oracle demonstrated world record performance of 15,000 concurrent users running the PeopleSoft HRMS Self-Service 9.1 benchmark.

  • The combination of the SPARC T4 servers running the PeopleSoft HRMS 9.1 benchmark supports 3.8x more online users with faster response time compared to the best published result from IBM on the previous PeopleSoft HRMS 8.9 benchmark.

  • The average CPU utilization on the SPARC T4-4 server in the application tier handling 15,000 users was less than 50%, leaving significant room for application growth.

  • The SPARC T4-4 server on the application tier used Oracle Solaris Containers which provide a flexible, scalable and manageable virtualization environment.

Performance Landscape

PeopleSoft HRMS Self-Service 9.1 Benchmark
Systems Processors Users Ave Response -
Search (sec)
Ave Response -
Save (sec)
SPARC T4-2 (web)
SPARC T4-4 (app)
SPARC T4-4 (db)
2 x SPARC T4, 2.85 GHz
4 x SPARC T4, 3.0 GHz
4 x SPARC T4, 3.0 GHz
15,000 1.01 0.63
PeopleSoft HRMS Self-Service 8.9 Benchmark
IBM Power 570 (web/app)
IBM Power 570 (db)
12 x POWER5, 1.9 GHz
4 x POWER5, 1.9 GHz
4,000 1.74 1.25
IBM p690 (web)
IBM p690 (app)
IBM p690 (db)
4 x POWER4, 1.9 GHz
12 x POWER4, 1.9 GHz
6 x 4392 MPIS/Gen1
4,000 1.35 1.01

The main differences between version 9.1 and version 8.9 of the benchmark are:

  • the database expanded from 100K employees and 20K managers to 500K employees and 100K managers,
  • the manager data was expanded,
  • a new transaction, "Employee Add Profile," was added, the percent of users executing it is less then 2%, and the transaction has a heavier footprint,
  • version 9.1 has a different benchmark metric (Average Response search/save time for x number of users) versus single user search/save time,
  • newer versions of the PeopleSoft application and PeopleTools software are used.

Configuration Summary

Application Server:

1 x SPARC T4-4 server
4 x SPARC T4 processors 3.0 GHz
512 GB main memory
5 x 300 GB SAS internal disks,
2 x 100 GB internal SSDs
1 x 300 GB internal SSD
Oracle Solaris 10 8/11
PeopleSoft PeopleTools 8.51.02
PeopleSoft HCM 9.1
Oracle Tuxedo, Version 10.3.0.0, 64-bit, Patch Level 031
Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.6.0_20

Web Server:

1 x SPARC T4-2 server
2 x SPARC T4 processors 2.85 GHz
256 GB main memory
1 x 300 GB SAS internal disks
1 x 300 GB internal SSD
Oracle Solaris 10 8/11
PeopleSoft PeopleTools 8.51.02
Oracle WebLogic Server 11g (10.3.3)
Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.6.0_20

Database Server:

1 x SPARC T4-4 server
4 x SPARC T4 processors 3.0 GHz
256 GB main memory
3 x 300 GB SAS internal disks
1 x Sun Storage F5100 Flash Array (80 flash modules)
Oracle Solaris 10 8/11
Oracle Database 11g Release 2

Benchmark Description

The purpose of the PeopleSoft HRMS Self-Service 9.1 benchmark is to measure comparative online performance of the selected processes in PeopleSoft Enterprise HCM 9.1 with Oracle Database 11g. The benchmark kit is an Oracle standard benchmark kit run by all platform vendors to measure the performance. It's an OLTP benchmark with no dependency on remote COBOL calls, there is no batch workload, and DB SQLs are moderately complex. The results are certified by Oracle and a white paper is published.

PeopleSoft defines a business transaction as a series of HTML pages that guide a user through a particular scenario. Users are defined as corporate Employees, Managers and HR administrators. The benchmark consists of 14 scenarios which emulate users performing typical HCM transactions such as viewing paychecks, promoting and hiring employees, updating employee profiles and other typical HCM application transactions.

All these transactions are well-defined in the PeopleSoft HR Self-Service 9.1 benchmark kit. The benchmark metric is the Average Response Time for search and save for 15,000 users..

Key Points and Best Practices

  • The application tier was configured with two PeopleSoft application server instances on the SPARC T4-4 server hosted in two separate Oracle Solaris Containers to demonstrate consolidation of multiple application, ease of administration, and load balancing.

  • Each PeopleSoft Application Server instance running in an Oracle Solaris Container was configured to run 5 application server Domains with 30 application server instances to be able to effectively handle the 15,000 users workload with zero application server queuing and minimal use of resources.

  • The web tier was configured with 20 WebLogic instances and with 4 GB JVM heap size to load balance transactions across 10 PeopleSoft Domains. That enables equitable distribution of transactions and scaling to high number of users.

  • Internal SSDs were configured in the application tier to host PeopleSoft Application Servers object CACHE file systems and in the web tier for WebLogic servers' logging providing near zero millisecond service time and faster server response time.

See Also

Disclosure Statement

Oracle's PeopleSoft HRMS 9.1 benchmark, www.oracle.com/us/solutions/benchmark/apps-benchmark/peoplesoft-167486.html, results 9/26/2011.

Tuesday Sep 27, 2011

SPARC T4-2 Servers Set World Record on JD Edwards EnterpriseOne Day in the Life Benchmark with Batch, Outperforms IBM POWER7

Using Oracle's SPARC T4-2 server for the application tier and a SPARC T4-1 server for the database tier, a world record result was produced running the Oracle's JD Edwards EnterpriseOne application Day in the Life (DIL) benchmark concurrently with a batch workload.

  • The SPARC T4-2 server running online and batch with JD Edwards EnterpriseOne 9.0.2 is 1.7x faster and has better response time than the IBM Power 750 system which only ran the online component of JD Edwards EnterpriseOne 9.0 Day in the Life test.

  • The combination of SPARC T4 servers delivered a Day in the Life benchmark result of 10,000 online users with 0.35 seconds of average transaction response time running concurrently with 112 Universal Batch Engine (UBE) processes at 67 UBEs/minute.

  • This is the first JD Edwards EnterpriseOne benchmark for 10,000 users and payroll batch on a SPARC T4-2 server for the application tier and the database tier with Oracle Database 11g Release 2. All servers ran with the Oracle Solaris 10 operating system.

  • The single-thread performance of the SPARC T4 processor produced sub-second response for the online components and provided dramatic performance for the batch jobs.

  • The SPARC T4 servers, JD Edwards EnterpriseOne 9.0.2, and Oracle WebLogic Server 11g Release 1 support 17% more users per JAS (Java Application Server) than the SPARC T3-1 server for this benchmark.

  • The SPARC T4-2 server provided a 6.7x better batch processing rate than the previous SPARC T3-1 server record result and had 2.5x faster response time.

  • The SPARC T4-2 server used Oracle Solaris Containers, which provide flexible, scalable and manageable virtualization.

  • JD Edwards EnterpriseOne uses Oracle Fusion Middleware WebLogic Server 11g R1 and Oracle Fusion Middleware Cluster Web Tier Utilities 11g HTTP server.

  • The combination of the SPARC T4-2 server and Oracle JD Edwards EnterpriseOne in the application tier with a SPARC T4-1 server in the database tier measured low CPU utilization providing headroom for growth.

Performance Landscape

JD Edwards EnterpriseOne Day in the Life Benchmark
Online with Batch Workload

System Online
Users
Resp
Time (sec)
Batch
Concur
(# of UBEs)
Batch
Rate
(UBEs/m)
Version
2xSPARC T4-2 (app+web)
SPARC T4-1 (db)
10000 0.35 112 67 9.0.2
SPARC T3-1 (app+web)
SPARC Enterprise M3000 (db)
5000 0.88 19 10 9.0.1

Resp Time (sec) — Response time of online jobs reported in seconds
Batch Concur (# of UBEs) — Batch concurrency presented in the number of UBEs
Batch Rate (UBEs/m) — Batch transaction rate in UBEs per minute

Edwards EnterpriseOne Day in the Life Benchmark
Online Workload Only

System Online
Users
Response
Time (sec)
Version
SPARC T3-1, 1 x SPARC T3 (1.65 GHz), Solaris 10 (app)
M3000, 1 x SPARC64 VII (2.75 GHz), Solaris 10 (db)
5000 0.52 9.0.1
IBM Power 750, POWER7 (3.55 GHz) (app+db) 4000 0.61 9.0

IBM result from http://www-03.ibm.com/systems/i/advantages/oracle/, IBM used WebSphere

Configuration Summary

Application Tier Configuration:

1 x SPARC T4-2 server with
2 x 2.85 GHz SPARC T4 processors
128 GB main memory
6 x 300 GB 10K RPM SAS internal HDD
Oracle Solaris 10 9/10
JD Edwards EnterpriseOne 9.0.2 with Tools 8.98.3.3

Web Tier Configuration:

1 x SPARC T4-2 server with
2 x 2.85 GHz SPARC T4 processors
256 GB main memory
2 x 300 GB SSD
4 x 300 GB 10K RPM SAS internal HDD
Oracle Solaris 10 9/10
Oracle WebLogic Server 11g Release 1

Database Tier Configuration:

1 x SPARC T4-1 server with
1 x 2.85 GHz SPARC T4 processor
128 GB main memory
6 x 300 GB 10K RPM SAS internal HDD
2 x Sun Storage F5100 Flash Array
Oracle Solaris 10 9/10
Oracle Database 11g Release 2

Benchmark Description

JD Edwards EnterpriseOne is an integrated applications suite of Enterprise Resource Planning (ERP) software. Oracle offers 70 JD Edwards EnterpriseOne application modules to support a diverse set of business operations.

Oracle's Day in the Life (DIL) kit is a suite of scripts that exercises most common transactions of JD Edwards EnterpriseOne applications, including business processes such as payroll, sales order, purchase order, work order, and manufacturing processes, such as ship confirmation. These are labeled by industry acronyms such as SCM, CRM, HCM, SRM and FMS. The kit's scripts execute transactions typical of a mid-sized manufacturing company.

  • The workload consists of online transactions and the UBE – Universal Business Engine workload of 42 short, 8 medium and 4 long UBEs.

  • LoadRunner runs the DIL workload, collects the user’s transactions response times and reports the key metric of Combined Weighted Average Transaction Response time.

  • The UBE processes workload runs from the JD Enterprise Application server.

    • Oracle's UBE processes come as three flavors:
      • Short UBEs < 1 minute engage in Business Report and Summary Analysis,
      • Mid UBEs > 1 minute create a large report of Account, Balance, and Full Address,
      • Long UBEs > 2 minutes simulate Payroll, Sales Order, night only jobs.
    • The UBE workload generates large numbers of PDF files reports and log files.
    • The UBE Queues are categorized as the QBATCHD, a single threaded queue for large and medium UBEs, and the QPROCESS queue for short UBEs run concurrently.

Oracle’s UBE process performance metric is Number of Maximum Concurrent UBE processes at transaction rate, UBEs/minute.

Key Points and Best Practices

One JD Edwards EnterpriseOne Application Server and two Oracle WebLogic Servers 11g R1 coupled with two Oracle Fusion Middleware 11g Web Tier HTTP Server instances on the SPARC T4-2 servers were hosted in three separate Oracle Solaris Containers to demonstrate consolidation of multiple application and web servers.

  • Interrupt fencing was configured on all Oracle Solaris Containers to channel the interrupts to processors other than the processor sets used for the JD Edwards Application server and WebLogic servers.

  • Processor 0 was left alone for clock interrupts.

  • The applications were executed in the FX scheduling class to improve performance by reducing the frequency of context switches.

  • A WebLogic vertical cluster was configured on each WebServer Container with twelve managed instances each to load balance users' requests and to provide the infrastructure that enables scaling to high number of users with ease of deployment and high availability.

  • The database server was run in an Oracle Solaris Container hosted on the SPARC T4-2 server.

  • The database log writer was run in the real time RT class and bound to a processor set.

  • The database redo logs were configured on the raw disk partitions.

  • The private network between the SPARC T4-2 servers was configured with a 10 GbE interface.

  • The Oracle Solaris Container on the Enterprise Application server ran 42 Short UBEs, 8 Medium UBEs and 4 Long UBEs concurrently as the mixed size batch workload.

  • The mixed size UBEs ran concurrently from the application server with the 10000 online users driven by the LoadRunner.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/26/2011.

SPARC T4 Servers Set World Record on Siebel Loyalty Batch

Oracle's SPARC T4-2 and SPARC T4-4 servers running Oracle's Siebel Loyalty Batch engine delivered a world record result for batch processing.

  • The SPARC T4-2 and SPARC T4-4 servers running Siebel Loyalty Batch engine, part of Siebel Loyalty Solution, with Oracle Database 11g Release 2 running on Oracle Solaris 10 achieved 7.65M TPH on Accrual (Reward) processing using three Siebel Servers.

  • The world record result was achieved with 24M members and 50M records in the base transaction table.

  • Siebel Loyalty Application was configured with 50 Active Promotions with three Assign Points and four Update Attributes.

  • Oracle's Siebel Server scaled near linearly on SPARC T4 systems achieving 2.72M TPH on a single Siebel Server to 7.65M TPH with three Siebel Servers.

  • The average CPU utilization on the database tier server was 25% and on the application tier server was 65%, leaving significant room for application growth.

Performance Landscape

System Processor TPH Version
3 x SPARC T4-2 (app)
1 x SPARC T4-4 (db)
SPARC T4, 2.85 GHz
SPARC T4, 3.0 GHz
7.65M 8.1.1.1FP
2 x SPARC T3-2 (app)
1 x SPARC T3-1 (app)
1 x SPARC M5000 (db)
SPARC T3, 1.65 GHz
SPARC T3, 1.65 GHz
SPARC64 VII, 2.52 GHz
3.9M 8.1.1.1FP
Customer (app)
Customer (db)
4 x Intel E5540, 2.53 GHz
1 x Itanium, 1.6 GHz
1.5M 8.1.x

Configuration Summary

Hardware Configuration:

3 x SPARC T4-2 servers, each with
2 x SPARC T4 processors, 2.85 GHz
128 GB main memory
1 x SPARC T4-4 server with
4 x SPARC T4 processors, 3.0 GHz
256 GB main memory
1 x Sun Storage 6180 array
16 disk drives
CSM200 with 16 disk drives

Software Configuration:

Oracle Solaris 10
Siebel Server 8.1.1.1FP
Oracle Database 11g Release 2 Enterprise Edition 11.2.0.1

Benchmark Description

Siebel Loyalty enables companies to simulate and process loyalty rewards for their activities across channels and process very high volume accrual and tier assessment transactions via batch process.

The benchmark simulates a workload of Accrual Batch Transactions Processing which imports data through Enterprise Integration Manager (EIM), evaluates eligible promotion and calculates rewards. The key performance metric is transactions per hour (TPH). Key aspects of the workload simulation include:

  • Batch Engine evaluating all accrual promotions and applying all actions in one go,
  • Users do not have control over the sequence in which promotion applied,
  • Promotion actions (assign/redeem points) are rolled back in case of failure.
The number of active promotions and, in particular, the Assign Point action has very significant impact on performance. The load simulated 50 Active promotions with 3 for Assign Points and 7 Update attribute actions configured.

The number of members and the number of queued transactions in the backend database have significant impact on the performance. The benchmark had 24 million members and 52 million records in the base transaction table. The simplified process flow of the benchmark is:

  • calculate accruals base on promotions,
  • credit points to members,
  • initiate any other actions specified in promotions.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/26/2011.

SPARC T4-4 Server Sets World Record on PeopleSoft Payroll (N.A.) 9.1, Outperforms IBM Mainframe, HP Itanium

Oracle's SPARC T4-4 server achieved world record performance on the Unicode version of Oracle's PeopleSoft Enterprise Payroll (N.A) 9.1 extra-large volume model benchmark using Oracle Database 11g Release 2 running on Oracle Solaris 10.

  • The SPARC T4-4 server was able to process 1,460,544 payments/hour using PeopleSoft Payroll N.A 9.1.

  • The SPARC T4-4 server UNICODE result of 30.84 minutes on Payroll 9.1 is 2.8x faster than IBM z10 EC 2097 Payroll 9.0 (UNICODE version) result of 87.4 minutes. The IBM mainframe is rated at 6,512 MIPS.

  • The SPARC T4-4 server UNICODE result of 30.84 minutes on Payroll 9.1 is 3.1x faster than HP rx7640 Itanium2 non-UNICODE result of 96.17 minutes, on Payroll 9.0.

  • The average CPU utilization on the SPARC T4-4 server was only 30%, leaving significant room for business growth.

  • The SPARC T4-4 server processed payroll for 500,000 employees, 750,000 payments, in 30.84 minutes compared to the earlier world record result of 46.76 minutes on Oracle's SPARC Enterprise M5000 server.

  • The SPARC Enterprise M5000 server configured with eight 2.66 GHz SPARC64 VII processors has a result of 46.76 minutes on Payroll 9.1. That is 7% better than the result of 50.11 minutes on the SPARC Enterprise M5000 server configured with eight 2.53 GHz SPARC64 VII processors on Payroll 9.0. The difference in clock speed between the two processors is ~5%. That is close to the difference in the two results, thereby showing that the impact of the Payroll 9.1 benchmark on the overall result is about the same as that of Payroll 9.0.

Performance Landscape

PeopleSoft Payroll (N.A.) 9.1 – 500K Employees (7 Million SQL PayCalc, Unicode)

System OS/Database Payroll Processing
Result (minutes)
Run 1
(minutes)
Num of
Streams
SPARC T4-4, 4 x 3.0 GHz SPARC T4 Solaris/Oracle 11g 30.84 43.76 96
SPARC M5000, 8 x 2.66 GHz SPARC64 VII+ Solaris/Oracle 11g 46.76 66.28 32

PeopleSoft Payroll (N.A.) 9.0 – 500K Employees (3 Million SQL PayCalc, Non-Unicode)

System OS/Database Time in Minutes Num of
Streams
Payroll
Processing
Result
Run 1 Run 2 Run 3
Sun M5000, 8 x 2.53 GHz SPARC64 VII Solaris/Oracle 11g 50.11 73.88 534.20 1267.06 32
IBM z10 EC 2097, 9 x 4.4 GHz Gen1 Z/OS /DB2 58.96 80.5 250.68 462.6 8
IBM z10 EC 2097, 9 x 4.4 GHz Gen1 Z/OS /DB2 87.4 ** 107.6 - - 8
HP rx7640, 8 x 1.6 GHz Itanium2 HP-UX/Oracle 11g 96.17 133.63 712.72 1665.01 32

** This result was run with Unicode. The IBM z10 EC 2097 UNICODE result of 87.4 minutes is 48% slower than IBM z10 EC 2097 non-UNICODE result of 58.96 minutes, both on Payroll 9.0, each configured with nine 4.4GHz Gen1 processors.

Payroll 9.1 Compared to Payroll 9.0

Please note that Payroll 9.1 is Unicode based and Payroll 9.0 had non-Unicode and Unicode versions of the workload. There are 7 million executions of an SQL statement for the PayCalc batch process in Payroll 9.1 and 3 million executions of the same SQL statement for the PayCalc batch process in Payroll 9.0. This gets reflected in the elapsed time (27.33 min for 9.1 and 23.78 min for 9.0). The elapsed times of all other batch processes is lower (better) on 9.1.

Configuration Summary

Hardware Configuration:

SPARC T4-4 server
4 x 3.0 GHz SPARC T4 processors
256 GB memory
Sun Storage F5100 Flash Array
80 x 24 GB FMODs

Software Configuration:

Oracle Solaris 10 8/11
PeopleSoft HRMS and Campus Solutions 9.10.303
PeopleSoft Enterprise (PeopleTools) 8.51.035
Oracle Database 11g Release 2 11.2.0.1 (64-bit)
Micro Focus COBOLServer Express 5.1 (64-bit)

Benchmark Description

The PeopleSoft 9.1 Payroll (North America) benchmark is a performance benchmark established by PeopleSoft to demonstrate system performance for a range of processing volumes in a specific configuration. This information may be used to determine the software, hardware, and network configurations necessary to support processing volumes. This workload represents large batch runs typical of OLTP workloads during a mass update.

To measure five application business process run times for a database representing a large organization. The five processes are:

  • Paysheet Creation: Generates payroll data worksheets consisting of standard payroll information for each employee for a given pay cycle.

  • Payroll Calculation: Looks at paysheets and calculates checks for those employees.

  • Payroll Confirmation: Takes information generated by Payroll Calculation and updates the employees' balances with the calculated amounts.

  • Print Advice forms: The process takes the information generated by Payroll Calculations and Confirmation and produces an Advice for each employee to report Earnings, Taxes, Deduction, etc.

  • Create Direct Deposit File: The process takes information generated by the above processes and produces an electronic transmittal file that is used to transfer payroll funds directly into an employee's bank account.

Key Points and Best Practices

  • The SPARC T4-4 server with the Sun Storage F5100 Flash Array device had an average read throughput of up to 103 MB/sec and an average write throughput of up to 124 MB/sec while consuming 30% CPU on average.

  • The Sun Storage F5100 Flash Array device is a solid-state device that provides a read latency of only 0.5 msec. That is about 10 times faster than the normal disk latencies of 5 msec measured on this benchmark.

See Also

  • Oracle PeopleSoft Benchmark White Papers
    oracle.com
  • PeopleSoft Enterprise Human Capital Management (Payroll)
    oracle.com

  • PeopleSoft Enterprise Payroll 9.1 Using Oracle for Solaris (Unicode) on an Oracle's SPARC T4-4 – White Paper
    oracle.com

  • SPARC T4-4 Server
    oracle.com
  • Oracle Solaris
    oracle.com
  • Oracle Database 11g Release 2 Enterprise Edition
    oracle.com
  • Sun Storage F5100 Flash Array
    oracle.com

Disclosure Statement

Oracle's PeopleSoft Payroll 9.1 benchmark, SPARC T4-4 30.84 min,
http://www.oracle.com/us/solutions/benchmark/apps-benchmark/peoplesoft-167486.html, results 9/26/2011.

Monday Sep 19, 2011

Halliburton ProMAX® Seismic Processing on Sun Blade X6270 M2 with Sun ZFS Storage 7320

Halliburton/Landmark's ProMAX® 3D Pre-Stack Kirchhoff Time Migration's (PSTM) single workflow scalability and multiple workflow throughput using various scheduling methods are evaluated on a cluster of Oracle's Sun Blade X6270 M2 server modules attached to Oracle's Sun ZFS Storage 7320 appliance.

Two resource scheduling methods, compact and distributed, are compared while increasing the system load with additional concurrent ProMAX® workflows.

  • Multiple concurrent 24-process ProMAX® PSTM workflow throughput is constant; 10 workflows on 10 nodes finish as fast as 1 workflow on one compute node. Additionally, processing twice the data volume yields similar traces/second throughput performance.

  • A single ProMAX® PSTM workflow has good scaling from 1 to 10 nodes of a Sun Blade X6270 M2 cluster scaling 4.5X. ProMAX® scales to 4.7X on 10 nodes with one input data set and 6.3X with two consecutive input data sets (i.e. twice the data).

  • A single ProMAX® PSTM workflow has near linear scaling of 11x on a Sun Blade X6270 M2 server module when running from 1 to 12 processes.

  • The 12-thread ProMAX® workflow throughput using the distributed scheduling method is equivalent or slightly faster than the compact scheme for 1 to 6 concurrent workflows.

Performance Landscape

Multiple 24-Process Workflow Throughput Scaling

This test measures the system throughput scalability as concurrent 24-process workflows are added, one workflow per node. The per workflow throughput and the system scalability are reported.

Aggregate system throughput scales linearly. Ten concurrent workflows finish in the same time as does one workflow on a single compute node.

Halliburton ProMAX® Pre-Stack Time Migration - Multiple Workflow Scaling


Single Workflow Scaling

This test measures single workflow scalability across a 10-node cluster. Utilizing a single data set, performance exhibits near linear scaling of 11x at 12 processes, and per-node scaling of 4x at 6 nodes; performance flattens quickly reaching a peak of 60x at 240 processors and per-node scaling of 4.7x with 10 nodes.

Running with two consecutive input data sets in the workflow, scaling is considerably improved with peak scaling ~35% higher than obtained using a single data set. Doubling the data set size minimizes time spent in workflow initialization, data input and output.

Halliburton ProMAX® Pre-Stack Time Migration - Single Workflow Scaling

This next test measures single workflow scalability across a 10-node cluster (as above) but limiting scheduling to a maximum of 12-process per node; effectively restricting a maximum of one process per physical core. The speedup relative to a single process, and single node are reported.

Utilizing a single data set, performance exhibits near linear scaling of 37x at 48 processes, and per-node scaling of 4.3x at 6 nodes. Performance of 55x at 120 processors and per-node scaling of 5x with 10 nodes is reached and scalability is trending higher more strongly compared to the the case of two processes running per physical core above. For equivalent total process counts, multi-node runs using only a single process per physical core appear to run between 28-64% more efficiently (96 and 24 processes respectively). With a full compliment of 10 nodes (120 processes) the peak performance is only 9.5% lower than with 2 processes per vcpu (240 processes).

Running with two consecutive input data sets in the workflow, scaling is considerably improved with peak scaling ~35% higher than obtained using a single data set.

Halliburton ProMAX® Pre-Stack Time Migration - Single Workflow Scaling

Multiple 12-Process Workflow Throughput Scaling, Compact vs. Distributed Scheduling

The fourth test compares compact and distributed scheduling of 1, 2, 4, and 6 concurrent 12-processor workflows.

All things being equal, the system bi-section bandwidth should improve with distributed scheduling of a fixed-size workflow; as more nodes are used for a workflow, more memory and system cache is employed and any node memory bandwidth bottlenecks can be offset by distributing communication across the network (provided the network and inter-node communication stack do not become a bottleneck). When physical cores are not over-subscribed, compact and distributed scheduling performance is within 3% suggesting that there may be little memory contention for this workflow on the benchmarked system configuration.

With compact scheduling of two concurrent 12-processor workflows, the physical cores become over-subscribed and performance degrades 36% per workflow. With four concurrent workflows, physical cores are oversubscribed 4x and performance is seen to degrade 66% per workflow. With six concurrent workflows over-subscribed compact scheduling performance degrades 77% per workflow. As multiple 12-processor workflows become more and more distributed, the performance approaches the non over-subscribed case.

Halliburton ProMAX® Pre-Stack Time Migration - Multiple Workflow Scaling

141616 traces x 624 samples


Test Notes

All tests were performed with one input data set (70808 traces x 624 samples) and two consecutive input data sets (2 * (70808 traces x 624 samples)) in the workflow. All results reported are the average of at least 3 runs and performance is based on reported total wall-clock time by the application.

All tests were run with NFS attached Sun ZFS Storage 7320 appliance and then with NFS attached legacy Sun Fire X4500 server. The StorageTek Workload Analysis Tool (SWAT) was invoked to measure the I/O characteristics of the NFS attached storage used on separate runs of all workflows.

Configuration Summary

Hardware Configuration:

10 x Sun Blade X6270 M2 server modules, each with
2 x 3.33 GHz Intel Xeon X5680 processors
48 GB DDR3-1333 memory
4 x 146 GB, Internal 10000 RPM SAS-2 HDD
10 GbE
Hyper-Threading enabled

Sun ZFS Storage 7320 Appliance
1 x Storage Controller
2 x 2.4 GHz Intel Xeon 5620 processors
48 GB memory (12 x 4 GB DDR3-1333)
2 TB Read Cache (4 x 512 GB Read Flash Accelerator)
10 GbE
1 x Disk Shelf
20.0 TB RAID-Z (20 x 1 TB SAS-2, 7200 RPM HDD)
4 x Write Flash Accelerators

Sun Fire X4500
2 x 2.8 GHz AMD 290 processors
16 GB DDR1-400 memory
34.5 TB RAID-Z (46 x 750 GB SATA-II, 7200 RPM HDD)
10 GbE

Software Configuration:

Oracle Linux 5.5
Parallel Virtual Machine 3.3.11 (bundled with ProMAX)
Intel 11.1.038 Compilers
Libraries: pthreads 2.4, Java 1.6.0_01, BLAS, Stanford Exploration Project Libraries

Benchmark Description

The ProMAX® family of seismic data processing tools is the most widely used Oil and Gas Industry seismic processing application. ProMAX® is used for multiple applications, from field processing and quality control, to interpretive project-oriented reprocessing at oil companies and production processing at service companies. ProMAX® is integrated with Halliburton's OpenWorks® Geoscience Oracle Database to index prestack seismic data and populate the database with processed seismic.

This benchmark evaluates single workflow scalability and multiple workflow throughput of the ProMAX® 3D Prestack Kirchhoff Time Migration (PSTM) while processing the Halliburton benchmark data set containing 70,808 traces with 8 msec sample interval and trace length of 4992 msec. Benchmarks were performed with both one and two consecutive input data sets.

Each workflow consisted of:

  • reading the previously constructed MPEG encoded processing parameter file
  • reading the compressed seismic data traces from disk
  • performing the PSTM imaging
  • writing the result to disk

Workflows using two input data sets were constructed by simply adding a second identical seismic data read task immediately after the first in the processing parameter file. This effectively doubled the data volume read, processed, and written.

This version of ProMAX® currently only uses Parallel Virtual Machine (PVM) as the parallel processing paradigm. The PVM software only used TCP networking and has no internal facility for assigning memory affinity and processor binding. Every compute node is running a PVM daemon.

The ProMAX® processing parameters used for this benchmark:

Minimum output inline = 65
Maximum output inline = 85
Inline output sampling interval = 1
Minimum output xline = 1
Maximum output xline = 200 (fold)
Xline output sampling interval = 1
Antialias inline spacing = 15
Antialias xline spacing = 15
Stretch Mute Aperature Limit with Maximum Stretch = 15
Image Gather Type = Full Offset Image Traces
No Block Moveout
Number of Alias Bands = 10
3D Amplitude Phase Correction
No compression
Maximum Number of Cache Blocks = 500000

Primary PSTM business metrics are typically time-to-solution and accuracy of the subsurface imaging solution.

Key Points and Best Practices

  • Multiple job system throughput scales perfectly; ten concurrent workflows on 10 nodes each completes in the same time and has the same throughput as a single workflow running on one node.
  • Best single workflow scaling is 6.6x using 10 nodes.

    When tasked with processing several similar workflows, while individual time-to-solution will be longer, the most efficient way to run is to fully distribute them one workflow per node (or even across two nodes) and run these concurrently, rather than to use all nodes for each workflow and running consecutively. For example, while the best-case configuration used here will run 6.6 times faster using all ten nodes compared to a single node, ten such 10-node jobs running consecutively will overall take over 50% longer to complete than ten jobs one per node running concurrently.

  • Throughput was seen to scale better with larger workflows. While throughput with both large and small workflows are similar with only one node, the larger dataset exhibits 11% and 35% more throughput with four and 10 nodes respectively.

  • 200 processes appears to be a scalability asymptote with these workflows on the systems used.
  • Hyperthreading marginally helps throughput. For the largest model run on 10 nodes, 240 processes delivers 11% more performance than with 120 processes.

  • The workflows do not exhibit significant I/O bandwidth demands. Even with 10 concurrent 24-process jobs, the measured aggregate system I/O did not exceed 100 MB/s.

  • 10 GbE was the only network used and, though shared for all interprocess communication and network attached storage, it appears to have sufficient bandwidth for all test cases run.

See Also

Disclosure Statement

The following are trademarks or registered trademarks of Halliburton/Landmark Graphics: ProMAX®, GeoProbe®, OpenWorks®. Results as of 9/1/2011.

Thursday Sep 15, 2011

Sun Fire X4800 M2 Servers (now known as Sun Server X2-8) Produce World Record on SAP SD-Parallel Benchmark

Oracle delivered an SAP enhancement package 4 for SAP ERP 6.0 (Unicode) Sales and Distribution - Parallel (SD Parallel) Benchmark world record result using eight of Oracle's Sun Fire X4800 M2 servers (now known as Sun Server X2-8), Oracle Solaris 10 and Oracle Database 11g Real Application Clusters (RAC) software that achieved 180,000 users as of 10/03/2011.

  • The eight Sun Fire X4800 M2 servers delivered a world record result of 180,000 users on the SAP SD Parallel Benchmark.

  • The eight Sun Fire X4800 M2 server SD Parallel result of 180,000 users delivered 43% more performance compared to the IBM Power 795 server SD two-tier result of 126,063 users.

Performance Landscape

Selected SAP Sales and Distribution (SD) benchmark results are presented in decreasing order of performance. All benchmarks were using SAP enhancement package 4 for SAP ERP 6.0 (Unicode).

System OS
Database
Users SAPS Type Cert #
Eight Sun Fire X4800 M2
8 x Intel Xeon E7-8870 @2.4 GHz
512 GB
Oracle Solaris 10
Oracle 11g RAC
180,000 1,016,380 Parallel 2011037
Six Sun Fire X4800 M2
8 x Intel Xeon E7-8870 @2.4 GHz
512 GB
Oracle Solaris 10
Oracle 11g RAC
137,904 765,470 Parallel 2011038
IBM Power 795
32 x POWER7 @4.0 GHz
4096 GB
AIX 7.1
DB2 9.7
126,063 688,630 Two-Tier 2010046
Four Sun Fire X4800 M2
8 x Intel Xeon E7-8870 @2.4 GHz
512 GB
Oracle Solaris 10
Oracle 11g RAC
94,736 546,050 Parallel 2011039
Two Sun Fire X4800 M2
8 x Intel Xeon E7-8870 @2.4 GHz
512 GB
Oracle Solaris 10
Oracle 11g RAC
49,860 274,080 Parallel 2011040
Four Sun Fire X4470
4 x Intel Xeon X7560 @2.26 GHz
256 GB
Solaris 10
Oracle 11g RAC
40,000 221,020 Parallel 2010039

Complete benchmark results and descriptions can be found at the SAP standard applications benchmark website.
For SD benchmark results website: Two-Tier or Three-Tier. For SD Parallel benchmark results website: SD Parallel.

Configuration and Results Summary

Hardware Configuration:

8 x Sun Fire X4800 M2 servers, each with
8 x Intel Xeon E7-8870 @ 2.4 GHz (8 processors, 80 cores, 160 threads)
512 GB memory

Software Configuration:

SAP enhancement package 4 for SAP ERP 6.0
Oracle Database 11g Real Application Clusters (RAC)
Oracle Solaris 10

Results Summary:

Number of SAP SD benchmark users:
180,000
Average dialog response time:
0.63 seconds
Throughput:

Fully processed order line items per hour:
20,327,670

Dialog steps/hour:
60,983,000

SAPS:
1,016,380
Average database request time (dialog/update):
0.010 sec / 0.055 sec
SAP Certification:
2011037

Benchmark Description

The SAP Standard Application Sales and Distribution - Parallel (SD Parallel) Benchmark is a two-tier ERP business test that is indicative of full business workloads of complete order processing and invoice processing and demonstrates the ability to run both the application and database software on a single system. The SAP Standard Application SD Benchmark represents the critical tasks performed in real-world ERP business environments.

The SD Parallel Benchmark consists of the same transactions and user interaction steps as the two-tier and three-tier SD Benchmark. This means that the SD Parallel Benchmark runs the same business processes as the SD Benchmark. The difference between the benchmarks is the technical data distribution. Additionally, the benchmark requires equal distribution of the benchmark users across all database nodes for the used benchmark clients (round-robin method). Following this rule, all database nodes work on data of all clients. This avoids unrealistic configurations such as having only one client per database node.

The SAP Benchmark Council agreed to give the parallel benchmark a different name so that the difference can be easily recognized by any interested parties - customers, prospects, and analysts. The naming convention is SD Parallel for Sales & Distribution - Parallel.

SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products.

See Also

Disclosure Statement

SAP enhancement package 4 for SAP ERP 6.0 (Unicode) Sales and Distribution Benchmark, results as of 10/03/2011.

SD Parallel, 8 x Sun Fire X4800 M2 (each 8 processors, 80 cores, 160 threads) 180,000 SAP SD Users, Oracle Solaris 10, Oracle 11g Real Application Clusters (RAC), Certification Number 2011037.
SD Parallel, 6 x Sun Fire X4800 M2 (each 8 processors, 80 cores, 160 threads) 137,904 SAP SD Users, Oracle Solaris 10, Oracle 11g Real Application Clusters (RAC), Certification Number 2011038.
SD Parallel, 4 x Sun Fire X4470 (each 4 processors, 32 cores, 64 threads) 40,000 SAP SD Users, Oracle Solaris 10, Oracle 11g Real Application Clusters (RAC), Certification Number 2010039.
SD Two-Tier, IBM Power 795 (32 processors, 256 cores, 1024 threads) 126,063 SAP SD Users, AIX 7.1, DB2 9.7, Certification Number 2010046.

SAP, R/3 are registered trademarks of SAP AG in Germany and other countries. More information may be found at www.sap.com/benchmark.

Monday Sep 12, 2011

SPARC Enterprise M9000 Produces World Record SAP ATO Benchmark

Oracle delivered an SAP enhancement package 4 for SAP ERP 6.0 Assemble-to-Order (ATO) benchmark world record result using Oracle's SPARC Enterprise M9000 server running Oracle Solaris 10 and Oracle Database 11g along with SAP Enhancement Package 4 for SAP ERP 6.0 (Unicode). The SAP ATO benchmark integrates process chains across SAP Business Suite components, include Financials, Logistics, Human Resources, Basis and Cross Application.

  • The SPARC Enterprise M9000 server containing 64 SPARC64 VII+ 3.0 GHz processors, running Oracle Solaris 10 and Oracle Database 11g along with SAP Enhancement Package 4 for SAP ERP 6.0 (Unicode) delivered a world record 206,000 fully processed assembly orders per hour on the SAP enhancement package 4 for SAP ERP 6.0 ATO benchmark.

  • The SPARC Enterprise M9000 server result shows it can more than consolidate the work of the three-tier HP solution which used 80 different servers.

  • Oracle produced the first SAP ATO benchmark result using Unicode encoding.

  • The SAP ATO benchmark uses multiple components of the SAP Business Suite. See more detail at the SAP ATO benchmark webpage.

Performance Landscape

SAP ATO 2-Tier Performance Table (select results in decreasing performance order)

System OS
Database
Assembly Orders
per hour(*)
SAP
ERP/ECC
Release
Cert Num
SPARC Enterprise M9000
64 x SPARC64 VII+ @3.0 GHz
2048 GB
Oracle Solaris 10
Oracle 11g
206,360 SAP ERP6.0*
(Unicode)
2011033
Fujitsu Siemens Primepower 2000
128 x SPARC64 @560 MHz
128 GB
Solaris 8
Oracle 8.1.7
34,260 4.6B
(non-Unicode)
2001018
HP 9000 Superdome
64 x PA-RISC 8600 @552 MHz
128 GB
HP-UX 11.11
Oracle 8.16
18,870 4.6B
(non-Unicode)
2001014
Fujitsu Siemens Primepower 900
16 x SPARC64 V @1.35 GHz
64 GB
Solaris 8
Oracle 9i
12,170 4.6C
(non-Unicode)
2003012
HP rx5670
4 x Itanium II @1.0 GHz
24 GB
HP-UX 11i
Oracle 9i
3,090 4.6C
(non-Unicode)
2002069

(*) SAP enhancement package 4 for SAP ERP6.0 (Unicode)

SAP ATO 3-Tier Performance Table (top results in decreasing performance order)

System OS
Database
Assembly Orders
per hour(*)
SAP
ERP/ECC
Release
Cert Num
HP 9000 Superdome Enterprise Server
64 x PA-RISC 8700 @ 750MHz
128 GB
HP-UX 11i
Oracle 9i
144,090 4.6 C
(non-Unicode)
2002003
HP 9000 Superdome Enterprise Server
64 x PA-RISC 8700 @750 MHz
128 GB
HP-UX 11i
Oracle 9i
130,570 4.6 C
(non-Unicode)
2001047

(*) Assembly Order: Request to assemble pre-manufactured parts and assemblies to finished products according to an existing sales order.

Complete benchmark results may be found at the SAP benchmark website: http://www.sap.com/benchmark.

Configuration Summary and Results

Hardware Configuration:

SPARC Enterprise M9000
64 SPARC64 VII+ 3.0 GHz processor
2048 GB memory

Software Configuration:

Oracle Solaris 10
SAP enhancement package 4 for SAP ERP 6.0 (Unicode)
Oracle Database 11g

Certified Result:

Fully business processed Assembly Orders/hour:
206,360
SAP Certification Number:
2011033

Benchmark Description

The SAP ATO benchmark integrates process chains across SAP Business Suite components. The ATO scenario is characterized by high volume sales, short production times (from hours to one day), and individual assembly for such products as PCs, pumps, and cars. In general, each benchmark user has its own master data, such as material, vendor, or customer master data to avoid data locking situations. However, the ATO Benchmark has been designed to handle and overcome data locking situations - the ATO benchmark users access common master data, such as material, vendor, or customer master data. (source: http://www12.sap.com/solutions/benchmark/ato.epx).

SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products.

See Also

Disclosure Statement

SAP, R/3 are registered trademarks of SAP AG in Germany and other countries. More information may be found at www.sap.com/benchmark

Two-tier SAP ATO standard SAP ERP 6.0 2005/EP4 (Unicode) application benchmarks as of 09/04/11:
Oracle's SPARC Enterprise M9000 (64 processors, 256 cores, 512 threads) 206,360 Assembly Orders/hour, 64 x 3.0 GHz SPARC VIII, 2048 GB memory, Oracle 11g, Oracle Solaris 10, Certification Number 2011033.

Two-tier SAP ATO standard 4.6 C application benchmarks as of 09/04/11:
Fujitsu Siemens Primepower 900 (16-way SMP) 12,170 Assembly Orders/hour, 16 x 1.35 GHz SPARC64 V, 64 GB memory, Oracle 9i, Solaris 8, Certification Number 2003012.
HP rx5670 (4 processors SMP) 3,090 Assembly Orders/hour, 4 x 1.0 GHz Itanium II, 24 GB memory, Oracle 9i, HP-UX 11i, Certification Number 2002069.

Two-tier SAP ATO standard 4.6 B application benchmarks as of 09/04/11:
HP 9000 Superdome (64-way SMP) 18,8770 Assembly Orders/hour, 64 x 552 MHz PA-RISC 8600, 128 GB memory, Oracle 8.1.6, HP-UX 11.11, Certification Number 2001014.
Fujitsu Siemens Primepower 2000 (128 processors SMP) 34,260 Assembly Orders/hour, 128 x 560 MHz SPARC64, 128 GB memory, Oracle 8.1.7, Solaris 8, Certification Number 2001018.

Three-tier SAP ATO standard 4.6 C application benchmarks as of 09/04/11:
HP 9000 Superdome Enterprise Server (64 processors SMP) 144,090 Assembly Orders/hour, 64 x 750 MHz PA-RISC 8700, 128 GB memory, Oracle 9i, HP-UX 11i, Certification Number 2002003
HP 9000 Superdome Enterprise Server (64 processors SMP) 130,570 Assembly Orders/hour, 64 x 750 MHz PA-RISC 8700, 128 GB memory, Oracle 9i, HP-UX 11i, Certification Number 2001047

Friday Jul 01, 2011

SPARC T3-1 Record Results Running JD Edwards EnterpriseOne Day in the Life Benchmark with Added Batch Component

Using Oracle's SPARC T3-1 server for the application tier and Oracle's SPARC Enterprise M3000 server for the database tier, a world record result was produced running the Oracle's JD Edwards EnterpriseOne applications Day in the Life benchmark run concurrently with a batch workload.

  • The SPARC T3-1 server based result has 25% better performance than the IBM Power 750 POWER7 server even though the IBM result did not include running a batch component.

  • The SPARC T3-1 server based result has 25% better space/performance than the IBM Power 750 POWER7 server as measured by the online component.

  • The SPARC T3-1 server based result is 5x faster than the x86-based IBM x3650 M2 server system when executing the online component of the JD Edwards EnterpriseOne 9.0.1 Day in the Life benchmark. The IBM result did not include a batch component.

  • The SPARC T3-1 server based result has 2.5x better space/performance than the x86-based IBM x3650 M2 server as measured by the online component.

  • The combination of SPARC T3-1 and SPARC Enterprise M3000 servers delivered a Day in the Life benchmark result of 5000 online users with 0.875 seconds of average transaction response time running concurrently with 19 Universal Batch Engine (UBE) processes at 10 UBEs/minute. The solution exercises various JD Edwards EnterpriseOne applications while running Oracle WebLogic Server 11g Release 1 and Oracle Web Tier Utilities 11g HTTP server in Oracle Solaris Containers, together with the Oracle Database 11g Release 2.

  • The SPARC T3-1 server showed that it could handle the additional workload of batch processing while maintaining the same number of online users for the JD Edwards EnterpriseOne Day in the Life benchmark. This was accomplished with minimal loss in response time.

  • JD Edwards EnterpriseOne 9.0.1 takes advantage of the large number of compute threads available in the SPARC T3-1 server at the application tier and achieves excellent response times.

  • The SPARC T3-1 server consolidates the application/web tier of the JD Edwards EnterpriseOne 9.0.1 application using Oracle Solaris Containers. Containers provide flexibility, easier maintenance and better CPU utilization of the server leaving processing capacity for additional growth.

  • A number of Oracle advanced technology and features were used to obtain this result: Oracle Solaris 10, Oracle Solaris Containers, Oracle Java Hotspot Server VM, Oracle WebLogic Server 11g Release 1, Oracle Web Tier Utilities 11g, Oracle Database 11g Release 2, the SPARC T3 and SPARC64 VII+ based servers.

  • This is the first published result running both online and batch workload concurrently on the JD Enterprise Application server. No published results are available from IBM running the online component together with a batch workload.

  • The 9.0.1 version of the benchmark saw some minor performance improvements relative to 9.0. When comparing between 9.0.1 and 9.0 results, the reader should take this into account when the difference between results is small.

Performance Landscape

JD Edwards EnterpriseOne Day in the Life Benchmark
Online with Batch Workload

This is the first publication on the Day in the Life benchmark run concurrently with batch jobs. The batch workload was provided by Oracle's Universal Batch Engine.

System Rack
Units
Online
Users
Resp
Time (sec)
Batch
Concur
(# of UBEs)
Batch
Rate
(UBEs/m)
Version
SPARC T3-1, 1xSPARC T3 (1.65 GHz), Solaris 10
M3000, 1xSPARC64 VII+ (2.86 GHz), Solaris 10
4 5000 0.88 19 10 9.0.1

Resp Time (sec) — Response time of online jobs reported in seconds
Batch Concur (# of UBEs) — Batch concurrency presented in the number of UBEs
Batch Rate (UBEs/m) — Batch transaction rate in UBEs/minute.

JD Edwards EnterpriseOne Day in the Life Benchmark
Online Workload Only

These results are for the Day in the Life benchmark. They are run without any batch workload.

System Rack
Units
Online
Users
Response
Time (sec)
Version
SPARC T3-1, 1xSPARC T3 (1.65 GHz), Solaris 10
M3000, 1xSPARC64 VII (2.75 GHz), Solaris 10
4 5000 0.52 9.0.1
IBM Power 750, 1xPOWER7 (3.55 GHz), IBM i7.1 4 4000 0.61 9.0
IBM x3650M2, 2xIntel X5570 (2.93 GHz), OVM 2 1000 0.29 9.0

IBM result from http://www-03.ibm.com/systems/i/advantages/oracle/, IBM used WebSphere

Configuration Summary

Hardware Configuration:

1 x SPARC T3-1 server
1 x 1.65 GHz SPARC T3
128 GB memory
16 x 300 GB 10000 RPM SAS
1 x Sun Flash Accelerator F20 PCIe Card, 96 GB
1 x 10 GbE NIC
1 x SPARC Enterprise M3000 server
1 x 2.86 SPARC64 VII+
64 GB memory
1 x 10 GbE NIC
2 x StorageTek 2540 + 2501

Software Configuration:

JD Edwards EnterpriseOne 9.0.1 with Tools 8.98.3.3
Oracle Database 11g Release 2
Oracle 11g WebLogic server 11g Release 1 version 10.3.2
Oracle Web Tier Utilities 11g
Oracle Solaris 10 9/10
Mercury LoadRunner 9.10 with Oracle Day in the Life kit for JD Edwards EnterpriseOne 9.0.1
Oracle’s Universal Batch Engine - Short UBEs and Long UBEs

Benchmark Description

JD Edwards EnterpriseOne is an integrated applications suite of Enterprise Resource Planning (ERP) software. Oracle offers 70 JD Edwards EnterpriseOne application modules to support a diverse set of business operations.

Oracle's Day in the Life (DIL) kit is a suite of scripts that exercises most common transactions of JD Edwards EnterpriseOne applications, including business processes such as payroll, sales order, purchase order, work order, and other manufacturing processes, such as ship confirmation. These are labeled by industry acronyms such as SCM, CRM, HCM, SRM and FMS. The kit's scripts execute transactions typical of a mid-sized manufacturing company.

  • The workload consists of online transactions and the UBE workload of 15 short and 4 long UBEs.

  • LoadRunner runs the DIL workload, collects the user’s transactions response times and reports the key metric of Combined Weighted Average Transaction Response time.

  • The UBE processes workload runs from the JD Enterprise Application server.

    • Oracle's UBE processes come as three flavors:

      • Short UBEs < 1 minute engage in Business Report and Summary Analysis,
      • Mid UBEs > 1 minute create a large report of Account, Balance, and Full Address,
      • Long UBEs > 2 minutes simulate Payroll, Sales Order, night only jobs.
    • The UBE workload generates large numbers of PDF files reports and log files.

    • The UBE Queues are categorized as the QBATCHD, a single threaded queue for large UBEs, and the QPROCESS queue for short UBEs run concurrently.

  • One of the Oracle Solaris Containers ran 4 Long UBEs, while another Container ran 15 short UBEs concurrently.

  • The mixed size UBEs ran concurrently from the SPARC T3-1 server with the 5000 online users driven by the LoadRunner.

  • Oracle’s UBE process performance metric is Number of Maximum Concurrent UBE processes at transaction rate, UBEs/minute.

Key Points and Best Practices

Two JD Edwards EnterpriseOne Application Servers and two Oracle Fusion Middleware WebLogic Servers 11g R1 coupled with two Oracle Fusion Middleware 11g Web Tier HTTP Server instances on the SPARC T3-1 server were hosted in four separate Oracle Solaris Containers to demonstrate consolidation of multiple application and web servers.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 6/27/2011.

Friday Jun 10, 2011

SPARC Enterprise M5000 Delivers First PeopleSoft Payroll 9.1 Benchmark

Oracle's M-series server sets a world record on Oracle's PeopleSoft Enterprise Payroll (N.A) 9.1 with extra large volume model benchmark (Unicode). Oracle's SPARC Enterprise M5000 server was able to to run faster than the previous generation system result even though the PeopleSoft Payroll 9.1 benchmark is more computationally demanding.

Oracle's SPARC Enterprise M5000 server configured with eight 2.66 GHz SPARC64 VII+ processors together with Oracle's Sun Storage F5100 Flash Array storage achieved world record performance on the Unicode version of Oracle's PeopleSoft Enterprise Payroll (N.A) 9.1 with extra large volume model benchmark using Oracle Database 11g Release 2 running on Oracle Solaris 10.

  • The SPARC Enterprise M5000 server processed payroll payments for the 500K employees PeopleSoft Payroll 9.1 (Unicode) benchmark in 46.76 minutes compared to a previous result of 50.11 minutes for the PeopleSoft Payroll 9.0 (non-Unicode) benchmark configured with 2.53 GHz SPARC64 VII processors resulting in 7% better performance.

  • Note that the IBM z10 Gen1 mainframe running the PeopleSoft Payroll 9.0 (Unicode) benchmark was 48% slower than the 9.0 non-Unicode version. The IBM z10 mainframe with nine 4.4 GHz Gen1 processors has a list price over $6M and is rated at 6,512 MIPS.

  • The SPARC Enterprise M5000 server with the Sun Storage F5100 Flash Array system processed payroll for 500K employees completing the end-to-end run in 66.28 mins, 11% faster than earlier published result of 73.88 mins with Payroll 9.0 configured with 2.53 GHz SPARC64 VII processors.

  • The Sun Storage F5100 Flash Array device is a high performance, high-density solid-state flash array which provides a read latency of only 0.5 msec which is about 10 times faster than the normal disk latencies of 5 msec measured on this benchmark.

Performance Landscape

PeopleSoft Payroll (N.A.) 9.1 – 500K Employees (7 Million SQL PayCalc, Unicode)

System Processor OS/Database Payroll Processing
Result (minutes)
Run 1
(minutes)
Num of
Streams
SPARC M5000 8x 2.66GHz SPARC64 VII+ Solaris/Oracle 11g 46.76 66.28 32

PeopleSoft Payroll (N.A.) 9.0 – 500K Employees (3 Million SQL PayCalc, Non-Unicode)

System Processor OS/Database Time in Minutes Num of
Streams
Payroll
Processing
Result
Run 1 Run 2 Run 3
Sun M5000 8x 2.53GHz SPARC64 VII Solaris/Oracle 11g 50.11 73.88 534.20 1267.06 32
IBM z10 9x 4.4GHz Gen1 Z/OS /DB2 58.96 80.5 250.68 462.6 8
IBM z10 9x 4.4GHz Gen1 Z/OS /DB2 87.4 ** 107.6 - - 8
HP rx7640 8x 1.6GHz Itanium2 HP-UX/Oracle 11g 96.17 133.63 712.72 1665.01 32

** This result was run with Unicode

Payroll 9.1 Compared to Payroll 9.0

Please note that Payroll 9.1 is Unicode based and Payroll 9.0 is non-Unicode. There are 7 million executions of an SQL statement for the PayCalc batch process in Payroll 9.1 and 3 million executions of the same SQL statement for the PayCalc batch process in Payroll 9.0. This gets reflected in the elapsed time (27.33 min for 9.1 and 23.78 min for 9.0). The elapsed times of all other batch processes is lower (better) on 9.1.

Configuration Summary

Hardware Configuration:

SPARC Enterprise M5000 server
8 x 2.66 GHz SPARC64 VII+ processors
128 GB memory
2 x SAS HBA (SG-XPCIE8SAS-E-Z - PCIe HBA for Rack Servers)
Sun Storage F5100 Flash Array
40 x 24 GB FMODs
1 x StorageTek 2501 array with
12 x 146 GB SAS 15K RPM disks
1 x StorageTek 2540 array with
12 x 146 GB SAS 15K RPM disks

Software Configuration:

Oracle Solaris 10 09/10
PeopleSoft HRMS and Campus Solutions 9.10.303
PeopleSoft Enterprise (PeopleTools) 8.51.035
Oracle Database 11g Release 2 11.2.0.1 (64-bit)
Micro Focus COBOLServer Express 5.1 (64-bit)

Benchmark Description

The PeopleSoft 9.1 Payroll (North America) benchmark is a performance benchmark established by PeopleSoft to demonstrate system performance for a range of processing volumes in a specific configuration. This information may be used to determine the software, hardware, and network configurations necessary to support processing volumes. This workload represents large batch runs typical of OLTP workloads during a mass update.

To measure five application business process run times for a database representing a large organization. The five processes are:

  • Paysheet Creation: Generates payroll data worksheets consisting of standard payroll information for each employee for a given pay cycle.

  • Payroll Calculation: Looks at paysheets and calculates checks for those employees.

  • Payroll Confirmation: Takes information generated by Payroll Calculation and updates the employees' balances with the calculated amounts.

  • Print Advice forms: The process takes the information generated by Payroll Calculations and Confirmation and produces an Advice for each employee to report Earnings, Taxes, Deduction, etc.

  • Create Direct Deposit File: The process takes information generated by the above processes and produces an electronic transmittal file that is used to transfer payroll funds directly into an employee's bank account.

For the benchmark, we collected at least three data points with different numbers of job streams (parallel jobs). This batch benchmark allows a maximum of thirty-two job streams to be configured to run in parallel.

See Also

Disclosure Statement

Oracle's PeopleSoft Payroll 9.1 benchmark, SPARC Enterprise M5000 46.76 min, www.oracle.com/apps_benchmark/html/white-papers-peoplesoft.html, results 6/10/2011.

About

BestPerf is the source of Oracle performance expertise. In this blog, Oracle's Strategic Applications Engineering group explores Oracle's performance results and shares best practices learned from working on Enterprise-wide Applications.

Index Pages
Search

Archives
« June 2016
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  
       
Today