Tuesday Sep 27, 2011

SPARC T4-2 Servers Set World Record on JD Edwards EnterpriseOne Day in the Life Benchmark with Batch, Outperforms IBM POWER7

Using Oracle's SPARC T4-2 server for the application tier and a SPARC T4-1 server for the database tier, a world record result was produced running the Oracle's JD Edwards EnterpriseOne application Day in the Life (DIL) benchmark concurrently with a batch workload.

  • The SPARC T4-2 server running online and batch with JD Edwards EnterpriseOne 9.0.2 is 1.7x faster and has better response time than the IBM Power 750 system which only ran the online component of JD Edwards EnterpriseOne 9.0 Day in the Life test.

  • The combination of SPARC T4 servers delivered a Day in the Life benchmark result of 10,000 online users with 0.35 seconds of average transaction response time running concurrently with 112 Universal Batch Engine (UBE) processes at 67 UBEs/minute.

  • This is the first JD Edwards EnterpriseOne benchmark for 10,000 users and payroll batch on a SPARC T4-2 server for the application tier and the database tier with Oracle Database 11g Release 2. All servers ran with the Oracle Solaris 10 operating system.

  • The single-thread performance of the SPARC T4 processor produced sub-second response for the online components and provided dramatic performance for the batch jobs.

  • The SPARC T4 servers, JD Edwards EnterpriseOne 9.0.2, and Oracle WebLogic Server 11g Release 1 support 17% more users per JAS (Java Application Server) than the SPARC T3-1 server for this benchmark.

  • The SPARC T4-2 server provided a 6.7x better batch processing rate than the previous SPARC T3-1 server record result and had 2.5x faster response time.

  • The SPARC T4-2 server used Oracle Solaris Containers, which provide flexible, scalable and manageable virtualization.

  • JD Edwards EnterpriseOne uses Oracle Fusion Middleware WebLogic Server 11g R1 and Oracle Fusion Middleware Cluster Web Tier Utilities 11g HTTP server.

  • The combination of the SPARC T4-2 server and Oracle JD Edwards EnterpriseOne in the application tier with a SPARC T4-1 server in the database tier measured low CPU utilization providing headroom for growth.

Performance Landscape

JD Edwards EnterpriseOne Day in the Life Benchmark
Online with Batch Workload

System Online
Users
Resp
Time (sec)
Batch
Concur
(# of UBEs)
Batch
Rate
(UBEs/m)
Version
2xSPARC T4-2 (app+web)
SPARC T4-1 (db)
10000 0.35 112 67 9.0.2
SPARC T3-1 (app+web)
SPARC Enterprise M3000 (db)
5000 0.88 19 10 9.0.1

Resp Time (sec) — Response time of online jobs reported in seconds
Batch Concur (# of UBEs) — Batch concurrency presented in the number of UBEs
Batch Rate (UBEs/m) — Batch transaction rate in UBEs per minute

Edwards EnterpriseOne Day in the Life Benchmark
Online Workload Only

System Online
Users
Response
Time (sec)
Version
SPARC T3-1, 1 x SPARC T3 (1.65 GHz), Solaris 10 (app)
M3000, 1 x SPARC64 VII (2.75 GHz), Solaris 10 (db)
5000 0.52 9.0.1
IBM Power 750, POWER7 (3.55 GHz) (app+db) 4000 0.61 9.0

IBM result from http://www-03.ibm.com/systems/i/advantages/oracle/, IBM used WebSphere

Configuration Summary

Application Tier Configuration:

1 x SPARC T4-2 server with
2 x 2.85 GHz SPARC T4 processors
128 GB main memory
6 x 300 GB 10K RPM SAS internal HDD
Oracle Solaris 10 9/10
JD Edwards EnterpriseOne 9.0.2 with Tools 8.98.3.3

Web Tier Configuration:

1 x SPARC T4-2 server with
2 x 2.85 GHz SPARC T4 processors
256 GB main memory
2 x 300 GB SSD
4 x 300 GB 10K RPM SAS internal HDD
Oracle Solaris 10 9/10
Oracle WebLogic Server 11g Release 1

Database Tier Configuration:

1 x SPARC T4-1 server with
1 x 2.85 GHz SPARC T4 processor
128 GB main memory
6 x 300 GB 10K RPM SAS internal HDD
2 x Sun Storage F5100 Flash Array
Oracle Solaris 10 9/10
Oracle Database 11g Release 2

Benchmark Description

JD Edwards EnterpriseOne is an integrated applications suite of Enterprise Resource Planning (ERP) software. Oracle offers 70 JD Edwards EnterpriseOne application modules to support a diverse set of business operations.

Oracle's Day in the Life (DIL) kit is a suite of scripts that exercises most common transactions of JD Edwards EnterpriseOne applications, including business processes such as payroll, sales order, purchase order, work order, and manufacturing processes, such as ship confirmation. These are labeled by industry acronyms such as SCM, CRM, HCM, SRM and FMS. The kit's scripts execute transactions typical of a mid-sized manufacturing company.

  • The workload consists of online transactions and the UBE – Universal Business Engine workload of 42 short, 8 medium and 4 long UBEs.

  • LoadRunner runs the DIL workload, collects the user’s transactions response times and reports the key metric of Combined Weighted Average Transaction Response time.

  • The UBE processes workload runs from the JD Enterprise Application server.

    • Oracle's UBE processes come as three flavors:
      • Short UBEs < 1 minute engage in Business Report and Summary Analysis,
      • Mid UBEs > 1 minute create a large report of Account, Balance, and Full Address,
      • Long UBEs > 2 minutes simulate Payroll, Sales Order, night only jobs.
    • The UBE workload generates large numbers of PDF files reports and log files.
    • The UBE Queues are categorized as the QBATCHD, a single threaded queue for large and medium UBEs, and the QPROCESS queue for short UBEs run concurrently.

Oracle’s UBE process performance metric is Number of Maximum Concurrent UBE processes at transaction rate, UBEs/minute.

Key Points and Best Practices

One JD Edwards EnterpriseOne Application Server and two Oracle WebLogic Servers 11g R1 coupled with two Oracle Fusion Middleware 11g Web Tier HTTP Server instances on the SPARC T4-2 servers were hosted in three separate Oracle Solaris Containers to demonstrate consolidation of multiple application and web servers.

  • Interrupt fencing was configured on all Oracle Solaris Containers to channel the interrupts to processors other than the processor sets used for the JD Edwards Application server and WebLogic servers.

  • Processor 0 was left alone for clock interrupts.

  • The applications were executed in the FX scheduling class to improve performance by reducing the frequency of context switches.

  • A WebLogic vertical cluster was configured on each WebServer Container with twelve managed instances each to load balance users' requests and to provide the infrastructure that enables scaling to high number of users with ease of deployment and high availability.

  • The database server was run in an Oracle Solaris Container hosted on the SPARC T4-2 server.

  • The database log writer was run in the real time RT class and bound to a processor set.

  • The database redo logs were configured on the raw disk partitions.

  • The private network between the SPARC T4-2 servers was configured with a 10 GbE interface.

  • The Oracle Solaris Container on the Enterprise Application server ran 42 Short UBEs, 8 Medium UBEs and 4 Long UBEs concurrently as the mixed size batch workload.

  • The mixed size UBEs ran concurrently from the application server with the 10000 online users driven by the LoadRunner.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/26/2011.

SPARC T4 Servers Set World Record on Siebel Loyalty Batch

Oracle's SPARC T4-2 and SPARC T4-4 servers running Oracle's Siebel Loyalty Batch engine delivered a world record result for batch processing.

  • The SPARC T4-2 and SPARC T4-4 servers running Siebel Loyalty Batch engine, part of Siebel Loyalty Solution, with Oracle Database 11g Release 2 running on Oracle Solaris 10 achieved 7.65M TPH on Accrual (Reward) processing using three Siebel Servers.

  • The world record result was achieved with 24M members and 50M records in the base transaction table.

  • Siebel Loyalty Application was configured with 50 Active Promotions with three Assign Points and four Update Attributes.

  • Oracle's Siebel Server scaled near linearly on SPARC T4 systems achieving 2.72M TPH on a single Siebel Server to 7.65M TPH with three Siebel Servers.

  • The average CPU utilization on the database tier server was 25% and on the application tier server was 65%, leaving significant room for application growth.

Performance Landscape

System Processor TPH Version
3 x SPARC T4-2 (app)
1 x SPARC T4-4 (db)
SPARC T4, 2.85 GHz
SPARC T4, 3.0 GHz
7.65M 8.1.1.1FP
2 x SPARC T3-2 (app)
1 x SPARC T3-1 (app)
1 x SPARC M5000 (db)
SPARC T3, 1.65 GHz
SPARC T3, 1.65 GHz
SPARC64 VII, 2.52 GHz
3.9M 8.1.1.1FP
Customer (app)
Customer (db)
4 x Intel E5540, 2.53 GHz
1 x Itanium, 1.6 GHz
1.5M 8.1.x

Configuration Summary

Hardware Configuration:

3 x SPARC T4-2 servers, each with
2 x SPARC T4 processors, 2.85 GHz
128 GB main memory
1 x SPARC T4-4 server with
4 x SPARC T4 processors, 3.0 GHz
256 GB main memory
1 x Sun Storage 6180 array
16 disk drives
CSM200 with 16 disk drives

Software Configuration:

Oracle Solaris 10
Siebel Server 8.1.1.1FP
Oracle Database 11g Release 2 Enterprise Edition 11.2.0.1

Benchmark Description

Siebel Loyalty enables companies to simulate and process loyalty rewards for their activities across channels and process very high volume accrual and tier assessment transactions via batch process.

The benchmark simulates a workload of Accrual Batch Transactions Processing which imports data through Enterprise Integration Manager (EIM), evaluates eligible promotion and calculates rewards. The key performance metric is transactions per hour (TPH). Key aspects of the workload simulation include:

  • Batch Engine evaluating all accrual promotions and applying all actions in one go,
  • Users do not have control over the sequence in which promotion applied,
  • Promotion actions (assign/redeem points) are rolled back in case of failure.
The number of active promotions and, in particular, the Assign Point action has very significant impact on performance. The load simulated 50 Active promotions with 3 for Assign Points and 7 Update attribute actions configured.

The number of members and the number of queued transactions in the backend database have significant impact on the performance. The benchmark had 24 million members and 52 million records in the base transaction table. The simplified process flow of the benchmark is:

  • calculate accruals base on promotions,
  • credit points to members,
  • initiate any other actions specified in promotions.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/26/2011.

SPARC T4-4 Server Sets World Record on PeopleSoft Payroll (N.A.) 9.1, Outperforms IBM Mainframe, HP Itanium

Oracle's SPARC T4-4 server achieved world record performance on the Unicode version of Oracle's PeopleSoft Enterprise Payroll (N.A) 9.1 extra-large volume model benchmark using Oracle Database 11g Release 2 running on Oracle Solaris 10.

  • The SPARC T4-4 server was able to process 1,460,544 payments/hour using PeopleSoft Payroll N.A 9.1.

  • The SPARC T4-4 server UNICODE result of 30.84 minutes on Payroll 9.1 is 2.8x faster than IBM z10 EC 2097 Payroll 9.0 (UNICODE version) result of 87.4 minutes. The IBM mainframe is rated at 6,512 MIPS.

  • The SPARC T4-4 server UNICODE result of 30.84 minutes on Payroll 9.1 is 3.1x faster than HP rx7640 Itanium2 non-UNICODE result of 96.17 minutes, on Payroll 9.0.

  • The average CPU utilization on the SPARC T4-4 server was only 30%, leaving significant room for business growth.

  • The SPARC T4-4 server processed payroll for 500,000 employees, 750,000 payments, in 30.84 minutes compared to the earlier world record result of 46.76 minutes on Oracle's SPARC Enterprise M5000 server.

  • The SPARC Enterprise M5000 server configured with eight 2.66 GHz SPARC64 VII processors has a result of 46.76 minutes on Payroll 9.1. That is 7% better than the result of 50.11 minutes on the SPARC Enterprise M5000 server configured with eight 2.53 GHz SPARC64 VII processors on Payroll 9.0. The difference in clock speed between the two processors is ~5%. That is close to the difference in the two results, thereby showing that the impact of the Payroll 9.1 benchmark on the overall result is about the same as that of Payroll 9.0.

Performance Landscape

PeopleSoft Payroll (N.A.) 9.1 – 500K Employees (7 Million SQL PayCalc, Unicode)

System OS/Database Payroll Processing
Result (minutes)
Run 1
(minutes)
Num of
Streams
SPARC T4-4, 4 x 3.0 GHz SPARC T4 Solaris/Oracle 11g 30.84 43.76 96
SPARC M5000, 8 x 2.66 GHz SPARC64 VII+ Solaris/Oracle 11g 46.76 66.28 32

PeopleSoft Payroll (N.A.) 9.0 – 500K Employees (3 Million SQL PayCalc, Non-Unicode)

System OS/Database Time in Minutes Num of
Streams
Payroll
Processing
Result
Run 1 Run 2 Run 3
Sun M5000, 8 x 2.53 GHz SPARC64 VII Solaris/Oracle 11g 50.11 73.88 534.20 1267.06 32
IBM z10 EC 2097, 9 x 4.4 GHz Gen1 Z/OS /DB2 58.96 80.5 250.68 462.6 8
IBM z10 EC 2097, 9 x 4.4 GHz Gen1 Z/OS /DB2 87.4 ** 107.6 - - 8
HP rx7640, 8 x 1.6 GHz Itanium2 HP-UX/Oracle 11g 96.17 133.63 712.72 1665.01 32

** This result was run with Unicode. The IBM z10 EC 2097 UNICODE result of 87.4 minutes is 48% slower than IBM z10 EC 2097 non-UNICODE result of 58.96 minutes, both on Payroll 9.0, each configured with nine 4.4GHz Gen1 processors.

Payroll 9.1 Compared to Payroll 9.0

Please note that Payroll 9.1 is Unicode based and Payroll 9.0 had non-Unicode and Unicode versions of the workload. There are 7 million executions of an SQL statement for the PayCalc batch process in Payroll 9.1 and 3 million executions of the same SQL statement for the PayCalc batch process in Payroll 9.0. This gets reflected in the elapsed time (27.33 min for 9.1 and 23.78 min for 9.0). The elapsed times of all other batch processes is lower (better) on 9.1.

Configuration Summary

Hardware Configuration:

SPARC T4-4 server
4 x 3.0 GHz SPARC T4 processors
256 GB memory
Sun Storage F5100 Flash Array
80 x 24 GB FMODs

Software Configuration:

Oracle Solaris 10 8/11
PeopleSoft HRMS and Campus Solutions 9.10.303
PeopleSoft Enterprise (PeopleTools) 8.51.035
Oracle Database 11g Release 2 11.2.0.1 (64-bit)
Micro Focus COBOLServer Express 5.1 (64-bit)

Benchmark Description

The PeopleSoft 9.1 Payroll (North America) benchmark is a performance benchmark established by PeopleSoft to demonstrate system performance for a range of processing volumes in a specific configuration. This information may be used to determine the software, hardware, and network configurations necessary to support processing volumes. This workload represents large batch runs typical of OLTP workloads during a mass update.

To measure five application business process run times for a database representing a large organization. The five processes are:

  • Paysheet Creation: Generates payroll data worksheets consisting of standard payroll information for each employee for a given pay cycle.

  • Payroll Calculation: Looks at paysheets and calculates checks for those employees.

  • Payroll Confirmation: Takes information generated by Payroll Calculation and updates the employees' balances with the calculated amounts.

  • Print Advice forms: The process takes the information generated by Payroll Calculations and Confirmation and produces an Advice for each employee to report Earnings, Taxes, Deduction, etc.

  • Create Direct Deposit File: The process takes information generated by the above processes and produces an electronic transmittal file that is used to transfer payroll funds directly into an employee's bank account.

Key Points and Best Practices

  • The SPARC T4-4 server with the Sun Storage F5100 Flash Array device had an average read throughput of up to 103 MB/sec and an average write throughput of up to 124 MB/sec while consuming 30% CPU on average.

  • The Sun Storage F5100 Flash Array device is a solid-state device that provides a read latency of only 0.5 msec. That is about 10 times faster than the normal disk latencies of 5 msec measured on this benchmark.

See Also

  • Oracle PeopleSoft Benchmark White Papers
    oracle.com
  • PeopleSoft Enterprise Human Capital Management (Payroll)
    oracle.com

  • PeopleSoft Enterprise Payroll 9.1 Using Oracle for Solaris (Unicode) on an Oracle's SPARC T4-4 – White Paper
    oracle.com

  • SPARC T4-4 Server
    oracle.com
  • Oracle Solaris
    oracle.com
  • Oracle Database 11g Release 2 Enterprise Edition
    oracle.com
  • Sun Storage F5100 Flash Array
    oracle.com

Disclosure Statement

Oracle's PeopleSoft Payroll 9.1 benchmark, SPARC T4-4 30.84 min,
http://www.oracle.com/us/solutions/benchmark/apps-benchmark/peoplesoft-167486.html, results 9/26/2011.

Saturday Sep 24, 2011

BestPerf Index 3 October 2011

This is an occasionally-generated index of previous entries in the BestPerf blog. Skip to next entry

Colors used:

Benchmark
Best Practices
Index

Oct 03, 2011 SPARC T4-4 Servers Set World Record on SPECjEnterprise2010, Beats IBM POWER7, Cisco x86
Oct 03, 2011 SPARC T4-4 Beats IBM POWER7 and HP Itanium on TPC-H @1000GB Benchmark
Oct 03, 2011 Sun ZFS Storage 7420 Appliance Doubles NetApp FAS3270A on SPC-1 Benchmark
Oct 03, 2011 SPARC T4-4 Produces World Record Oracle OLAP Capacity
Sep 30, 2011 SPARC T4-2 Server Beats Intel (Westmere AES-NI) on ZFS Encryption Tests
Sep 30, 2011 SPARC T4 Processor Beats Intel (Westmere AES-NI) on AES Encryption Tests
Sep 29, 2011 SPARC T4 Processor Outperforms IBM POWER7 and Intel (Westmere AES-NI) on OpenSSL AES Encryption Test
Sep 29, 2011 SPARC T4-1 Server Outperforms Intel (Westmere AES-NI) on IPsec Encryption Tests
Sep 29, 2011 SPARC T4-2 Server Beats Intel (Westmere AES-NI) on SSL Network Tests
Sep 28, 2011 SPARC T4 Servers Set World Record on Oracle E-Business Suite R12 X-Large Order to Cash
Sep 28, 2011 SPARC T4-2 Server Beats Intel (Westmere AES-NI) on Oracle Database Tablespace Encryption Queries
Sep 28, 2011 SPARC T4 Servers Set World Record on PeopleSoft HRMS 9.1
Sep 27, 2011 SPARC T4-2 Servers Set World Record on JD Edwards EnterpriseOne Day in the Life Benchmark with Batch, Outperforms IBM POWER7
Sep 27, 2011 SPARC T4 Servers Set World Record on Siebel Loyalty Batch
Sep 27, 2011 SPARC T4-4 Server Sets World Record on PeopleSoft Payroll (N.A.) 9.1, Outperforms IBM Mainframe, HP Itanium
Sep 19, 2011 Halliburton ProMAX® Seismic Processing on Sun Blade X6270 M2 with Sun ZFS Storage 7320
Sep 15, 2011 Sun Fire X4800 M2 Servers Produce World Record on SAP SD-Parallel Benchmark
Sep 12, 2011 SPARC Enterprise M9000 Produces World Record SAP ATO Benchmark
Aug 12, 2011 Sun Blade X6270 M2 with Oracle WebLogic World Record 2 Processor SPECjEnterprise 2010 Benchmark
Jul 01, 2011 SPARC T3-1 Record Results Running JD Edwards EnterpriseOne Day in the Life Benchmark with Added Batch Component
Jun 10, 2011 SPARC Enterprise M5000 Delivers First PeopleSoft Payroll 9.1 Benchmark
Jun 03, 2011 SPARC Enterprise M8000 with Oracle 11g Beats IBM POWER7 on TPC-H @1000GB Benchmark
Mar 25, 2011 SPARC Enterprise M9000 with Oracle Database 11g Delivers World Record Single Server TPC-H @3000GB Result
Mar 23, 2011 SPARC T3-1B Doubles Performance on Oracle Fusion Middleware WebLogic Avitek Medical Records Sample Application
Mar 23, 2011 Netra SPARC T3-1 22% Faster Than IBM Running Oracle Communications ASAP
Feb 17, 2011 SPARC T3-1 takes JD Edwards "Day In the Life" benchmark lead, beats IBM Power7 by 25%
Dec 08, 2010 Sun Blade X6275 M2 Cluster with Sun Storage 7410 Performance Running Seismic Processing Reverse Time Migration
Dec 08, 2010 Sun Blade X6275 M2 Delivers Best Fluent (MCAE Application) Performance on Tested Configurations
Dec 08, 2010 Sun Blade X6275 M2 Server Module with Intel X5670 Processors SPEC CPU2006 Results
Dec 02, 2010 World Record TPC-C Result on Oracle's SPARC Supercluster with T3-4 Servers
Dec 02, 2010 World Record SPECweb2005 Result on SPARC T3-2 with Oracle iPlanet Web Server
Dec 02, 2010 World Record Performance on PeopleSoft Enterprise Financials Benchmark run on Sun SPARC Enterprise M4000 and M5000
Oct 26, 2010 3D VTI Reverse Time Migration Scalability On Sun Fire X2270-M2 Cluster with Sun Storage 7210
Oct 11, 2010 Sun SPARC Enterprise M9000 Server Delivers World Record Non-Clustered TPC-H @3000GB Performance
Sep 30, 2010 Consolidation of 30 x86 Servers onto One SPARC T3-2
Sep 29, 2010 SPARC T3-1 Delivers Record Number of Online Users on JD Edwards EnterpriseOne 9.0.1 Day in the Life Test
Sep 28, 2010 SPARC T3 Cryptography Performance Over 1.9x Increase in Throughput over UltraSPARC T2 Plus
Sep 28, 2010 SPARC T3-2 Delivers First Oracle E-Business X-Large Benchmark Self-Service (OLTP) Result
Sep 27, 2010 Sun Fire X2270 M2 Super-Linear Scaling of Hadoop Terasort and CloudBurst Benchmarks
Sep 27, 2010 SPARC T3-1 Shows Capabilities Running Online Auction Benchmark with Oracle Fusion Middleware
Sep 24, 2010 SPARC T3-2 sets World Record on SPECjvm2008 Benchmark
Sep 24, 2010 SPARC T3 Provides High Performance Security for Oracle Weblogic Applications
Sep 23, 2010 Sun Storage F5100 Flash Array with PCI-Express SAS-2 HBAs Achieves Over 17 GB/sec Read
Sep 23, 2010 SPARC T3-1 Performance on PeopleSoft Enterprise Financials 9.0 Benchmark
Sep 22, 2010 Oracle Solaris 10 9/10 ZFS OLTP Performance Improvements
Sep 22, 2010 SPARC T3-1 Supports 13,000 Users on Financial Services and Enterprise Application Integration Running Siebel CRM 8.1.1
Sep 21, 2010 ProMAX Performance and Throughput on Sun Fire X2270 and Sun Storage 7410
Sep 21, 2010 Sun Flash Accelerator F20 PCIe Cards Outperform IBM on SPC-1C
Sep 21, 2010 SPARC T3 Servers Deliver Top Performance on Oracle Communications Order and Service Management
Sep 20, 2010 Schlumberger's ECLIPSE 300 Performance Throughput On Sun Fire X2270 Cluster with Sun Storage 7410
Sep 20, 2010 Sun Fire X4470 4 Node Cluster Delivers World Record SAP SD-Parallel Benchmark Result
Sep 20, 2010 SPARC T3-4 Sets World Record Single Server Result on SPECjEnterprise2010 Benchmark
Aug 25, 2010 Transparent Failover with Solaris MPxIO and Oracle ASM
Aug 23, 2010 Repriced: SPC-1 Sun Storage 6180 Array (8Gb) 1.9x Better Than IBM DS5020 in Price-Performance
Aug 23, 2010 Repriced: SPC-2 (RAID 5 & 6 Results) Sun Storage 6180 Array (8Gb) Outperforms IBM DS5020 by up to 64% in Price-Performance
Jun 29, 2010 Sun Fire X2270 M2 Achieves Leading Single Node Results on ANSYS FLUENT Benchmark
Jun 29, 2010 Sun Fire X2270 M2 Demonstrates Outstanding Single Node Performance on MSC.Nastran Benchmarks
Jun 29, 2010 Sun Fire X2270 M2 Achieves Leading Single Node Results on ABAQUS Benchmark
Jun 29, 2010 Sun Fire X2270 M2 Sets World Record on SPEC OMP2001 Benchmark
Jun 29, 2010 Sun Fire X4170 M2 Sets World Record on SPEC CPU2006 Benchmark
Jun 29, 2010 Sun Blade X6270 M2 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4270 M2 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4470 Sets World Records on SPEC OMP2001 Benchmarks
Jun 28, 2010 Sun Fire X4470 Sets World Record on SPEC CPU2006 Rate Benchmark
Jun 28, 2010 Sun Fire X4470 2-Node Configuration Sets World Record for SAP SD-Parallel Benchmark
Jun 28, 2010 Sun Fire X4800 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4800 Sets World Records on SPEC CPU2006 Rate Benchmarks
Jun 10, 2010 Hyperion Essbase ASO World Record on Sun SPARC Enterprise M5000
Jun 09, 2010 PeopleSoft Payroll 500K Employees on Sun SPARC Enterprise M5000 World Record
Jun 03, 2010 Sun SPARC Enterprise T5440 World Record SPECjAppServer2004
May 11, 2010 Per-core Performance Myth Busting
Apr 14, 2010 Oracle Sun Storage F5100 Flash Array Delivers World Record SPC-1C Performance
Apr 13, 2010 Oracle Sun Flash Accelerator F20 PCIe Card Accelerates Web Caching Performance
Apr 06, 2010 WRF Benchmark: X6275 Beats Power6
Mar 29, 2010 Sun Blade X6275/QDR IB/ Reverse Time Migration
Feb 23, 2010 IBM POWER7 SPECfp_rate2006: Poor Scaling? Or Configuration Confusion?
Jan 25, 2010 Sun/Solaris Leadership in SAP SD Benchmarks and HP claims
Jan 21, 2010 SPARC Enterprise M4000 PeopleSoft NA Payroll 240K Employees Performance (16 Streams)
Dec 16, 2009 Sun Fire X4640 Delivers World Record x86 Result on SPEC OMPL2001
Nov 24, 2009 Sun M9000 Fastest SAP 2-tier SD Benchmark on current SAP EP4 for SAP ERP 6.0 (Unicode)
Nov 20, 2009 Sun Blade X6275 cluster delivers leading results for Fluent truck_111m benchmark
Nov 20, 2009 Sun Blade 6048 and Sun Blade X6275 NAMD Molecular Dynamics Benchmark beats IBM BlueGene/L
Nov 19, 2009 SPECmail2009: New World record on T5240 1.6GHz Sun 7310 and ZFS
Nov 18, 2009 Sun Flash Accelerator F20 PCIe Card Achieves 100K 4K IOPS and 1.1 GB/sec
Nov 05, 2009 New TPC-C World Record Sun/Oracle
Nov 02, 2009 Sun Blade X6275 Cluster Beats SGI Running Fluent Benchmarks
Nov 02, 2009 Sun Ultra 27 Delivers Leading Single Frame Buffer SPECviewperf 10 Results
Oct 28, 2009 SPC-2 Sun Storage 6780 Array RAID 5 & RAID 6 51% better $/performance than IBM DS5300
Oct 25, 2009 Sun C48 & Lustre fast for Seismic Reverse Time Migration using Sun X6275
Oct 25, 2009 Sun F5100 and Seismic Reverse Time Migration with faster Optimal Checkpointing
Oct 23, 2009 Wiki on performance best practices
Oct 20, 2009 Exadata V2 Information
Oct 15, 2009 Oracle Flash Cache - SGA Caching on Sun Storage F5100
Oct 13, 2009 Oracle Hyperion Sun M5000 and Sun Storage 7410
Oct 13, 2009 Sun T5440 Oracle BI EE Sun SPARC Enterprise T5440 World Record
Oct 13, 2009 SPECweb2005 on Sun SPARC Enterprise T5440 World Record using Solaris Containers and Sun Storage F5100 Flash
Oct 13, 2009 Oracle PeopleSoft Payroll (NA) Sun SPARC Enterprise M4000 and Sun Storage F5100 World Record Performance
Oct 13, 2009 SAP 2-tier SD Benchmark on Sun SPARC Enterprise M9000/32 SPARC64 VII
Oct 13, 2009 CP2K Life Sciences, Ab-initio Dynamics - Sun Blade 6048 Chassis with Sun Blade X6275 - Scalability and Throughput with Quad Data Rate InfiniBand
Oct 13, 2009 SAP 2-tier SD-Parallel on Sun Blade X6270 1-node, 2-node and 4-node
Oct 13, 2009 Halliburton ProMAX Oil & Gas Application Fast on Sun 6048/X6275 Cluster
Oct 13, 2009 SPECcpu2006 Results On MSeries Servers With Updated SPARC64 VII Processors
Oct 12, 2009 MCAE ABAQUS faster on Sun F5100 and Sun X4270 - Single Node World Record
Oct 12, 2009 MCAE ANSYS faster on Sun F5100 and Sun X4270
Oct 12, 2009 MCAE MCS/NASTRAN faster on Sun F5100 and Fire X4270
Oct 12, 2009 SPC-2 Sun Storage 6180 Array RAID 5 & RAID 6 Over 70% Better Price Performance than IBM
Oct 12, 2009 SPC-1 Sun Storage 6180 Array Over 70% Better Price Performance than IBM
Oct 12, 2009 Why Sun Storage F5100 is a good option for Peoplesoft NA Payroll Application
Oct 12, 2009 1.6 Million 4K IOPS in 1RU on Sun Storage F5100 Flash Array
Oct 11, 2009 TPC-C World Record Sun - Oracle
Oct 09, 2009 X6275 Cluster Demonstrates Performance and Scalability on WRF 2.5km CONUS Dataset
Oct 02, 2009 Sun X4270 VMware VMmark benchmark achieves excellent result
Sep 22, 2009 Sun X4270 Virtualized for Two-tier SAP ERP 6.0 Enhancement Pack 4 (Unicode) Standard Sales and Distribution (SD) Benchmark
Sep 01, 2009 String Searching - Sun T5240 & T5440 Outperform IBM Cell Broadband Engine
Aug 28, 2009 Sun X4270 World Record SAP-SD 2-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Aug 27, 2009 Sun SPARC Enterprise T5240 with 1.6GHz UltraSPARC T2 Plus Beats 4-Chip IBM Power 570 POWER6 System on SPECjbb2005
Aug 26, 2009 Sun SPARC Enterprise T5220 with 1.6GHz UltraSPARC T2 Sets Single Chip World Record on SPECjbb2005
Aug 12, 2009 SPECmail2009 on Sun SPARC Enterprise T5240 and Sun Java System Messaging Server 6.3
Jul 23, 2009 World Record Performance of Sun CMT Servers
Jul 22, 2009 Why does 1.6 beat 4.7?
Jul 21, 2009 Zeus ZXTM Traffic Manager World Record on Sun T5240
Jul 21, 2009 Sun T5440 Oracle BI EE World Record Performance
Jul 21, 2009 Sun T5440 World Record SAP-SD 4-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Jul 21, 2009 1.6 GHz SPEC CPU2006 - Rate Benchmarks
Jul 21, 2009 Sun Blade T6320 World Record SPECjbb2005 performance
Jul 21, 2009 New SPECjAppServer2004 Performance on the Sun SPARC Enterprise T5440
Jul 20, 2009 Sun T5440 SPECjbb2005 Beats IBM POWER6 Chip-to-Chip
Jul 20, 2009 New CMT results coming soon....
Jul 14, 2009 Vdbench: Sun StorageTek Vdbench, a storage I/O workload generator.
Jul 14, 2009 Storage performance and workload analysis using Swat.
Jul 10, 2009 World Record TPC-H@300GB Price-Performance for Windows on Sun Fire X4600 M2
Jul 06, 2009 Sun Blade 6048 Chassis with Sun Blade X6275: RADIOSS Benchmark Results
Jul 03, 2009 SPECmail2009 on Sun Fire X4275+Sun Storage 7110: Mail Server System Solution
Jun 30, 2009 Sun Blade 6048 and Sun Blade X6275 NAMD Molecular Dynamics Benchmark beats IBM BlueGene/L
Jun 26, 2009 Sun Fire X2270 Cluster Fluent Benchmark Results
Jun 25, 2009 Sun SSD Server Platform Bandwidth and IOPS (Speeds & Feeds)
Jun 24, 2009 I/O analysis using DTrace
Jun 23, 2009 New CPU2006 Records: 3x better integer throughput, 9x better fp throughput
Jun 23, 2009 Sun Blade X6275 results capture Top Places in CPU2006 SPEED Metrics
Jun 19, 2009 Pointers to Java Performance Tuning resources
Jun 19, 2009 SSDs in HPC: Reducing the I/O Bottleneck BluePrint Best Practices
Jun 17, 2009 The Performance Technology group wiki is alive!
Jun 17, 2009 Performance of Sun 7410 and 7310 Unified Storage Array Line
Jun 16, 2009 Sun Fire X2270 MSC/Nastran Vendor_2008 Benchmarks
Jun 15, 2009 Sun Fire X4600 M2 Server Two-tier SAP ERP 6.0 (Unicode) Standard Sales and Distribution (SD) Benchmark
Jun 12, 2009 Correctly comparing SAP-SD Benchmark results
Jun 12, 2009 OpenSolaris Beats Linux on memcached Sun Fire X2270
Jun 11, 2009 SAS Grid Computing 9.2 utilizing the Sun Storage 7410 Unified Storage System
Jun 10, 2009 Using Solaris Resource Management Utilities to Improve Application Performance
Jun 09, 2009 Free Compiler Wins Nehalem Race by 2x
Jun 08, 2009 Variety of benchmark results to be posted on BestPerf
Jun 05, 2009 Interpreting Sun's SPECpower_ssj2008 Publications
Jun 03, 2009 Wide Variety of Topics to be discussed on BestPerf
Jun 03, 2009 Welcome to BestPerf group blog!

Monday Sep 19, 2011

Halliburton ProMAX® Seismic Processing on Sun Blade X6270 M2 with Sun ZFS Storage 7320

Halliburton/Landmark's ProMAX® 3D Pre-Stack Kirchhoff Time Migration's (PSTM) single workflow scalability and multiple workflow throughput using various scheduling methods are evaluated on a cluster of Oracle's Sun Blade X6270 M2 server modules attached to Oracle's Sun ZFS Storage 7320 appliance.

Two resource scheduling methods, compact and distributed, are compared while increasing the system load with additional concurrent ProMAX® workflows.

  • Multiple concurrent 24-process ProMAX® PSTM workflow throughput is constant; 10 workflows on 10 nodes finish as fast as 1 workflow on one compute node. Additionally, processing twice the data volume yields similar traces/second throughput performance.

  • A single ProMAX® PSTM workflow has good scaling from 1 to 10 nodes of a Sun Blade X6270 M2 cluster scaling 4.5X. ProMAX® scales to 4.7X on 10 nodes with one input data set and 6.3X with two consecutive input data sets (i.e. twice the data).

  • A single ProMAX® PSTM workflow has near linear scaling of 11x on a Sun Blade X6270 M2 server module when running from 1 to 12 processes.

  • The 12-thread ProMAX® workflow throughput using the distributed scheduling method is equivalent or slightly faster than the compact scheme for 1 to 6 concurrent workflows.

Performance Landscape

Multiple 24-Process Workflow Throughput Scaling

This test measures the system throughput scalability as concurrent 24-process workflows are added, one workflow per node. The per workflow throughput and the system scalability are reported.

Aggregate system throughput scales linearly. Ten concurrent workflows finish in the same time as does one workflow on a single compute node.

Halliburton ProMAX® Pre-Stack Time Migration - Multiple Workflow Scaling


Single Workflow Scaling

This test measures single workflow scalability across a 10-node cluster. Utilizing a single data set, performance exhibits near linear scaling of 11x at 12 processes, and per-node scaling of 4x at 6 nodes; performance flattens quickly reaching a peak of 60x at 240 processors and per-node scaling of 4.7x with 10 nodes.

Running with two consecutive input data sets in the workflow, scaling is considerably improved with peak scaling ~35% higher than obtained using a single data set. Doubling the data set size minimizes time spent in workflow initialization, data input and output.

Halliburton ProMAX® Pre-Stack Time Migration - Single Workflow Scaling

This next test measures single workflow scalability across a 10-node cluster (as above) but limiting scheduling to a maximum of 12-process per node; effectively restricting a maximum of one process per physical core. The speedup relative to a single process, and single node are reported.

Utilizing a single data set, performance exhibits near linear scaling of 37x at 48 processes, and per-node scaling of 4.3x at 6 nodes. Performance of 55x at 120 processors and per-node scaling of 5x with 10 nodes is reached and scalability is trending higher more strongly compared to the the case of two processes running per physical core above. For equivalent total process counts, multi-node runs using only a single process per physical core appear to run between 28-64% more efficiently (96 and 24 processes respectively). With a full compliment of 10 nodes (120 processes) the peak performance is only 9.5% lower than with 2 processes per vcpu (240 processes).

Running with two consecutive input data sets in the workflow, scaling is considerably improved with peak scaling ~35% higher than obtained using a single data set.

Halliburton ProMAX® Pre-Stack Time Migration - Single Workflow Scaling

Multiple 12-Process Workflow Throughput Scaling, Compact vs. Distributed Scheduling

The fourth test compares compact and distributed scheduling of 1, 2, 4, and 6 concurrent 12-processor workflows.

All things being equal, the system bi-section bandwidth should improve with distributed scheduling of a fixed-size workflow; as more nodes are used for a workflow, more memory and system cache is employed and any node memory bandwidth bottlenecks can be offset by distributing communication across the network (provided the network and inter-node communication stack do not become a bottleneck). When physical cores are not over-subscribed, compact and distributed scheduling performance is within 3% suggesting that there may be little memory contention for this workflow on the benchmarked system configuration.

With compact scheduling of two concurrent 12-processor workflows, the physical cores become over-subscribed and performance degrades 36% per workflow. With four concurrent workflows, physical cores are oversubscribed 4x and performance is seen to degrade 66% per workflow. With six concurrent workflows over-subscribed compact scheduling performance degrades 77% per workflow. As multiple 12-processor workflows become more and more distributed, the performance approaches the non over-subscribed case.

Halliburton ProMAX® Pre-Stack Time Migration - Multiple Workflow Scaling

141616 traces x 624 samples


Test Notes

All tests were performed with one input data set (70808 traces x 624 samples) and two consecutive input data sets (2 * (70808 traces x 624 samples)) in the workflow. All results reported are the average of at least 3 runs and performance is based on reported total wall-clock time by the application.

All tests were run with NFS attached Sun ZFS Storage 7320 appliance and then with NFS attached legacy Sun Fire X4500 server. The StorageTek Workload Analysis Tool (SWAT) was invoked to measure the I/O characteristics of the NFS attached storage used on separate runs of all workflows.

Configuration Summary

Hardware Configuration:

10 x Sun Blade X6270 M2 server modules, each with
2 x 3.33 GHz Intel Xeon X5680 processors
48 GB DDR3-1333 memory
4 x 146 GB, Internal 10000 RPM SAS-2 HDD
10 GbE
Hyper-Threading enabled

Sun ZFS Storage 7320 Appliance
1 x Storage Controller
2 x 2.4 GHz Intel Xeon 5620 processors
48 GB memory (12 x 4 GB DDR3-1333)
2 TB Read Cache (4 x 512 GB Read Flash Accelerator)
10 GbE
1 x Disk Shelf
20.0 TB RAID-Z (20 x 1 TB SAS-2, 7200 RPM HDD)
4 x Write Flash Accelerators

Sun Fire X4500
2 x 2.8 GHz AMD 290 processors
16 GB DDR1-400 memory
34.5 TB RAID-Z (46 x 750 GB SATA-II, 7200 RPM HDD)
10 GbE

Software Configuration:

Oracle Linux 5.5
Parallel Virtual Machine 3.3.11 (bundled with ProMAX)
Intel 11.1.038 Compilers
Libraries: pthreads 2.4, Java 1.6.0_01, BLAS, Stanford Exploration Project Libraries

Benchmark Description

The ProMAX® family of seismic data processing tools is the most widely used Oil and Gas Industry seismic processing application. ProMAX® is used for multiple applications, from field processing and quality control, to interpretive project-oriented reprocessing at oil companies and production processing at service companies. ProMAX® is integrated with Halliburton's OpenWorks® Geoscience Oracle Database to index prestack seismic data and populate the database with processed seismic.

This benchmark evaluates single workflow scalability and multiple workflow throughput of the ProMAX® 3D Prestack Kirchhoff Time Migration (PSTM) while processing the Halliburton benchmark data set containing 70,808 traces with 8 msec sample interval and trace length of 4992 msec. Benchmarks were performed with both one and two consecutive input data sets.

Each workflow consisted of:

  • reading the previously constructed MPEG encoded processing parameter file
  • reading the compressed seismic data traces from disk
  • performing the PSTM imaging
  • writing the result to disk

Workflows using two input data sets were constructed by simply adding a second identical seismic data read task immediately after the first in the processing parameter file. This effectively doubled the data volume read, processed, and written.

This version of ProMAX® currently only uses Parallel Virtual Machine (PVM) as the parallel processing paradigm. The PVM software only used TCP networking and has no internal facility for assigning memory affinity and processor binding. Every compute node is running a PVM daemon.

The ProMAX® processing parameters used for this benchmark:

Minimum output inline = 65
Maximum output inline = 85
Inline output sampling interval = 1
Minimum output xline = 1
Maximum output xline = 200 (fold)
Xline output sampling interval = 1
Antialias inline spacing = 15
Antialias xline spacing = 15
Stretch Mute Aperature Limit with Maximum Stretch = 15
Image Gather Type = Full Offset Image Traces
No Block Moveout
Number of Alias Bands = 10
3D Amplitude Phase Correction
No compression
Maximum Number of Cache Blocks = 500000

Primary PSTM business metrics are typically time-to-solution and accuracy of the subsurface imaging solution.

Key Points and Best Practices

  • Multiple job system throughput scales perfectly; ten concurrent workflows on 10 nodes each completes in the same time and has the same throughput as a single workflow running on one node.
  • Best single workflow scaling is 6.6x using 10 nodes.

    When tasked with processing several similar workflows, while individual time-to-solution will be longer, the most efficient way to run is to fully distribute them one workflow per node (or even across two nodes) and run these concurrently, rather than to use all nodes for each workflow and running consecutively. For example, while the best-case configuration used here will run 6.6 times faster using all ten nodes compared to a single node, ten such 10-node jobs running consecutively will overall take over 50% longer to complete than ten jobs one per node running concurrently.

  • Throughput was seen to scale better with larger workflows. While throughput with both large and small workflows are similar with only one node, the larger dataset exhibits 11% and 35% more throughput with four and 10 nodes respectively.

  • 200 processes appears to be a scalability asymptote with these workflows on the systems used.
  • Hyperthreading marginally helps throughput. For the largest model run on 10 nodes, 240 processes delivers 11% more performance than with 120 processes.

  • The workflows do not exhibit significant I/O bandwidth demands. Even with 10 concurrent 24-process jobs, the measured aggregate system I/O did not exceed 100 MB/s.

  • 10 GbE was the only network used and, though shared for all interprocess communication and network attached storage, it appears to have sufficient bandwidth for all test cases run.

See Also

Disclosure Statement

The following are trademarks or registered trademarks of Halliburton/Landmark Graphics: ProMAX®, GeoProbe®, OpenWorks®. Results as of 9/1/2011.

Thursday Sep 15, 2011

Sun Fire X4800 M2 Servers (now known as Sun Server X2-8) Produce World Record on SAP SD-Parallel Benchmark

Oracle delivered an SAP enhancement package 4 for SAP ERP 6.0 (Unicode) Sales and Distribution - Parallel (SD Parallel) Benchmark world record result using eight of Oracle's Sun Fire X4800 M2 servers (now known as Sun Server X2-8), Oracle Solaris 10 and Oracle Database 11g Real Application Clusters (RAC) software that achieved 180,000 users as of 10/03/2011.

  • The eight Sun Fire X4800 M2 servers delivered a world record result of 180,000 users on the SAP SD Parallel Benchmark.

  • The eight Sun Fire X4800 M2 server SD Parallel result of 180,000 users delivered 43% more performance compared to the IBM Power 795 server SD two-tier result of 126,063 users.

Performance Landscape

Selected SAP Sales and Distribution (SD) benchmark results are presented in decreasing order of performance. All benchmarks were using SAP enhancement package 4 for SAP ERP 6.0 (Unicode).

System OS
Database
Users SAPS Type Cert #
Eight Sun Fire X4800 M2
8 x Intel Xeon E7-8870 @2.4 GHz
512 GB
Oracle Solaris 10
Oracle 11g RAC
180,000 1,016,380 Parallel 2011037
Six Sun Fire X4800 M2
8 x Intel Xeon E7-8870 @2.4 GHz
512 GB
Oracle Solaris 10
Oracle 11g RAC
137,904 765,470 Parallel 2011038
IBM Power 795
32 x POWER7 @4.0 GHz
4096 GB
AIX 7.1
DB2 9.7
126,063 688,630 Two-Tier 2010046
Four Sun Fire X4800 M2
8 x Intel Xeon E7-8870 @2.4 GHz
512 GB
Oracle Solaris 10
Oracle 11g RAC
94,736 546,050 Parallel 2011039
Two Sun Fire X4800 M2
8 x Intel Xeon E7-8870 @2.4 GHz
512 GB
Oracle Solaris 10
Oracle 11g RAC
49,860 274,080 Parallel 2011040
Four Sun Fire X4470
4 x Intel Xeon X7560 @2.26 GHz
256 GB
Solaris 10
Oracle 11g RAC
40,000 221,020 Parallel 2010039

Complete benchmark results and descriptions can be found at the SAP standard applications benchmark website.
For SD benchmark results website: Two-Tier or Three-Tier. For SD Parallel benchmark results website: SD Parallel.

Configuration and Results Summary

Hardware Configuration:

8 x Sun Fire X4800 M2 servers, each with
8 x Intel Xeon E7-8870 @ 2.4 GHz (8 processors, 80 cores, 160 threads)
512 GB memory

Software Configuration:

SAP enhancement package 4 for SAP ERP 6.0
Oracle Database 11g Real Application Clusters (RAC)
Oracle Solaris 10

Results Summary:

Number of SAP SD benchmark users:
180,000
Average dialog response time:
0.63 seconds
Throughput:

Fully processed order line items per hour:
20,327,670

Dialog steps/hour:
60,983,000

SAPS:
1,016,380
Average database request time (dialog/update):
0.010 sec / 0.055 sec
SAP Certification:
2011037

Benchmark Description

The SAP Standard Application Sales and Distribution - Parallel (SD Parallel) Benchmark is a two-tier ERP business test that is indicative of full business workloads of complete order processing and invoice processing and demonstrates the ability to run both the application and database software on a single system. The SAP Standard Application SD Benchmark represents the critical tasks performed in real-world ERP business environments.

The SD Parallel Benchmark consists of the same transactions and user interaction steps as the two-tier and three-tier SD Benchmark. This means that the SD Parallel Benchmark runs the same business processes as the SD Benchmark. The difference between the benchmarks is the technical data distribution. Additionally, the benchmark requires equal distribution of the benchmark users across all database nodes for the used benchmark clients (round-robin method). Following this rule, all database nodes work on data of all clients. This avoids unrealistic configurations such as having only one client per database node.

The SAP Benchmark Council agreed to give the parallel benchmark a different name so that the difference can be easily recognized by any interested parties - customers, prospects, and analysts. The naming convention is SD Parallel for Sales & Distribution - Parallel.

SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products.

See Also

Disclosure Statement

SAP enhancement package 4 for SAP ERP 6.0 (Unicode) Sales and Distribution Benchmark, results as of 10/03/2011.

SD Parallel, 8 x Sun Fire X4800 M2 (each 8 processors, 80 cores, 160 threads) 180,000 SAP SD Users, Oracle Solaris 10, Oracle 11g Real Application Clusters (RAC), Certification Number 2011037.
SD Parallel, 6 x Sun Fire X4800 M2 (each 8 processors, 80 cores, 160 threads) 137,904 SAP SD Users, Oracle Solaris 10, Oracle 11g Real Application Clusters (RAC), Certification Number 2011038.
SD Parallel, 4 x Sun Fire X4470 (each 4 processors, 32 cores, 64 threads) 40,000 SAP SD Users, Oracle Solaris 10, Oracle 11g Real Application Clusters (RAC), Certification Number 2010039.
SD Two-Tier, IBM Power 795 (32 processors, 256 cores, 1024 threads) 126,063 SAP SD Users, AIX 7.1, DB2 9.7, Certification Number 2010046.

SAP, R/3 are registered trademarks of SAP AG in Germany and other countries. More information may be found at www.sap.com/benchmark.

Monday Sep 12, 2011

SPARC Enterprise M9000 Produces World Record SAP ATO Benchmark

Oracle delivered an SAP enhancement package 4 for SAP ERP 6.0 Assemble-to-Order (ATO) benchmark world record result using Oracle's SPARC Enterprise M9000 server running Oracle Solaris 10 and Oracle Database 11g along with SAP Enhancement Package 4 for SAP ERP 6.0 (Unicode). The SAP ATO benchmark integrates process chains across SAP Business Suite components, include Financials, Logistics, Human Resources, Basis and Cross Application.

  • The SPARC Enterprise M9000 server containing 64 SPARC64 VII+ 3.0 GHz processors, running Oracle Solaris 10 and Oracle Database 11g along with SAP Enhancement Package 4 for SAP ERP 6.0 (Unicode) delivered a world record 206,000 fully processed assembly orders per hour on the SAP enhancement package 4 for SAP ERP 6.0 ATO benchmark.

  • The SPARC Enterprise M9000 server result shows it can more than consolidate the work of the three-tier HP solution which used 80 different servers.

  • Oracle produced the first SAP ATO benchmark result using Unicode encoding.

  • The SAP ATO benchmark uses multiple components of the SAP Business Suite. See more detail at the SAP ATO benchmark webpage.

Performance Landscape

SAP ATO 2-Tier Performance Table (select results in decreasing performance order)

System OS
Database
Assembly Orders
per hour(*)
SAP
ERP/ECC
Release
Cert Num
SPARC Enterprise M9000
64 x SPARC64 VII+ @3.0 GHz
2048 GB
Oracle Solaris 10
Oracle 11g
206,360 SAP ERP6.0*
(Unicode)
2011033
Fujitsu Siemens Primepower 2000
128 x SPARC64 @560 MHz
128 GB
Solaris 8
Oracle 8.1.7
34,260 4.6B
(non-Unicode)
2001018
HP 9000 Superdome
64 x PA-RISC 8600 @552 MHz
128 GB
HP-UX 11.11
Oracle 8.16
18,870 4.6B
(non-Unicode)
2001014
Fujitsu Siemens Primepower 900
16 x SPARC64 V @1.35 GHz
64 GB
Solaris 8
Oracle 9i
12,170 4.6C
(non-Unicode)
2003012
HP rx5670
4 x Itanium II @1.0 GHz
24 GB
HP-UX 11i
Oracle 9i
3,090 4.6C
(non-Unicode)
2002069

(*) SAP enhancement package 4 for SAP ERP6.0 (Unicode)

SAP ATO 3-Tier Performance Table (top results in decreasing performance order)

System OS
Database
Assembly Orders
per hour(*)
SAP
ERP/ECC
Release
Cert Num
HP 9000 Superdome Enterprise Server
64 x PA-RISC 8700 @ 750MHz
128 GB
HP-UX 11i
Oracle 9i
144,090 4.6 C
(non-Unicode)
2002003
HP 9000 Superdome Enterprise Server
64 x PA-RISC 8700 @750 MHz
128 GB
HP-UX 11i
Oracle 9i
130,570 4.6 C
(non-Unicode)
2001047

(*) Assembly Order: Request to assemble pre-manufactured parts and assemblies to finished products according to an existing sales order.

Complete benchmark results may be found at the SAP benchmark website: http://www.sap.com/benchmark.

Configuration Summary and Results

Hardware Configuration:

SPARC Enterprise M9000
64 SPARC64 VII+ 3.0 GHz processor
2048 GB memory

Software Configuration:

Oracle Solaris 10
SAP enhancement package 4 for SAP ERP 6.0 (Unicode)
Oracle Database 11g

Certified Result:

Fully business processed Assembly Orders/hour:
206,360
SAP Certification Number:
2011033

Benchmark Description

The SAP ATO benchmark integrates process chains across SAP Business Suite components. The ATO scenario is characterized by high volume sales, short production times (from hours to one day), and individual assembly for such products as PCs, pumps, and cars. In general, each benchmark user has its own master data, such as material, vendor, or customer master data to avoid data locking situations. However, the ATO Benchmark has been designed to handle and overcome data locking situations - the ATO benchmark users access common master data, such as material, vendor, or customer master data. (source: http://www12.sap.com/solutions/benchmark/ato.epx).

SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products.

See Also

Disclosure Statement

SAP, R/3 are registered trademarks of SAP AG in Germany and other countries. More information may be found at www.sap.com/benchmark

Two-tier SAP ATO standard SAP ERP 6.0 2005/EP4 (Unicode) application benchmarks as of 09/04/11:
Oracle's SPARC Enterprise M9000 (64 processors, 256 cores, 512 threads) 206,360 Assembly Orders/hour, 64 x 3.0 GHz SPARC VIII, 2048 GB memory, Oracle 11g, Oracle Solaris 10, Certification Number 2011033.

Two-tier SAP ATO standard 4.6 C application benchmarks as of 09/04/11:
Fujitsu Siemens Primepower 900 (16-way SMP) 12,170 Assembly Orders/hour, 16 x 1.35 GHz SPARC64 V, 64 GB memory, Oracle 9i, Solaris 8, Certification Number 2003012.
HP rx5670 (4 processors SMP) 3,090 Assembly Orders/hour, 4 x 1.0 GHz Itanium II, 24 GB memory, Oracle 9i, HP-UX 11i, Certification Number 2002069.

Two-tier SAP ATO standard 4.6 B application benchmarks as of 09/04/11:
HP 9000 Superdome (64-way SMP) 18,8770 Assembly Orders/hour, 64 x 552 MHz PA-RISC 8600, 128 GB memory, Oracle 8.1.6, HP-UX 11.11, Certification Number 2001014.
Fujitsu Siemens Primepower 2000 (128 processors SMP) 34,260 Assembly Orders/hour, 128 x 560 MHz SPARC64, 128 GB memory, Oracle 8.1.7, Solaris 8, Certification Number 2001018.

Three-tier SAP ATO standard 4.6 C application benchmarks as of 09/04/11:
HP 9000 Superdome Enterprise Server (64 processors SMP) 144,090 Assembly Orders/hour, 64 x 750 MHz PA-RISC 8700, 128 GB memory, Oracle 9i, HP-UX 11i, Certification Number 2002003
HP 9000 Superdome Enterprise Server (64 processors SMP) 130,570 Assembly Orders/hour, 64 x 750 MHz PA-RISC 8700, 128 GB memory, Oracle 9i, HP-UX 11i, Certification Number 2001047

Friday Aug 12, 2011

Sun Blade X6270 M2 with Oracle WebLogic World Record 2 Processor SPECjEnterprise 2010 Benchmark

Oracle produced a World Record single application server using 2 chips result for the SPECjEnterprise2010 benchmark of 5,427.42 SPECjEnterprise2010 EjOPS using one of Oracle's Sun Blade X6270 M2 server module for the application tier and one Sun Blade X6270 M2 server module for the database.

  • The Sun Blade X6270 M2 server module equipped with two Intel Xeon X5690 processors running at 3.46 GHz, demonstrated 47% better performance compared to the 2-chip IBM System HS22 server result of 3,694.35 SPECjEnterprise2010 EjOPS using the same model of Intel Xeon X5690 processor.

  • The Sun Blade X6270 M2 server module running the application tier demonstrated 33% better performance compared to the 2-chip IBM Power 730 Express server result of 4,062.38 SPECjEnterprise2010 EjOPS.

  • The Sun Blade X6270 M2 server modules used Oracle WebLogic Server 11g Release 1 (10.3.5) application, Java SE 6 Update 26, and Oracle Database 11g Release 2 to produce this result.

Performance Landscape

Complete benchmark results are at the SPEC website, SPECjEnterprise2010 Results.

SPECjEnterprise2010 Performance Chart
as of 8/11/2011
Submitter EjOPS* Application Server Database Server
Oracle 5,427.42 1x Sun Blade X6270 M2
2x 3.46 GHz Intel Xeon X5690
Oracle WebLogic 11g (10.3.5)
1x Sun Blade X6270 M2
2x 3.46 GHz Intel Xeon X5690
Oracle 11g DB 11.2.0.2
IBM 4,062.38 1x IBM Power 730 Express
2x 3.5 GHz POWER 7
WebSphere Application Server V7
1x IBM BladeCenter PS701
1x 3.0 GHz POWER 7
IBM DB2 9.7 Workgroup Server Edition FP3a
IBM 3,694.35 1x IBM HS22
2x 3.46 GHz Intel Xeon X5690
WebSphere Application Server V8
1x IBM x3850 X5
2x 2.4 GHz Intel Xeon E7-4870
IBM DB2 9.7 FP3a

* SPECjEnterprise2010 EjOPS, bigger is better.

Configuration Summary

Application Server:
    1 x Sun Blade X6270 M2
      2 x 3.46 GHz Intel Xeon X5690
      48 GB memory
      4 x 10 GbE NIC
      Oracle Linux 5 Update 6
      Oracle WebLogic Server 11g Release 1 (10.3.5)
      Java HotSpot(TM) 64-Bit Server VM on Linux, version 1.6.0_26 (Java SE 6 Update 26)

Database Server:

    1 x Sun Blade X6270 M2
      2 x 3.46 GHz Intel Xeon X5690
      144 GB memory
      2 x 10 GbE NIC
      2 x Sun Storage 6180
      Oracle Linux 5 Update 6
      Oracle Database 11g Enterprise Edition Release 11.2.0.2

Benchmark Description

SPECjEnterprise2010 is the third generation of the SPEC organization's J2EE end-to-end industry standard benchmark application. The SPECjEnterprise2010 benchmark has been designed and developed to cover the Java EE 5.0 specification's significantly expanded and simplified programming model, highlighting the major features used by developers in the industry today. This provides a real world workload driving the Application Server's implementation of the Java EE specification to its maximum potential and allowing maximum stressing of the underlying hardware and software systems.

The workload consists of an end to end web based order processing domain, an RMI and Web Services driven manufacturing domain and a supply chain model utilizing document based Web Services. The application is a collection of Java classes, Java Servlets, Java Server Pages , Enterprise Java Beans, Java Persistence Entities (pojo's) and Message Driven Beans.

The SPECjEnterprise2010 benchmark heavily exercises all parts of the underlying infrastructure that make up the application environment, including hardware, JVM software, database software, JDBC drivers, and the system network.

The primary metric of the SPECjEnterprise2010 benchmark is jEnterprise Operations Per Second ("SPECjEnterprise2010 EjOPS"). The primary metric for the SPECjEnterprise2010 benchmark is calculated by adding the metrics of the Dealership Management Application in the Dealer Domain and the Manufacturing Application in the Manufacturing Domain. There is no price/performance metric in this benchmark.

Key Points and Best Practices

  • Two Oracle WebLogic server instances were started using numactl binding 1 instance per chip.
  • Two Oracle database listener processes were started and each was bound to a separate chip.
  • Additional tuning information is in the report at http://spec.org.

See Also

Disclosure Statement

SPEC and the benchmark name SPECjEnterprise are registered trademarks of the Standard Performance Evaluation Corporation. Sun Blade X6270 M2, 5,427.42 SPECjEnterprise2010 EjOPS; IBM Power 730 Express, 4,062.38 SPECjEnterprise2010 EjOPS; IBM System HS22, 3,694.35 SPECjEnterprise2010 EjOPS. Results from www.spec.org as of 8/11/2011.

Friday Jul 01, 2011

SPARC T3-1 Record Results Running JD Edwards EnterpriseOne Day in the Life Benchmark with Added Batch Component

Using Oracle's SPARC T3-1 server for the application tier and Oracle's SPARC Enterprise M3000 server for the database tier, a world record result was produced running the Oracle's JD Edwards EnterpriseOne applications Day in the Life benchmark run concurrently with a batch workload.

  • The SPARC T3-1 server based result has 25% better performance than the IBM Power 750 POWER7 server even though the IBM result did not include running a batch component.

  • The SPARC T3-1 server based result has 25% better space/performance than the IBM Power 750 POWER7 server as measured by the online component.

  • The SPARC T3-1 server based result is 5x faster than the x86-based IBM x3650 M2 server system when executing the online component of the JD Edwards EnterpriseOne 9.0.1 Day in the Life benchmark. The IBM result did not include a batch component.

  • The SPARC T3-1 server based result has 2.5x better space/performance than the x86-based IBM x3650 M2 server as measured by the online component.

  • The combination of SPARC T3-1 and SPARC Enterprise M3000 servers delivered a Day in the Life benchmark result of 5000 online users with 0.875 seconds of average transaction response time running concurrently with 19 Universal Batch Engine (UBE) processes at 10 UBEs/minute. The solution exercises various JD Edwards EnterpriseOne applications while running Oracle WebLogic Server 11g Release 1 and Oracle Web Tier Utilities 11g HTTP server in Oracle Solaris Containers, together with the Oracle Database 11g Release 2.

  • The SPARC T3-1 server showed that it could handle the additional workload of batch processing while maintaining the same number of online users for the JD Edwards EnterpriseOne Day in the Life benchmark. This was accomplished with minimal loss in response time.

  • JD Edwards EnterpriseOne 9.0.1 takes advantage of the large number of compute threads available in the SPARC T3-1 server at the application tier and achieves excellent response times.

  • The SPARC T3-1 server consolidates the application/web tier of the JD Edwards EnterpriseOne 9.0.1 application using Oracle Solaris Containers. Containers provide flexibility, easier maintenance and better CPU utilization of the server leaving processing capacity for additional growth.

  • A number of Oracle advanced technology and features were used to obtain this result: Oracle Solaris 10, Oracle Solaris Containers, Oracle Java Hotspot Server VM, Oracle WebLogic Server 11g Release 1, Oracle Web Tier Utilities 11g, Oracle Database 11g Release 2, the SPARC T3 and SPARC64 VII+ based servers.

  • This is the first published result running both online and batch workload concurrently on the JD Enterprise Application server. No published results are available from IBM running the online component together with a batch workload.

  • The 9.0.1 version of the benchmark saw some minor performance improvements relative to 9.0. When comparing between 9.0.1 and 9.0 results, the reader should take this into account when the difference between results is small.

Performance Landscape

JD Edwards EnterpriseOne Day in the Life Benchmark
Online with Batch Workload

This is the first publication on the Day in the Life benchmark run concurrently with batch jobs. The batch workload was provided by Oracle's Universal Batch Engine.

System Rack
Units
Online
Users
Resp
Time (sec)
Batch
Concur
(# of UBEs)
Batch
Rate
(UBEs/m)
Version
SPARC T3-1, 1xSPARC T3 (1.65 GHz), Solaris 10
M3000, 1xSPARC64 VII+ (2.86 GHz), Solaris 10
4 5000 0.88 19 10 9.0.1

Resp Time (sec) — Response time of online jobs reported in seconds
Batch Concur (# of UBEs) — Batch concurrency presented in the number of UBEs
Batch Rate (UBEs/m) — Batch transaction rate in UBEs/minute.

JD Edwards EnterpriseOne Day in the Life Benchmark
Online Workload Only

These results are for the Day in the Life benchmark. They are run without any batch workload.

System Rack
Units
Online
Users
Response
Time (sec)
Version
SPARC T3-1, 1xSPARC T3 (1.65 GHz), Solaris 10
M3000, 1xSPARC64 VII (2.75 GHz), Solaris 10
4 5000 0.52 9.0.1
IBM Power 750, 1xPOWER7 (3.55 GHz), IBM i7.1 4 4000 0.61 9.0
IBM x3650M2, 2xIntel X5570 (2.93 GHz), OVM 2 1000 0.29 9.0

IBM result from http://www-03.ibm.com/systems/i/advantages/oracle/, IBM used WebSphere

Configuration Summary

Hardware Configuration:

1 x SPARC T3-1 server
1 x 1.65 GHz SPARC T3
128 GB memory
16 x 300 GB 10000 RPM SAS
1 x Sun Flash Accelerator F20 PCIe Card, 96 GB
1 x 10 GbE NIC
1 x SPARC Enterprise M3000 server
1 x 2.86 SPARC64 VII+
64 GB memory
1 x 10 GbE NIC
2 x StorageTek 2540 + 2501

Software Configuration:

JD Edwards EnterpriseOne 9.0.1 with Tools 8.98.3.3
Oracle Database 11g Release 2
Oracle 11g WebLogic server 11g Release 1 version 10.3.2
Oracle Web Tier Utilities 11g
Oracle Solaris 10 9/10
Mercury LoadRunner 9.10 with Oracle Day in the Life kit for JD Edwards EnterpriseOne 9.0.1
Oracle’s Universal Batch Engine - Short UBEs and Long UBEs

Benchmark Description

JD Edwards EnterpriseOne is an integrated applications suite of Enterprise Resource Planning (ERP) software. Oracle offers 70 JD Edwards EnterpriseOne application modules to support a diverse set of business operations.

Oracle's Day in the Life (DIL) kit is a suite of scripts that exercises most common transactions of JD Edwards EnterpriseOne applications, including business processes such as payroll, sales order, purchase order, work order, and other manufacturing processes, such as ship confirmation. These are labeled by industry acronyms such as SCM, CRM, HCM, SRM and FMS. The kit's scripts execute transactions typical of a mid-sized manufacturing company.

  • The workload consists of online transactions and the UBE workload of 15 short and 4 long UBEs.

  • LoadRunner runs the DIL workload, collects the user’s transactions response times and reports the key metric of Combined Weighted Average Transaction Response time.

  • The UBE processes workload runs from the JD Enterprise Application server.

    • Oracle's UBE processes come as three flavors:

      • Short UBEs < 1 minute engage in Business Report and Summary Analysis,
      • Mid UBEs > 1 minute create a large report of Account, Balance, and Full Address,
      • Long UBEs > 2 minutes simulate Payroll, Sales Order, night only jobs.
    • The UBE workload generates large numbers of PDF files reports and log files.

    • The UBE Queues are categorized as the QBATCHD, a single threaded queue for large UBEs, and the QPROCESS queue for short UBEs run concurrently.

  • One of the Oracle Solaris Containers ran 4 Long UBEs, while another Container ran 15 short UBEs concurrently.

  • The mixed size UBEs ran concurrently from the SPARC T3-1 server with the 5000 online users driven by the LoadRunner.

  • Oracle’s UBE process performance metric is Number of Maximum Concurrent UBE processes at transaction rate, UBEs/minute.

Key Points and Best Practices

Two JD Edwards EnterpriseOne Application Servers and two Oracle Fusion Middleware WebLogic Servers 11g R1 coupled with two Oracle Fusion Middleware 11g Web Tier HTTP Server instances on the SPARC T3-1 server were hosted in four separate Oracle Solaris Containers to demonstrate consolidation of multiple application and web servers.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 6/27/2011.

Friday Jun 10, 2011

SPARC Enterprise M5000 Delivers First PeopleSoft Payroll 9.1 Benchmark

Oracle's M-series server sets a world record on Oracle's PeopleSoft Enterprise Payroll (N.A) 9.1 with extra large volume model benchmark (Unicode). Oracle's SPARC Enterprise M5000 server was able to to run faster than the previous generation system result even though the PeopleSoft Payroll 9.1 benchmark is more computationally demanding.

Oracle's SPARC Enterprise M5000 server configured with eight 2.66 GHz SPARC64 VII+ processors together with Oracle's Sun Storage F5100 Flash Array storage achieved world record performance on the Unicode version of Oracle's PeopleSoft Enterprise Payroll (N.A) 9.1 with extra large volume model benchmark using Oracle Database 11g Release 2 running on Oracle Solaris 10.

  • The SPARC Enterprise M5000 server processed payroll payments for the 500K employees PeopleSoft Payroll 9.1 (Unicode) benchmark in 46.76 minutes compared to a previous result of 50.11 minutes for the PeopleSoft Payroll 9.0 (non-Unicode) benchmark configured with 2.53 GHz SPARC64 VII processors resulting in 7% better performance.

  • Note that the IBM z10 Gen1 mainframe running the PeopleSoft Payroll 9.0 (Unicode) benchmark was 48% slower than the 9.0 non-Unicode version. The IBM z10 mainframe with nine 4.4 GHz Gen1 processors has a list price over $6M and is rated at 6,512 MIPS.

  • The SPARC Enterprise M5000 server with the Sun Storage F5100 Flash Array system processed payroll for 500K employees completing the end-to-end run in 66.28 mins, 11% faster than earlier published result of 73.88 mins with Payroll 9.0 configured with 2.53 GHz SPARC64 VII processors.

  • The Sun Storage F5100 Flash Array device is a high performance, high-density solid-state flash array which provides a read latency of only 0.5 msec which is about 10 times faster than the normal disk latencies of 5 msec measured on this benchmark.

Performance Landscape

PeopleSoft Payroll (N.A.) 9.1 – 500K Employees (7 Million SQL PayCalc, Unicode)

System Processor OS/Database Payroll Processing
Result (minutes)
Run 1
(minutes)
Num of
Streams
SPARC M5000 8x 2.66GHz SPARC64 VII+ Solaris/Oracle 11g 46.76 66.28 32

PeopleSoft Payroll (N.A.) 9.0 – 500K Employees (3 Million SQL PayCalc, Non-Unicode)

System Processor OS/Database Time in Minutes Num of
Streams
Payroll
Processing
Result
Run 1 Run 2 Run 3
Sun M5000 8x 2.53GHz SPARC64 VII Solaris/Oracle 11g 50.11 73.88 534.20 1267.06 32
IBM z10 9x 4.4GHz Gen1 Z/OS /DB2 58.96 80.5 250.68 462.6 8
IBM z10 9x 4.4GHz Gen1 Z/OS /DB2 87.4 ** 107.6 - - 8
HP rx7640 8x 1.6GHz Itanium2 HP-UX/Oracle 11g 96.17 133.63 712.72 1665.01 32

** This result was run with Unicode

Payroll 9.1 Compared to Payroll 9.0

Please note that Payroll 9.1 is Unicode based and Payroll 9.0 is non-Unicode. There are 7 million executions of an SQL statement for the PayCalc batch process in Payroll 9.1 and 3 million executions of the same SQL statement for the PayCalc batch process in Payroll 9.0. This gets reflected in the elapsed time (27.33 min for 9.1 and 23.78 min for 9.0). The elapsed times of all other batch processes is lower (better) on 9.1.

Configuration Summary

Hardware Configuration:

SPARC Enterprise M5000 server
8 x 2.66 GHz SPARC64 VII+ processors
128 GB memory
2 x SAS HBA (SG-XPCIE8SAS-E-Z - PCIe HBA for Rack Servers)
Sun Storage F5100 Flash Array
40 x 24 GB FMODs
1 x StorageTek 2501 array with
12 x 146 GB SAS 15K RPM disks
1 x StorageTek 2540 array with
12 x 146 GB SAS 15K RPM disks

Software Configuration:

Oracle Solaris 10 09/10
PeopleSoft HRMS and Campus Solutions 9.10.303
PeopleSoft Enterprise (PeopleTools) 8.51.035
Oracle Database 11g Release 2 11.2.0.1 (64-bit)
Micro Focus COBOLServer Express 5.1 (64-bit)

Benchmark Description

The PeopleSoft 9.1 Payroll (North America) benchmark is a performance benchmark established by PeopleSoft to demonstrate system performance for a range of processing volumes in a specific configuration. This information may be used to determine the software, hardware, and network configurations necessary to support processing volumes. This workload represents large batch runs typical of OLTP workloads during a mass update.

To measure five application business process run times for a database representing a large organization. The five processes are:

  • Paysheet Creation: Generates payroll data worksheets consisting of standard payroll information for each employee for a given pay cycle.

  • Payroll Calculation: Looks at paysheets and calculates checks for those employees.

  • Payroll Confirmation: Takes information generated by Payroll Calculation and updates the employees' balances with the calculated amounts.

  • Print Advice forms: The process takes the information generated by Payroll Calculations and Confirmation and produces an Advice for each employee to report Earnings, Taxes, Deduction, etc.

  • Create Direct Deposit File: The process takes information generated by the above processes and produces an electronic transmittal file that is used to transfer payroll funds directly into an employee's bank account.

For the benchmark, we collected at least three data points with different numbers of job streams (parallel jobs). This batch benchmark allows a maximum of thirty-two job streams to be configured to run in parallel.

See Also

Disclosure Statement

Oracle's PeopleSoft Payroll 9.1 benchmark, SPARC Enterprise M5000 46.76 min, www.oracle.com/apps_benchmark/html/white-papers-peoplesoft.html, results 6/10/2011.

Friday Jun 03, 2011

SPARC Enterprise M8000 with Oracle 11g Beats IBM POWER7 on TPC-H @1000GB Benchmark

Oracle's SPARC Enterprise M8000 server configured with SPARC64 VII+ processors, Oracle's Sun Storage F5100 Flash Array storage, Oracle Solaris, and Oracle Database 11g Release 2 achieved a TPC-H performance result of 209,533 QphH@1000GB with price/performance of $9.53/QphH@1000GB.

Oracle's SPARC server surpasses the performance of the IBM POWER7 server on the 1 TB TPC-H decision support benchmark.

Oracle focuses on the performance of the complete hardware and software stack. Implementation details such as the number of cores or the number of threads obscures the important metric of delivered system performance. The SPARC Enterprise M8000 server delivers higher performance than the IBM Power 780 even though the SPARC VII+ processor-core is 1.6x slower than the POWER7 processor-core.

  • The SPARC Enterprise M8000 server is 27% faster than the IBM Power 780. IBM's reputed single-thread performance leadership does not provide benefit for throughput.

  • Oracle beats IBM Power with better performance. This shows that Oracle's focus on integrated system design provides more customer value than IBM's focus on per core performance.

  • The SPARC Enterprise M8000 server is up to 3.8 times faster than the IBM Power 780 for Refresh Function. Again, IBM's reputed single-thread performance leadership does not provide benefit for this important function.

  • The SPARC Enterprise M8000 server is 49% faster than the HP Superdome 2 (1.73 GHz Itanium 9350).

  • The SPARC Enterprise M8000 server is 22% better price performance than the HP Superdome 2 (1.73 GHz Itanium 9350).

  • The SPARC Enterprise M8000 server is 2 times faster than the HP Superdome 2 (1.73 GHz Itanium 9350) for Refresh Function.

  • Oracle used Storage Redundancy Level 3 as defined by the TPC-H 2.14.0 specification which is the highest level.

  • One should focus on the performance of the complete hardware and software stack since server implementation details such as the number of cores or the number of threads obscures the important metric of delivered system performance.

  • This TPC-H result demonstrates that the SPARC Enterprise M8000 server can handle the increasingly large databases required of DSS systems. The server delivered more than 16 GB/sec of IO throughput through Oracle Database 11g Release 2 software maintaining high cpu load.

Performance Landscape

The table below lists published results from comparable enterprise class systems from Oracle, HP and IBM. Each system was configured with 512 GB of memory.

TPC-H @1000GB

System
CPU type
Proc/Core/Thread
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
SPARC Enterprise M8000
3 GHz SPARC64 VII+
16 / 64 / 128
209,533.6 $9.53 177,845.9 246,867.2 Oracle 11g 09/22/11
IBM Power 780
4.14 GHz POWER7
8 / 32 / 128
164,747.2 $6.85 170,206.4 159,463.1 Sybase 03/31/11
HP SuperDome 2
1.73 GHz Intel Itanium 9350
16 / 64 / 64
140,181.1 $12.15 139,181.0 141,188.3 Oracle 11g 10/20/10

QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric (smaller is better)
QppH = the Power Numerical Quantity
QthH = the Throughput Numerical Quantity

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Server:

SPARC Enterprise M8000 server
16 x SPARC64 VII+ 3.0 GHz processors (total of 64 cores, 128 threads)
512 GB memory
12 x internal SAS (12 x 300 GB) disk drives

External Storage:

4 x Sun Storage F5100 Flash Array device, each with
80 x 24 GB Flash Modules

Software:

Oracle Solaris 10 8/11
Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 1000 GB (Scale Factor 3000)
TPC-H Composite: 209,533.6 QphH@1000GB
Price/performance: $9.53/QphH@1000GB
Available: 09/22/2011
Total 3 year Cost: $1,995,715
TPC-H Power: 177,845.9
TPC-H Throughput: 246,867.2
Database Load Time: 1:27:12

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB and 10000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multi user modes. The benchmark requires reporting of price/performance, which is the ratio of QphH to total HW/SW cost plus 3 years maintenance.

Key Points and Best Practices

  • Four Sun Storage F5100 Flash Array devices were used for the benchmark. Each F5100 device contains 80 Flash Modules (FMODs). Twenty (20) FMODs from each F5100 device were connected to a single SAS 6 Gb HBA. A single F5100 device showed 4.16 GB/sec for sequential read and demonstrated linear scaling of 16.62 GB/sec with 4 x F5100 devices.
  • The IO rate from the Oracle database was over 16 GB/sec.
  • Oracle Solaris 10 8/11 required very little system tuning.
  • The SPARC Enterprise M8000 server and Oracle Solaris efficiently managed the system load of over one thousand Oracle parallel processes.
  • The Oracle database files were mirrored under Solaris Volume Manager (SVM). Two F5100 arrays were mirrored to another 2 F5100 arrays. IO performance was good and balanced across all the FMODs. Because of the SVM mirror one of the durability tests, the disk/controller failure test, was transparent to the Oracle database.

See Also

Disclosure Statement

SPARC Enterprise M8000 209,533.6 QphH@1000GB, $9.53/QphH@1000GB, avail 09/22/11, IBM Power 780 QphH@1000GB, 164,747.2 QphH@1000GB, $6.85/QphH@1000GB, avail 03/31/11, HP Integrity Superdome 2 140,181.1 QphH@1000GB, $12.15/QphH@1000GB avail 10/20/10, TPC-H, QphH, $/QphH tm of Transaction Processing Performance Council (TPC). More info www.tpc.org.

Friday Mar 25, 2011

SPARC Enterprise M9000 with Oracle Database 11g Delivers World Record Single Server TPC-H @3000GB Result

Oracle's SPARC Enterprise M9000 server delivers single-system TPC-H @3000GB world record performance. The SPARC Enterprise M9000 server along with Oracle's Sun Storage 6180 arrays and running Oracle Database 11g Release 2 on the Oracle Solaris operating system proves the power of Oracle's integrated solution.

  • The SPARC Enterprise M9000 server configured with SPARC64 VII+ processors, Sun Storage 6180 arrays and running Oracle Solaris 10 combined with Oracle Database 11g Release 2 achieved World Record TPC-H performance of 386,478.3 QphH@3000GB for non-clustered systems.

  • The SPARC Enterprise M9000 server running the Oracle Database 11g Release 2 software is 2.5 times faster than the IBM p595 (POWER6) server which ran with Sybase IQ v.15.1 database software.

  • The SPARC Enterprise M9000 server is 3.4 times faster than the IBM p595 server for data loading.

  • The SPARC Enterprise M9000 server is 3.5 times faster than the IBM p595 server for Refresh Function.

  • The SPARC Enterprise M9000 server configured with Sun Storage 6180 arrays shows linear scaling up to the maximum delivered IO performance of 48.3 GB/sec as measured by vdbench.

  • The SPARC Enterprise M9000 server running the Oracle Database 11g Release 2 software is 2.4 times faster than the HP ProLiant DL980 server which used Microsoft SQL Server 2008 R2 Enterprise Edition software.

  • The SPARC Enterprise M9000 server is 2.9 times faster than the HP ProLiant DL980 server for data loading.

  • The SPARC Enterprise M9000 server is 4 times faster than the HP ProLiant DL980 server for Refresh Function.

  • A 1.94x improvement was delivered by the SPARC Enterprise M9000 server result using 64 SPARC64 VII+ processors compared to the previous Sun SPARC Enterprise M9000 server result which used 32 SPARC64 VII processes.

  • Oracle's TPC-H result shows that the SPARC Enterprise M9000 server can handle the increasingly large databases required of DSS systems. The IO rate as measured by the Oracle database is over 40 GB/sec.

  • Oracle used Storage Redundancy Level 3 as defined by the TPC-H 2.14.0 specification which is the highest level.

Performance Landscape

TPC-H @3000GB, Non-Clustered Systems

System
CPU type
Memory
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
SPARC Enterprise M9000
3 GHz SPARC64 VII+
1024 GB
386,478.3 $18.19 316,835.8 471,428.6 Oracle 11g 09/22/11
Sun SPARC Enterprise M9000
2.88 GHz SPARC64 VII
512 GB
198,907.5 $15.27 182,350.7 216,967.7 Oracle 11g 12/09/10
HP ProLiant DL980 G7
2.27 GHz Intel Xeon X7560
512 GB
162,601.7 $2.68 185,297.7 142,601.7 SQL Server 10/13/10
IBM Power 595
5.0 GHz POWER6
512 GB
156,537.3 $20.60 142,790.7 171,607.4 Sybase 11/24/09

QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric (smaller is better)
QppH = the Power Numerical Quantity
QthH = the Throughput Numerical Quantity

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Server:

SPARC Enterprise M9000
64 x SPARC VII+ 3.0 GHz processors
1024 GB memory
4 x internal SAS (4 x 146 GB)

External Storage:

32 x Sun Storage 6180 arrays (each with 16 x 600 GB)

Software:

Oracle Solaris 10 9/10
Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 3000 GB (Scale Factor 3000)
TPC-H Composite: 386,478.3 QphH@3000GB
Price/performance: $18.19/QphH@3000GB
Available: 09/22/2011
Total 3 year Cost: $7,030,009
TPC-H Power: 316,835.8
TPC-H Throughput: 471,428.6
Database Load Time: 2:59:01

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB and 10000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multi user modes. The benchmark requires reporting of price/performance, which is the ratio of QphH to total HW/SW cost plus 3 years maintenance.

Key Points and Best Practices

  • The Sun Storage 6180 array showed linear scalability of 48.3 GB/sec Sequential Read with thirty-two Sun Storage 6180 arrays. Scaling could continue if there are more arrays available.
  • Oracle Solaris 10 9/10 required very little system tuning.
  • The optimal Sun Storage 6180 arrays configuration for the benchmark was to set up 1 disk per volume instead of multiple disks per volume and let Oracle Oracle Automatic Storage Management (ASM) mirror. Presenting as many volumes as possible to Oracle database gave the highest scan rate.

  • The storage was managed by ASM with 4 MB stripe size. 1 MB is the default stripe size but 4 MB works better for large databases.

  • All the Oracle database files, except TEMP tablespace, were mirrored under ASM. 16 x Sun Storage 6180 arrays (256 disks) were mirrored to another 16 x Sun Storage 6180 arrays using ASM. IO performance was good and balanced across all the disks. With the ASM mirror the benchmark passed the ACID (Atomicity, Consistency, Isolation and Durablity) test.

  • Oracle database tables were 256-way partitioned. The parallel degree for each table was set to 256 to match the number of available cores. This setting worked the best for performance.

  • Oracle Database 11g Release 2 feature Automatic Parallel Degree Policy was set to AUTO for the benchmark. This enabled automatic degree of parallelism, statement queuing and in-memory parallel execution.

See Also

Disclosure Statement

SPARC Enterprise M9000 386,478.3 QphH@3000GB, $18.19/QphH@3000GB, avail 09/22/11, IBM Power 595 QphH@3000GB, 156,537.3 QphH@3000GB, $20.60/QphH@3000GB, avail 11/24/09, HP ProLiant DL980 G7 162,601.7 QphH@3000GB, $2.68/QphH@3000GB avail 10/13/10, TPC-H, QphH, $/QphH tm of Transaction Processing Performance Council (TPC). More info www.tpc.org.

Wednesday Mar 23, 2011

SPARC T3-1B Doubles Performance on Oracle Fusion Middleware WebLogic Avitek Medical Records Sample Application

The Oracle WebLogic Server 11g software was used to demonstrate the performance of the Avitek Medical Records sample application. A configuration using SPARC T3-1B and SPARC Enterprise M5000 servers from Oracle was used and showed excellent scaling of different configurations as well as doubling previous generation SPARC blade performance.

  • A SPARC T3-1B server, running a typical real-world J2EE application on Oracle WebLogic Server 11g, together with a SPARC Enterprise M5000 server running the Oracle database, had 2.1x times the transactional throughput over the previous generation UltraSPARC T2 processor based Sun Blade T6320 server module.

  • The SPARC T3-1B server shows linear scaling as the number of cores in the SPARC T3 processor used in the SPARC T3-1B system module are doubled.

  • The Avitek Medical Records application instances were deployed in Oracle Solaris zones on the SPARC T3-1B server, allowing for flexible, scalable and lightweight architecture of the application tier.

Performance Landscape

Performance for the application tier is presented. Results are the maximum transactions per second (TPS).

Server Processor Memory Maximum TPS
SPARC T3-1B 1 x SPARC T3, 1.65 GHz, 16 cores 128 GB 28,156
SPARC T3-1B 1 x SPARC T3, 1.65 GHz, 8 cores 128 GB 14,030
Sun Blade T6320 1 x UltraSPARC T2, 1.4 GHz, 8 cores 64 GB 13,386

The same SPARC Enterprise M5000 server from Oracle was used in each case as the database server. Internal disk storage was used.

Configuration Summary

Hardware Configuration:

1 x SPARC T3-1B
1 x 1.65 GHz SPARC T3
128 GB memory

1 x Sun Blade T6320
1 x 1.4Ghz GHz SPARC T2
64 GB memory

1 x SPARC Enterprise M5000
8 x 2.53 SPARC64 VII
128 GB memory

Software Configuration:

Avitek Medical Records
Oracle Database 10g Release 2
Oracle WebLogic Server 11g R1 version 10.3.3 (Oracle Fusion Middleware)
Oracle Solaris 10 9/10
HP Mercury LoadRunner 9.5

Benchmark Description

Avitek Medical Records (or MedRec) is an Oracle WebLogic Server 11g sample application suite that demonstrates all aspects of the J2EE platform. MedRec showcases the use of each J2EE component, and illustrates best practice design patterns for component interaction and client development. Oracle WebLogic server 11g is a key component of Oracle Fusion Middleware 11g.

The MedRec application provides a framework for patients, doctors, and administrators to manage patient data using a variety of different clients. Patient data includes:

  • Patient profile information: A patient's name, address, social security number, and log-in information.

  • Patient medical records: Details about a patient's visit with a physician, such as the patient's vital signs and symptoms as well as the physician's diagnosis and prescriptions.

MedRec comprises of two main Java EE applications supporting different user scenarios:

medrecEar – Patients log in to the web application (patientWebApp) to register their profile or edit. Patients can also view medical records or their prior visits. Administrators use the web application (adminWebApp) to approve or deny new patient profile requests. medrecEar also provides all of the controller and business logic used by the MedRec application suite, as well as the Web Service used by different clients.

physicianEar – Physicians and nurses login to the web application (physicianWebApp) to search and access patient profiles, create and review medical records, and prescribe medicine to patients. The physician application is designed to communicate using the Web Service provided in the medrecEar.

The medrecEAR and physicianEar application are deployed to Oracle WebLogic Server 11g instance called MedRecServer. The physicianEAR application communicates with the controller components of medrecEAR using Web Services.

The workload injected into the MedRec applications measures the average transactions per second for the following sequence:

  1. A client opens page http://{host}:7011/Start.jsp (MedRec)
  2. Patient completes Registration process
  3. Administrator login, approves the patient profile, and logout
  4. Physician connect to the on-line system and logs in
  5. Physician performs search for a patient and looks up patient's visit information
  6. Physician logs out
  7. Patient logs in and reviews the profile
  8. Patient makes changes to the profile and updates the information
  9. Patient logs out

Each of the above steps constitutes a single transaction.

Key Points and Best Practices

Please see the Oracle documentation on the Oracle Technical Network for tuning your Oracle WebLogic Server 11g deployment.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 3/22/2011.

Tuesday Mar 22, 2011

Netra SPARC T3-1 22% Faster Than IBM Running Oracle Communications ASAP

Oracle's Netra SPARC T3-1 server delivered better performance than the IBM Power 570 server running the Oracle Communications ASAP application. Oracle Communications ASAP is used by the world's leading communication providers to enable voice, data, video and content services across wireless, wireline and satellite networks.

  • A Netra SPARC T3-1 server is 22% faster than the IBM Power 570 server delivering higher order volume throughput. This was achieved by consolidating Oracle Database 11g Release 2 and Oracle Communications ASAP 7.0.2 software onto a single Netra SPARC T3-1 server.

  • Oracle's Netra servers are NEBS level 3 certified, unlike the competition. NEBS is a set of safety, physical, and environmental design guidelines for telecommunications equipment in the United States.

  • A single Netra SPARC T3-1 server takes one-eighth the rack space of an IBM Power 570 system.

  • The single processor Netra SPARC T3-1 server beat an eight processor IBM Power 570 server.

  • The ASAP result which was run on the Netra SPARC T3-1 server is the highest single-system throughput ever measured for this benchmark.

Performance Landscape

Results of Oracle Communications ASAP run with Oracle Database 11g.

System Processor Memory OS Orders/hour Version
Netra SPARC T3-1 1 x 1.65 GHz SPARC T3 128 GB Solaris 10 570,000 7.0.2
IBM Power 570 8 x 5 GHz POWER6 128 GB AIX 6.1.2 463,500 7.0

In both cases, server utilization ranged between 60 and 75%.

Configuration Summary

Hardware Configuration:

Netra SPARC T3-1
1 x 1.65 GHz T3 processor
128 GB memory
Sun Storage 7410 Unified Storage System with one Sun Storage J4400 array

Software Configuration:

Oracle Solaris 10 9/10
Oracle Database 11g Release 2 (11.2.0.1.0)
Java Platform, Standard Edition 6 Update 18
Oracle Communications ASAP 7.0.2
Oracle WebLogic Server 10.3.3.0

Benchmark Description

Oracle Communications Service Activation orchestrates the activation of complex services in a flow-through manner across multiple technology domains for both wireline and wireless service providers. This Activation product has two engines: ASAP (Automatic Service Activation Program) and IPSA (IP Service Activator). ASAP covers multiple technologies and vendors, while IPSA focuses on IP-based services.

ASAP converts order activation requests (also referred to as CSDLs) into specific atomic actions for network elements (ASDLs). ASAP performance is measured in throughput and can be expressed either as number of input business orders processed (orders/hour or CSDLs/hour) or as number of actions on network elements (ASDLs/sec). The ratio of CSDL to ASDL depends on the specific telco operator. This workload uses a 1:7 ratio (commonly used by wireless providers), which means that every order translates into actions for 7 network elements. For this benchmark, ASAP was configured to use one NEP (Network Element Processor) per network element.

Key Points and Best Practices

The application and database tiers were hosted on same Netra SPARC T3-1 server.

ASAP has three main components: WebLogic, SARM, NEP. WebLogic is used to receive and translate orders coming in as JMS messages. SARM and NEP, both native applications, perform the core activations functions.

A single ASAP instance delivered slightly under 300k orders/hour, with 27% system utilization. To take better advantage of the SPARC T3 processor's threads, two more instances of ASAP were deployed, reaching 570k orders/hour. The observed ratio between ASAP and Oracle database processor load was 1 to 1.

The Sun Storage 7410 data volumes were mounted via NFS and accessed through the onboard GbE NIC.

A second test was conducted with a more complex configuration of 24 NEPs instead of 7. This simulates the requirements of one of the largest ASAP customers. For this scenario, a single ASAP instances delivered 200k orders/hour.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 3/22/2011.

Thursday Feb 17, 2011

SPARC T3-1 takes JD Edwards "Day In the Life" benchmark lead, beats IBM Power7 by 25%

Oracle's SPARC T3-1 server, running the application, together with Oracle's SPARC Enterprise M3000 server running the database, have achieved a record result of 5000 users, with 0.523 seconds of average transaction response time, for the online component of the "Day in the Life" JD Edwards EnterpriseOne benchmark.

  • The "Day in the Life" benchmark tests the Oracle JD Edwards EnterpriseOne applications, running Oracle Fusion Middleware WebLogic Server 11g R1, Oracle Fusion Middleware Web Tier Utilities 11g HTTP server and JD Edwards EnterpriseOne 9.0.1 in Oracle Solaris Containers, together with the Oracle Database 11g Release 2.

  • The SPARC T3-1 server is 25% faster and has better response time than the IBM P750 POWER7 system, when executing the JD Edwards EnterpriseOne 9.0.1 Day in the Life test, online component.

  • The SPARC T3-1 server had 25% better space/performance than the IBM P750 POWER7 server.

  • The SPARC T3-1 server is 5x faster than the x86-based IBM x3650 M2 server system, when executing the JD Edwards EnterpriseOne 9.0.1 Day in the Life test, online component.

  • The SPARC T3-1 server had 2.5x better space/performance than the x86-based IBM x3650 M2 server.

  • The SPARC T3-1 server consolidated the application/web tier of the JD Edwards EnterpriseOne 9.0.1 application using Oracle Solaris Containers. Containers provide flexibility, easier maintenance and better CPU utilization of the server leaving processing capacity for additional growth.

  • The SPARC Enterprise M3000 server provides enterprise class RAS features for customers deploying the Oracle 11g Release 2 database software.

  • To obtain this leading result, a number of Oracle advanced technology and features were used: Oracle Solaris 10, Oracle Solaris Containers, Oracle Java Hotspot Server VM, Oracle Fusion Middleware WebLogic Server 11g R1, Oracle Fusion Middleware Web Tier Utilities 11g, Oracle Database 11g Release 2, the SPARC T3 and the SPARC64 VII based servers.

Performance Landscape

JD Edwards EnterpriseOne DIL Online Component Performance Chart

System Memory OS #user JD Edwards
Version
Rack
Units
Response
Time
(sec)
SPARC T3-1, 1x1.65 GHz SPARC T3 128 Solaris 10 5000 9.0.1 2U 0.523
\*IBM Power 750, 1x3.55 GHz POWER7 120 IBM i7.1 4000 9.0 4U 0.61
IBM Power 570, 4x4.2 GHz POWER6 128 IBM i6.1 2400 8.12 4U 1.129
IBM x3650M2, 2x2.93 GHz X5570 64 OVM 1000 9.0 2U 0.29

\* from http://www-03.ibm.com/systems/i/advantages/oracle/, IBM used Websphere

Configuration Summary

Hardware Configuration:

1 x SPARC T3-1 server
1 x 1.65 GHz SPARC T3
128 GB memory
16 x 300 GB 10000 RPM SAS
1 x 1 GbE NIC
1 x SPARC Enterprise M3000
1 x 2.75 SPARC 64 VII
64 GB memory
1 x 1 GbE NIC
2 x StorageTek 2540/2501

Software Configuration:

JD Edwards EnterpriseOne 9.0.1 with Tools 8.98.3.3
Oracle Database 11g Release 2
Oracle Fusion Middleware 11g WebLogic server 11g R1 version 10.3.2
Oracle Fusion Middleware Web Tier Utilities 11g
Oracle Solaris 10 9/10
Mercury LoadRunner 9.10 with Oracle DIL kit for JD Edwards EnterpriseOne 9.0 update 1

Benchmark Description

Oracle's JD Edwards EnterpriseOne is an integrated applications suite of Enterprise Resource Planning software.

  • Oracle offers 70 JD Edwards EnterpriseOne application modules to support a diverse set of business operations.
  • Oracle 's Day-In-Life (DIL) kit is a suite of scripts that exercises most common transactions of J.D. Edwards EnterpriseOne applications including business processes such as payroll, sales order, purchase order, work order, and other manufacturing processes, such as ship confirmation. These are labeled by industry acronyms such as SCM, CRM, HCM, SRM and FMS.
  • Oracle's DIL kit's scripts execute transactions typical of a mid-sized manufacturing company.
  • The workload consists of online transactions. It does not include the batch processing job components.
  • LoadRunner is used to run the workload and collect the users' transactions response times against increasing numbers of users from 500 to 5000.
  • Key metric used to evaluate performance is the transaction response time which is reported by LoadRunner.

Key Points and Best Practices

Two JD Edwards EnterpriseOne and two Oracle Fusion Middleware WebLogic Servers 11g R1 coupled with two Fusion Middleware 11g Web Tier HTTP Servers instances on the SPARC T3-1 server were hosted in four separate Oracle Solaris Containers to demonstrate consolidation of multiple application and web servers.

  • Each Oracle Solaris container was bound to a separate processor set with 40 virtual processors allocated to each EnterpriseOne Server, 16 virtual processors allocated to each WebServer container and 16 to the default set. This was done to improve performance by using the physical memory closest to the processors, thereby, reducing memory access latency and reducing processor cross calls. The default processor set was used for network and disk interrupt handling.

  • The applications were executed in the FX scheduling class to improve performance by reducing the frequency of context switches.

  • A WebLogic Vertical cluster was configured on each WebServer container with seven managed instances each to load balance users' requests and to provide the infrastructure that enables scaling to high number of users with ease of deployment and high availability.

  • The database server was run in an Oracle Solaris Container hosted on the Oracle's SPARC Enterprise M3000 server.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 2/16/2011.

Saturday Jan 01, 2011

BestPerf Index 1 January 2011

This is an occasionally-generated index of previous entries in the BestPerf blog. Skip to next entry

Colors used:

Benchmark
Best Practices
Other

Dec 08, 2010 Sun Blade X6275 M2 Cluster with Sun Storage 7410 Performance Running Seismic Processing Reverse Time Migration
Dec 08, 2010 Sun Blade X6275 M2 Delivers Best Fluent (MCAE Application) Performance on Tested Configurations
Dec 08, 2010 Sun Blade X6275 M2 Server Module with Intel X5670 Processors SPEC CPU2006 Results
Dec 02, 2010 World Record TPC-C Result on Oracle's SPARC Supercluster with T3-4 Servers
Dec 02, 2010 World Record SPECweb2005 Result on SPARC T3-2 with Oracle iPlanet Web Server
Dec 02, 2010 World Record Performance on PeopleSoft Enterprise Financials Benchmark run on Sun SPARC Enterprise M4000 and M5000
Oct 26, 2010 3D VTI Reverse Time Migration Scalability On Sun Fire X2270-M2 Cluster with Sun Storage 7210
Oct 11, 2010 Sun SPARC Enterprise M9000 Server Delivers World Record Non-Clustered TPC-H @3000GB Performance
Sep 30, 2010 Consolidation of 30 x86 Servers onto One SPARC T3-2
Sep 29, 2010 SPARC T3-1 Delivers Record Number of Online Users on JD Edwards EnterpriseOne 9.0.1 Day in the Life Test
Sep 28, 2010 SPARC T3 Cryptography Performance Over 1.9x Increase in Throughput over UltraSPARC T2 Plus
Sep 28, 2010 SPARC T3-2 Delivers First Oracle E-Business X-Large Benchmark Self-Service (OLTP) Result
Sep 27, 2010 Sun Fire X2270 M2 Super-Linear Scaling of Hadoop Terasort and CloudBurst Benchmarks
Sep 27, 2010 SPARC T3-1 Shows Capabilities Running Online Auction Benchmark with Oracle Fusion Middleware
Sep 24, 2010 SPARC T3-2 sets World Record on SPECjvm2008 Benchmark
Sep 24, 2010 SPARC T3 Provides High Performance Security for Oracle Weblogic Applications
Sep 23, 2010 Sun Storage F5100 Flash Array with PCI-Express SAS-2 HBAs Achieves Over 17 GB/sec Read
Sep 23, 2010 SPARC T3-1 Performance on PeopleSoft Enterprise Financials 9.0 Benchmark
Sep 22, 2010 Oracle Solaris 10 9/10 ZFS OLTP Performance Improvements
Sep 22, 2010 SPARC T3-1 Supports 13,000 Users on Financial Services and Enterprise Application Integration Running Siebel CRM 8.1.1
Sep 21, 2010 ProMAX Performance and Throughput on Sun Fire X2270 and Sun Storage 7410
Sep 21, 2010 Sun Flash Accelerator F20 PCIe Cards Outperform IBM on SPC-1C
Sep 21, 2010 SPARC T3 Servers Deliver Top Performance on Oracle Communications Order and Service Management
Sep 20, 2010 Schlumberger's ECLIPSE 300 Performance Throughput On Sun Fire X2270 Cluster with Sun Storage 7410
Sep 20, 2010 Sun Fire X4470 4 Node Cluster Delivers World Record SAP SD-Parallel Benchmark Result
Sep 20, 2010 SPARC T3-4 Sets World Record Single Server Result on SPECjEnterprise2010 Benchmark
Aug 25, 2010 Transparent Failover with Solaris MPxIO and Oracle ASM
Aug 23, 2010 Repriced: SPC-1 Sun Storage 6180 Array (8Gb) 1.9x Better Than IBM DS5020 in Price-Performance
Aug 23, 2010 Repriced: SPC-2 (RAID 5 & 6 Results) Sun Storage 6180 Array (8Gb) Outperforms IBM DS5020 by up to 64% in Price-Performance
Jun 29, 2010 Sun Fire X2270 M2 Achieves Leading Single Node Results on ANSYS FLUENT Benchmark
Jun 29, 2010 Sun Fire X2270 M2 Demonstrates Outstanding Single Node Performance on MSC.Nastran Benchmarks
Jun 29, 2010 Sun Fire X2270 M2 Achieves Leading Single Node Results on ABAQUS Benchmark
Jun 29, 2010 Sun Fire X2270 M2 Sets World Record on SPEC OMP2001 Benchmark
Jun 29, 2010 Sun Fire X4170 M2 Sets World Record on SPEC CPU2006 Benchmark
Jun 29, 2010 Sun Blade X6270 M2 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4270 M2 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4470 Sets World Records on SPEC OMP2001 Benchmarks
Jun 28, 2010 Sun Fire X4470 Sets World Record on SPEC CPU2006 Rate Benchmark
Jun 28, 2010 Sun Fire X4470 2-Node Configuration Sets World Record for SAP SD-Parallel Benchmark
Jun 28, 2010 Sun Fire X4800 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4800 Sets World Records on SPEC CPU2006 Rate Benchmarks
Jun 10, 2010 Hyperion Essbase ASO World Record on Sun SPARC Enterprise M5000
Jun 09, 2010 PeopleSoft Payroll 500K Employees on Sun SPARC Enterprise M5000 World Record
Jun 03, 2010 Sun SPARC Enterprise T5440 World Record SPECjAppServer2004
May 11, 2010 Per-core Performance Myth Busting
Apr 14, 2010 Oracle Sun Storage F5100 Flash Array Delivers World Record SPC-1C Performance
Apr 13, 2010 Oracle Sun Flash Accelerator F20 PCIe Card Accelerates Web Caching Performance
Apr 06, 2010 WRF Benchmark: X6275 Beats Power6
Mar 29, 2010 Sun Blade X6275/QDR IB/ Reverse Time Migration
Feb 23, 2010 IBM POWER7 SPECfp_rate2006: Poor Scaling? Or Configuration Confusion?
Jan 25, 2010 Sun/Solaris Leadership in SAP SD Benchmarks and HP claims
Jan 21, 2010 SPARC Enterprise M4000 PeopleSoft NA Payroll 240K Employees Performance (16 Streams)
Dec 16, 2009 Sun Fire X4640 Delivers World Record x86 Result on SPEC OMPL2001
Nov 24, 2009 Sun M9000 Fastest SAP 2-tier SD Benchmark on current SAP EP4 for SAP ERP 6.0 (Unicode)
Nov 20, 2009 Sun Blade X6275 cluster delivers leading results for Fluent truck_111m benchmark
Nov 20, 2009 Sun Blade 6048 and Sun Blade X6275 NAMD Molecular Dynamics Benchmark beats IBM BlueGene/L
Nov 19, 2009 SPECmail2009: New World record on T5240 1.6GHz Sun 7310 and ZFS
Nov 18, 2009 Sun Flash Accelerator F20 PCIe Card Achieves 100K 4K IOPS and 1.1 GB/sec
Nov 05, 2009 New TPC-C World Record Sun/Oracle
Nov 02, 2009 Sun Blade X6275 Cluster Beats SGI Running Fluent Benchmarks
Nov 02, 2009 Sun Ultra 27 Delivers Leading Single Frame Buffer SPECviewperf 10 Results
Oct 28, 2009 SPC-2 Sun Storage 6780 Array RAID 5 & RAID 6 51% better $/performance than IBM DS5300
Oct 25, 2009 Sun C48 & Lustre fast for Seismic Reverse Time Migration using Sun X6275
Oct 25, 2009 Sun F5100 and Seismic Reverse Time Migration with faster Optimal Checkpointing
Oct 23, 2009 Wiki on performance best practices
Oct 20, 2009 Exadata V2 Information
Oct 15, 2009 Oracle Flash Cache - SGA Caching on Sun Storage F5100
Oct 13, 2009 Oracle Hyperion Sun M5000 and Sun Storage 7410
Oct 13, 2009 Sun T5440 Oracle BI EE Sun SPARC Enterprise T5440 World Record
Oct 13, 2009 SPECweb2005 on Sun SPARC Enterprise T5440 World Record using Solaris Containers and Sun Storage F5100 Flash
Oct 13, 2009 Oracle PeopleSoft Payroll (NA) Sun SPARC Enterprise M4000 and Sun Storage F5100 World Record Performance
Oct 13, 2009 SAP 2-tier SD Benchmark on Sun SPARC Enterprise M9000/32 SPARC64 VII
Oct 13, 2009 CP2K Life Sciences, Ab-initio Dynamics - Sun Blade 6048 Chassis with Sun Blade X6275 - Scalability and Throughput with Quad Data Rate InfiniBand
Oct 13, 2009 SAP 2-tier SD-Parallel on Sun Blade X6270 1-node, 2-node and 4-node
Oct 13, 2009 Halliburton ProMAX Oil & Gas Application Fast on Sun 6048/X6275 Cluster
Oct 13, 2009 SPECcpu2006 Results On MSeries Servers With Updated SPARC64 VII Processors
Oct 13, 2009 MCAE ABAQUS faster on Sun F5100 and Sun X4270 - Single Node World Record
Oct 12, 2009 MCAE ANSYS faster on Sun F5100 and Sun X4270
Oct 12, 2009 MCAE MCS/NASTRAN faster on Sun F5100 and Fire X4270
Oct 12, 2009 SPC-2 Sun Storage 6180 Array RAID 5 & RAID 6 Over 70% Better Price Performance than IBM
Oct 12, 2009 SPC-1 Sun Storage 6180 Array Over 70% Better Price Performance than IBM
Oct 12, 2009 Why Sun Storage F5100 is a good option for Peoplesoft NA Payroll Application
Oct 12, 2009 1.6 Million 4K IOPS in 1RU on Sun Storage F5100 Flash Array
Oct 11, 2009 TPC-C World Record Sun - Oracle
Oct 09, 2009 X6275 Cluster Demonstrates Performance and Scalability on WRF 2.5km CONUS Dataset
Oct 02, 2009 Sun X4270 VMware VMmark benchmark achieves excellent result
Sep 22, 2009 Sun X4270 Virtualized for Two-tier SAP ERP 6.0 Enhancement Pack 4 (Unicode) Standard Sales and Distribution (SD) Benchmark
Sep 01, 2009 String Searching - Sun T5240 & T5440 Outperform IBM Cell Broadband Engine
Aug 28, 2009 Sun X4270 World Record SAP-SD 2-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Aug 27, 2009 Sun SPARC Enterprise T5240 with 1.6GHz UltraSPARC T2 Plus Beats 4-Chip IBM Power 570 POWER6 System on SPECjbb2005
Aug 26, 2009 Sun SPARC Enterprise T5220 with 1.6GHz UltraSPARC T2 Sets Single Chip World Record on SPECjbb2005
Aug 12, 2009 SPECmail2009 on Sun SPARC Enterprise T5240 and Sun Java System Messaging Server 6.3
Jul 23, 2009 World Record Performance of Sun CMT Servers
Jul 22, 2009 Why does 1.6 beat 4.7?
Jul 21, 2009 Zeus ZXTM Traffic Manager World Record on Sun T5240
Jul 21, 2009 Sun T5440 Oracle BI EE World Record Performance
Jul 21, 2009 Sun T5440 World Record SAP-SD 4-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Jul 21, 2009 1.6 GHz SPEC CPU2006 - Rate Benchmarks
Jul 21, 2009 Sun Blade T6320 World Record SPECjbb2005 performance
Jul 21, 2009 New SPECjAppServer2004 Performance on the Sun SPARC Enterprise T5440
Jul 21, 2009 Sun T5440 SPECjbb2005 Beats IBM POWER6 Chip-to-Chip
Jul 21, 2009 New CMT results coming soon....
Jul 14, 2009 Vdbench: Sun StorageTek Vdbench, a storage I/O workload generator.
Jul 14, 2009 Storage performance and workload analysis using Swat.
Jul 10, 2009 World Record TPC-H@300GB Price-Performance for Windows on Sun Fire X4600 M2
Jul 06, 2009 Sun Blade 6048 Chassis with Sun Blade X6275: RADIOSS Benchmark Results
Jul 03, 2009 SPECmail2009 on Sun Fire X4275+Sun Storage 7110: Mail Server System Solution
Jun 30, 2009 Sun Blade 6048 and Sun Blade X6275 NAMD Molecular Dynamics Benchmark beats IBM BlueGene/L
Jun 26, 2009 Sun Fire X2270 Cluster Fluent Benchmark Results
Jun 25, 2009 Sun SSD Server Platform Bandwidth and IOPS (Speeds & Feeds)
Jun 24, 2009 I/O analysis using DTrace
Jun 23, 2009 New CPU2006 Records: 3x better integer throughput, 9x better fp throughput
Jun 23, 2009 Sun Blade X6275 results capture Top Places in CPU2006 SPEED Metrics
Jun 19, 2009 Pointers to Java Performance Tuning resources
Jun 19, 2009 SSDs in HPC: Reducing the I/O Bottleneck BluePrint Best Practices
Jun 17, 2009 The Performance Technology group wiki is alive!
Jun 17, 2009 Performance of Sun 7410 and 7310 Unified Storage Array Line
Jun 16, 2009 Sun Fire X2270 MSC/Nastran Vendor_2008 Benchmarks
Jun 15, 2009 Sun Fire X4600 M2 Server Two-tier SAP ERP 6.0 (Unicode) Standard Sales and Distribution (SD) Benchmark
Jun 12, 2009 Correctly comparing SAP-SD Benchmark results
Jun 12, 2009 OpenSolaris Beats Linux on memcached Sun Fire X2270
Jun 11, 2009 SAS Grid Computing 9.2 utilizing the Sun Storage 7410 Unified Storage System
Jun 10, 2009 Using Solaris Resource Management Utilities to Improve Application Performance
Jun 09, 2009 Free Compiler Wins Nehalem Race by 2x
Jun 08, 2009 Variety of benchmark results to be posted on BestPerf
Jun 05, 2009 Interpreting Sun's SPECpower_ssj2008 Publications
Jun 03, 2009 Wide Variety of Topics to be discussed on BestPerf
Jun 03, 2009 Welcome to BestPerf group blog!

Wednesday Dec 08, 2010

Sun Blade X6275 M2 Cluster with Sun Storage 7410 Performance Running Seismic Processing Reverse Time Migration

This Oil & Gas benchmark highlights both the computational performance improvements of the Sun Blade X6275 M2 server module over the previous genernation server and the linear scalability achievable for the total application throughput using a Sun Storage 7410 system to deliver almost 2 GB/sec I/O effective write performance.

Oracle's Sun Storage 7410 system attached via 10 Gigabit Ethernet to a cluster of Oracle's Sun Blade X6275 M2 server modules was used to demonstrate the performance of a 3D VTI Reverse Time Migration application, a heavily used geophysical imaging and modeling application for Oil & Gas Exploration. The total application throughput scaling and computational kernel performance improvements are presented for imaging two production sized grids using 800 input samples.

  • The Sun Blade X6275 M2 server module showed up to a 40% performance improvement over the previous generation server module with super-linear scalability to 16 nodes for the 9-Point Stencil used in this Reverse Time Migration computational kernel.

  • The balanced combination of Oracle's Sun Storage 7410 system over 10 GbE to the Sun Blade X6275 M2 server module cluster showed linear scalability for the total application throughput, including the I/O and MPI communication, to produce a final 3-D seismic depth imaged cube for interpretation.

  • The final image write time from the Sun Blade X6275 M2 server module nodes to Oracle's Sun Storage 7410 system achieved 10GbE line speed of 1.25 GBytes/second or better write performance. The effects of I/O buffer caching on the Sun Blade X6275 M2 server module nodes and 34 GByte write optimized cache on the Sun Storage 7410 system gave up to 1.8 GBytes/second effective write performance.

Performance Landscape

Server Generational Performance Improvements

Performance improvements for the Reverse Time Migration computational kernel using a Sun Blade X6275 M2 cluster are compared to the previous generation Sun Blade X6275 cluster. Hyper-threading was enabled for both configurations allowing 24 OpenMP threads for the Sun Blade X6275 M2 server module nodes and 16 for the Sun Blade X6275 server module nodes.

Sun Blade X6275 M2 Performance Improvements
Number Nodes Grid Size - 1243 x 1151 x 1231 Grid Size - 2486 x 1151 x1231
X6275 Kernel Time (sec) X6275 M2 Kernel Time (sec) X6275 M2 Speedup X6275 Kernel Time (sec) X6275 M2 Kernel Time (sec) X6275 M2 Speedup
16 306 242 1.3 728 576 1.3
14 355 271 1.3 814 679 1.2
12 435 346 1.3 945 797 1.2
10 541 390 1.4 1156 890 1.3
8 726 555 1.3 1511 1193 1.3

Application Scaling

Performance and scaling results of the total application, including I/O, for the reverse time migration demonstration application are presented. Results were obtained using a Sun Blade X6275 M2 server cluster with a Sun Storage 7410 system for the file server. The servers were running with hyperthreading enabled, allowing for 24 OpenMP threads per server node.

Application Scaling Across Multiple Nodes
Number Nodes Grid Size - 1243 x 1151 x 1231 Grid Size - 2486 x 1151 x1231
Total Time (sec) Kernel Time (sec) Total Speedup Kernel Speedup Total Time (sec) Kernel Time (sec) Total Speedup Kernel Speedup
16 501 242 2.1\* 2.3\* 1060 576 2.0 2.1\*
14 583 271 1.8 2.0 1219 679 1.7 1.8
12 681 346 1.6 1.6 1420 797 1.5 1.5
10 807 390 1.3 1.4 1688 890 1.2 1.3
8 1058 555 1.0 1.0 2085 1193 1.0 1.0

\* Super-linear scaling due to the compute kernel fitting better into available cache for larger node counts

Image File Effective Write Performance

The performance for writing the final 3D image from a Sun Blade X6275 M2 server cluster over 10 Gigabit Ethernet to a Sun Storage 7410 system are presented. Each server allocated one core per node for MPI I/O thus allowing 22 OpenMP compute threads per node with hyperthreading enabled. Captured performance analytics from the Sun Storage 7410 system indicate effective use of its 34 Gigabyte write optimized cache.

Image File Effective Write Performance
Number Nodes Grid Size - 1243 x 1151 x 1231 Grid Size - 2486 x 1151 x1231
Write Time (sec) Write Performance (GB/sec) Write Time (sec) Write Performance (GB/sec)
16 4.8 1.5 10.2 1.4
14 5.0 1.4 10.2 1.4
12 4.0 1.8 11.3 1.3
10 4.3 1.6 9.1 1.6
8 4.6 1.5 9.7 1.5

Note: Performance results better than 1.3GB/sec related to I/O buffer caching on server nodes.

Configuration Summary

Hardware Configuration:

8 x 2 node Sun Blade X6275 M2 server nodes, each node with
2 x 2.93 GHz Intel Xeon X5670 processors
48 GB memory (12 x 4 GB at 1333 MHz)
1 x QDR InfiniBand Host Channel Adapter

Sun Datacenter InfiniBand Switch IB-36
Sun Network 10 GbE Switch 72p

Sun Storage 7410 system connected via 10 Gigabit Ethernet
4 x 17 GB STEC ZeusIOPs SSD mirrored - 34 GB
40 x 750 GB 7500 RPM Seagate SATA disks mirrored - 14.4 TB
No L2ARC Readzilla Cache

Software Configuration:

Oracle Enterprise Linux Server release 5.5
Oracle Message Passing Toolkit 8.2.1c (for MPI)
Oracle Solaris Studio 12.2 C++, Fortran, OpenMP

Benchmark Description

This Vertical Transverse Isotropy (VTI) Anisotropic Reverse Time Depth Migration (RTM) application measures the total time it takes to image 800 samples of various production size grids and write the final image to disk for the next work flow step involving 3-D seismic volume interpretation. In doing so, it reports the compute, interprocessor communication, and I/O performance of the individual functions that comprise the total solution. Unlike most references for the Reverse Time Migration, that focus solely on the performance of the 3D stencil compute kernel, this demonstration code additionally reports the total throughput involved in processing large data sets with a full 3D Anisotropic RTM application. It provides valuable insight into configuration and sizing for specific seismic processing requirements. The performance effects of new processors, interconnects, I/O subsystems, and software technologies can be evaluated while solving a real Exploration business problem.

This benchmark study uses the "in-core" implementation of this demonstration code where each node reads in only the trace, velocity, and conditioning data to be processed by that node plus a 4 element array pad (based on spatial order 8) shared with it's neighbors to the left and right during the initialization phase. It maintains previous, current, and next wavefield state information for each of the source, receiver, and anisotropic wavefields in memory. The second two grid dimensions used in this benchmark are specifically chosen to be prime numbers to exaggerate the effects of data alignment. Algorithm adaptions for processing higher orders in space and alternative "out-of-core" solutions using SSDs for wave state checkpointing are implemented in this demonstration application to better understand the effects of problem size scaling. Care is taken to handle absorption boundary conditioning and a variety of imaging conditions, appropriately.

RTM Application Structure:

Read Processing Parameter File, Determine Domain Decomposition, and Initialize Data Structures, and Allocate Memory.

Read Velocity, Epsilon, and Delta Data Based on Domain Decomposition and create source, receiver, & anisotropic previous, current, and next wave states.

First Loop over Time Steps

Compute 3D Stencil for Source Wavefield (a,s) - 8th order in space, 2nd order in time
Propagate over Time to Create s(t,z,y,x) & a(t,z,y,x)
Inject Estimated Source Wavelet
Apply Absorption Boundary Conditioning (a)
Update Wavefield States and Pointers
Write Snapshot of Wavefield (out-of-core) or Push Wavefield onto Stack (in-core)
Communicate Boundary Information

Second Loop over Time Steps
Compute 3D Stencil for Receiver Wavefield (a,r) - 8th order in space, 2nd order in time
Propagate over Time to Create r(t,z,y,x) & a(t,z,y,x)
Read Receiver Trace and Inject Receiver Wavelet
Apply Absorption Boundary Conditioning (a)
Update Wavefield States and Pointers
Communicate Boundary Information
Read in Source Wavefield Snapshot (out-of-core) or Pop Off of Stack (in-core)
Cross-correlate Source and Receiver Wavefields
Update image using image conditioning parameters

Write 3D Depth Image i(z,x,y) = Sum over time steps s(t,z,x,y) \* r(t,z,x,y) or other imaging conditions.

Key Points and Best Practices

This demonstration application represents a full Reverse Time Migration solution. Many references to the RTM application tend to focus on the compute kernel and ignore the complexity that the input, communication, and output bring to the task.

Image File MPI Write Performance Tuning

Changing the Image File Write from MPI non-blocking to MPI blocking and setting Oracle Message Passing Toolkit MPI environment variables revealed an 18x improvement in write performance to the Sun Storage 7410 system going from:

    86.8 to 4.8 seconds for the 1243 x 1151 x 1231 grid size
    183.1 to 10.2 seconds for the 2486 x 1151 x 1231 grid size

The Swat Sun Storage 7410 analytics data capture indicated an initial write performance of about 100 MB/sec with the MPI non-blocking implementation. After modifying to MPI blocking writes, Swat showed between 1.3 and 1.8 GB/sec with up to 13000 write ops/sec to write the final output image. The Swat results are consistent with the actual measured performance and provide valuable insight into the Reverse Time Migration application I/O performance.

The reason for this vast improvement has to do with whether the MPI file mode is sequential or not (MPI_MODE_SEQUENTIAL, O_SYNC, O_DSYNC). The MPI non-blocking routines, MPI_File_iwrite_at and MPI_wait, typically used for overlapping I/O and computation, do not support sequential file access mode. Therefore, the application could not take full performance advantages of the Sun Storage 7410 system write optimized cache. In contrast, the MPI blocking routine, MPI_File_write_at, defaults to MPI sequential mode and the performance advantages of the write optimized cache are realized. Since writing the final image is at the end of RTM execution, there is no need to overlap the I/O with computation.

Additional MPI parameters used:

    setenv SUNW_MP_PROCBIND true
    setenv MPI_SPIN 1
    setenv MPI_PROC_BIND 1

Adjusting the Level of Multithreading for Performance

The level of multithreading (8, 10, 12, 22, or 24) for various components of the RTM should be adjustable based on the type of computation taking place. Best to use OpenMP num_threads clause to adjust the level of multi-threading for each particular work task. Use numactl to specify how the threads are allocated to cores in accordance to the OpenMP parallelism level.

See Also

Disclosure Statement

Copyright 2010, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 12/07/2010.

Sun Blade X6275 M2 Delivers Best Fluent (MCAE Application) Performance on Tested Configurations

This Manufacturing Engineering benchmark highlights the performance advantage the Sun Blade X6275 M2 server module offers over IBM, Cray, and SGI solutions as shown by the ANSYS FLUENT fluid dynamics application.

A cluster of eight of Oracle's Sun Blade X6275 M2 server modules delivered outstanding performance running the FLUENT 12 benchmark test suite.

  • The Sun Blade X6275 M2 server module cluster delivered the best results in all 36 of the test configurations run, outperforming the best posted results by as much as 42%.
  • The Sun Blade X6275 M2 server module demonstrated up to 76% performance improvement over the previous generation Sun Blade X6275 server module.

Performance Landscape

In the following tables, results are "Ratings" (bigger is better).
Rating = No. of sequential runs of test case possible in 1 day: 86,400/(Total Elapsed Run Time in Seconds)

The following table compares results on the basis of core count, irrespective of processor generation. This means that in some cases, i.e., for the 32-core and 64-core configurations, systems with the Intel Xeon X5670 six-core processors did not utilize quite all of the cores available for the specified processor count.


FLUENT 12 Benchmark Test Suite

Competitive Comparisons

System
Processors Cores Benchmark Test Case Ratings
eddy
417k
turbo
500k
aircraft
2m
sedan
4m
truck
14m
truck_poly
14m

Sun Blade X6275 M2 16 96 9340.5 39272.7 8307.7 8533.3 903.8 786.9
Best Posted 24 96

7562.4
797.0 712.9
Best Posted 16 96 7337.6 33553.4 6533.1 5989.6 739.1 683.5

Sun Blade X6275 M2 11 64 6306.6 27212.6 5592.2 5158.2 568.8 518.9
Best Posted 16 64 5556.3 26381.7 5494.4 4902.1 566.6 518.6

Sun Blade X6275 M2 8 48 4620.3 19093.9 4080.3 3251.2 376.0 359.4
Best Posted 8 48 4494.1 18989.0 3990.8 3185.3 372.7 354.5

Sun Blade X6275 M2 6 32 4061.1 15091.7 3275.8 3013.1 299.5 267.8
Best Posted 8 32 3404.9 14832.6 3211.9 2630.1 286.7 266.7

Sun Blade X6275 M2 4 24 2751.6 10441.1 2161.4 1907.3 188.2 182.5
Best Posted 6 24 1458.2 9626.7 1820.9 1747.2 185.1 180.8
Best Posted 4 24 2565.7 10164.7 2109.9 1608.2 187.1 180.8

Sun Blade X6275 M2 2 12 1429.9 5358.1 1097.5 813.2 95.9 95.9
Best Posted 2 12 1338.0 5308.8 1073.3 808.6 92.9 94.4



The following table compares results on the basis of processor count showing inter-generational processor performance improvement.


FLUENT 12 Benchmark Test Suite

Intergenerational Comparisons

System
Processors Cores Benchmark Test Case Ratings
eddy
417k
turbo
500k
aircraft
2m
sedan
4m
truck
14m
truck_poly
14m

Sun Blade X6275 M2 16 96 9340.5 39272.7 8307.7 8533.3 903.8 786.9
Sun Blade X6275 16 64 5308.8 26790.7 5574.2 5074.9 547.2 525.2
X6275 M2 : X6275 16
1.76 1.47 1.49 1.68 1.65 1.50

Sun Blade X6275 M2 8 48 4620.3 19093.9 4080.3 3251.2 376.0 359.4
Sun Blade X6275 8 32 3066.5 13768.9 3066.5 2602.4 289.0 270.3
X6275 M2 : X6275 8
1.51 1.39 1.33 1.25 1.30 1.33

Sun Blade X6275 M2 4 24 2751.6 10441.1 2161.4 1907.3 188.2 182.5
Sun Blade X6275 4 16 1714.3 7545.9 1519.1 1345.8 144.4 141.8
X6275 M2 : X6275 4
1.61 1.38 1.42 1.42 1.30 1.29

Sun Blade X6275 M2 2 12 1429.9 5358.1 1097.5 813.2 95.9 95.9
Sun Blade X6275 2 8 931.8 4061.1 827.2 681.5 73.0 73.8
X6275 M2 : X6275 2
1.53 1.32 1.33 1.19 1.31 1.30

Configuration Summary

Hardware Configuration:

8 x Sun Blade X6275 M2 server modules, each with
4 Intel Xeon X5670 2.93 GHz processors, turbo enabled
96 GB memory 1333 MHz
2 x 24 GB SATA-based Sun Flash Modules
2 x QDR InfiniBand Host Channel Adapter
Sun Datacenter InfiniBand Switch IB-36

Software Configuration:

Oracle Enterprise Linux Enterprise Server 5.5
ANSYS FLUENT V12.1.2
ANSYS FLUENT Benchmark Test Suite

Benchmark Description

The following description is from the ANSYS FLUENT website:

The FLUENT benchmarks suite comprises of a set of test cases covering a large range of mesh sizes, physical models and solvers representing typical industry usage. The cases range in size from a few 100 thousand cells to more than 100 million cells. Both the segregated and coupled implicit solvers are included, as well as hexahedral, mixed and polyhedral cell cases. This broad coverage is expected to demonstrate the breadth of FLUENT performance on a variety of hardware platforms and test cases.

The performance of a CFD code will depend on several factors, including size and topology of the mesh, physical models, numerics and parallelization, compilers and optimization, in addition to performance characteristics of the hardware where the simulation is performed. The principal objective of this benchmark suite is to provide comprehensive and fair comparative information of the performance of FLUENT on available hardware platforms.

About the ANSYS FLUENT 12 Benchmark Test Suite

    CFD models tend to be very large where grid refinement is required to capture with accuracy conditions in the boundary layer region adjacent to the body over which flow is occurring. Fine grids are required to also determine accurate turbulence conditions. As such these models can run for many hours or even days as well using a large number of processors.

Key Points and Best Practices

  • ANSYS FLUENT has not yet been certified by the vendor on Oracle Enterprise Linux (OEL). However, the ANSYS FLUENT benchmark tests have been run successfully on Oracle hardware running OEL as is (i.e. with NO changes or modifications).
  • The performance improvement of the Sun Blade X6275 M2 server module over the previous generation Sun Blade X6275 server module was due to two main factors: the increased core count per processor (6 vs. 4), and the more optimal, iterative dataset partitioning scheme used for the Sun Blade X6275 M2 server module.

See Also

Disclosure Statement

All information on the FLUENT website (http://www.fluent.com) is Copyrighted 1995-2010 by ANSYS Inc. Results as of December 06, 2010.

Tuesday Dec 07, 2010

Sun Blade X6275 M2 Server Module with Intel X5670 Processors SPEC CPU2006 Results

Results are presented for Oracle's Sun Blade X6275 M2 server module running the SPEC CPU2006 benchmark suite.
  • The dual-node Sun Blade X6275 M2 server module, equipped with two Intel Xeon X5670 2.93 GHz processors per node and running the Oracle Enterprise Linux 5.5 operating system delivered the best SPECint_rate2006 and SPECfp_rate2006 benchmark results for all systems with Intel Xeon processor 5000 sequence.

  • With a SPECint_rate2006 benchmark result of 679, the Sun Blade X6275 M2 server module, with two compute nodes per blade, delivers maximum performance for space constrained environments.

  • Comparing Oracle's dual-node blade to HP's dual-node blade server, based on their single node performance, the Sun Blade X6275 M2 server module SPECfp_rate2006 score of 241 outperforms the best published HP ProLiant BL2X220c G5 server score by 3.2x.

  • A single node of a Sun Blade X6275 M2 server module using 2.93 GHz Intel Xeon X5670 processors delivered 37% improvement in SPECint_rate2006 benchmark results and 22% improvement in SPECfp_rate2006 benchmark results compared to the previous generation Sun Blade X6275 server module.

  • Both nodes of a Sun Blade X6275 M2 server module using 2.93 GHz Intel Xeon X5670 processors delivered 59% improvement on the SPECint_rate2006 benchmark and 40% improvement on the SPECfp_rate2006 benchmark compared to the previous generation Sun Blade X6275 server module.

Performance Landscape

SPEC CPU2006 results comparing blade systems using Intel Xeon processor 5000 sequence based CPUs.

System SPECint_rate2006 SPECfp_rate2006
base peak base peak
Sun Blade X6275 M2 2-nodes, X5670 651 679 465 474
Sun Blade X6275 M2 1-node, X5670 326 348 234 241
IBM BladeCenter HS22V, X5680 352 377 246 254
Dell PowerEdge M710, X5680 355 380 247 256
HP BL460c G7, X5670 324 347 233 241
HP BL2X220c G5 1-node, E5450 106 132 67.4 74.8

SPEC CPU2006 results generational comparison between the Sun Blade X6275 M2 server module and the Sun Blade X6275 server module.

System SPECint_rate2006 SPECfp_rate2006
base peak base peak
Sun Blade X6275 M2 2-nodes, X5670 651 679 465 474
Sun Blade X6275 2-nodes, X5570 410 478 332 355
Sun Blade X6275 M2 1-node, X5670 326 348 234 241
Sun Blade X6275 1-node, X5570 238 253 191 197

Results in the above tables are from www.spec.org and this report as of 12/6/2010.

Configuration Summary and Results

Hardware Configuration:

Sun Blade X6275 server module, 2 nodes and each node has
2 x 2.93 GHz Intel Xeon X5670 processors, turbo enabled
96 GB, (12 x 8 GB DDR3-1333 DIMM)
Sun Storage 7410 System via NFS

Software Configuration:

Oracle Enterprise Linux Server release 5.5, kernel 2.6.18-194.el5
Intel 11.1
MicroQuill SmartHeap Library V8.1
SPEC CPU2006 V1.1

Results Summary:

Sun Blade X6275 M2, both nodes 651 SPECint_rate_base2006 679 SPECint_rate2006
Sun Blade X6275 M2, both nodes 465 SPECfp_rate_base2006 474 SPECfp_rate2006
Sun Blade X6275 M2, one node 326 SPECint_rate_base2006 348 SPECint_rate2006
Sun Blade X6275 M2, one node 234 SPECfp_rate_base2006 241 SPECfp_rate2006
Sun Blade X6275 M2, one node 36.2 SPECint_base2006 39.0 SPECint2006

Benchmark Description

SPEC CPU2006 is SPEC's most popular benchmark, with over 14000 results published in the years since it was introduced. It measures:

  • "Speed" - single copy performance of chip, memory, compiler
  • "Rate" - multiple copy (throughput)

The rate metrics are used for the throughput-oriented systems described on this page. These metrics include:

  • SPECint_rate2006: throughput for 12 integer benchmarks derived from real applications such as perl, gcc, XML processing, and pathfinding
  • SPECfp_rate2006: throughput for 17 floating point benchmarks derived from real applications, including chemistry, physics, genetics, and weather.

There are base variants of both the above metrics that require more conservative compilation. In particular, all benchmarks of a particular programming language must use the same compilation flags.

See Also

Disclosure Statement

SPEC and the benchmark names SPECint and SPECfp are registered trademarks of the Standard Performance Evaluation Corporation. Results are from the report and www.spec.org as of December 6, 2010.

Sunday Dec 05, 2010

BestPerf Index 6 December 2010

This is an occasionally-generated index of previous entries in the BestPerf blog. Skip to next entry

Colors used:

Benchmark
Best Practices
Other

Dec 02, 2010 World Record TPC-C Result on Oracle's SPARC Supercluster with T3-4 Servers
Dec 02, 2010 World Record SPECweb2005 Result on SPARC T3-2 with Oracle iPlanet Web Server
Dec 02, 2010 World Record Performance on PeopleSoft Enterprise Financials Benchmark run on Sun SPARC Enterprise M4000 and M5000
Oct 26, 2010 3D VTI Reverse Time Migration Scalability On Sun Fire X2270-M2 Cluster with Sun Storage 7210
Oct 11, 2010 Sun SPARC Enterprise M9000 Server Delivers World Record Non-Clustered TPC-H @3000GB Performance
Sep 30, 2010 Consolidation of 30 x86 Servers onto One SPARC T3-2
Sep 29, 2010 SPARC T3-1 Delivers Record Number of Online Users on JD Edwards EnterpriseOne 9.0.1 Day in the Life Test
Sep 28, 2010 SPARC T3 Cryptography Performance Over 1.9x Increase in Throughput over UltraSPARC T2 Plus
Sep 28, 2010 SPARC T3-2 Delivers First Oracle E-Business X-Large Benchmark Self-Service (OLTP) Result
Sep 27, 2010 Sun Fire X2270 M2 Super-Linear Scaling of Hadoop Terasort and CloudBurst Benchmarks
Sep 27, 2010 SPARC T3-1 Shows Capabilities Running Online Auction Benchmark with Oracle Fusion Middleware
Sep 24, 2010 SPARC T3-2 sets World Record on SPECjvm2008 Benchmark
Sep 24, 2010 SPARC T3 Provides High Performance Security for Oracle Weblogic Applications
Sep 23, 2010 Sun Storage F5100 Flash Array with PCI-Express SAS-2 HBAs Achieves Over 17 GB/sec Read
Sep 23, 2010 SPARC T3-1 Performance on PeopleSoft Enterprise Financials 9.0 Benchmark
Sep 22, 2010 Oracle Solaris 10 9/10 ZFS OLTP Performance Improvements
Sep 22, 2010 SPARC T3-1 Supports 13,000 Users on Financial Services and Enterprise Application Integration Running Siebel CRM 8.1.1
Sep 21, 2010 ProMAX Performance and Throughput on Sun Fire X2270 and Sun Storage 7410
Sep 21, 2010 Sun Flash Accelerator F20 PCIe Cards Outperform IBM on SPC-1C
Sep 21, 2010 SPARC T3 Servers Deliver Top Performance on Oracle Communications Order and Service Management
Sep 20, 2010 Schlumberger's ECLIPSE 300 Performance Throughput On Sun Fire X2270 Cluster with Sun Storage 7410
Sep 20, 2010 Sun Fire X4470 4 Node Cluster Delivers World Record SAP SD-Parallel Benchmark Result
Sep 20, 2010 SPARC T3-4 Sets World Record Single Server Result on SPECjEnterprise2010 Benchmark
Aug 25, 2010 Transparent Failover with Solaris MPxIO and Oracle ASM
Aug 23, 2010 Repriced: SPC-1 Sun Storage 6180 Array (8Gb) 1.9x Better Than IBM DS5020 in Price-Performance
Aug 23, 2010 Repriced: SPC-2 (RAID 5 & 6 Results) Sun Storage 6180 Array (8Gb) Outperforms IBM DS5020 by up to 64% in Price-Performance
Jun 29, 2010 Sun Fire X2270 M2 Achieves Leading Single Node Results on ANSYS FLUENT Benchmark
Jun 29, 2010 Sun Fire X2270 M2 Demonstrates Outstanding Single Node Performance on MSC.Nastran Benchmarks
Jun 29, 2010 Sun Fire X2270 M2 Achieves Leading Single Node Results on ABAQUS Benchmark
Jun 29, 2010 Sun Fire X2270 M2 Sets World Record on SPEC OMP2001 Benchmark
Jun 29, 2010 Sun Fire X4170 M2 Sets World Record on SPEC CPU2006 Benchmark
Jun 29, 2010 Sun Blade X6270 M2 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4270 M2 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4470 Sets World Records on SPEC OMP2001 Benchmarks
Jun 28, 2010 Sun Fire X4470 Sets World Record on SPEC CPU2006 Rate Benchmark
Jun 28, 2010 Sun Fire X4470 2-Node Configuration Sets World Record for SAP SD-Parallel Benchmark
Jun 28, 2010 Sun Fire X4800 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4800 Sets World Records on SPEC CPU2006 Rate Benchmarks
Jun 10, 2010 Hyperion Essbase ASO World Record on Sun SPARC Enterprise M5000
Jun 09, 2010 PeopleSoft Payroll 500K Employees on Sun SPARC Enterprise M5000 World Record
Jun 03, 2010 Sun SPARC Enterprise T5440 World Record SPECjAppServer2004
May 11, 2010 Per-core Performance Myth Busting
Apr 14, 2010 Oracle Sun Storage F5100 Flash Array Delivers World Record SPC-1C Performance
Apr 13, 2010 Oracle Sun Flash Accelerator F20 PCIe Card Accelerates Web Caching Performance
Apr 06, 2010 WRF Benchmark: X6275 Beats Power6
Mar 29, 2010 Sun Blade X6275/QDR IB/ Reverse Time Migration
Feb 23, 2010 IBM POWER7 SPECfp_rate2006: Poor Scaling? Or Configuration Confusion?
Jan 25, 2010 Sun/Solaris Leadership in SAP SD Benchmarks and HP claims
Jan 21, 2010 SPARC Enterprise M4000 PeopleSoft NA Payroll 240K Employees Performance (16 Streams)
Dec 16, 2009 Sun Fire X4640 Delivers World Record x86 Result on SPEC OMPL2001
Nov 24, 2009 Sun M9000 Fastest SAP 2-tier SD Benchmark on current SAP EP4 for SAP ERP 6.0 (Unicode)
Nov 20, 2009 Sun Blade X6275 cluster delivers leading results for Fluent truck_111m benchmark
Nov 20, 2009 Sun Blade 6048 and Sun Blade X6275 NAMD Molecular Dynamics Benchmark beats IBM BlueGene/L
Nov 19, 2009 SPECmail2009: New World record on T5240 1.6GHz Sun 7310 and ZFS
Nov 18, 2009 Sun Flash Accelerator F20 PCIe Card Achieves 100K 4K IOPS and 1.1 GB/sec
Nov 05, 2009 New TPC-C World Record Sun/Oracle
Nov 02, 2009 Sun Blade X6275 Cluster Beats SGI Running Fluent Benchmarks
Nov 02, 2009 Sun Ultra 27 Delivers Leading Single Frame Buffer SPECviewperf 10 Results
Oct 28, 2009 SPC-2 Sun Storage 6780 Array RAID 5 & RAID 6 51% better $/performance than IBM DS5300
Oct 25, 2009 Sun C48 & Lustre fast for Seismic Reverse Time Migration using Sun X6275
Oct 25, 2009 Sun F5100 and Seismic Reverse Time Migration with faster Optimal Checkpointing
Oct 23, 2009 Wiki on performance best practices
Oct 20, 2009 Exadata V2 Information
Oct 15, 2009 Oracle Flash Cache - SGA Caching on Sun Storage F5100
Oct 13, 2009 Oracle Hyperion Sun M5000 and Sun Storage 7410
Oct 13, 2009 Sun T5440 Oracle BI EE Sun SPARC Enterprise T5440 World Record
Oct 13, 2009 SPECweb2005 on Sun SPARC Enterprise T5440 World Record using Solaris Containers and Sun Storage F5100 Flash
Oct 13, 2009 Oracle PeopleSoft Payroll (NA) Sun SPARC Enterprise M4000 and Sun Storage F5100 World Record Performance
Oct 13, 2009 SAP 2-tier SD Benchmark on Sun SPARC Enterprise M9000/32 SPARC64 VII
Oct 13, 2009 CP2K Life Sciences, Ab-initio Dynamics - Sun Blade 6048 Chassis with Sun Blade X6275 - Scalability and Throughput with Quad Data Rate InfiniBand
Oct 13, 2009 SAP 2-tier SD-Parallel on Sun Blade X6270 1-node, 2-node and 4-node
Oct 13, 2009 Halliburton ProMAX Oil & Gas Application Fast on Sun 6048/X6275 Cluster
Oct 13, 2009 SPECcpu2006 Results On MSeries Servers With Updated SPARC64 VII Processors
Oct 13, 2009 MCAE ABAQUS faster on Sun F5100 and Sun X4270 - Single Node World Record
Oct 12, 2009 MCAE ANSYS faster on Sun F5100 and Sun X4270
Oct 12, 2009 MCAE MCS/NASTRAN faster on Sun F5100 and Fire X4270
Oct 12, 2009 SPC-2 Sun Storage 6180 Array RAID 5 & RAID 6 Over 70% Better Price Performance than IBM
Oct 12, 2009 SPC-1 Sun Storage 6180 Array Over 70% Better Price Performance than IBM
Oct 12, 2009 Why Sun Storage F5100 is a good option for Peoplesoft NA Payroll Application
Oct 12, 2009 1.6 Million 4K IOPS in 1RU on Sun Storage F5100 Flash Array
Oct 11, 2009 TPC-C World Record Sun - Oracle
Oct 09, 2009 X6275 Cluster Demonstrates Performance and Scalability on WRF 2.5km CONUS Dataset
Oct 02, 2009 Sun X4270 VMware VMmark benchmark achieves excellent result
Sep 22, 2009 Sun X4270 Virtualized for Two-tier SAP ERP 6.0 Enhancement Pack 4 (Unicode) Standard Sales and Distribution (SD) Benchmark
Sep 01, 2009 String Searching - Sun T5240 & T5440 Outperform IBM Cell Broadband Engine
Aug 28, 2009 Sun X4270 World Record SAP-SD 2-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Aug 27, 2009 Sun SPARC Enterprise T5240 with 1.6GHz UltraSPARC T2 Plus Beats 4-Chip IBM Power 570 POWER6 System on SPECjbb2005
Aug 26, 2009 Sun SPARC Enterprise T5220 with 1.6GHz UltraSPARC T2 Sets Single Chip World Record on SPECjbb2005
Aug 12, 2009 SPECmail2009 on Sun SPARC Enterprise T5240 and Sun Java System Messaging Server 6.3
Jul 23, 2009 World Record Performance of Sun CMT Servers
Jul 22, 2009 Why does 1.6 beat 4.7?
Jul 21, 2009 Zeus ZXTM Traffic Manager World Record on Sun T5240
Jul 21, 2009 Sun T5440 Oracle BI EE World Record Performance
Jul 21, 2009 Sun T5440 World Record SAP-SD 4-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Jul 21, 2009 1.6 GHz SPEC CPU2006 - Rate Benchmarks
Jul 21, 2009 Sun Blade T6320 World Record SPECjbb2005 performance
Jul 21, 2009 New SPECjAppServer2004 Performance on the Sun SPARC Enterprise T5440
Jul 21, 2009 Sun T5440 SPECjbb2005 Beats IBM POWER6 Chip-to-Chip
Jul 21, 2009 New CMT results coming soon....
Jul 14, 2009 Vdbench: Sun StorageTek Vdbench, a storage I/O workload generator.
Jul 14, 2009 Storage performance and workload analysis using Swat.
Jul 10, 2009 World Record TPC-H@300GB Price-Performance for Windows on Sun Fire X4600 M2
Jul 06, 2009 Sun Blade 6048 Chassis with Sun Blade X6275: RADIOSS Benchmark Results
Jul 03, 2009 SPECmail2009 on Sun Fire X4275+Sun Storage 7110: Mail Server System Solution
Jun 30, 2009 Sun Blade 6048 and Sun Blade X6275 NAMD Molecular Dynamics Benchmark beats IBM BlueGene/L
Jun 26, 2009 Sun Fire X2270 Cluster Fluent Benchmark Results
Jun 25, 2009 Sun SSD Server Platform Bandwidth and IOPS (Speeds & Feeds)
Jun 24, 2009 I/O analysis using DTrace
Jun 23, 2009 New CPU2006 Records: 3x better integer throughput, 9x better fp throughput
Jun 23, 2009 Sun Blade X6275 results capture Top Places in CPU2006 SPEED Metrics
Jun 19, 2009 Pointers to Java Performance Tuning resources
Jun 19, 2009 SSDs in HPC: Reducing the I/O Bottleneck BluePrint Best Practices
Jun 17, 2009 The Performance Technology group wiki is alive!
Jun 17, 2009 Performance of Sun 7410 and 7310 Unified Storage Array Line
Jun 16, 2009 Sun Fire X2270 MSC/Nastran Vendor_2008 Benchmarks
Jun 15, 2009 Sun Fire X4600 M2 Server Two-tier SAP ERP 6.0 (Unicode) Standard Sales and Distribution (SD) Benchmark
Jun 12, 2009 Correctly comparing SAP-SD Benchmark results
Jun 12, 2009 OpenSolaris Beats Linux on memcached Sun Fire X2270
Jun 11, 2009 SAS Grid Computing 9.2 utilizing the Sun Storage 7410 Unified Storage System
Jun 10, 2009 Using Solaris Resource Management Utilities to Improve Application Performance
Jun 09, 2009 Free Compiler Wins Nehalem Race by 2x
Jun 08, 2009 Variety of benchmark results to be posted on BestPerf
Jun 05, 2009 Interpreting Sun's SPECpower_ssj2008 Publications
Jun 03, 2009 Wide Variety of Topics to be discussed on BestPerf
Jun 03, 2009 Welcome to BestPerf group blog!

Thursday Dec 02, 2010

World Record TPC-C Result on Oracle's SPARC Supercluster with T3-4 Servers

Oracle demonstrated the world's fastest database performance using 27 of Oracle's SPARC T3-4 servers, 138 Sun Storage F5100 Flash Array storage systems and Oracle Database 11g Release 2 Enterprise Edition with Real Application Clusters (RAC) and Partitioning delivered a world-record TPC-C benchmark result.

  • The SPARC T3-4 server cluster delivered a world record TPC-C benchmark result of 30,249,688 tpmC and $1.01 $/tpmC (USD) using Oracle Database 11g Release 2 on a configuration available 6/1/2011.

  • The SPARC T3-4 server cluster is 2.9x faster than the performance of the IBM Power 780 (POWER7 3.86 GHz) cluster with IBM DB2 9.7 database and has 27% better price/performance on the TPC-C benchmark. Almost identical price discount levels were applied by Oracle and IBM.

  • The Oracle solution has three times better performance than the IBM configuration and only used twice the power during the run of the TPC-C benchmark.  (Based upon IBM's own claims of energy usage from their August 17, 2010 press release.)

  • The Oracle solution delivered 2.9x the performance in only 71% of the space compared to the IBM TPC-C benchmark result.

  • The SPARC T3-4 server with Sun Storage F5100 Flash Array storage solution demonstrates 3.2x faster response time than IBM Power 780 (POWER7 3.86 GHz) result on the TPC-C benchmark.

  • Oracle used a single-image database, whereas IBM used 96 separate database partitions on their 3-node cluster. It is interesting to note that IBM used 32 database images instead of running each server as a simple SMP.

  • IBM did not use DB2 Enterprise Database, but instead IBM used "DB2 InfoSphere Warehouse 9.7" which is a data warehouse and data management product and not their flagship OLTP product.

  • The multi-node SPARC T3-4 server cluster is 7.4x faster than the HP Superdome (1.6 GHz Itanium2) solution and has 66% better price/performance on the TPC-C benchmark.

  • The Oracle solution utilized Oracle's Sun FlashFire technology to deliver this result. The Sun Storage F5100 Flash Array storage system was used for database storage.

  • Oracle Database 11g Enterprise Edition Release 2 with Real Application Clusters and Partitioning scales and effectively uses all of the nodes in this configuration to produce the world record TPC-C benchmark performance.

  • This result showed Oracle's integrated hardware and software stacks provide industry leading performance.

Performance Landscape

TPC-C results (sorted by tpmC, bigger is better)

System tpmC Price/tpmC Avail Database Cluster Racks
27 x SPARC T3-4 30,249,688 1.01 USD 6/1/2011 Oracle 11g RAC Y 15
3 x IBM Power 780 10,366,254 1.38 USD 10/13/10 DB2 9.7 Y 10
HP Integrity Superdome 4,092,799 2.93 USD 08/06/07 Oracle 10g R2 N 46

Avail - Availability date
Racks - Clients, servers, storage, infrastructure

Oracle and IBM TPC-C Response times

System tpmC Response Time (sec)
New Order 90th%
Response Time (sec)
New Order Average
27 x SPARC T3-4 30,249,688 0.750 0.352
3 x IBM Power 780 10,366,254 2.1 1.137
Response Time Ratio - Oracle Better 2.9x 2.8x 3.2x

Oracle uses Average New Order Response time for comparison between Oracle and IBM.

Graphs of Oracle's and IBM's response times for New-Order can be found in the full disclosure reports on TPC's website TPC-C Official Result Page.

Configuration Summary and Results

Hardware Configuration:

15 racks used to hold

Servers
27 x SPARC T3-4 servers, each with
4 x 1.65 GHz SPARC T3 processors
512 GB memory
3 x 300 GB 10K RPM 2.5" SAS disks

Data Storage
69 x Sun Fire X4270 M2 servers configured as COMSTAR heads, each with
1 x 2.93 GHz Intel Xeon X5670 processor
8 GB memory
9 x 2 TB 7.2K RPM 3.5" SAS disks
2 x Sun Storage F5100 Flash Array storage (1.92 TB each)
1 x Brocade DCX switch

Redo Storage
28 x Sun Fire X4270 M2 servers configured as COMSTAR heads, each with
1 x 2.93 GHz Intel Xeon X5670 processor
8 GB memory
11 x 2 TB 7.2K RPM 3.5" SAS disks
2 x Brocade 5300 switches

Clients
81 x Sun Fire X4170 M2 servers, each with
2 x 2.93 GHz Intel X5670 processors
48 GB memory
2 x 146 GB 10K RMP 2.5" SAS disks

Software Configuration:

Oracle Solaris 10 9/10 (for SPARC T3-4 and Sun Fire X4170 M2)
Oracle Solaris 11 Express (COMSTAR for Sun Fire X4270 M2)
Oracle Database 11g Release 2 Enterprise Edition with Real Application Clusters and Partitioning
Oracle iPlanet Web Server 7.0 U5
Tuxedo CFS-R Tier 1

Results:

System 27 x SPARC T3-4
tpmC 30,249,688
Price/tpmC 1.01 USD
Avail 6/1/2011
Database Oracle Database 11g RAC
Cluster yes
Racks 15
New Order Ave Response 0.352 seconds

Benchmark Description

TPC-C is an OLTP system benchmark. It simulates a complete environment where a population of terminal operators executes transactions against a database. The benchmark is centered around the principal activities (transactions) of an order-entry environment. These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses.

Key Points and Best Practices

  • Oracle Database 11g Release 2 Enterprise Edition with Real Application Clusters and Partitioning scales easily to this high level of performance.

  • Sun Storage F5100 Flash Array storage provides high performance, very low latency, and very high storage density.

  • COMSTAR (Common Multiprotocol SCSI Target), new in Oracle Solaris 11 Express, is the software framework that enables a Solaris host to serve as a SCSI Target platform. COMSTAR uses a modular approach to break the huge task of handling all the different pieces in a SCSI target subsystem into independent functional modules which are glued together by the SCSI Target Mode Framework (STMF). The modules implementing functionality at SCSI level (disk, tape, medium changer etc.) are not required to know about the underlying transport. And the modules implementing the transport protocol (FC, iSCSI, etc.) are not aware of the SCSI-level functionality of the packets they are transporting. The framework hides the details of allocation providing execution context and cleanup of SCSI commands and associated resources and simplifies the task of writing the SCSI or transport modules.

  • Oracle iPlanet Web Server 7.0 U5 is used in the user tier of the benchmark with each of the web server instance supporting more than a quarter-million users, while satisfying the stringent response time requirement from the TPC-C benchmark.

See Also

Disclosure Statement

TPC Benchmark C, tpmC, and TPC-C are trademarks of the Transaction Processing Performance Council (TPC). 27-node SPARC T3-4 Cluster (4 x 1.65 GHz SPARC T3 processors) with Oracle Database 11g Release 2 Enterprise Edition with Real Application Clusters and Partitioning, 30,249,688 tpmC, $1.01/tpmC, Available 6/1/2011. IBM Power 780 Cluster (3 nodes using 3.86 GHz POWER7 processors) with IBM DB2 InfoSphere Warehouse Ent. Base Ed. 9.7, 10,366,254 tpmC, $1.38 USD/tpmC, available 10/13/2010. HP Integrity Superdome(1.6GHz Itanium2, 64 processors, 128 cores, 256 threads) with Oracle 10g Enterprise Edition, 4,092,799 tpmC, $2.93/tpmC, available 8/06/07. Energy claims based upon IBM calculations and internal measurements. Source: http://www.tpc.org/tpcc, results as of 11/22/2010

World Record SPECweb2005 Result on SPARC T3-2 with Oracle iPlanet Web Server

Oracle's SPARC T3-2 server running Oracle iPlanet Web Server middleware delivered a world record SPECweb2005 benchmark result of 113,857. Oracle's 2-socket SPARC is 9% faster than the fastest 2-socket x86-based competitive server and even 8% faster than the 4-socket HP x86-based server.

  • The SPARC T3-2 server with dual 1.65 GHz SPARC T3 processors using Oracle iPlanet Web Server 7.0.9 middleware delivered a world record result of 113857 on the SPECweb2005 benchmark.

  • This result demonstrates that the SPARC T3-2 running Oracle Solaris and Oracle iPlanet Web Server can support thousands of concurrent web server sessions and is an industry leader in web serving with a high performance and enterprise quality solution.

  • Oracle is the only SPECweb2005 benchmark sponsor who can demonstrate top performance using a commercially viable and production quality web serving solution with the Oracle iPlanet Web Server and the Oracle Solaris 10 operating system.

  • On the SPECweb2005 benchmark, the SPARC T3-2 server with two 1.65 GHz SPARC T3 processors is 8% faster than the latest Hewlett-Packard result that was just published on the HP ProLiant DL585 G7 with four 2.0 GHz AMD 6128HE processors.

  • On the SPECweb2005 benchmark, the SPARC T3-2 server with two 1.65 GHz SPARC T3 processors is 9% faster than the Fujitsu PRIMERGY TX300 S6 with two 3.3 GHz Intel X5680 processors.

  • On the SPECweb2005 benchmark, the SPARC T3-2 server with two 1.65 GHz SPARC T3 processors is 37% faster than the HP ProLiant DL370 G6 with two 3.2 GHz Intel W5580 processors.

  • On the Support workload of SPECweb2005, the SPARC T3-2 server with two 1.65 GHz SPARC T3 processors obtained a 41% higher score than the Fujitsu PRIMERGY TX300 S6 with two 3.3 GHz Intel X5680 processors.

  • The SPARC T3-2 server obtained 14.4 times the result of the 4-core IBM System p5 550 1.9 GHz POWER5+ system on the SPECweb2005 benchmark. There are no IBM POWER7 or POWER6 based system results published on the SPECweb2005 benchmark.

Performance Landscape

SPECweb2005 select results as of 8 December 2010. See the SPEC website for more. Information ordered by Result, bigger is better.

Server Processor OS SPECweb2005 Performance (\*) Web Server
Result Bank Ecom Supp
SPARC T3-2 2 x 1.65 T3 Solaris 113857 165024 160056 123840 iPlanet
HP DL585 G7 4 x 2.0 6128HE RedHat Linux 105586 168192 175104 88576 Rock
Fujitsu TX300 S6 2 x 3.33 X5680 RedHat Linux 104422 162000 177000 88000 Rock
Sun T5440 4 x 1.6 T2 Plus Solaris 100209 176500 133000 95000 Sun
Fujitsu TX300 S5 2 x 2.93 X5570 RedHat Linux 83198 106000 140000 86000 Rock
HP ML370 G6 2 x 3.2 W5580 RedHat Linux 83073 117120 142080 76352 Rock
HP DL370 G6 2 x 3.2 W5580 RedHat Linux 83073 117120 142080 76352 Rock
HP DL585 G5 4 x 3.1 Opt8393 RedHat Linux 71629 117504 123072 56320 Rock
IBM p5 550 2 x 1.9 POWER5+ SuSE Linux 7881 12240 11820 7500 Zeus

(\*) Metrics are
Result - SPECweb2005, overall metric
Bank - SPECweb2005_banking, Banking component metric
Ecom - SPECweb2005_ecommerce, ECommerce component metric
Supp - SPECweb2005_support, Support component metric

Configuration Summary

Hardware Configuration:

1 SPARC T3-2 with
2 x 1.65 GHz SPARC T3 processors
256 GB memory
2 x Sun Storage F5100 Flash Array
4 x Dual 10 GbE SFP+ PCIe LP
4 x 6 GB SAS PCIe HBA

Software Configuration:

Oracle Solaris 10 9/10
Oracle iPlanet Web Server 7.0.9
Java Platform, Standard Edition version 1.6.0_21-b06
Java Hotspot Server VM version 17.0-b16, mixed mode

Benchmark Description

SPECweb2005, successor to SPECweb99 and SPECweb99_SSL, is an industry standard benchmark for evaluating Web Server performance developed by SPEC. The benchmark simulates multiple user sessions accessing a Web Server and generating static and dynamic HTTP requests. The major features of SPECweb2005 are:

  • Measures simultaneous user sessions
  • Dynamic content: currently PHP and JSP implementations
  • Page images requested using 2 parallel HTTP connections
  • Multiple, standardized workloads: Banking (HTTPS), E-commerce (HTTP and HTTPS), and Support (HTTP)
  • Simulates browser caching effects
  • File accesses more accurately simulate today's disk access patterns

SPEC requires the server under test to support SSL Protocol V3 (SSLv3).

Of the various ciphers supported in SSLv3, cipher SSL_RSA_WITH_RC4_128_MD5 is currently required for all workload components that use SSL. It was selected as one of the most commonly used SSLv3 ciphers and allows results to be directly compared to each other. SSL_RSA_WITH_RC4_128_MD5 consists of:

  • RSA public key (asymmetric) encryption with a 1024-bit key
  • RC4 symmetric encryption with a 128-bit key for bulk data encryption
  • MD5 digest algorithm with 128-bit output for the Message Authentication Code (MAC)

A compliant result must use the cipher suite listed above, and must employ the 1024 bit key for RSA public key encryption, 128-bit key for RC4 bulk data encryption, and have a 128-bit output for the Message Authentication code.

All Banking workload request to the server under test use SSL, where the Ecommerce workload requests are a mix of SSL and non-SSL. Non of the Support workload requests to server under test use SSL.

Key Points and Best Practices

  • When multiple 10 GbE Dual Port NICs are used, it is best practice to equally divide these NICs between PCI root nodes that are available.

  • Two web server instances was used. One web server instance was bound to a processor set with CPUs in the first processor chip. The other web server instance was bound to a processor set with CPUs in the second processor chip. The web server instance bound to CPUs in first processor chip was listening on the NIC IP addresses on that processor's chip PCI root node. The same was done with web server instance bound to CPUs in second processor chip. This was done to improve the locality of the processing.

  • Each web server is executed in the FX scheduling class to improve performance by reducing the frequency of context switches.

See Also

Disclosure Statement

SPEC and the benchmark name SPECweb are registered trademarks of Standard Performance Evaluation Corporation. Results are from www.spec.org as of December 8, 2010 and this report. Oracle, SPARC T3-2, 113,857 SPECweb2005. HP ProLiant DL585 G7, 105,586 SPECweb2005. Fujitsu PRIMERGY TX300 S6, 104,422 SPECweb2005. Sun SPARC Enterprise T5440, 100,209 SPECweb2005. Fujitsu PRIMERGY TX300 S5, 83,198 SPECweb2005. HP ProLiant ML370 G6, 83,073 SPECweb2005. HP ProLiant DL370 G6, 83,073 SPECweb2005. HP ProLiant DL585 G5, 71,629 SPECweb2005. IBM System p5 550, 7,881 SPECweb2005.

World Record Performance on PeopleSoft Enterprise Financials Benchmark run on Sun SPARC Enterprise M4000 and M5000

Oracle's Sun SPARC Enterprise M4000 and M5000 servers have combined to produce a world record result on Oracle's PeopleSoft Enterprise Financial Management 9.0 benchmark.

  • The Sun SPARC Enterprise M4000 and M5000 servers configured with SPARC64 VII+ processors along with Oracle's Sun Storage F5100 Flash Array system achieved a world record result using PeopleSoft Enterprise Financial Management and Oracle Database 11g Release 2 software running on the Oracle Solaris 10 operating system.

  • The PeopleSoft Enterprise Financial Management solution processed online business transactions to support 1000 concurrent users using 32 application server threads with compliant response times while simultaneously completing complex batch jobs in record time.

  • The Sun Storage F5100 Flash Array system is a high performance, high-density solid-state flash array which provides a read latency of only 0.5 msec which is about 10 times faster than the normal disk latencies of 5 msec measured on this benchmark.

  • The Sun SPARC Enterprise M4000 and M5000 servers were able to process online users and concurrent batch jobs simultaneously in 34.72 minutes on this benchmark that reflects complex, multi-tier environment and utilizes a large back-end database of nearly 1 TB.

  • The combination of Oracle's PeopleSoft Enterprise Financial Management 9.00.00.331, PeopleSoft PeopleTools 8.49.23 and Oracle WebLogic server was run on the Sun SPARC Enterprise M4000 server and Oracle database 11g Release 2 was run on the Sun SPARC Enterprise M5000 server for this benchmark.

Performance Landscape

The following table discloses the current and the single previously disclosed result for this benchmark. Results are elapsed times therefore the smaller number is better.

Servers CPU Tier Batch (mins) Batch
w/Online (mins)
Sun SPARC Enterprise M4000 2.66 GHz SPARC64 VII+ Web/App
33.09
34.72
Sun SPARC Enterprise M5000 2.66 GHz SPARC64 VII+ DB

SPARC T3-1 1.65 GHz SPARC T3 Web/App 35.82 37.01
Sun SPARC Enterprise M5000 2.5 GHz SPARC64 VII DB

Configuration Summary

Web/Application Tier Configuration:

1 x Sun SPARC Enterprise M4000
4 x 2.66 GHz SPARC64 VII+ processors
128 GB of memory

Database Tier Configuration:

1 x Sun SPARC Enterprise M5000
8 x 2.66 GHz SPARC64 VII+ processors
128 GB of memory
1 x Sun Storage F5100 Flash Array (74 x 24 GB FMODs)
2 x StorageTek 2540 (12 x 146 GB SAS 15K RPM)
1 x StorageTek 2501 (12 x 146 GB SAS 15K RPM)
1 x Dual-Port SAS Fibre Channel Host Bus Adapters (HBA)

Software Configurations:

Oracle Solaris 10 10/09
PeopleSoft Enterprise Financial Management/SCM 9.00.00.311 64-bit
PeopleSoft Enterprise (PeopleTools) 8.49.23 64-bit
Oracle Database 11g Release 2 11.1.0.6 64-bit
Oracle Tuxedo 9.1 RP36 with Jolt 9.1
Micro Focus COBOL Server Express 4.0 SP4 64-bit

Benchmark Description

This Day-in-the-Life benchmark measured the concurrent batch and online performance for a large database model. This scenario more accurately represents a production environment where users and scheduled batch jobs must run concurrently. This benchmark measured performance results during a Close-the-Books process.

The PeopleSoft Enterprise Financials 9 batch processes included in this benchmark are as follows:

  • Journal Generator: (AE) This process creates journals from accounting entries (AE) generated from various data sources, including non-PeopleSoft systems as well as PeopleSoft applications. In the benchmark, the Journal Generator (FS_JGEN) process is set up to create accounting entries from Oracle's PeopleSoft applications in the same database, such as PeopleSoft Enterprise Payables, Receivables, Asset Management, Expenses, Cash Management. The process is run with the option of Edit and Post turned on to edit and post the journals created by Journal generator. Journal Edit is an AE program and Post is a COBOL program.

  • Allocation: (AE) This process allocates balances held or accumulated in one or more entities to more than one business unit, department or other entities based on user-defined rules.

  • Journal Edit & Post: (AE & COBOL) Journal Edit validates journal transactions before posting them to the ledger. This validation ensures that journals are valid, for example: valid ChartFields values and combinations, debits and credits equal, and inter/intra-unit balanced, Journal Post process posts only valid, edited journals, ensures each journal line posts to the appropriate target detail ledgers, and then changes the journal's status to posted. In this benchmark, the Journal Edit & Post is also set up to edit and post Oracle's PeopleSoft applications from another database, such as PeopleSoft Enterprise Payroll data.

  • Summary Ledger: (AE) Summary Ledger processing summarizes detail ledger data across selected GL BUs. Summary Ledgers can be generated for reporting purposes or used in consolidations.

  • Consolidations: (COBOL) Consolidation processing summarizes ledger balances and generates elimination journal entries across business units based on user-defined rules.

  • SQR & nVision Reporting: Reporting will consist of nVision and SQR reports. A balance sheet, an income statement, and a trial balance will be generated for each GL BU by SQR processes GLS7002 and GLS7012. The consolidated results of the nVision reports are run by 10 nVision users using 4 standard delivered report request definitions such as BALANCE, INCOME, CONSBAL, and DEPTINC. Each of the nVision users will have ownership over 10 Business Units and each of the nVision users will submit multiple runs that are being executed in parallel to generate a total of 40 nVision reports.

Batch processes are run concurrently with more than 1000 emulated users executing 30 pre-defined online applications. Response times for the online applications are collected and must conform to a maximum time.

Key Points and Best Practices

The Sun SPARC Enterprise M4000 and M5000 servers were able process online users and concurrent batch jobs simultaneously in 34.72 minutes.

The Sun Storage F5100 Flash Array system, which is highly tuned for IOPS, contributed to the result through reduced IO latency.

The family of Sun SPARC Enterprise M-series servers, with Sun Storage F5100 Flash Array systems, form an ideal environment for hosting complex multi-tier applications. This is the second public disclosure of any system running this benchmark.

The Sun SPARC Enterprise M4000 server hosted the web and application server tiers providing good response time to emulated user requests. The benchmark specification allows 1000 users, but there is headroom for increased load.

The Sun SPARC Enterprise M5000 server was used for the database server along with a Sun Storage F5100 Flash Array system. The speed of the M-series server with the low latency of the Flash Array provided the overall low latency for user requests, even while completing complex batch jobs.

Despite the systems being lightly loaded, the increased frequency of the SPARC64 VII+ processors yielded lower latencies and faster elapsed times than previously disclosed results.

The low latency of the Sun Storage F5100 Flash Array storage contributed to the excellent response times of emulated users by making data quickly available to the database back-end. The array was configured as several RAID 0 volumes and data was distributed across the volumes, maximizing storage bandwidth.

The transaction processing capacity of the Sun SPARC Enterprise M5000 server enabled very fast batch processing times while supporting over 1000 online users.

While running the maximum workload specified by the benchmark, the systems were lightly loaded, providing headroom to grow.

Please see the white paper for information on PeopleSoft payroll best practices using flash.

See Also

Disclosure Statement

Oracle's PeopleSoft Financials 9.0 benchmark, Oracle's Sun SPARC Enterprise M4000 (4 2.66 SPARC64 VII+), Oracle's Sun SPARC Enterprise M5000 (8 2.66 SPARC64 VII+), 34.72 min. Results as of 12/02/2010, see www.oracle.com/apps_benchmark/html/white-papers-peoplesoft.html for more about PeopleSoft.

Monday Nov 01, 2010

BestPerf Index 1 November 2010

This is an occasionally-generated index of previous entries in the BestPerf blog. Skip to next entry

Colors used:

Benchmark
Best Practices
Other

Oct 26, 2010 3D VTI Reverse Time Migration Scalability On Sun Fire X2270-M2 Cluster with Sun Storage 7210
Oct 11, 2010 Sun SPARC Enterprise M9000 Server Delivers World Record Non-Clustered TPC-H @3000GB Performance
Sep 30, 2010 Consolidation of 30 x86 Servers onto One SPARC T3-2
Sep 29, 2010 SPARC T3-1 Delivers Record Number of Online Users on JD Edwards EnterpriseOne 9.0.1 Day in the Life Test
Sep 28, 2010 SPARC T3 Cryptography Performance Over 1.9x Increase in Throughput over UltraSPARC T2 Plus
Sep 28, 2010 SPARC T3-2 Delivers First Oracle E-Business X-Large Benchmark Self-Service (OLTP) Result
Sep 27, 2010 Sun Fire X2270 M2 Super-Linear Scaling of Hadoop Terasort and CloudBurst Benchmarks
Sep 27, 2010 SPARC T3-1 Shows Capabilities Running Online Auction Benchmark with Oracle Fusion Middleware
Sep 24, 2010 SPARC T3-2 sets World Record on SPECjvm2008 Benchmark
Sep 24, 2010 SPARC T3 Provides High Performance Security for Oracle Weblogic Applications
Sep 23, 2010 Sun Storage F5100 Flash Array with PCI-Express SAS-2 HBAs Achieves Over 17 GB/sec Read
Sep 23, 2010 SPARC T3-1 Performance on PeopleSoft Enterprise Financials 9.0 Benchmark
Sep 22, 2010 Oracle Solaris 10 9/10 ZFS OLTP Performance Improvements
Sep 22, 2010 SPARC T3-1 Supports 13,000 Users on Financial Services and Enterprise Application Integration Running Siebel CRM 8.1.1
Sep 21, 2010 ProMAX Performance and Throughput on Sun Fire X2270 and Sun Storage 7410
Sep 21, 2010 Sun Flash Accelerator F20 PCIe Cards Outperform IBM on SPC-1C
Sep 21, 2010 SPARC T3 Servers Deliver Top Performance on Oracle Communications Order and Service Management
Sep 20, 2010 Schlumberger's ECLIPSE 300 Performance Throughput On Sun Fire X2270 Cluster with Sun Storage 7410
Sep 20, 2010 Sun Fire X4470 4 Node Cluster Delivers World Record SAP SD-Parallel Benchmark Result
Sep 20, 2010 SPARC T3-4 Sets World Record Single Server Result on SPECjEnterprise2010 Benchmark
Aug 25, 2010 Transparent Failover with Solaris MPxIO and Oracle ASM
Aug 23, 2010 Repriced: SPC-1 Sun Storage 6180 Array (8Gb) 1.9x Better Than IBM DS5020 in Price-Performance
Aug 23, 2010 Repriced: SPC-2 (RAID 5 & 6 Results) Sun Storage 6180 Array (8Gb) Outperforms IBM DS5020 by up to 64% in Price-Performance
Jun 29, 2010 Sun Fire X2270 M2 Achieves Leading Single Node Results on ANSYS FLUENT Benchmark
Jun 29, 2010 Sun Fire X2270 M2 Demonstrates Outstanding Single Node Performance on MSC.Nastran Benchmarks
Jun 29, 2010 Sun Fire X2270 M2 Achieves Leading Single Node Results on ABAQUS Benchmark
Jun 29, 2010 Sun Fire X2270 M2 Sets World Record on SPEC OMP2001 Benchmark
Jun 29, 2010 Sun Fire X4170 M2 Sets World Record on SPEC CPU2006 Benchmark
Jun 29, 2010 Sun Blade X6270 M2 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4270 M2 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4470 Sets World Records on SPEC OMP2001 Benchmarks
Jun 28, 2010 Sun Fire X4470 Sets World Record on SPEC CPU2006 Rate Benchmark
Jun 28, 2010 Sun Fire X4470 2-Node Configuration Sets World Record for SAP SD-Parallel Benchmark
Jun 28, 2010 Sun Fire X4800 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4800 Sets World Records on SPEC CPU2006 Rate Benchmarks
Jun 10, 2010 Hyperion Essbase ASO World Record on Sun SPARC Enterprise M5000
Jun 09, 2010 PeopleSoft Payroll 500K Employees on Sun SPARC Enterprise M5000 World Record
Jun 03, 2010 Sun SPARC Enterprise T5440 World Record SPECjAppServer2004
May 11, 2010 Per-core Performance Myth Busting
Apr 14, 2010 Oracle Sun Storage F5100 Flash Array Delivers World Record SPC-1C Performance
Apr 13, 2010 Oracle Sun Flash Accelerator F20 PCIe Card Accelerates Web Caching Performance
Apr 06, 2010 WRF Benchmark: X6275 Beats Power6
Mar 29, 2010 Sun Blade X6275/QDR IB/ Reverse Time Migration
Feb 23, 2010 IBM POWER7 SPECfp_rate2006: Poor Scaling? Or Configuration Confusion?
Jan 25, 2010 Sun/Solaris Leadership in SAP SD Benchmarks and HP claims
Jan 21, 2010 SPARC Enterprise M4000 PeopleSoft NA Payroll 240K Employees Performance (16 Streams)
Dec 16, 2009 Sun Fire X4640 Delivers World Record x86 Result on SPEC OMPL2001
Nov 24, 2009 Sun M9000 Fastest SAP 2-tier SD Benchmark on current SAP EP4 for SAP ERP 6.0 (Unicode)
Nov 20, 2009 Sun Blade X6275 cluster delivers leading results for Fluent truck_111m benchmark
Nov 20, 2009 Sun Blade 6048 and Sun Blade X6275 NAMD Molecular Dynamics Benchmark beats IBM BlueGene/L
Nov 19, 2009 SPECmail2009: New World record on T5240 1.6GHz Sun 7310 and ZFS
Nov 18, 2009 Sun Flash Accelerator F20 PCIe Card Achieves 100K 4K IOPS and 1.1 GB/sec
Nov 05, 2009 New TPC-C World Record Sun/Oracle
Nov 02, 2009 Sun Blade X6275 Cluster Beats SGI Running Fluent Benchmarks
Nov 02, 2009 Sun Ultra 27 Delivers Leading Single Frame Buffer SPECviewperf 10 Results
Oct 28, 2009 SPC-2 Sun Storage 6780 Array RAID 5 & RAID 6 51% better $/performance than IBM DS5300
Oct 25, 2009 Sun C48 & Lustre fast for Seismic Reverse Time Migration using Sun X6275
Oct 25, 2009 Sun F5100 and Seismic Reverse Time Migration with faster Optimal Checkpointing
Oct 23, 2009 Wiki on performance best practices
Oct 20, 2009 Exadata V2 Information
Oct 15, 2009 Oracle Flash Cache - SGA Caching on Sun Storage F5100
Oct 13, 2009 Oracle Hyperion Sun M5000 and Sun Storage 7410
Oct 13, 2009 Sun T5440 Oracle BI EE Sun SPARC Enterprise T5440 World Record
Oct 13, 2009 SPECweb2005 on Sun SPARC Enterprise T5440 World Record using Solaris Containers and Sun Storage F5100 Flash
Oct 13, 2009 Oracle PeopleSoft Payroll (NA) Sun SPARC Enterprise M4000 and Sun Storage F5100 World Record Performance
Oct 13, 2009 SAP 2-tier SD Benchmark on Sun SPARC Enterprise M9000/32 SPARC64 VII
Oct 13, 2009 CP2K Life Sciences, Ab-initio Dynamics - Sun Blade 6048 Chassis with Sun Blade X6275 - Scalability and Throughput with Quad Data Rate InfiniBand
Oct 13, 2009 SAP 2-tier SD-Parallel on Sun Blade X6270 1-node, 2-node and 4-node
Oct 13, 2009 Halliburton ProMAX Oil & Gas Application Fast on Sun 6048/X6275 Cluster
Oct 13, 2009 SPECcpu2006 Results On MSeries Servers With Updated SPARC64 VII Processors
Oct 13, 2009 MCAE ABAQUS faster on Sun F5100 and Sun X4270 - Single Node World Record
Oct 12, 2009 MCAE ANSYS faster on Sun F5100 and Sun X4270
Oct 12, 2009 MCAE MCS/NASTRAN faster on Sun F5100 and Fire X4270
Oct 12, 2009 SPC-2 Sun Storage 6180 Array RAID 5 & RAID 6 Over 70% Better Price Performance than IBM
Oct 12, 2009 SPC-1 Sun Storage 6180 Array Over 70% Better Price Performance than IBM
Oct 12, 2009 Why Sun Storage F5100 is a good option for Peoplesoft NA Payroll Application
Oct 12, 2009 1.6 Million 4K IOPS in 1RU on Sun Storage F5100 Flash Array
Oct 11, 2009 TPC-C World Record Sun - Oracle
Oct 09, 2009 X6275 Cluster Demonstrates Performance and Scalability on WRF 2.5km CONUS Dataset
Oct 02, 2009 Sun X4270 VMware VMmark benchmark achieves excellent result
Sep 22, 2009 Sun X4270 Virtualized for Two-tier SAP ERP 6.0 Enhancement Pack 4 (Unicode) Standard Sales and Distribution (SD) Benchmark
Sep 01, 2009 String Searching - Sun T5240 & T5440 Outperform IBM Cell Broadband Engine
Aug 28, 2009 Sun X4270 World Record SAP-SD 2-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Aug 27, 2009 Sun SPARC Enterprise T5240 with 1.6GHz UltraSPARC T2 Plus Beats 4-Chip IBM Power 570 POWER6 System on SPECjbb2005
Aug 26, 2009 Sun SPARC Enterprise T5220 with 1.6GHz UltraSPARC T2 Sets Single Chip World Record on SPECjbb2005
Aug 12, 2009 SPECmail2009 on Sun SPARC Enterprise T5240 and Sun Java System Messaging Server 6.3
Jul 23, 2009 World Record Performance of Sun CMT Servers
Jul 22, 2009 Why does 1.6 beat 4.7?
Jul 21, 2009 Zeus ZXTM Traffic Manager World Record on Sun T5240
Jul 21, 2009 Sun T5440 Oracle BI EE World Record Performance
Jul 21, 2009 Sun T5440 World Record SAP-SD 4-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Jul 21, 2009 1.6 GHz SPEC CPU2006 - Rate Benchmarks
Jul 21, 2009 Sun Blade T6320 World Record SPECjbb2005 performance
Jul 21, 2009 New SPECjAppServer2004 Performance on the Sun SPARC Enterprise T5440
Jul 21, 2009 Sun T5440 SPECjbb2005 Beats IBM POWER6 Chip-to-Chip
Jul 21, 2009 New CMT results coming soon....
Jul 14, 2009 Vdbench: Sun StorageTek Vdbench, a storage I/O workload generator.
Jul 14, 2009 Storage performance and workload analysis using Swat.
Jul 10, 2009 World Record TPC-H@300GB Price-Performance for Windows on Sun Fire X4600 M2
Jul 06, 2009 Sun Blade 6048 Chassis with Sun Blade X6275: RADIOSS Benchmark Results
Jul 03, 2009 SPECmail2009 on Sun Fire X4275+Sun Storage 7110: Mail Server System Solution
Jun 30, 2009 Sun Blade 6048 and Sun Blade X6275 NAMD Molecular Dynamics Benchmark beats IBM BlueGene/L
Jun 26, 2009 Sun Fire X2270 Cluster Fluent Benchmark Results
Jun 25, 2009 Sun SSD Server Platform Bandwidth and IOPS (Speeds & Feeds)
Jun 24, 2009 I/O analysis using DTrace
Jun 23, 2009 New CPU2006 Records: 3x better integer throughput, 9x better fp throughput
Jun 23, 2009 Sun Blade X6275 results capture Top Places in CPU2006 SPEED Metrics
Jun 19, 2009 Pointers to Java Performance Tuning resources
Jun 19, 2009 SSDs in HPC: Reducing the I/O Bottleneck BluePrint Best Practices
Jun 17, 2009 The Performance Technology group wiki is alive!
Jun 17, 2009 Performance of Sun 7410 and 7310 Unified Storage Array Line
Jun 16, 2009 Sun Fire X2270 MSC/Nastran Vendor_2008 Benchmarks
Jun 15, 2009 Sun Fire X4600 M2 Server Two-tier SAP ERP 6.0 (Unicode) Standard Sales and Distribution (SD) Benchmark
Jun 12, 2009 Correctly comparing SAP-SD Benchmark results
Jun 12, 2009 OpenSolaris Beats Linux on memcached Sun Fire X2270
Jun 11, 2009 SAS Grid Computing 9.2 utilizing the Sun Storage 7410 Unified Storage System
Jun 10, 2009 Using Solaris Resource Management Utilities to Improve Application Performance
Jun 09, 2009 Free Compiler Wins Nehalem Race by 2x
Jun 08, 2009 Variety of benchmark results to be posted on BestPerf
Jun 05, 2009 Interpreting Sun's SPECpower_ssj2008 Publications
Jun 03, 2009 Wide Variety of Topics to be discussed on BestPerf
Jun 03, 2009 Welcome to BestPerf group blog!

Tuesday Oct 26, 2010

3D VTI Reverse Time Migration Scalability On Sun Fire X2270-M2 Cluster with Sun Storage 7210

This Oil & Gas benchmark shows the Sun Storage 7210 system delivers almost 2 GB/sec bandwidth and realizes near-linear scaling performance on a cluster of 16 Sun Fire X2270 M2 servers.

Oracle's Sun Storage 7210 system attached via QDR InfiniBand to a cluster of sixteen of Oracle's Sun Fire X2270 M2 servers was used to demonstrate the performance of a Reverse Time Migration application, an important application in the Oil & Gas industry. The total application throughput and computational kernel scaling are presented for two production sized grids of 800 samples.

  • Both the Reverse Time Migration I/O and combined computation shows near-linear scaling from 8 to 16 nodes on the Sun Storage 7210 system connected via QDR InfiniBand to a Sun Fire X2270 M2 server cluster:

      1243 x 1151 x 1231: 2.0x improvement
      2486 x 1151 x 1231: 1.7x improvement
  • The computational kernel of the Reverse Time Migration has linear to super-linear scaling from 8 to 16 nodes in Oracle's Sun Fire X2270 M2 server cluster:

      1243 x 1151 x 1231 : 2.2x improvement
      2486 x 1151 x 1231 : 2.0x improvement
  • Intel Hyper-Threading provides additional performance benefits to both the Reverse Time Migration I/O and computation when going from 12 to 24 OpenMP threads on the Sun Fire X2270 M2 server cluster:

      1243 x 1151 x 1231: 8% - computational kernel; 2% - total application throughput
      2486 x 1151 x 1231: 12% - computational kernel; 6% - total application throughput
  • The Sun Storage 7210 system delivers the Velocity, Epsilon, and Delta data to the Reverse Time Migration at a steady rate even when timing includes memory initialization and data object creation:

      1243 x 1151 x 1231: 1.4 to 1.6 GBytes/sec
      2486 x 1151 x 1231: 1.2 to 1.3 GBytes/sec

    One can see that when doubling the size of the problem, the additional complexity of overlapping I/O and multiple node file contention only produces a small reduction in read performance.

Performance Landscape

Application Scaling

Performance and scaling results of the total application, including I/O, for the reverse time migration demonstration application are presented. Results were obtained using a Sun Fire X2270 M2 server cluster with a Sun Storage 7210 system for the file server. The servers were running with hyperthreading enabled, allowing for 24 OpenMP threads per server.

Application Scaling Across Multiple Nodes
Number Nodes Grid Size - 1243 x 1151 x 1231 Grid Size - 2486 x 1151 x1231
Total Time (sec) Kernel Time (sec) Total Speedup Kernel Speedup Total Time (sec) Kernel Time (sec) Total Speedup Kernel Speedup
16 504 259 2.0 2.2\* 1024 551 1.7 2.0
14 565 279 1.8 2.0 1191 677 1.5 1.6
12 662 343 1.6 1.6 1426 817 1.2 1.4
10 784 394 1.3 1.4 1501 856 1.2 1.3
8 1024 560 1.0 1.0 1745 1108 1.0 1.0

\* Super-linear scaling due to the compute kernel fitting better into available cache

Application Scaling – Hyper-Threading Study

The affects of hyperthreading are presented when running the reverse time migration demonstration application. Results were obtained using a Sun Fire X2270 M2 server cluster with a Sun Storage 7210 system for the file server.

Hyper-Threading Comparison – 12 versus 24 OpenMP Threads
Number Nodes Thread per Node Grid Size - 1243 x 1151 x 1231 Grid Size - 2486 x 1151 x1231
Total Time (sec) Kernel Time (sec) Total HT Speedup Kernel HT Speedup Total Time (sec) Kernel Time (sec) Total HT Speedup Kernel HT Speedup
16 24 504 259 1.02 1.08 1024 551 1.06 1.12
16 12 515 279 1.00 1.00 1088 616 1.00 1.00

Read Performance

Read performance is presented for the velocity, epsilon and delta files running the reverse time migration demonstration application. Results were obtained using a Sun Fire X2270 M2 server cluster with a Sun Storage 7210 system for the file server. The servers were running with hyperthreading enabled, allowing for 24 OpenMP threads per server.

Velocity, Epsilon, and Delta File Read and Memory Initialization Performance
Number Nodes Overlap MBytes Read Grid Size - 1243 x 1151 x 1231 Grid Size - 2486 x 1151 x1231
Time (sec) Time Relative 8-node Total GBytes Read Read Rate GB/s Time (sec) Time Relative 8-node Total GBytes Read Read Rate GB/s
16 2040 16.7 1.1 23.2 1.4 36.8 1.1 44.3 1.2
8 951
14.8 1.0 22.1 1.6 33.0 1.0 43.2 1.3

Configuration Summary

Hardware Configuration:

16 x Sun Fire X2270 M2 servers, each with
2 x 2.93 GHz Intel Xeon X5670 processors
48 GB memory (12 x 4 GB at 1333 MHz)

Sun Storage 7210 system connected via QDR InfiniBand
2 x 18 GB SATA SSD (logzilla)
40 x 1 TB 7200 RM SATA disk

Software Configuration:

SUSE Linux Enterprise Server SLES 10 SP 2
Oracle Message Passing Toolkit 8.2.1 (for MPI)
Sun Studio 12 Update 1 C++, Fortran, OpenMP

Benchmark Description

This Reverse Time Migration (RTM) demonstration application measures the total time it takes to image 800 samples of various production size grids and write the final image to disk. In this version, each node reads in only the trace, velocity, and conditioning data to be processed by that node plus a four element inline 3-D array pad (spatial order of eight) shared with its neighbors to the left and right during the initialization phase. It represents a full RTM application including the data input, computation, communication, and final output image to be used by the next work flow step involving 3D volumetric seismic interpretation.

Key Points and Best Practices

This demonstration application represents a full Reverse Time Migration solution. Many references to the RTM application tend to focus on the compute kernel and ignore the complexity that the input, communication, and output bring to the task.

I/O Characterization without Optimal Checkpointing

Velocity, Epsilon, and Delta Files - Grid Reading

The additional amount of overlapping reads to share velocity, epsilon, and delta edge data with neighbors can be calculated using the following equation:

    (number_nodes - 1) x (order_in_space) x (y_dimension) x (z_dimension) x (4 bytes) x (3 files)

For this particular benchmark study, the additional 3-D pad overlap for the 16 and 8 node cases is:

    16 nodes: 15 x 8 x 1151 x 1231 x 4 x 3 = 2.04 GB extra
    8 nodes: 7 x 8 x 1151 x 1231 x 4 x 3 = 0.95 GB extra

For the first of the two test cases, the total size of the three files used for the 1243 x 1151 x 1231 case is

    1243 x 1151 x 1231 x 4 bytes = 7.05 GB per file x 3 files = 21.13 GB

With the additional 3-D pad, the total amount of data read is:

    16 nodes: 2.04 GB + 21.13 GB = 23.2 GB
    8 nodes: 0.95 GB + 21.13 GB = 22.1 GB

For the second of the two test cases, the total size of the three files used for the 2486 x 1151 x 1231 case is

    2486 x 1151 x 1231 x 4 bytes = 14.09 GB per file x 3 files = 42.27 GB

With the additional pad based on the number of nodes, the total amount of data read is:

    16 nodes: 2.04 GB + 42.27 GB = 44.3 GB
    8 nodes: 0.95 GB + 42.27 GB = 43.2 GB

Note that the amount of overlapping data read increases, not only by the number of nodes, but as the y dimension and/or the z dimension increases.

Trace Reading

The additional amount of overlapping reads to share trace edge data with neighbors for can be calculated using the following equation:

    (number_nodes - 1) x (order_in_space) x (y_dimension) x (4 bytes) x (number_of_time_slices)

For this particular benchmark study, the additional overlap for the 16 and 8 node cases is:

    16 nodes: 15 x 8 x 1151 x 4 x 800 = 442MB extra
    8 nodes: 7 x 8 x 1151 x 4 x 800 = 206MB extra

For the first case the size of the trace data file used for the 1243 x 1151 x 1231 case is

    1243 x 1151 x 4 bytes x 800 = 4.578 GB

With the additional pad based on the number of nodes, the total amount of data read is:

    16 nodes: .442 GB + 4.578 GB = 5.0 GB
    8 nodes: .206 GB + 4.578 GB = 4.8 GB

For the second case the size of the trace data file used for the 2486 x 1151 x 1231 case is

    2486 x 1151 x 4 bytes x 800 = 9.156 GB

With the additional pad based on the number of nodes, the total amount of data read is:

    16 nodes: .442 GB + 9.156 GB = 9.6 GB
    8 nodes: .206 GB + 9.156 GB = 9.4 GB

As the number of nodes is increased, the overlap causes more disk lock contention.

Writing Final Output Image

1243x1151x1231 - 7.1 GB per file:

    16 nodes: 78 x 1151 x 1231 x 4 = 442MB/node (7.1 GB total)
    8 nodes: 156 x 1151 x 1231 x 4 = 884MB/node (7.1 GB total)

2486x1151x1231 - 14.1 GB per file:

    16 nodes: 156 x 1151 x 1231 x 4 = 930 MB/node (14.1 GB total)
    8 nodes: 311 x 1151 x 1231 x 4 = 1808 MB/node (14.1 GB total)

Resource Allocation

It is best to allocate one node as the Oracle Grid Engine resource scheduler and MPI master host. This is especially true when running with 24 OpenMP threads in hyperthreading mode to avoid oversubscribing a node that is cooperating in delivering the solution.

See Also

Disclosure Statement

Copyright 2010, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 10/20/2010.

Monday Oct 11, 2010

Sun SPARC Enterprise M9000 Server Delivers World Record Non-Clustered TPC-H @3000GB Performance

Oracle's Sun SPARC Enterprise M9000 server delivered a single-system TPC-H 3000GB world record performance. The Sun SPARC Enterprise M9000 server, running Oracle Database 11g Release 2 on the Oracle Solaris operating system proves the power of Oracle's integrated solution.

  • Oracle beats IBM Power with better performance and price/performance (3 Year TCO). This shows that Oracle's focus on integrated system design provides more customer value than IBM's focus on "per core performance"!

  • The Sun SPARC Enterprise M9000 server is 27% faster than the IBM Power 595.

  • The Sun SPARC Enterprise M9000 server is 22% faster than the HP ProLiant DL980 G7.

  • The Sun SPARC Enterprise M9000 server is 26% lower than the IBM Power 595 for price/performance.

  • The Sun SPARC Enterprise M9000 server is 2.7 times faster than the IBM Power 595 for data loading.

  • The Sun SPARC Enterprise M9000 server is 2.3 times faster than the HP ProLiant DL980 for data loading.

  • The Sun SPARC Enterprise M9000 server is 2.6 times faster than the IBM p595 for Refresh Function.

  • The Sun SPARC Enterprise M9000 server is 3 times faster than the HP ProLiant DL980 for Refresh Function.

  • Oracle used Storage Redundancy Level 3 as defined by the TPC-H 2.12.0 specification, which is the highest level. IBM is the only other vendor to secure the storage to this level.

  • One should focus on the performance of the complete hardware and software stack since server implementation details such as the number of cores or the number of threads will obscure the important metrics of delivered system performance and system price/performance.

  • The Sun SPARC Enterprise M9000 server configured with SPARC VII processors, Sun Storage 6180 arrays, and running Oracle Solaris 10 operating system combined with Oracle Database 11g Release 2 achieved World Record TPC-H performance of 198,907.5 QphH@3000GB for non-clustered systems.

  • The Sun SPARC Enterprise M9000 server is over three times faster than the HP Itanium2 Superdome.

  • The Sun Storage 6180 array configuration (a total of 16 6180 arrays) in this benchmark delivered IO performance of over 21 GB/sec Sequential Read performance as measured by the vdbench tool.

  • This TPC-H result demonstrates that the Sun SPARC Enterprise M9000 server can handle the increasingly large databases required of DSS systems. The server delivered more than 18 GB/sec of real IO throughput as measured by the Oracle Database 11g Release 2 software.

  • Both Oracle and IBM had the same level of hardware discounting as allowed by TPC rules to provide a effective comparison of price/performance.

  • IBM has not shown any delivered I/O performance results for the high-end IBM POWER7 systems. In addition, they have not delivered any commercial benchmarks (TPC-C, TPC-H, etc.) which have heavy I/O demands.

Performance Landscape

TPC-H @3000GB, Non-Clustered Systems

System
CPU type
Memory
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
Sun SPARC Enterprise M9000
2.88GHz SPARC64 VII
512GB
198,907.5 $15.27 182,350.7 216,967.7 Oracle 12/09/10
HP ProLiant DL980 G7
2.27GHz Intel Xeon X7560
512GB
162,601.7 $2.68 185,297.7 142,601.7 SQL Server 10/13/10
IBM Power 595
5.0GHz POWER6
512GB
156,537.3 $20.60 142,790.7 171,607.4 Sybase 11/24/09
Unisys ES7000 7600R
2.6GHz Intel Xeon
1024GB
102,778.2 $21.05 120,254.8 87,841.4 SQL Server 05/06/10
HP Integrity Superdome
1.6GHz Intel Itanium
256GB
60,359.3 $32.60 80,838.3 45,068.3 SQL Server 05/21/07

QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric (smaller is better)
QppH = the Power Numerical Quantity
QthH = the Throughput Numerical Quantity

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Server:

Sun SPARC Enterprise M9000
32 x SPARC VII 2.88 GHz processors
512 GB memory
4 x internal SAS (4 x 300 GB)

External Storage:

16 x Sun Storage 6180 arrays (16x 16 x 300 GB)

Software:

Operating System: Oracle Solaris 10 10/09
Database: Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 3000 GB (Scale Factor 3000)
TPC-H Composite: 198,907.5 QphH@3000GB
Price/performance: $15.27/QphH@3000GB
Available: 12/09/2010
Total 3 year Cost: $3,037,900
TPC-H Power: 182,350.7
TPC-H Throughput: 216,967.7
Database Load Time: 3:40:11

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB and 10000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multi user modes. The benchmark requires reporting of price/performance, which is the ratio of QphH to total HW/SW cost plus 3 years maintenance.

Key Points and Best Practices

  • The Sun Storage 6180 array showed good scalability and these sixteen 6180 arrays showed over 21 GB/sec Sequential Read performance as measured by the vdbench tool.
  • Oracle Solaris 10 10/09 required little system tuning.
  • The optimal 6180 configuration for the benchmark was to set up 1 disk per volume instead of multiple disks per volume and let Oracle Solaris Volume Manager (SVM) mirror. Presenting as many volumes as possible to Oracle database gave the highest scan rate.

  • The storage was managed by SVM with 1MB stripe size to match with Oracle's database IO size. The default 16K stripe size is just too small for this DSS benchmark.

  • All the Oracle files, except TEMP tablespace, were mirrored under SVM. Eight 6180 arrays (128 disks) were mirrored to another 8 6180 arrays using 128-way stripe. IO performance was good and balanced across all the disks with a round robin order. Read performance was the same with mirror or without mirror. With the SVM mirror the benchmark passed the ACID (Atomicity, Consistency, Isolation and Durablity) test.

  • Oracle tables were 128-way partitioned and parallel degree for each table was set to 128 because the system had 128 cores. This setting worked the best for performance.

  • CPU usage during the Power run was not so high. This is because parallel degree was set to 128 for the tables and indexes so it utilized 128 vcpus for the most of the queries but the system had 256 vcpus.

See Also

Disclosure Statement

Sun SPARC Enterprise M9000 198,907.5 QphH@3000GB, $15.27/QphH@3000GB, avail 12/09/10, IBM Power 595 QphH@3000GB, 156,537.3 QphH@3000GB, $20.60/QphH@3000GB, avail 11/24/09, HP Integrity Superdome 60,359.3 QphH@3000GB, $32.60/QphH@3000GB avail 06/18/07, TPC-H, QphH, $/QphH tm of Transaction Processing Performance Council (TPC). More info www.tpc.org.

Thursday Sep 30, 2010

BestPerf Index 1 October 2010

This is an occasionally-generated index of previous entries in the BestPerf blog. Skip to next entry

Colors used:

Benchmark
Best Practices
Other

Sep 30, 2010 Consolidation of 30 x86 Servers onto One SPARC T3-2
Sep 29, 2010 SPARC T3-1 Delivers Record Number of Online Users on JD Edwards EnterpriseOne 9.0.1 Day in the Life Test
Sep 28, 2010 SPARC T3 Cryptography Performance Over 1.9x Increase in Throughput over UltraSPARC T2 Plus
Sep 28, 2010 SPARC T3-2 Delivers First Oracle E-Business X-Large Benchmark Self-Service (OLTP) Result
Sep 27, 2010 Sun Fire X2270 M2 Super-Linear Scaling of Hadoop Terasort and CloudBurst Benchmarks
Sep 27, 2010 SPARC T3-1 Shows Capabilities Running Online Auction Benchmark with Oracle Fusion Middleware
Sep 24, 2010 SPARC T3-2 sets World Record on SPECjvm2008 Benchmark
Sep 24, 2010 SPARC T3 Provides High Performance Security for Oracle Weblogic Applications
Sep 23, 2010 Sun Storage F5100 Flash Array with PCI-Express SAS-2 HBAs Achieves Over 17 GB/sec Read
Sep 23, 2010 SPARC T3-1 Performance on PeopleSoft Enterprise Financials 9.0 Benchmark
Sep 22, 2010 Oracle Solaris 10 9/10 ZFS OLTP Performance Improvements
Sep 22, 2010 SPARC T3-1 Supports 13,000 Users on Financial Services and Enterprise Application Integration Running Siebel CRM 8.1.1
Sep 21, 2010 ProMAX Performance and Throughput on Sun Fire X2270 and Sun Storage 7410
Sep 21, 2010 Sun Flash Accelerator F20 PCIe Cards Outperform IBM on SPC-1C
Sep 21, 2010 SPARC T3 Servers Deliver Top Performance on Oracle Communications Order and Service Management
Sep 20, 2010 Schlumberger's ECLIPSE 300 Performance Throughput On Sun Fire X2270 Cluster with Sun Storage 7410
Sep 20, 2010 Sun Fire X4470 4 Node Cluster Delivers World Record SAP SD-Parallel Benchmark Result
Sep 20, 2010 SPARC T3-4 Sets World Record Single Server Result on SPECjEnterprise2010 Benchmark
Aug 25, 2010 Transparent Failover with Solaris MPxIO and Oracle ASM
Aug 23, 2010 Repriced: SPC-1 Sun Storage 6180 Array (8Gb) 1.9x Better Than IBM DS5020 in Price-Performance
Aug 23, 2010 Repriced: SPC-2 (RAID 5 & 6 Results) Sun Storage 6180 Array (8Gb) Outperforms IBM DS5020 by up to 64% in Price-Performance
Jun 29, 2010 Sun Fire X2270 M2 Achieves Leading Single Node Results on ANSYS FLUENT Benchmark
Jun 29, 2010 Sun Fire X2270 M2 Demonstrates Outstanding Single Node Performance on MSC.Nastran Benchmarks
Jun 29, 2010 Sun Fire X2270 M2 Achieves Leading Single Node Results on ABAQUS Benchmark
Jun 29, 2010 Sun Fire X2270 M2 Sets World Record on SPEC OMP2001 Benchmark
Jun 29, 2010 Sun Fire X4170 M2 Sets World Record on SPEC CPU2006 Benchmark
Jun 29, 2010 Sun Blade X6270 M2 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4270 M2 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4470 Sets World Records on SPEC OMP2001 Benchmarks
Jun 28, 2010 Sun Fire X4470 Sets World Record on SPEC CPU2006 Rate Benchmark
Jun 28, 2010 Sun Fire X4470 2-Node Configuration Sets World Record for SAP SD-Parallel Benchmark
Jun 28, 2010 Sun Fire X4800 Sets World Record on SPECjbb2005 Benchmark
Jun 28, 2010 Sun Fire X4800 Sets World Records on SPEC CPU2006 Rate Benchmarks
Jun 10, 2010 Hyperion Essbase ASO World Record on Sun SPARC Enterprise M5000
Jun 09, 2010 PeopleSoft Payroll 500K Employees on Sun SPARC Enterprise M5000 World Record
Jun 03, 2010 Sun SPARC Enterprise T5440 World Record SPECjAppServer2004
May 11, 2010 Per-core Performance Myth Busting
Apr 14, 2010 Oracle Sun Storage F5100 Flash Array Delivers World Record SPC-1C Performance
Apr 13, 2010 Oracle Sun Flash Accelerator F20 PCIe Card Accelerates Web Caching Performance
Apr 06, 2010 WRF Benchmark: X6275 Beats Power6
Mar 29, 2010 Sun Blade X6275/QDR IB/ Reverse Time Migration
Feb 23, 2010 IBM POWER7 SPECfp_rate2006: Poor Scaling? Or Configuration Confusion?
Jan 25, 2010 Sun/Solaris Leadership in SAP SD Benchmarks and HP claims
Jan 21, 2010 SPARC Enterprise M4000 PeopleSoft NA Payroll 240K Employees Performance (16 Streams)
Dec 16, 2009 Sun Fire X4640 Delivers World Record x86 Result on SPEC OMPL2001
Nov 24, 2009 Sun M9000 Fastest SAP 2-tier SD Benchmark on current SAP EP4 for SAP ERP 6.0 (Unicode)
Nov 20, 2009 Sun Blade X6275 cluster delivers leading results for Fluent truck_111m benchmark
Nov 20, 2009 Sun Blade 6048 and Sun Blade X6275 NAMD Molecular Dynamics Benchmark beats IBM BlueGene/L
Nov 19, 2009 SPECmail2009: New World record on T5240 1.6GHz Sun 7310 and ZFS
Nov 18, 2009 Sun Flash Accelerator F20 PCIe Card Achieves 100K 4K IOPS and 1.1 GB/sec
Nov 05, 2009 New TPC-C World Record Sun/Oracle
Nov 02, 2009 Sun Blade X6275 Cluster Beats SGI Running Fluent Benchmarks
Nov 02, 2009 Sun Ultra 27 Delivers Leading Single Frame Buffer SPECviewperf 10 Results
Oct 28, 2009 SPC-2 Sun Storage 6780 Array RAID 5 & RAID 6 51% better $/performance than IBM DS5300
Oct 25, 2009 Sun C48 & Lustre fast for Seismic Reverse Time Migration using Sun X6275
Oct 25, 2009 Sun F5100 and Seismic Reverse Time Migration with faster Optimal Checkpointing
Oct 23, 2009 Wiki on performance best practices
Oct 20, 2009 Exadata V2 Information
Oct 15, 2009 Oracle Flash Cache - SGA Caching on Sun Storage F5100
Oct 13, 2009 Oracle Hyperion Sun M5000 and Sun Storage 7410
Oct 13, 2009 Sun T5440 Oracle BI EE Sun SPARC Enterprise T5440 World Record
Oct 13, 2009 SPECweb2005 on Sun SPARC Enterprise T5440 World Record using Solaris Containers and Sun Storage F5100 Flash
Oct 13, 2009 Oracle PeopleSoft Payroll (NA) Sun SPARC Enterprise M4000 and Sun Storage F5100 World Record Performance
Oct 13, 2009 SAP 2-tier SD Benchmark on Sun SPARC Enterprise M9000/32 SPARC64 VII
Oct 13, 2009 CP2K Life Sciences, Ab-initio Dynamics - Sun Blade 6048 Chassis with Sun Blade X6275 - Scalability and Throughput with Quad Data Rate InfiniBand
Oct 13, 2009 SAP 2-tier SD-Parallel on Sun Blade X6270 1-node, 2-node and 4-node
Oct 13, 2009 Halliburton ProMAX Oil & Gas Application Fast on Sun 6048/X6275 Cluster
Oct 13, 2009 SPECcpu2006 Results On MSeries Servers With Updated SPARC64 VII Processors
Oct 13, 2009 MCAE ABAQUS faster on Sun F5100 and Sun X4270 - Single Node World Record
Oct 12, 2009 MCAE ANSYS faster on Sun F5100 and Sun X4270
Oct 12, 2009 MCAE MCS/NASTRAN faster on Sun F5100 and Fire X4270
Oct 12, 2009 SPC-2 Sun Storage 6180 Array RAID 5 & RAID 6 Over 70% Better Price Performance than IBM
Oct 12, 2009 SPC-1 Sun Storage 6180 Array Over 70% Better Price Performance than IBM
Oct 12, 2009 Why Sun Storage F5100 is a good option for Peoplesoft NA Payroll Application
Oct 12, 2009 1.6 Million 4K IOPS in 1RU on Sun Storage F5100 Flash Array
Oct 11, 2009 TPC-C World Record Sun - Oracle
Oct 09, 2009 X6275 Cluster Demonstrates Performance and Scalability on WRF 2.5km CONUS Dataset
Oct 02, 2009 Sun X4270 VMware VMmark benchmark achieves excellent result
Sep 22, 2009 Sun X4270 Virtualized for Two-tier SAP ERP 6.0 Enhancement Pack 4 (Unicode) Standard Sales and Distribution (SD) Benchmark
Sep 01, 2009 String Searching - Sun T5240 & T5440 Outperform IBM Cell Broadband Engine
Aug 28, 2009 Sun X4270 World Record SAP-SD 2-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Aug 27, 2009 Sun SPARC Enterprise T5240 with 1.6GHz UltraSPARC T2 Plus Beats 4-Chip IBM Power 570 POWER6 System on SPECjbb2005
Aug 26, 2009 Sun SPARC Enterprise T5220 with 1.6GHz UltraSPARC T2 Sets Single Chip World Record on SPECjbb2005
Aug 12, 2009 SPECmail2009 on Sun SPARC Enterprise T5240 and Sun Java System Messaging Server 6.3
Jul 23, 2009 World Record Performance of Sun CMT Servers
Jul 22, 2009 Why does 1.6 beat 4.7?
Jul 21, 2009 Zeus ZXTM Traffic Manager World Record on Sun T5240
Jul 21, 2009 Sun T5440 Oracle BI EE World Record Performance
Jul 21, 2009 Sun T5440 World Record SAP-SD 4-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Jul 21, 2009 1.6 GHz SPEC CPU2006 - Rate Benchmarks
Jul 21, 2009 Sun Blade T6320 World Record SPECjbb2005 performance
Jul 21, 2009 New SPECjAppServer2004 Performance on the Sun SPARC Enterprise T5440
Jul 21, 2009 Sun T5440 SPECjbb2005 Beats IBM POWER6 Chip-to-Chip
Jul 21, 2009 New CMT results coming soon....
Jul 14, 2009 Vdbench: Sun StorageTek Vdbench, a storage I/O workload generator.
Jul 14, 2009 Storage performance and workload analysis using Swat.
Jul 10, 2009 World Record TPC-H@300GB Price-Performance for Windows on Sun Fire X4600 M2
Jul 06, 2009 Sun Blade 6048 Chassis with Sun Blade X6275: RADIOSS Benchmark Results
Jul 03, 2009 SPECmail2009 on Sun Fire X4275+Sun Storage 7110: Mail Server System Solution
Jun 30, 2009 Sun Blade 6048 and Sun Blade X6275 NAMD Molecular Dynamics Benchmark beats IBM BlueGene/L
Jun 26, 2009 Sun Fire X2270 Cluster Fluent Benchmark Results
Jun 25, 2009 Sun SSD Server Platform Bandwidth and IOPS (Speeds & Feeds)
Jun 24, 2009 I/O analysis using DTrace
Jun 23, 2009 New CPU2006 Records: 3x better integer throughput, 9x better fp throughput
Jun 23, 2009 Sun Blade X6275 results capture Top Places in CPU2006 SPEED Metrics
Jun 19, 2009 Pointers to Java Performance Tuning resources
Jun 19, 2009 SSDs in HPC: Reducing the I/O Bottleneck BluePrint Best Practices
Jun 17, 2009 The Performance Technology group wiki is alive!
Jun 17, 2009 Performance of Sun 7410 and 7310 Unified Storage Array Line
Jun 16, 2009 Sun Fire X2270 MSC/Nastran Vendor_2008 Benchmarks
Jun 15, 2009 Sun Fire X4600 M2 Server Two-tier SAP ERP 6.0 (Unicode) Standard Sales and Distribution (SD) Benchmark
Jun 12, 2009 Correctly comparing SAP-SD Benchmark results
Jun 12, 2009 OpenSolaris Beats Linux on memcached Sun Fire X2270
Jun 11, 2009 SAS Grid Computing 9.2 utilizing the Sun Storage 7410 Unified Storage System
Jun 10, 2009 Using Solaris Resource Management Utilities to Improve Application Performance
Jun 09, 2009 Free Compiler Wins Nehalem Race by 2x
Jun 08, 2009 Variety of benchmark results to be posted on BestPerf
Jun 05, 2009 Interpreting Sun's SPECpower_ssj2008 Publications
Jun 03, 2009 Wide Variety of Topics to be discussed on BestPerf
Jun 03, 2009 Welcome to BestPerf group blog!

Consolidation of 30 x86 Servers onto One SPARC T3-2

One of Oracle's SPARC T3-2 servers was able to consolidate the database workloads off of thirty older x86 servers in a secure virtualized environment.

  • The thirty x86 servers required 6.7 times more power than the consolidated workload on the SPARC T3-2 server.

  • The x86 configuration used 10 times the rack space than the consolidated workload did on the SPARC T3-2 server.

  • In addition to power & space considerations, there are also administrative cost savings resulting from having to manage just one server, as opposed to thirty servers.

  • Gartner says, "They need to realize that removing a single x86 server from a data center will result in savings of more than $400 a year in energy costs alone".

  • The total transaction throughput for the SPARC T3 server (132,000) was almost the same as the aggregate throughput achieved by the thirty x86 servers (138,000), where each x86 running at 10% utilization.

  • The average transaction response time on the SPARC T3-2 server (24 ms) was just a little higher than the average transaction response time on the Intel servers (19.5 ms).

Performance Landscape

System Oracle
Instances
Average
System
Utilization
Transactions/
min/system
Average
Response
time (ms)
watts/
system
OS
Sun Fire X4250
2x 3.0GHz Xeon
1 10% 4,600 19.5 320 Linux
SPARC T3-2
1x 1.65GHz SPARC T3
30 80% 132,000 24.0 1400\* Solaris

\* power consumption includes storage and periperal devices

Notes:
total throughput for 30 Intel systems = 30 \* 4600 = 138,000
total watts for 30 Intel systems = 30 \* 320 = 9600

Results and Configuration Summary

x86 Server Configuration:

30 x Sun Fire X4250 servers, each with
2 X Intel 3.0 GHz E5450 processors
16 GB memory
6 x internal 146 GB 15K SAS disks
RedHat Linux 5.3
Oracle Database 11g Release 2

SPARC T3 Server Configuration:

1 x SPARC T3-2 server
2 x 1.65 GHz SPARC T3 processors
256 GB memory
2 X 10K 300 GB internal SAS disks
1 x Sun Storage F5100 Flash Array storage
1 x Sun Fires X4270 server as COMSTAR target
Oracle Solaris 10 9/10
Oracle Database 11g Release 2

Benchmark Description

This demonstration was designed to show the benefits of virtualization when upgrading from older X86 systems to one of Oracle's T-series servers. A 30:1 consolidation was shown moving from thirty X86 Linux servers to a single T-Series server running Oracle Solaris in a secure virtualized environment. After the consolidation, there was still 20% headroom in the SPARC T3-2 server for additional growth in the workload.

The 200 scale iGen OLTP workload was used to test the consolidation. The x86 system was loaded with iGen clients up to a level of 10% cpu utilization. This load level for x86 systems is typically found in many data centers.

Thirty Oracle Solaris zones (containers) were created on the SPARC T3-2 server, with each zone configured identically as the Oracle configuration on the x86 server. The throughput on each zone was ramped up to the same level as on the Intel base server.

The overall CPU utilization on the SPARC T3-2 server, together with the average iGen transaction response times were then measured along with the power consumption.

Key Points and Best Practices

  • Each Oracle Solaris container was assigned to a processor set consisting of eight virtual CPUs. This use of processor sets was critical to obtaining the reported performance number. Without processor set, the performance was reduced to about one-half the reported performance number.

  • Once the first container was completely configured (with Oracle 11g and iGen installed), the remaining containers were created by a simple cloning procedure, which took no more than a few minutes for each container.

  • Setting up a standalone x86 server with Linux, Oracle and iGen is a far more time consuming task than setting up additional containers once the first container has been created.

See Also

Disclosure Statement

Copyright 2010, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/20/2010.

Wednesday Sep 29, 2010

SPARC T3-1 Delivers Record Number of Online Users on JD Edwards EnterpriseOne 9.0.1 Day in the Life Test

Using an online component of "Day in the Life" test that represents the most popular of Oracle's JD Edwards EnterpriseOne applications, Oracle's SPARC T3-1 server running Oracle Application Server 10g Release 3 and JD Edwards EnterpriseOne 9.0.1 in Oracle Solaris Containers, in tandem with Oracle's Sun SPARC Enterprise M3000 server running Oracle Database 11g, delivered a record result of 4200 users.

  • A SPARC T3-1 server paired with a Sun SPARC Enterprise M3000 server beat the IBM POWER7 result by 5% on the JD Edwards EnterpriseOne 9.0.1 Day in the Life test.

  • The JD Edwards EnterpriseOne 9.0.1 application takes advantage of the large number of threads in SPARC T3-1 server on the application tier achieving 4200 users with < 2 sec response time.

  • This benchmark uses Oracle Solaris Containers for Web and Application tier consolidation. Containers provide flexibility, easier maintenance and better CPU utilization of the server leaving processing capacity for additional growth.

  • The Sun SPARC Enterprise M3000 server provides enterprise class RAS features for customers deploying DB. Customers can take advantage of the SmartFlash feature in Oracle Database 11g Release 1 on Oracle Solaris.

  • To obtain this leading result, a number of Oracle technologies were used: Oracle Solaris 10, Oracle Solaris Containers, Oracle Java Hotspot Server VM, Oracle OC4J Application Server, Oracle Database 11g Release 1, SPARC T3-1 server, and Sun SPARC Enterprise M3000 server.

Performance Landscape

JDE EnterpriseOne DIL Kit Performance Chart

System Memory OS #user JD Edwards
Version
Rack Units
SPARC T3-1, 1x1.65 GHz SPARC T3 128 Solaris 10 4200 9.0.1 2U
IBM Power 750, 1x3.55 GHz POWER7 120 IBM i7.1 4000 9.0 4U
IBM Power 570, 4x4.2 GHz POWER6 128 IBM i6.1 2400 8.12 4U
IBM x3650M2, 2x2.93 GHz X5570 64 OVM 1000 9.0 2U

Results as of 9/17/2010.

Results and Configuration Summary

Hardware Configuration:

1 SPARC T3-1 server
1 x 1.65 GHz SPARC T3 processor
128 GB memory
1 x 10 GbE NIC
1 x Sun SPARC Enterprise M3000 server
1 x 2.76 GHz SPARC64 VII processor
64 GB memory
1 x 10 Gbe NIC
2 x StorageTek 2540/2501

Software Configuration:

J.D. Edwards EnterpriseOne 9.0 Update 1
Tools 8.98.3.0
Oracle Database 11g Release 1
Oracle OC4J application server 10g R3
Oracle Solaris 10 9/10
Mercury LoadRunner 9.10 with Oracle DIL kit for JD Edwards EnterpriseOne 9.0 update 1

Benchmark Description

Oracle's JD Edwards EnterpriseOne is an integrated applications suite of Enterprise Resource Planning software.

  • Oracle offers 70 JD Edwards EnterpriseOne application modules to support a diverse set of business operations.
  • Oracle's Day-In-Life (DIL) kit is a suite of scripts that exercises most common transactions of J.D. Edwards EnterpriseOne applications including business processes such as payroll, sales order, purchase order, work order, and other manufacturing processes, such as ship confirmation. These are labeled by industry acronyms such as SCM, CRM, HCM, SRM and FMS.
  • The Oracles DIL kit's scripts executes transactions typical of a mid-sized manufacturing company.

The workload consists of online transactions. It does not include the batch processing job components.

LoadRunner is used to run the workload and collect the users transactions response times against increasing numbers of users. Key metric used to evaluate performance is the transaction response time which is reported by LoadRunner.

Key Points and Best Practices

Two JD Edwards EnterpriseOne and two Oracle Application server instances on the SPARC T3-1 server were hosted in four separate Oracle Solaris Containers to demonstrate consolidation of multiple application and web servers.

  • Each Oracle Solaris container was bound to a separate processor set, each containing 3 cores (with 4 cores left for the default processor set). This was done to improve performance by using the physical memory closest to the processors, thereby, reducing memory access latency and reduce processors cross calls. The default processor set was used for network and disk interrupt handling.

  • The applications were executed in the FX scheduling class to improve performance by reducing the frequency of context switches.

  • The database server was run in an Oracle Solaris Container hosted on the Sun SPARC Enterprise M3000 server.

See Also

Disclosure Statement

Copyright 2010, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/20/2010.

Tuesday Sep 28, 2010

SPARC T3 Cryptography Performance Over 1.9x Increase in Throughput over UltraSPARC T2 Plus

In this study, the pk11rsaperf cryptographic microbenchmark program was used to compare the throughput performance of the UltraSPARC T2 Plus and SPARC T3 processors.
  • For the standard RSA 1024-bit public key encryption, the SPARC T3 showed a 1.93x performance improvement over the UltraSPARC T2 when performing multiple decryptions of a reference text encrypted with a fixed key pair
  • The SPARC T3 achieved nearly 80,000 ops/s for the standard RSA 1024-bit public key encryption, when performing multiple decryptions of a reference text encrypted with a fixed key pair in a 1U server.

As security has taken unprecedented importance in all facets of the IT industry, today organizations are proactively adopting to cryptographic mechanisms to protect their business information from unauthorized access and ensure its confidentiality and integrity during transit and storage.

Cryptographic operations are heavily compute-intensive which burdens the host system with additional CPU cycles and network bandwidth resulting significant degradation of overall throughput of the system and its hosted applications.

Oracle's T-series systems based on the Oracle's SPARC T3 processor provide the industry's fastest on-chip hardware cryptographic capabilities to accelerate the following cyphers.

  • AES (ECB, CBC, CTR, CCM, CGM, CFB)
  • RSA, DSA
  • Diffie Helman (key pair gen, derive)
  • Elliptic Curve (ECDH, ECDSA, key pair gen)
  • MD5, SHA1, SHA256, SHA384, SHA512
  • Hardware RNG
In contrast, the Intel Westmere processor only adds instructions to accelerate AES.
  • "The Intel AES-NI consists of seven instructions. Six of them offer full hardware support for AES. Four instructions support AES encryption and decryption, and the other two instructions support AES key expansion. The seventh aids in carry-less multiplication. The AES instructions have the flexibility to support all usages of AES, including all standard key lengths, standard modes of operation, and even some nonstandard or future variants." Reference
  • Further, Westmere's AES-NI instructions are \*not\* hypervisor aware, VM Guests do not use the feature when given workloads, and Java Cryptography Extensions do not provide an AES-NI library.

Performance Landscape


PK11 RSA 1024-bit Benchmark Test

Processor Processes Threads per
Process
Total
Threads
Aggregate
Performance (ops/sec)
SPARC T3 8 16 128 79,558
2 64 128 76,877
1 128 128 52,660

UltraSPARC T2 Plus 8 8 64 41,285
2 32 64 39,823
1 64 64 39,856

Results and Configuration Summary

Hardware Configuration:

SPARC T3-1
1 x 1.65 GHz SPARC T3

Sun SPARC Enterprise T5240
2 x 1.6 GHz UltraSPARC T2 Plus (1 blacklisted)

Software Configuration:

Oracle Solaris 10 10/09

Benchmark Description

The RSA/AES-256 Cryptography benchmark suite was internally developed by Sun to measure maximum throughput of RSA private key (sign) operations and AES-256 operations that a system can perform. Multiple processes are used to achieve the maximum throughput.

pk11rsaperf measures the performance of RSA 1024-bit processing as performed by the Solaris Cryptographic Framework via PKCS#11 API. Different data sizes and varying numbers of concurrent threads can be tested. The metric is ops/sec.

Key Points and Best Practices

  • When running the SPARC T3 at full capacity, at least 2 processes (64 threads each) are recommended as this increases throughput by over 45% over using just 1 process (128 threads) for RSA processing.

See Also

Disclosure Statement

Copyright 2010, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/20/2010.

About

BestPerf is the source of Oracle performance expertise. In this blog, Oracle's Strategic Applications Engineering group explores Oracle's performance results and shares best practices learned from working on Enterprise-wide Applications.

Index Pages
Search

Archives
« April 2016
SunMonTueWedThuFriSat
     
1
2
3
4
5
6
7
8
9
10
11
12
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
       
Today