Sunday Jan 30, 2011

PeopleSoft Financials 9.0 (Day-in-the-Life) Benchmark on Oracle Sun

It is very rare to see any vendor publishing a benchmark on competing products of their own let alone products that are 100% compatible with each other. Well, it happened at Oracle|Sun. M-series and T-series hardware was the subject of two similar / comparable benchmarks; and PeopleSoft Financials 9.0 DIL was the benchmarked workload.

Benchmark report URLs

PeopleSoft Financials 9.0 on Oracle's SPARC T3-1 Server
PeopleSoft Financials 9.0 on Oracle's Sun SPARC Enterprise M4000 Server

Brief description of workload

The benchmark workload simulated Financial Control and Reporting business processes that a customer typically performs when closing their books at period end. "Closing the books" generally involves Journal generation, editing & posting; General Ledger allocations, summary & consolidations and reporting in GL. The applications that were involved in this process are: General Ledger, Asset Management, Cash Management, Expenses, Payables and Receivables.

The benchmark execution simulated the processing required for closing the books (background batch processes) along with some online concurrent transaction activity by 1000 end users.

Summary of Benchmark Test Results

The following table summarizes the test results of the "close the books" business processes. For the online transaction response times, check the benchmark reports (too many transactions to summarize here). Feel free to draw your own conclusions.

As of this writing no other vendor published any benchmark result with PeopleSoft Financials 9.0 workload.

(If the following table is illegible, click here for cleaner copy of test results.)

Hardware Configuration Elapsed Time Journal Lines per Hour Ledger Lines per Hour
Batch only Batch + 1K users Batch only Batch + 1K users Batch only Batch + 1K users
DB + Proc Sched  1 x Sun SPARC Enterprise M5000 Server
 8 x 2.53 GHz QC SPARC64 VII processors, 128 GB RAM
App + Web  1 x SPARC T3-1 Server
 1 x 1.65 GHz 16-Core SPARC T3 processor, 128 GB RAM
24.15m
Reporting: 11.67m
25.03m
Reporting: 11.98m
6,355,093 6,141,258 6,209,682 5,991,382
DB + Proc Sched  1 x Sun SPARC Enterprise M5000 Server
 8 x 2.66 GHz QC SPARC64 VII+ processors, 128 GB RAM
App + Web  1 x Sun SPARC Enterprise M4000 Server
 4 x 2.66 GHz QC SPARC64 VII+ processors, 128 GB RAM
21.74m
Reporting: 11.35m
23.30m
Reporting: 11.42m
7,059,591 6,597,239 6,898,060 6,436,236

Software Versions

Oracle’s PeopleSoft Enterprise Financials/SCM 9.00.00.331
Oracle’s PeopleSoft Enterprise (PeopleTools) 8.49.23 64-bit
Oracle’s PeopleSoft Enterprise (PeopleTools) 8.49.23 32-bit on Windows Server 2003 SP2 for generating reports using nVision
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 64-bit + RDBMS patch 9699654
Oracle Tuxedo 9.1 RP36 Jolt 9.1 64-bit
Oracle WebLogic Server 9.2 MP3 64-bit (Java version "1.5.0_12")
MicroFocus Server Express 4.0 SP4 64-bit
Oracle Solaris 10 10/09 and 09/10

Acknowledgments

It is one of the complex and stressful benchmarks that I have ever been involved in. It is a collaborative effort from different teams within Oracle Corporation. A sincere thanks to the PeopleSoft benchmark team for providing adequate support throughout the execution of the benchmark and for the swift validation of benchmark results numerous times (yes, "numerous" - it is not a typo.)

Thursday Feb 18, 2010

PeopleSoft Campus Solutions 9.0 benchmark on Sun SPARC Enterprise M4000 and X6270 blades

Oracle|Sun published PeopleSoft Campus Solutions 9.0 benchmark results today. Here is the direct URL to the benchmark results white paper:

      PeopleSoft Enterprise Campus Solutions 9.0 using Oracle 11g on a Sun SPARC Enterprise M4000 & Sun Blade X6270 Modules

Sun published three PeopleSoft benchmarks on SPARC platform over the last 12 month period -- one OLTP and two batch benchmarks[1]. The latest benchmark is somewhat special for at least couple of reasons:

  • Campus Solutions 9.0 workload has both online transactions and batch processes, and
  • This is the very first time ever Sun published a PeopleSoft benchmark on x64 hardware running Oracle Enterprise Linux

The summary of the benchmark test results is shown below. These numbers were extracted from the very first page of the benchmark results white papers where Oracle|PeopleSoft highlights the significance of the test results and the actual numbers that are of interest to the customers. Test results are sorted by the hourly throughput (invoices & transcripts per hour) in the descending order. Click on the link that is underneath the vendor name to open corresponding benchmark result.

While analyzing these test results, remember that the higher the throughput, the better. In the case of online transactions, it is desirable to keep the response times as low as possible.

(Prettier version of the following table is at:
        Oracle PeopleSoft Campus Solutions 9.0 Benchmark Test Results
)

Oracle PeopleSoft Campus Solutions 9.0 Benchmark Test Results

Vendor Hardware Configuration OS Resource Utilization Response/Elapsed Times at Peak Load (6,000 users)
Online Transactions: Avg Response Times (sec) Batch Throughput/hr
CPU% Mem (GB) Logon LSC Page Load Page Save Invoice Transcripts
Sun DB 1 x M4000 with 2 x 2.53GHz SPARC64 VII QC processors, 32GB RAM
1 x Sun Storage Flash Accelerator F20 with 4 x 24GB FMODs
1 x ST2540 array with 11 × 136.7GB SAS 15K RPM drives
Solaris 10 37.29 20.94 0.64 0.78 0.82 1.57 31,797 36,652
APP 2 x X6270 blades with 2 x 2.93GHz Xeon 5570 QC processors, 24GB RAM OEL4 U8 41.69\* 4.99\*
WEB+PS 1 x X6270 blade with 2 x 2.8GHz Xeon 5560 QC processors, 24GB RAM OEL4 U8 33.08 6.03

HP DB 1 x Integrity rx6600 with 4 x 1.6GHz Itanium 9050 DC procs, 32G RAM
1 x HP StorageWorks EVA8100 array with 58 x 146GB drives
HP-UX 11iv3 61 30 0.71 0.91 0.83 1.63 22,753 30,257
APP 2 x BL460c blade with 2 x 3.16GHz Xeon 5460 QC procs, 16GB RAM RHEL4U5 61.81 3.6
WEB 1 x BL460c blade with 2 x 3GHz Xeon 5160 DC procs, 8GB RAM RHEL4U5 44.36 3.77
PS 1 x BL460c blade with 2 x 3GHz Xeon 5160 DC procs, 8GB RAM RHEL4U5 21.90 1.48

HP DB 1 x ProLiant DL580 G4 w/ 4 x 3.4GHz Xeon 7140M DC procs, 32G RAM
1 x HP StorageWorks XP128 array with 28 x 73GB drives
Win2003R2 70.37 21.26 0.72 1.17 0.94 1.80 17,621 25,423
APP 4 x BL480c G1 blades with 2 x 3GHz Xeon 5160 DC procs, 12GB RAM Win2003R2 65.61 2.17
WEB 1 x BL460c G1 blades with 2 x 3GHz Xeon 5160 DC procs, 12GB RAM Win2003R2 54.11 3.13
PS 1 x BL460c G1 blades with 2 x 3GHz Xeon 5160 DC procs, 12GB RAM Win2003R2 32.44 1.40

This is all public information. Feel free to compare the hardware configurations & the data presented in the table and draw your own conclusions. Since both Sun and HP used the same benchmark workload, toolkit and ran the benchmark with the same number of concurrent users and job streams for the batch processes, comparison should be pretty straight forward.

Hopefully the following paragraphs will provide relevant insights into the benchmark and the application.

Caution in interpreting the Online Transaction Response Times

Average response times for the online transactions were measured using HP's QuickTest Pro (QTP) tool. This is a benchmark requirement. QTP test scripts have a dependency on the web browser (IE in particular) -- hence it is extremely sensitive to the web browser latencies, remote desktop/VNC latencies and other latencies induced by the operating system. Be aware that all these latencies will be factored into the transaction response times and due to this, the final average transaction response times might be skewed a little. In other words, the reported average transaction response times may not necessarily be very accurate. In most of the cases we might be looking at the approximate values and the actual values might be far better than the ones reported in the benchmark report. (I really wish Oracle|PeopleSoft would throw away some of the skewed samples to make the data more accurate and reliable.). Please keep this in mind when looking at the response times of the online transactions.

Quick note about Consolidation

In our benchmark environment, we had the PeopleSoft Process Scheduler (batch server) set up on the same node as that of the web server node. In general, Oracle recommends setting up the process scheduler either on the database server node or on a dedicated system. However in the benchmark environment, we chose not to run the process scheduler on the database server node as it would hurt the performance of the online transactions. At the same time, we noticed plenty of idle CPU cycles on the web server node even at the peak load of 6,000 concurrent users, so we decided to run the PS on the web server node. In case if customers are not comfortable with this kind of setup, they can use any supported virtualization technology (eg., Logical Domains, Containers on Solaris, Oracle VM on OEL) to separate the process scheduler from the web server by allocating the system resources as they like. It is just a matter of choice.

PeopleSoft Load Balancing

PeopleSoft has load balancing mechanism built into the web server to forward the incoming requests to appropriate application server in the enterprise, and within the application server to send the request to an appropriate application server process, PSAPPSRV. (I'm not 100% sure but I think application server balances the load among application server processes in a round robin fashion on \*nix platforms whereas on Windows, it forwards all the requests to a single application server process until it reaches the configured limit before moving on to the next available application server process.). However this in-built load balancing is not perfect. Most of the times, the number of requests processed by each of the identically configured application server processes [running on different application server nodes in the enterprise] may not be even. This minor shortcoming could lead to uneven resource usage across different nodes in the PeopleSoft deployment. You can notice this in the CPU and memory usage reported for the two app server nodes in the benchmark environment (check the benchmark results white paper).

Sun Flash Accelerator F20 PCIe Card

To reduce I/O latency, hot tables and hot indexes were placed on a Sun Flash Accelerator F20 PCIe Card in this benchmark. The F20 card has a total capacity of 96 GB with 4 x 24GB Flash Modules (FMODs). Although this workload is moderately I/O intensive, the batch processes in this benchmark generate a lot of I/O for few minutes in the steady state of the benchmark. The flash accelerator handled the burst of I/O activity pretty well, and as a result the performance of the batch processesing was improved.

Check the white paper Best Practices for Oracle PeopleSoft Enterprise Payroll for North America using the Sun Storage F5100 Flash Array or Sun Flash Accelerator F20 PCIe Card to know more about the top flash products offered by Oracle|Sun and how they can be deployed in a PeopleSoft environment for maximum benefit.

Solaris specific Tuning

Almost on all versions of Solaris 10, the kernel uses 4M as the maximum page size despite the fact that the underlying hardware supports as high as 256M pages. However large pages may improve the performance of some of the memory intensive workloads such as Oracle database by reducing the number of virtual <=> physical translations there by reducing the expensive dTLB/iTLB misses. In the benchmark environment, the following values were set in the /etc/system configuration file of the database server node to enable 256MB pages for the process heap and ISM.


\* 256M pages for process heap
set max_uheap_lpsize=0x10000000

\* 256M pages for ISM
set mmu_ism_pagesize=0x10000000

While we are on the same topic, Linux configuration is out-of-the-box. No OS tuning was performed in this benchmark.

Tuning Tip for Solaris Customers

Even though we did not set up the middle-tier on a Solaris box in this benchmark, this particular tuning tip is still valid and may help all those customers running the application server on Solaris. Consider lowering the shell limit for the file descriptors to a value of 512 or less if it was set to any value greater than 512. As of today (until the release of PeopleTools 8.50), there are certain parts of code in PeopleSoft calls the file control routine, fcntl(), and the file close routine, fclose(), in a loop "ulimit -n" number of times to close a bunch of files which were opened to perform a specific task. In general, PeopleSoft processes won't open hundreds of files. Hence the above mentioned behavior results in ton of dummy calls that error out. Besides, those system calls are not cheap -- they consume CPU cycles. It gets worse when there are a number of PeopleSoft processes that exhibit this kind of behavior simultaneously. (high system CPU% is one of the symptoms that helps identifying this behavior). Oracle|PeopleSoft is currently trying to address this performance issue. Meanwhile customers can lower the file descriptors shell limit to reduce its intensity and impact.

We have not observed this behavior on OEL when running the benchmark. But be sure to trace the system calls and figure out if the shell limit for the file descriptors need be lowered even on Linux or other supported platforms.


______________________________________

Footnotes:

1. PeopleSoft benchmarks on Sun platform in year 2009-2010

  1. PeopleSoft HRMS 8.9 SELF-SERVICE Using ORACLE on Sun SPARC Enterprise M3000 and Enterprise T5120 Servers -- online transactions (OLTP)
  2. PeopleSoft Enterprise Payroll 9.0 using Oracle for Solaris on a Sun SPARC Enterprise M4000 (8 streams) -- batch workload
  3. PeopleSoft Enterprise Payroll 9.0 using Oracle for Solaris on a Sun SPARC Enterprise M4000 (16 streams) -- batch workload



2. \*HP's benchmark results white paper did not show the CPU and memory breakdown numbers separately for each of the application server nodes. It only shows the average of average CPU and memory utilization for all app server nodes under "App Servers". Sun's average CPU, memory numbers [shown in the above table] were calculated in the same way for consistency.

Wednesday Jan 20, 2010

PeopleSoft NA Payroll 240K EE Benchmark with 16 Job Streams : Another Home Run for Sun

Poor Steve A.[1] ... This entry is not about Steve A. though. It is about the new PeopleSoft NA Payroll benchmark result that Sun published today.

First things first. Here is the direct URL to our latest benchmark results:

        PeopleSoft Enterprise Payroll 9.0 using Oracle for Solaris on a Sun SPARC Enterprise M4000 (16 job streams[2] -- simply referred as 'stream' hereonwards)

The summary of the benchmark test results is shown below only for the 16 stream benchmarks. These numbers were extracted from the very first page of the benchmark results white papers where Oracle|PeopleSoft highlights the significance of the results and the actual numbers that are of interest to the customers. The results in the following table are sorted by the hourly throughput (payments/hour) in the descending order. The goal is to achieve as much hourly throughput as possible. Click on the link that is underneath the hourly throughput values to open corresponding benchmark result.

Oracle PeopleSoft North American Payroll 9.0 - Number of employees: 240,000 & Number of payments: 360,000
Vendor OS Hardware Config #Job Streams Elapsed Time (min) Hourly Throughput
Payments per Hour
Sun Solaris 10 5/09 1x Sun SPARC Enterprise M4000 with 4 x 2.53 GHz SPARC64-VII Quad-Core processors and 32 GB memory
1 x Sun Storage F5100 Flash Array with 40 Flash Modules for data, indexes
1 x Sun Storage J4200 Array for redo logs
16 43.78 493,376
HP HP-UX 1 x HP Integrity rx6600 with 4 x 1.6 GHz Intel Itanium2 9000 Dual-Core processors and 32 GB memory
1 x HP StorageWorks EVA 8100
16 68.07 317,320

This is all public information. Feel free to compare the hardware configurations and the data presented in both of the rows and draw your own conclusions. Since both Sun and HP used the same benchmark toolkit, workload and ran the benchmark with the same number of job streams, comparison should be pretty straight forward.

If you want to compare the 8 stream results, check the other blog entry: PeopleSoft North American Payroll on Sun Solaris with F5100 Flash Array : A blog Reprise. Sun used the same hardware to run both benchmark tests with 8 and 16 streams respectively. We could have gotten away with 20+ Flash Modules (FMODs), but we want to keep the benchmark environment consistent with our prior benchmark effort around the same benchmark workload with 8 job streams. Due to the same hardware setup, now we can easily demonstrate the advantage of parallelism (simply by comparing the test results from 8 and 16 stream benchmarks) and how resilient and scalable the F5100 Flash array is.

Our benchmarks showed an improvement of ~55% in overall throughput when the number of job streams were increased from 8 to 16. Also our 16 stream results showed ~55% improvement in overall throughput over HP's published results with the same number of streams at a maximum average CPU utilization of 45% compared to HP's maximum average CPU utilization of 89%. The half populated Sun Storage F5100 Flash Array played the key role in both of those benchmark efforts by demonstrating superior I/O performance over the traditional disk based arrays.

Before concluding, I would like to highlight a few known facts (just for the benefit of those people who may fall for the PR trickery):

  1. 8 job streams != 16 job streams. In other words, the results from an 8 stream effort is not comparable to that of a 16 stream result.
  2. The throughput should go up with increased number of job streams [ only up to some extent -- do not forget that there will be a saturation point for everything ]. For example, the throughput with 16 streams might be higher compared to the 8 stream throughput.
  3. The Law of Diminishing Returns applies to the software world too, not just for the economics. So, there is no guarantee that the throughput will be much better with 24 or 32 job streams.

Other blog posts and documents of interest:

  1. Best Practices for Oracle PeopleSoft Enterprise Payroll for North America using the Sun Storage F5100 Flash Array or Sun Flash Accelerator F20 PCIe Card
  2. PeopleSoft Enterprise Payroll 9.0 using Oracle for Solaris on a Sun SPARC Enterprise M4000 (8 streams benchmark white paper)
  3. PeopleSoft North American Payroll on Sun Solaris with F5100 Flash Array : A blog Reprise
  4. App benchmarks, incorrect conclusions and the Sun Storage F5100
  5. Oracle PeopleSoft Payroll (NA) Sun SPARC Enterprise M4000 and Sun Storage F5100 World Record Performance
































Notes:

[1] Steve A. tried so hard and his best to make everyone else believe that HP's 16 job stream NA Payroll 240K EE benchmark results are on par with Sun's 8 stream benchmark results. Apparently Steve A. failed and gave up after we showed the world a few screenshots from a published and eventually withdrawn benchmark [ by HP ]. You can read all his arguments, comparisons etc., in the comments section of my other blog entry PeopleSoft North American Payroll on Sun Solaris with F5100 Flash Array : A blog Reprise as well as in Joerg Moellenkamp's blog entries around the same topic.

[2] In PeopleSoft terminology, a job stream is something that is equivalent to a thread.

Tuesday Nov 10, 2009

PeopleSoft North American Payroll on Sun Solaris with F5100 Flash Array : A blog Reprise

During the "Sun day" keynote at OOW 09, John Fowler stated that we are #1 in PeopleSoft North American Payroll performance. Later Vince Carbone from our Performance Technologies group went on comparing our benchmark numbers with HP's and IBM's in BestPerf's group blog at Oracle PeopleSoft Payroll (NA) Sun SPARC Enterprise M4000 and Sun Storage F5100 World Record Performance. Meanwhile Jeorg Moellenkamp had been clarifying few things in his blog at App benchmarks, incorrect conclusions and the Sun Storage F5100. Interestingly it all happened while we have no concrete evidence in our hands to show to the outside world. We got our benchmark results validated right before the Oracle OpenWorld, which gave us the ability to speak about it publicly [ and we used it to the extent we could use ]. However Oracle folks were busy with their scheduled tasks for OOW 09 and couldn't work on the benchmark results white paper until now. Finally the white paper with the NA Payroll benchmark results is available on Oracle Applications benchmark web site. Here is the URL:

        PeopleSoft Enterprise Payroll 9.0 using Oracle for Solaris on a Sun SPARC Enterprise M4000

Once again the summary of results is shown below but in a slightly different format. These numbers were extracted from the very first page of the benchmark results white papers where PeopleSoft usually highlights the significance of the results and the actual numbers that they are interested in. The results are sorted by the hourly throughput (payments/hour) in the descending order. The goal is to achieve as much hourly throughput as possible. Since there is one 16 stream result as well in the following table, exercise caution when comparing 8 stream results with 16 stream results. In general, 16 parallel job streams are supposed to yield better throughput when compared to 8 parallel job streams. Hence comparing a 16 stream number with an 8 stream number is not an exact apple-to-apple comparison. It is more like comparing an apple to another apple that is half in size. Click on the link that is underneath the hourly throughput values to open corresponding benchmark result.

Oracle PeopleSoft North American Payroll 9.0 - Number of employees: 240,000 & Number of payments: 360,000
Vendor OS Hardware Config #Job Streams Elapsed Time (min) Hourly Throughput
Payments per Hour
Sun Solaris 10 5/09 1x Sun SPARC Enterprise M4000 with 4 x 2.53 GHz SPARC64-VII Quad-Core processors and 32 GB memory
1 x Sun Storage F5100 Flash Array with 40 Flash Modules for data, indexes
1 x Sun Storage J4200 Array for redo logs
8 67.85 318,349
HP HP-UX 1 x HP Integrity rx6600 with 4 x 1.6 GHz Intel Itanium2 9000 Dual-Core processors and 32 GB memory
1 x HP StorageWorks EVA 8100
16 68.07 317,320
HP HP-UX 1 x HP Integrity rx6600 with 4 x 1.6 GHz Intel Itanium2 9000 Dual-Core processors and 32 GB memory
1 x HP StorageWorks EVA 8100
8 89.77 240,615\*
IBM z/OS 1 x IBM zSeries 990 model 2084-B16 with 313 Feature with 6 x IBM z990 Gen1 processors (populated: 13, used: 6) and 32 GB memory
1 x IBM TotalStorage DS8300 with dual 4-way processors
8 91.7 235,551

This is all public information -- so, feel free to draw your own conclusions. \*At this time of writing, HP's 8 stream results were pulled out of Oracle Applications benchmark web site for some reason I do not know why. Hopefully it will show up again on the same web site soon. If it doesn't re-appear even after a month, probably we can simply assume that the result is withdrawn.

As these benchmark results were already discussed by different people in different blogs, I have nothing much to add. The only thing that I want to highlight is that this particular workload is moderately CPU intensive, but very I/O bound. Hence the better the I/O sub-system, the better the performance. Vince provided an insight on Why Sun Storage F5100 is a good option for this workload, while Jignesh Shah from our ISV-Engineering organization focused on the performance of this benchmark workload with F20 PCIe Card.

Also when dealing with NA Payroll, it is very unlikely to achieve a nice out-of-the-box performance. It requires a lot of database tuning too. As the data sets are very large, we partitioned the data in some of the very hot objects and it showed good improvement in query response times. So if you are a PeopleSoft customer running Payroll application with millions of rows of non-partitioned data, consider partitioning the data. [Updated 11/30/09]We are currently working on a best practices blueprint document for PeopleSoft North American Payroll that presents a variety of tuning tips like these in addition to the recommended practices for F5100 flash array and flash accelerator F20 PCIe card. Stay tuned .. Sun published a best practices blueprint document with a variety of tuning tips like these in addition to the recommended practices for F5100 flash array and flash accelerator F20 PCIe card. You can download the blueprint from the following location:

    Best Practices for Oracle PeopleSoft Enterprise Payroll for North America using the Sun Storage F5100 Flash Array or Sun Flash Accelerator F20 PCIe Card

Related Blog Post:

About

Benchmark announcements, HOW-TOs, Tips and Troubleshooting

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today