Monday Aug 17, 2009

T5440 Rocks [again] with Oracle Business Intelligence Enterprise Edition Workload

A while ago, I blogged about how we scaled Siebel 8.0 up to 14,000 concurrent users by consolidating the entire Siebel stack on a single Sun SPARC® Enterprise T5440 server with 4 x 1.4 GHz eight-core UltraSPARC® T2 Plus Processors. OLTP workload was used in that performance benchmark effort.

We repeated a similar effort by collaborating with Oracle Corporation, but with an OLAP workload this time around. Today Sun and Oracle announced the 28,000 user Oracle Business Intelligence Enterprise Edition (OBIEE) 10.1.3.4 benchmark results on a single Sun SPARC Enterprise T5440 server with 4 x 1.6 GHz eight-core UltraSPARC T2 Plus Processors running Solaris 10 5/09 operating system. An Oracle white paper with Sun's 28,000 user benchmark results is available on Oracle's benchmark web site.

Some of the notes and key take away's from this benchmark are as follows:

  • Key specifications for the Sun SPARC Enterprise T5440 system under test are: 4 x UltraSPARC T2 Plus processors, 32 cores, 256 compute threads and 128 GB of memory in a 4RU space.

  • The entire OBIEE solution was deployed on a single Sun SPARC Enterprise T5440 server using Oracle BI Cluster software.

  • The BI Cluster was configured with 4 x BI nodes. Each of those BI nodes were configured to run inside a Solaris Container.

    1. Each Solaris Container was configured with one physical processor (that is, 8 cores or 64 virtual cpus), and 32 GB physical memory.

    2. Each BI node was configured to run BI Server, Presentation Server and OC4J Web Server

    3. Two of the BI nodes have the BI Cluster Controller running (primary & secondary)

    4. One out of four Containers was sharing CPU and memory resources with Oracle 11g RDBMS and the host operating system that are running in the global zone

  • Caching was turned ON at the application server, which led to minimal database activity on the server.

    1. In other words, one can use these results only to size the hardware requirements for a complete BI EE deployment excluding the database server.

    2. All the OBIEE benchmark results published so far are with the caching turned ON. This fact was not explicitly mentioned in some of the benchmark results white papers. Check the competitive Landscape for the pointers to different benchmark results published by different vendors.

  • From our experiments with the OBIEE benchmark workload, it appears that a BI deployment with a single non-cluster BI node could reasonably scale well up to 7,500 active users on a T5440 server. To scale beyond 7,500 concurrent users, you might need another instance of BI. Of course, your mileage may vary.

  • BI EE exhibited excellent horizontal scalability when multiple BI nodes were clustered using BI Cluster software. Four BI nodes in the Cluster were able to handle 28,000 concurrent users with minimal impact on the overall average transaction response times.

      It appeared as though we can simply add more BI nodes to the BI Cluster to cope with the increase in user base. However due to the limited hardware resources, we could not try running beyond 4 nodes in the BI Cluster. As of today, the theoritical limit for the number of BI nodes in a Cluster is 16.

  • The underlying hardware must behave well in order for the application to scale and perform well -- so, credit goes to UltraSPARC T2 Plus powered Sun SPARC Enterprise T5440 server as well. In other words, it is fair to say the combination of (T5440 + OBIEE) performs and scales well on Solaris.

  • A summary of the results with system-wide averages of CPU and memory utilization is shown below.

    #Vusers Clustered #BI Nodes #CPU #Core RAM CPU Memory Avg Trx Response Time #Trx/sec
    7,500 No 1 1 8 32 GB 72.85% 18.11 GB 0.22 sec 155
    28,000 Yes 4 4 32 128 GB 75.04% 76.16 GB 0.25 sec 580
  • Internal Solid State Drive (SSD) with ZFS file system showed significant I/O performance improvement over traditional disk for the BI catalog activity. In addition, ZFS helped get past the UFS limitation of 32,767 sub-directories in a BI catalog directory.

  • The benchmark demonstrated that 64-bit BI EE platform is immune to the 4 GB virtual memory limitation of the 32-bit BI EE platform -- hence can potentially support even more users and have larger caches as long as the hardware resources are available.

      Solaris runs in 64-bit mode by default on SPARC platform. Consider running 64-bit BI EE on Solaris.

  • 2,107 watts is the average power consumption when all the 28,000 concurrent users are in the steady state of the benchmark test. That is, in the case of similarly configured workloads, T5440 supports 13.2 users per watt of the power consumed; and supports 7,000 users per rack unit.

TOPOLOGY DIAGRAM:

A picture is worth a thousand words. The following topology diagram(s) says it all about the configuration.

1. Single Node BI Non-Cluster Configuration : 7,500 Concurrent Users

Even though the Solaris Container was shown in a cloud like graphical form, it has nothing to do with the "Cloud Computing". It is just a side effect of fancy drawing.

2. Four Node BI Cluster Configuration : 28,000 Concurrent Users

COMPETITIVE LANDSCAPE

Here is a quick summary of all the results that are published by different vendors. Feel free to draw your own conclusions. All this is public information. Check the corresponding benchmark reports by clicking on the URLs under the "#Users" column.

Server Processors #Users OS
Chips Cores Threads GHz Type
  1 x Sun SPARC Enterprise T5440 4 32 256 1.6 UltraSPARC T2 Plus 28,000 Solaris 10 5/09
  5 x Sun Fire T2000 1 8 32 1.2 UltraSPARC T1 10,000 Solaris 10 11/06
  3 x HP DL380 G4 2 4 4 2.8 Intel Xeon 5,800 OEL
  1 x IBM x3755 4 8 8 2.8 AMD Opteron 4,000 RHEL4

CAUTION

Although T5440 possesses a ton of great qualities, it might not be suitable for deploying workloads with heavy single-threaded dependencies. The T5440 is an excellent hardware platform for multi-threaded, and moderately single-threaded/multi-process workloads. When in doubt, it is a good idea to leverage Sun Microsystems' Try & Buy program to try the workloads on the T5440 server before making the final call.


Check the second part of this blog post for the best practices for configuring / deploying Oracle Business Intelligence on top of Solaris 10 running on Sun CMT hardware.

Related Blog Posts:

Monday Oct 13, 2008

Siebel on Sun CMT hardware : Best Practices

The following suggested best practices are applicable to all Siebel deployments on CMT hardware (Tx00, T5x20, T5x40) running Solaris 10 [Note: some of this tuning applies to Siebel running on conventional hardware running Solaris]. These recommendations are based on our observations from the 14,000 user benchmark on Sun SPARC Enterprise T5440. Your mileage may vary.

All Tiers
  • Ensure that the system's firmware is up-to-date.

  • Upgrade to the latest update release of Solaris 10.

      Note to the customers running Siebel on Solaris 5/08: apply the kernel patch 137137-07 as soon as it is available on sunsolve.sun.com web site. Patch 137137-07 and later revisions, Solaris 10 10/08 will have the workaround to a critical Siebel specific bug. Oracle Corporation will eventually fix the bug in their codebase - in the meantime Solaris is covering for Siebel and all other 32-bit applications with their own memory allocators that return unaligned mutexes. Check the RFE 6729759 Need to accommodate non-8-byte-aligned mutexes and Oracle's Siebel support document 735451.1 Do NOT apply Kernel Patch 137111-04 on Solaris 10 for more details.


  • Enable 256M large pages on all nodes. By default, the latest update of Solaris 10 will use a maximum of 4M pages even when 256M pages are a good fit.

      256M pages can be enabled with the following /etc/system tunable.
      \* 256M pages
      set max_uheap_lpsize=0x10000000


  • Pro-actively avoid running into stdio's 256 file descriptors limitation.

      Set the following in a shell or add the following lines to the shell's profile (bash/ksh).
      ulimit -n 2048
      export LD_PRELOAD_32=/usr/lib/extendedFILE.so.1:$LD_PRELOAD_32

      Technically the file descriptor limit can be set to as high as 65536. However from the application's perspective, 2048 is a reasonable limit.


  • Improve scalability with MT-hot memory allocation library, libumem or libmtmalloc.

    To improve the scalability of the multi-threaded workloads, preload MT-hot object-caching memory allocation library like libumem(3LIB), mtmalloc(3MALLOC).

      eg., To preload the libumem library, set the LD_PRELOAD_32 environment variable in the shell (bash/ksh) as shown below.

      export LD_PRELOAD_32=/usr/lib/libumem.so.1:$LD_PRELOAD_32

      Web and the Application servers in the Siebel Enterprise stack are 32-bit. However Oracle 10g or 11g RDBMS on Solaris 10 SPARC is 64-bit. Hence the path to the libumem library in the PRELOAD statement differs slightly in the database-tier as shown below.

      export LD_PRELOAD_64=/usr/lib/sparcv9/libumem.so.1:$LD_PRELOAD_64

    Be aware that the trade-off is the increase in memory footprint -- you may notice 5 to 20% increase in the memory footprint with one of these MT-hot memory allocation libraries preloaded. Also not every Siebel application module benefits from MT-hot memory allocators. The recommendation is to experiment before implementing in production environments.

  • TCP/IP tunables

    Application fared well with the following set of TCP/IP parameters on Solaris 10 5/08.

    ndd -set /dev/tcp tcp_time_wait_interval 60000
    ndd -set /dev/tcp tcp_conn_req_max_q 1024
    ndd -set /dev/tcp tcp_conn_req_max_q0 4096
    ndd -set /dev/tcp tcp_ip_abort_interval 60000
    ndd -set /dev/tcp tcp_keepalive_interval 900000
    ndd -set /dev/tcp tcp_rexmit_interval_initial 3000
    ndd -set /dev/tcp tcp_rexmit_interval_max 10000
    ndd -set /dev/tcp tcp_rexmit_interval_min 3000
    ndd -set /dev/tcp tcp_smallest_anon_port 1024
    ndd -set /dev/tcp tcp_slow_start_initial 2
    ndd -set /dev/tcp tcp_xmit_hiwat 799744
    ndd -set /dev/tcp tcp_recv_hiwat 799744
    ndd -set /dev/tcp tcp_max_buf  8388608
    ndd -set /dev/tcp tcp_cwnd_max  4194304
    ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 67500
    ndd -set /dev/udp udp_xmit_hiwat 799744
    ndd -set /dev/udp udp_recv_hiwat 799744
    ndd -set /dev/udp udp_max_buf 8388608

Siebel Application Tier
  • All T-series systems (T1000/T2000, T5120/T5220, T5120/T5240, T5440) support the 256M page size. However Siebel's siebmtshw script restricts the page size to 4M. Comment out the following lines in $SIEBEL_HOME/siebsrvr/bin/siebmtshw.
      # This will set 4M page size for Heap and 64 KB for stack
      MPSSHEAP=4M
      MPSSSTACK=64K
      MPSSERRFILE=/tmp/mpsserr
      LD_PRELOAD=/usr/lib/mpss.so.1
      export MPSSHEAP MPSSSTACK MPSSERRFILE LD_PRELOAD

  • Experiment with less number of Siebel Object Managers.

      Configure the Object Managers in such a way that each OM will be handling at least 200 active users. Siebel's standard recommendation of 100 or less users per Object Manager is suitable for conventional systems but not ideal for CMT systems like Tx000, T5x20, T5x40, T5440. Sun's CMT systems are ideal for running multi-threaded processes with tons of LWPs per process. Besides, there will be significant improvement in the overall memory footprint with less number of Siebel Object Managers.

  • Try Oracle 11g R1 client in the application-tier. Oracle 10g R2 clients may crash under high load. For the symptoms of the crash, check Solaris/SPARC: Oracle 11gR1 client for Siebel 8.0.

      Oracle 10g R2 10.2.0.4 32-bit client is supposed to have a fix for the process crash issue - however it wasn't verified in our test environment.


Siebel Database Tier
  • Eliminate double buffering by forcing the file system to use direct I/O.

    Oracle database caches the data in its own cache within the shared global area (SGA) known as the database block buffer cache. Database reads and writes are cached in block buffer cache so the subsequent accesses for the same blocks do not need to re-read the data from the operating system. On the other hand, file systems on Solaris default to reading the data though the global file system cache for improved I/O operations. That is, by default each read is cached potentially twice - one copy in the operating system's file system cache, and the other copy in Oracle's block buffer cache. In addition to double caching, there is also some extra CPU overhead for the code which manages the operating system's file system cache. The solution is to eliminate double caching by forcing the file system to bypass the OS file system cache when reading and writing to the disk.

      In the 14,000 user benchmark setup, the UFS file systems (holding the data files and the redo logs) were mounted with the forcedirectio option.

      eg.,
      mount -o forcedirectio /dev/dsk/<partition> <mountpoint>


  • Store data files separate from the redo log files -- If the data files and the redo log files are stored on the same disk drive and if that disk drive fails, the files cannot be used in the database recovery procedures.

      In the 14,0000 user benchmark setup, there are two Sun StorateTek 2540 arrays connected to the T5440 - one array was holding the data files, where as the other was holding the Oracle redo log files.

  • Size online redo logs to control the frequency of log switches.

      In the 14,0000 user benchmark setup, two online redo logs were configured each with 10 GB disk space. When all 14,000 concurrent users are on-line, there is only one log switch in a 60 minute period.

  • If the storage array supports the read-ahead feature, enable it. When 'read-ahead enabled' is set to true, the write will be committed to the cache as opposed to the disk, and the OS signals the application that the write has been committed.


    Oracle Database Initialization Parameters

  • Set Oracle's initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT to appropriate value. DB_FILE_MULTIBLOCK_READ_COUNT parameter specifies the maximum number of blocks read in one I/O operation during a sequential scan.

      In the 14,0000 user benchmark configuration, DB_BLOCK_SIZE was set to 8 kB. During the benchmark run, the average reads are around 18.5 kB per second. Hence setting DB_FILE_MULTIBLOCK_READ_COUNT to a high value does not necessarily improve the I/O performance. A value of 8 for the database init parameter DB_FILE_MULTIBLOCK_READ_COUNT seems to perform better.


  • On T5240 and T5440 servers, set the database initialization parameter CPU_COUNT to 64. Otherwise, by default Oracle RDBMS assumes 128 and 256 for the CPU_COUNT on T5240 and T5440 respectively. Oracle's optimizer might use a completely different execution plan when it notices such a large number for the CPU_COUNT; and the resulting execution plan need not necessarily be an optimal one. In the 14,000 user benchmark, setting CPU_COUNT to 64 produced optimal execution plans.


  • On T5240 and T5440 servers, explicitly set the database initialization parameter _enable_NUMA_optimization to FALSE. On these multi-socket servers, _enable_NUMA_optimization will be set to TRUE by default. During the 14,000 user benchmark run, we noticed intermittent shadow process crashes with the default behavior. We didn't realize any additional gains either with the default NUMA optimizations.

Siebel Web Tier
  • Upgrade to the latest service pack of Sun Java Web Server 6.1 (32-bit).

  • Run the Sun Java Web Server in multi-process mode by setting the MaxProcs directive in magnus.conf to a value that is greater than 1. In the multi-process mode, the web server can handle requests using multiple processes with multiple threads in each process.

      When you specify a value greater than 1 for the MaxProcs, the web server relies on the operating system to distribute connections among/between multiple web server processes. However many modern operating systems including Solaris do not distribute connections evenly, particularly when there are a small number of concurrent connections.

  • Tune the maximum simultaneous requests by setting the RqThrottle parameter in the magnus.conf file to appropriate value. A value of 1024 was used in the 14,000 user benchmark.

Siebel 8.0 on Sun SPARC Enterprise T5440 - More Bang for the Buck!!

Today Sun announced the 14,000 user Siebel 8.0 PSPP benchmark results on a single Sun SPARC Enterprise T5440. An Oracle white paper with Sun's 14,000 user benchmark results is available on Oracle's Siebel benchmark web site. The content in this blog post complements the benchmark white paper.

Some of the notes and highlights from this competitive benchmark are as follows:

  • Key specifications for the Sun SPARC Enterprise T5440 system under test, are: 4 x UltraSPARC T2 Plus processors, 32 cores, 256 compute threads and 128 GB of memory in a 4RU space.

  • The entire Siebel 8.0 solution was deployed on a single Sun SPARC Enterprise T5440 including the web, gateway, application, and database servers.

      9 load driver clients with dual-core Opteron and Xeon processors were used to load up 14,000 concurrent users

  • Web, Application and the Database servers were isolated from each other by creating three Solaris Containers (non-global zones or local zones) dedicated one each for all those servers.

      Solaris 10 Binary Application Guarantee Program guarantees the binary compatibility for all applications running under Solaris native host operating system environments as well as Solaris 10 OS running as a guest operating system in a virtualized platform environment.

  • Siebel Gateway server and the Siebel Application servers were installed and configured in one of the three Solaris Containers. Two identical copies of Siebel Application server instances were configured to handle 7,000 user load by each of those instances.

      From our experiments with the Siebel 8.0 benchmark workload, it appears that a single instance of Siebel Application server could scale up to 10,000 active users. Siebel Connection Broker (SCBroker) component becomes the bottleneck at the peak load in a single instance of the Siebel Application server.

  • To keep it simple, the benchmark publication white paper limits itself to an overview of the system configuration. The full details are available in the diagram below.

    Topology Diagram



    The breakdown of the approximate averages of CPU and memory utilization by each tier is shown below.

    TierCPUMemory
    Web78%4.50 GB
    App76%69.00 GB
    DB72%20.00 GB

    System-wide averages are as follows:

    TierCPUMemory
    Web + App + DB82%93.5 GB


  • 1276 watts is the average power consumption when all the 14,000 concurrent users are in the steady state of the benchmark test. That is, in the case of similarly configured workloads, T5440 supports 10.97 users per watt of the power consumed; and supports 3500 users per rack unit.

Based on the above notes: Sun SPARC Enterprise T5440 is inexpensive, requires: less power and data center footprint, ideal for consolidation and equally importantly scales well.



Vendor-to-Vendor comparison

How does our new 14,000 user benchmark result compare with the high watermark benchmark results published by other vendors using the same Siebel 8.0 PSPP workload?

Besides Sun, IBM and HP are the only other vendors who published benchmark results so far with the Siebel 8.0 PSPP benchmark workload. IBM's highest user count is 7,000; where as 5,200 is HP's. Here is a quick comparison of the throughputs based on the results published by Sun, IBM and HP with the highest number of active users.

Sun Microsystems' 14,000 user benchmark on a single T5440 outperformed:

  • IBM's 7,000 user benchmark result by 1.9x

  • HP's 5,200 user benchmark result by 2.5x
      HP published the 5,200 user result with a combination of 2 x BL460c running Windows Server 2003 and 1 x rx6600 HP system running HP-UX.

  • Sun's own 10,000 user benchmark result on a combination of 2 x T5120 and 2 x T5220s by 1.4x

From the operating system perspective, Solaris outperformed AIX, Windows Server 2003 and HP-UX. Linux is nowhere to be found in the competitive landscape.

A simple comparison of all the published Siebel 8.0 benchmark results (as of today) by all vendors justifies the title of this blog post. As IBM and HP do not post the list price of all of their servers, I am not even attempting to show the price/performance comparison in here. On the other hand, Sun openly lists out all the list prices at store.sun.com web site.

CAUTION

Although T5440 possesses a ton of great qualities, it might not be suitable for deploying workloads with heavy single-threaded dependencies. The T5440 is an excellent hardware platform for multi-threaded, and moderately single-threaded/multi-process workloads. When in doubt, it is a good idea to leverage Sun Microsystems' Try & Buy program to try the workloads on this new and shiny T5440 before making the final call.



I would like to share the tuning information from the OS and the underlying hardware perspective for couple of reasons -- 1. Oracle's benchmark white paper does not include any of the system specific tuning information, and 2. it may take quite a bit of time for Oracle Corporation to update the Siebel Tuning Guide for Solaris with some of the tuning information that you find in here.

Check the second part of this blog post for the best practices running Siebel on Sun CMT hardware.

About

Benchmark announcements, HOW-TOs, Tips and Troubleshooting

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today