Friday Feb 28, 2014

[Solaris] Changing hostname, Parallel Compression, pNFS, Upgrading SRUs and Clearing Faults

[1] Solaris 11+ : changing hostname

Starting with Solaris 11, a system's identify (nodename) is configured through the config/nodename service property of the svc:/system/identity:node SMF service. Solaris 10 and prior versions have this information in /etc/nodename configuration file.

The following example demonstrates the commands to change the hostname from "ihcm-db-01" to "ehcm-db-01".

eg.,
# hostname
ihcm-db-01

# svccfg -s system/identity:node listprop config
config                       application        
config/enable_mapping       boolean     true
config/ignore_dhcp_hostname boolean     false
config/nodename             astring     ihcm-db-01
config/loopback             astring     ihcm-db-01
#

# svccfg -s system/identity:node setprop config/nodename="ehcm-db-01"

# svccfg -s system/identity:node refresh  -OR- 
	# svcadm refresh svc:/system/identity:node
# svcadm restart system/identity:node

# svccfg -s system/identity:node listprop config
config                       application        
config/enable_mapping       boolean     true
config/ignore_dhcp_hostname boolean     false
config/nodename             astring     ehcm-db-01
config/loopback             astring     ehcm-db-01

# hostname
ehcm-db-01

[2] Parallel Compression

This topic is not Solaris specific, but certainly helps Solaris users who are frustrated with the single threaded implementation of all officially supported compression tools such as compress, gzip, zip.

pigz (pig-zee) is a parallel implementation of gzip that suits well for the latest multi-processor, multi-core machines. By default, pigz breaks up the input into multiple chunks of size 128 KB, and compress each chunk in parallel with the help of light-weight threads. The number of compress threads is set by default to the number of online processors. The chunk size and the number of threads are configurable.

Compressed files can be restored to their original form using -d option of pigz or gzip tools. As per the man page, decompression is not parallelized out of the box, but may show some improvement compared to the existing old tools.

The following example demonstrates the advantage of using pigz over gzip in compressing and decompressing a large file.

eg.,

Original file, and the target hardware.

$ ls -lh PT8.53.04.tar 
-rw-r--r--   1 psft     dba         4.8G Feb 28 14:03 PT8.53.04.tar

$ psrinfo -pv
The physical processor has 8 cores and 64 virtual processors (0-63)
  The core has 8 virtual processors (0-7)
	...
  The core has 8 virtual processors (56-63)
    SPARC-T5 (chipid 0, clock 3600 MHz)

gzip compression.

$ time gzip --fast PT8.53.04.tar 

real    3m40.125s
user    3m27.105s
sys     0m13.008s

$ ls -lh PT8.53*
-rw-r--r--   1 psft     dba         3.1G Feb 28 14:03 PT8.53.04.tar.gz

/* the following prstat, vmstat outputs show that gzip is compressing the 
	tar file using a single thread - hence low CPU utilization. */

$ prstat -p 42510

   PID USERNAME  SIZE   RSS STATE   PRI NICE      TIME  CPU PROCESS/NLWP      
 42510 psft     2616K 2200K cpu16    10    0   0:01:00 1.5% gzip/1

$ prstat -m -p 42510

   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/NLWP  
 42510 psft      95 4.6 0.0 0.0 0.0 0.0 0.0 0.0   0  35  7K   0 gzip/1

$ vmstat 2

 r b w   swap  free  re  mf pi po fr de sr s0 s1 s2 s3   in   sy   cs us sy id
 0 0 0 776242104 917016008 0 7 0 0 0  0  0  0  0 52 52 3286 2606 2178  2  0 98
 1 0 0 776242104 916987888 0 14 0 0 0 0  0  0  0  0  0 3851 3359 2978  2  1 97
 0 0 0 776242104 916962440 0 0 0 0 0  0  0  0  0  0  0 3184 1687 2023  1  0 98
 0 0 0 775971768 916930720 0 0 0 0 0  0  0  0  0 39 37 3392 1819 2210  2  0 98
 0 0 0 775971768 916898016 0 0 0 0 0  0  0  0  0  0  0 3452 1861 2106  2  0 98

pigz compression.

$ time ./pigz PT8.53.04.tar 

real    0m25.111s	<== wall clock time is 25s compared to gzip's 3m 27s
user    17m18.398s
sys     0m37.718s

/* the following prstat, vmstat outputs show that pigz is compressing the 
        tar file using many threads - hence busy system with high CPU utilization. */

$ prstat -p 49734

   PID USERNAME  SIZE   RSS STATE   PRI NICE      TIME  CPU PROCESS/NLWP      
49734 psft       59M   58M sleep    11    0   0:12:58  38% pigz/66

$ vmstat 2

 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr s0 s1 s2 s3   in   sy   cs us sy id
 0 0 0 778097840 919076008 6 113 0 0 0 0 0  0  0 40 36 39330 45797 74148 61 4 35
 0 0 0 777956280 918841720 0 1 0 0 0  0  0  0  0  0  0 38752 43292 71411 64 4 32
 0 0 0 777490336 918334176 0 3 0 0 0  0  0  0  0 17 15 46553 53350 86840 60 4 35
 1 0 0 777274072 918141936 0 1 0 0 0  0  0  0  0 39 34 16122 20202 28319 88 4 9
 1 0 0 777138800 917917376 0 0 0 0 0  0  0  0  0  3  3 46597 51005 86673 56 5 39

$ ls -lh PT8.53.04.tar.gz 
-rw-r--r--   1 psft     dba         3.0G Feb 28 14:03 PT8.53.04.tar.gz

$ gunzip PT8.53.04.tar.gz 	<== shows that the pigz compressed file is 
                                         compatible with gzip/gunzip

$ ls -lh PT8.53*
-rw-r--r--   1 psft     dba         4.8G Feb 28 14:03 PT8.53.04.tar

Decompression.

$ time ./pigz -d PT8.53.04.tar.gz 

real    0m18.068s
user    0m22.437s
sys     0m12.857s

$ time gzip -d PT8.53.04.tar.gz 

real    0m52.806s <== compare gzip's 52s decompression time with pigz's 18s
user    0m42.068s
sys     0m10.736s

$ ls -lh PT8.53.04.tar 
-rw-r--r--   1 psft     dba         4.8G Feb 28 14:03 PT8.53.04.tar

Of course, there are other tools such as Parallel BZIP2 (PBZIP2), which is a parallel implementation of the bzip2 tool are worth a try too. The idea here is to highlight the fact that there are better tools out there to get the job done in a quick manner compared to the existing/old tools that are bundled with the operating system distribution.


[3] Solaris 11+ : Upgrading SRU

Assuming the package repository is set up already to do the network updates on a Solaris 11+ system, the following commands are helpful in upgrading a SRU.

  • List all available SRUs in the repository.

    # pkg list -af entire
  • Upgrade to the latest and greatest.

    # pkg update

    To find out what changes will be made to the system, try a dry run of the system update.

    # pkg update -nv
  • Upgrade to a specific SRU.

    # pkg update entire@<FMRI>

    Find the Fault Managed Resource Identifier (FMRI) string by running pkg list -af entire command.

Note that it is not so easy to downgrade SRU to a lower version as it may break the system. Should there be a need to downgrade or switch between different SRUs, relying on Boot Environments (BE) might be a good idea. Check Creating and Administering Oracle Solaris 11 Boot Environments document for details.


[4] Parallel NFS (pNFS)

Just a quick note — RFC 5661, Network File System (NFS) Version 4.1 introduced a new feature called "Parallel NFS" or pNFS, which allows NFS clients to access storage devices containing file data directly. When file data for a single NFS v4 server is stored on multiple and/or higher-throughput storage devices, using pNFS can result in significant improvement in file access performance. However Parallel NFS is an optional feature in NFS v4.1. Though there was a prototype made available few years ago when OpenSolaris was still alive, as of today, Solaris has no support for pNFS. Stay tuned for any updates from Oracle Solaris teams.

Here is an interesting write-up from one of our colleagues at Oracle|Sun (dated 2007) -- NFSv4.1's pNFS for Solaris.

(Credit to Rob Schneider and Tom Gould for initiating this topic)


[5] SPARC hardware : Check for and clear faults from ILOM

Couple of ways to check the faults using ILOM command line interface.

By running:

  1. show faulty command from ILOM command prompt, or
  2. fmadm faulty command from within the ILOM faultmgmt shell

Once found, use the clear_fault_action property with the set command to clear the fault for a FRU.

The following example checks for the faulty FRUs from ILOM faultmgmt shell, then clears it out.

eg.,

-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n)? y

faultmgmtsp> fmadm faulty

------------------- ------------------------------------ -------------- --------
Time                UUID                                 msgid          Severity
------------------- ------------------------------------ -------------- --------
2014-02-26/16:17:11 18c62051-c81d-c569-a4e6-e418db2f84b4 PCIEX-8000-SQ  Critical
        ...
        ...
Suspect 1 of 1
   Fault class  : fault.io.pciex.rc.generic-ue
   Certainty    : 100%
   Affects      : hc:///chassis=0/motherboard=0/cpuboard=1/chip=2/hostbridge=4
   Status       : faulted

   FRU
      Status            : faulty
      Location          : /SYS/PM1
      Manufacturer      : Oracle Corporation
      Name              : TLA,PM,T5-4,T5-8
        ...

Description : A fault has been diagnosed by the Host Operating System.

Response    : The service required LED on the chassis and on the affected
              FRU may be illuminated.

        ...

faultmgmtsp> exit

-> set /SYS/PM1 clear_fault_action=True
Are you sure you want to clear /SYS/PM1 (y/n)? y
Set 'clear_fault_action' to 'True'

Note that this procedure clears the fault from the SP but not from the host.

Saturday Sep 10, 2011

Oracle RDBMS : Generic Large Object (LOB) Performance Guidelines

This blog post is generic in nature and based on my recent experience with a content management system where securefile BLOBs are critical in storing and retrieving the checked in content. It is stro ngly suggested to check the official documentation in addition to these brief guidelines. In general, Oracle Database SecureFiles and Large Objects Developer's Guide 11g Release 2 (11.2) is a good starting point when creating tables involving SecureFiles and LOBs.

Guidelines

  • Tablespace: create the LOB in a different tablespace isolated from the rest of the database
  • Block size: consider larger block size (default 8 KB) if the expected size of the LOB is big
  • Chunk size: consider larger chunk size (default 8 KB) if larger LOBs are expected to be stored and retrieved
  • Inline or Out-of-line: choose "DISABLE STORAGE IN ROW" (out-of-line) if the average LOB size is expected to be > 4 KB. The default inlining is fine for smaller LOBs
  • CACHE or NOCACHE: consider bypassing the database buffer cache (NOCACHE) if large number of LOBs are stored and not expected to be retrieved frequently
  • COMPRESS or NOCOMPRESS: choose COMPRESS option if storage capacity is a concern and a constraint. It saves disk space at the expense of some performance overhead. In a RAC database environment, it is recommended to compress the LOBs to reduce the interconnect traffic
  • De-duplication: by default, duplicate LOBs are stored as a separate copy in the database. Choosing DEDUPLICATE option enables sharing the same data blocks for similar files thus reducing storage overhead and simplifying storage management
  • Partitioning: consider partitioning the parent table to maximize application performance. Hash partitioning is one of the options if there is no potential partition key in the table
  • Zero-Copy I/O protocol: turned on by default. Turning it off in a RAC database environment could be beneficial. Set the initialization parameter _use_zero_copy_io=FALSE to turn o ff the Zero-Copy I/O protocol
  • Shared I/O pool: database uses the shared I/O pool to perform large I/O operations on securefile LOBs. The shared I/O pool uses shared memory segments. If this pool is not large enough or if there is not enough memory available in this pool for a securefile LOB I/O operation, Oracle uses a portion of PGA until there is sufficient memory available in the shared I/O pool. Hence it is recommen ded to size the shared I/O pool appropriately by monitoring the database during the peak activity. Relevant initialization parameters: _shared_io_pool_size and _shared_iop_max_size

Also see:
Oracle Database Documentation : LOB Performance Guidelines

Sunday Jan 09, 2011

Oracle 11g : Poor Performance Accessing V$SESSION_FIX_CONTROL

PeopleSoft HCM, Financials/SCM 9.x customers may have to patch their Oracle database server with RDBMS patch 9699654. Rest of the Oracle customers: read the symptoms and decide.

In couple of PeopleSoft deployments it is observed that the following SQL is the top query when all queries are sorted by elapsed time or CPU time. 11.2.0.1.0 is the Oracle database server version.


SELECT VALUE FROM V$SESSION_FIX_CONTROL WHERE BUGNO = :B1 AND SESSION_ID = USERENV('SID')

The target query is being executed thousands of times. The poor performance is due to the lack of a proper index. Here is the explain plan that exhibits the performance issue.

-------------------------------------------------------------------------------
| Id  | Operation	 | Name       | Starts | E-Rows | A-Rows |   A-Time   |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |	      |      1 |	|      1 |00:00:00.02 |
|\*  1 |  FIXED TABLE FULL| X$QKSBGSES |      1 |      1 |      1 |00:00:00.02 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("BUGNO_QKSBGSEROW"=:B1 AND
	      "SID_QKSBGSEROW"=USERENV('SID') AND "INST_ID"=USERENV('INSTANCE')))

20 rows selected.

Oracle Corporation accepted this behavior as a bug and agreed to fix in Oracle RDBMS 12.1. Meanwhile an RDBMS patch was made available to the customers running 11.2.0.1 or later. 9699654 is the bug# (Bad performance of V$SESSION_FIX_CONTROL query) - so, Solaris SPARC customers can download the RDBMS patch 9699654 directly from the support web site. Customers on other platforms: please search the bug database and support web site with appropriate keywords.

After applying the RDBMS patch 9699654, the optimizer was using an index and the query performance was improved as expected. Also the target SQL query was no longer the top SQL - in fact, no references to this particular query were found in the AWR report. The new explain plan is shown below.

----------------------------------------------------------------------------------------------
| Id  | Operation               | Name               | Starts | E-Rows | A-Rows |   A-Time   |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                    |      1 |        |      1 |00:00:00.01 |
|\*  1 |  FIXED TABLE FIXED INDEX| X$QKSBGSES (ind:1) |      1 |      1 |      1 |00:00:00.01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("BUGNO_QKSBGSEROW"=:B1 AND "SID_QKSBGSEROW"=USERENV('SID') AND
              "INST_ID"=USERENV('INSTANCE')))

20 rows selected.

Wednesday Dec 15, 2010

Oracle Solaris Studio C/C++ : Inline Functions

Function inlining improves runtime performance by replacing a call to a function with the body of the function. This eliminates the overhead of jumping to and returning from a subroutine. An additional advantage is that placing the function code "inline" exposes it to further optimization.

The C++ compiler performs two kinds of inlining: front-end (parser) and back-end (code generator). The C and Fortran compilers support only back-end inlining. The same code generator is used for all compilers on a platform. The C++ compiler performs front-end inlining because it can use its knowledge of C++ semantics to eliminate extra copies of objects among other things that the code generator would not be able to do. The back-end inlining does not depend on the programming language.

The C++ compiler front end attempts to inline a function declared inline. If the function is too large, a warning message will be printed on stdout when +w or +w2 ("more warnings") option is used. The +d option prevents the front end from attempting to inline any function. The -g option also turns off front-end inlining. The -O options do not affect front-end inlining. C++ front end turns off function inlining when a combination of -g and -O options are specified on compile line. It may lead to loss of runtime performance. To avoid this, it is suggested to use -g0 instead of -g. C does not have -g0 so use -g instead.

With an optimization level of -O4 or higher, the code generator examines all functions independent of how they were declared in source code and replace function calls with inline code where it thinks the replacement will be beneficial. No diagnostic messages are displayed about back-end inlining. The +d option has no impact on back-end inlining.

Couple of trivial examples to demonstrate the compiler behavior.

eg.,

% cat inline.c

#include <stdio.h>

inline void printmespam() {
        printf("print me"); printf("print me"); 
        printf("print me"); printf("print me");
        printf("print me"); printf("print me"); 
        printf("print me"); printf("print me");
        printf("print me"); printf("print me"); 
        printf("print me"); printf("print me");
        printf("print me"); printf("print me"); 
        printf("print me"); printf("print me");
        printf("print me");
}

inline void printme() {
        printf("print me");
}

int main() {
        printme();
        printmespam();
        return (0);
}

% CC +w2 inline.c
"inline.c", line 17: Warning: "printmespam()" is too large and will not be expanded inline.
1 Warning(s) detected.

In the above example, printmespam() was not inlined by the compiler though it was explicitly requested to do so. The keyword inline is only a request but not a guarantee.

How to check if a routine is inlined?

A: Check the symbol table of the executable. If the routine doesn't show up in the symbol table, it is an indication that the missing routine is inlined. This is because the compiler might have replaced the function call with the body of the function.

% elfdump -CsN.symtab a.out | grep printme
      [85]  0x00010e68 0x000000a4  FUNC GLOB  D    0 .text       void printmespam()

printme is inlined where as printmespam is not.

Another way is to check the assembly code being generated. To generate the assembly code, compile the code with -S option of Oracle Solaris Studio compilers.

eg.,

% cat swap.c

void swap (int \*a, int \*b) {
        int t = \*a;
        \*a = \*b;
        \*b = t;
}

int main (int argc, char \*argv) {
        int x = 5, y = 2;
        swap (&x,&y);
        return (0);
}

% CC +w2 -S swap.c

% grep call swap.s
        call    __1cEswap6Fpi0_v_

% dem __1cEswap6Fpi0_v_
__1cEswap6Fpi0_v_ == void swap(int\*,int\*)

From the above output(s), it is clear that the function is not inlined since an assembly instruction has been generated with a call to routine swap. Let's add the keyword inline to the function definition.

% cat swap.c

inline void swap (int \*a, int \*b) {
        int t = \*a;
        \*a = \*b;
        \*b = t;
}

int main (..) { .. }

% CC +w2 -S swap.c
% grep call swap.s
%

After instructing the compiler to inline the routine swap, the compiler was able to inline the function in main() mainly because it was not too big. That is why no assembly instruction has been generated with a call to swap.

Another example to demonstrate slightly different behavior.

eg.,

% cat inline2.c

#include <stdio.h>

int globvar = 0;

inline void setglob () {
        globvar= 25;
}

int main (int argc, char \*argv[]) {
        globvar= 5;
        setglob ();
        printf ("Now global variable holds %d\\n", globvar);
        return (0);
}

% cc -o test inline2.c
Undefined                       first referenced
 symbol                             in file
setglob                             inline2.o
ld: fatal: Symbol referencing errors. No output written to test

The above code violates a C rule. An inline definition without an extern directive does not create an instance of the function. Calling the function has undefined results. The fix is to declare setglob with external linkage as shown below.

C++ has a different rule for inline functions. The compiler is required to figure out how to generate a defining instance if one is needed without any special action by the programmer. So the above example has valid C++ code but it is valid in C.

 
% cat inline2.c

#include <stdio.h>

int globvar = 0;

extern inline void setglob () {
        globvar= 25;
}

int main (..) { .. }

% cc inline2.c
% ./a.out
Now global variable holds 25

Notes:

  1. Do not use if(0) in an inline function. Use #if 0 instead

  2. Do not put a return statement in the "then" part of an "if" statement. Rearrange the code to put the return in the "else" part or outside the if-else entirely.

Acknowledgements:
Steve Clamage

(Location of original blogpost:
http://technopark02.blogspot.com/2005/04/sun-cc-compilers-inlining-routines.html)

Sunday Nov 07, 2010

Instructions to Turn ON/OFF Hardware Prefetch on SPARC64 Systems

The hardware prefetch is ON by default on M-series servers such as M8000/M9000, M4000/M5000, M3000

The following excerpt is from a SPARC64 document:

Hardware speculatively issues the prefetch operation based on the prediction that there is high possibility to access to the following continuous address in the future, if there have been load accesses for a consecutive address.

Although this feature is designed to improve the performance of various workloads, due to the speculative nature, not all workloads may benefit with the default behavior. For example, in our experiments, we noticed 10+% improvement in CPU utilization while running some of the PeopleSoft workloads on M-series hardware with hardware prefetch turned off. Hence irrespective of the application/workload, the recommended approach is to conduct few experiments by running representative customer workloads on target M-series hardware with and without the hardware prefetch turned on.

Instructions to Turn On/Off Hardware Prefetch:

  1. Connect to the system Service Processor (XSCF)

    % ssh -l <userid> <host>
    
  2. Check the current prefetch mode by running the following command at XSCF> prompt

    XSCF> showprefetchmode
    
  3. Find the domain id of all mounted system boards (or skip to next step)

    XSCF> showboards -a
    
  4. Power-off all configured domains

    XSCF> poweroff -d <domainid> [OR]
    XSCF> poweroff -a
    

    From my experience, on larger systems with multiple domains configured, all domains must be powered off before the SP lets changing the prefetch mode. If someone has a correction to this information or better instruction that minimizes disruption, please let me know. I'd be happy to update these instructions.

  5. Wait until the domain(s) are completely powered off. Check the status by running showlogs command

    XSCF> showlogs power
    
  6. Change the prefetch mode to the desired value

    XSCF> setprefetchmode -s [on|off]
    
  7. Verify the prefetch mode

    XSCF> showprefetchmode
    
  8. Finally power-on all configured domains

    XSCF> poweron -d <domainid> [OR]
    XSCF> poweron -a
    
  9. Disconnect from SP, and wait for the OS to boot up

Note to Sun-Oracle customers:

If the default value of hardware prefetch is changed, please make sure to mention this in any service requests, bug reports, etc., that you may file with Oracle Corporation. Unfortunately none of the standard commands on Solaris report the status of hardware prefetch - so, providing this additional piece of information beforehand will help the person who is analyzing/diagnosing the case.

Wednesday Apr 28, 2010

Oracle Database: Say No Thanks! to a New Index

.. unless you are working with a database that is largely read-only or if the new index is supposed to be fixing a critical performance issue with no side effect(s).

Two topics covered in this blog entry with plenty of simple examples:

  1. Why creating new indexes on a heavily modified table may not be a good idea? and
  2. How to identify unused indexes?
Read on.

Indexes are double-edged swords that may improve the performance of targeted queries, but in some cases they may accidentally degrade the performance of other queries that are not targeted. In any case, exercise caution while adding a new index to the database. After adding a new index, monitor the overall performance of the database, not just the targeted query.

If DML statements that modify data (INSERT, UPDATE, or DELETE) are being executed large number of times on a table, make sure that the addition of a new index on the same table does not negatively affect the performance of those DML operations. Usually this is not a problem if the SQLs being executed are simply retrieving but not adding or modifying the existing data. In all other cases, there is some performance overhead induced by the addition of each new index. For example, if there are 10 indexes created on a table DUMMY, adding a new row of data to the table DUMMY may require updating all 10 indexes behind the scenes by the database management system.

Here is an example demonstrating the performance overhead of a new index on a table.


SQL> CREATE TABLE VIDEO
  2  (BARCODE VARCHAR(10) NOT NULL,
  3  TITLE VARCHAR2(25) NOT NULL,
  4  FORMAT VARCHAR2(10),
  5  PRICE NUMBER,
  6  DATA_OF_RELEASE DATE)
  7  /

Table created.

SQL> insert into VIDEO values ('9301224321', 'AVATAR', 'BLU-RAY', 19.99, '22-APR-2010');

1 row created.

..

SQL> insert into VIDEO values ('3782460017', 'THE SIMPSONS - SEASON 20', 'BLU-RAY', 29.99, '04-JUL-2009');

1 row created.

SQL> select \* from VIDEO;

BARCODE 	TITLE				    FORMAT		 PRICE DATA_OF_RELEASE
--------------- ----------------------------------- --------------- ---------- ---------------
9301224321	AVATAR				    BLU-RAY		 19.99 22-APR-10
7619203043	BEN-HUR 			    VHS 		  9.79 12-MAR-63
7305832093	THE MATRIX			    DVD 		 12.29 03-DEC-99
4810218795	MEMENTO 			    DVD 		  8.49 02-FEB-02
3782460017	THE SIMPSONS - SEASON 20	    BLU-RAY		 29.99 04-JUL-09

SQL> select \* from USER_INDEXES where TABLE_NAME = 'VIDEO';

no rows selected

SQL> alter session set events '10046 trace name context forever, level 8';

Session altered.

SQL> select \* from VIDEO where FORMAT = 'BLU-RAY';

BARCODE 	TITLE				    FORMAT		 PRICE DATA_OF_RELEASE
--------------- ----------------------------------- --------------- ---------- ---------------
9301224321	AVATAR				    BLU-RAY		 19.99 22-APR-10
3782460017	THE SIMPSONS - SEASON 20	    BLU-RAY		 29.99 04-JUL-09

SQL> alter session set events '10046 trace name context off';

Session altered.

SQL trace file has the following contents.


SQL ID: 0pu5s70nsdnzv
Plan Hash: 3846322456
SELECT \* 
FROM
 VIDEO WHERE FORMAT = :"SYS_B_0"


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0         16          0           2
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          0         16          0           2

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 28  

Rows     Row Source Operation
-------  ---------------------------------------------------
      2  TABLE ACCESS FULL VIDEO (cr=16 pr=0 pw=0 time=3 us cost=4 size=100 card=2)

Let's create an index and see what happens.


SQL> create index VIDEO_IDX1 on VIDEO (FORMAT);

Index created.

SQL>  alter session set events '10046 trace name context forever, level 8';

Session altered.

SQL> select \* from VIDEO where FORMAT = 'BLU-RAY';

BARCODE 	TITLE				    FORMAT		 PRICE DATA_OF_RELEASE
--------------- ----------------------------------- --------------- ---------- ---------------
9301224321	AVATAR				    BLU-RAY		 19.99 22-APR-10
3782460017	THE SIMPSONS - SEASON 20	    BLU-RAY		 29.99 04-JUL-09

SQL> alter session set events '10046 trace name context off';

Session altered.

The latest contents of the trace file are as follows. Notice the reduction in buffer gets from 16 to 4. That is, the new index improved the query performance by 75%.


SQL ID: 0pu5s70nsdnzv
Plan Hash: 2773508764
SELECT \* 
FROM
 VIDEO WHERE FORMAT = :"SYS_B_0"


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          4          0           2
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          0          4          0           2

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 28  (CS90)

Rows     Row Source Operation
-------  ---------------------------------------------------
      2  TABLE ACCESS BY INDEX ROWID VIDEO (cr=4 pr=0 pw=0 time=12 us cost=2 size=100 card=2)
      2   INDEX RANGE SCAN VIDEO_IDX1 (cr=2 pr=0 pw=0 time=10 us cost=1 size=0 card=2)(object id 76899)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      2   TABLE ACCESS (BY INDEX ROWID) OF 'VIDEO' (TABLE)
      2    INDEX   MODE: ANALYZED (RANGE SCAN) OF 'VIDEO_IDX1' (INDEX)

So far so good. Let's add a new row of data and examine the trace file one more time. From hereafter, keep an eye on the "current" column (logical IOs performed due to an INSERT, UPDATE or DELETE) and notice how it changes with different actions -- adding and removing: indexes, new row(s) of data etc.,


SQL ID: dnb2d8cpdj56p
Plan Hash: 0
INSERT INTO VIDEO 
VALUES
 (:"SYS_B_0", :"SYS_B_1", :"SYS_B_2", :"SYS_B_3", :"SYS_B_4")


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          1          7           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.00          0          1          7           1

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 28  (CS90)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  LOAD TABLE CONVENTIONAL  (cr=1 pr=0 pw=0 time=0 us)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  INSERT STATEMENT   MODE: ALL_ROWS
      0   LOAD TABLE CONVENTIONAL OF 'VIDEO'

Now drop the index, re-insert the last row and get the tracing data again.


SQL> drop index VIDEO_IDX1;

Index dropped.

SQL> delete from VIDEO where BARCODE ='4457332907';

1 row deleted.

SQL> commit;

Commit complete.

SQL> alter session set events '10046 trace name context forever, level 8';

Session altered.

SQL> insert into VIDEO values ('4457332907', 'KING OF THE HILL - ALL', 'DVD', 90.00, '01-JAN-2011');

1 row created.

SQL> alter session set events '10046 trace name context off';

Session altered.

The contents of the latest trace file are shown below.


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.01       0.00          0          2          5          1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.01       0.00          0          2          5           1

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 28  (CS90)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  LOAD TABLE CONVENTIONAL  (cr=1 pr=0 pw=0 time=0 us)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  INSERT STATEMENT   MODE: ALL_ROWS
      0   LOAD TABLE CONVENTIONAL OF 'VIDEO'

This time create two indexes and see what happens.


SQL> CREATE INDEX VIDEO_IDX1 ON VIDEO (FORMAT);

Index created.

SQL> CREATE INDEX VIDEO_IDX2 ON VIDEO (TITLE);

Index created.

Trace file contents:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          1          9           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.00          0          1          9           1

Notice the two additional logical IOs (look under "current" column). Those additional logical input/output operations are the result of the new indexes. The number goes up as we add more indexes and data to the table VIDEO.


SQL> delete from VIDEO where BARCODE ='4457332907';

1 row deleted.

SQL> commit;

Commit complete.

SQL> create index VIDEO_IDX3 on VIDEO (PRICE, DATA_OF_RELEASE);

Index created.

SQL>  alter session set events '10046 trace name context forever, level 8';

Session altered.

SQL> insert into VIDEO values ('4457332907', 'KING OF THE HILL - ALL', 'DVD', 90.00, '01-JAN-2011');

1 row created.

SQL> alter session set events '10046 trace name context off';

Session altered.


SQL trace:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          1         11           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.00          0          1         11           1

You can try other operations such as UPDATE, DELETE on your own.

Since there are only few rows of data in the table VIDEO, it is hard to notice the real performance impact in these examples. If you really want to see the negative performance impact due to the large number of indexes on a heavily updated table, try adding thousands or millions of rows of data and few more indexes.

Moral of the story: Indexes aren't always cheap - they may have some overhead associated with them. Be aware of those overheads and ensure that the index maintenance overhead do not offset the performance gains resulting from the indexes created on a particular table.


Monitoring Index Usage

Now we know the possible disadvantage of having too many indexes on a heavily updated table. One way to reduce the index maintenance overhead is to instrument the indexes so we can monitor their usage from time to time and remove the unused indexes. To start monitoring the index usage, alter the index by specifying the keywords MONITORING USAGE.


SQL> select index_name from user_indexes where table_name = 'VIDEO';

INDEX_NAME
--------------------------------------------------------------------------------
VIDEO_IDX3
VIDEO_IDX1
VIDEO_IDX2

SQL> alter index VIDEO_IDX1 MONITORING USAGE;

Index altered.

SQL> alter index VIDEO_IDX2 MONITORING USAGE;

Index altered.

SQL> alter index VIDEO_IDX3 MONITORING USAGE;

Index altered.

Once the indexes are instrumented, query the V$OBJECT_USAGE view occasionally to see if the instrumented indexes are being used in executing SQL queries.


SQL> select \* from VIDEO where BARCODE LIKE '%22%';

BARCODE 	TITLE				    FORMAT		 PRICE DATA_OF_RELEASE
--------------- ----------------------------------- --------------- ---------- ---------------
9301224321	AVATAR				    BLU-RAY		 19.99 22-APR-10

SQL> select \* from VIDEO where FORMAT = 'VHS';

BARCODE 	TITLE				    FORMAT		 PRICE DATA_OF_RELEASE
--------------- ----------------------------------- --------------- ---------- ---------------
7619203043	BEN-HUR 			    VHS 		  9.79 12-MAR-63

SQL> select \* from VIDEO where PRICE < 20;

BARCODE 	TITLE				    FORMAT		 PRICE DATA_OF_RELEASE
--------------- ----------------------------------- --------------- ---------- ---------------
4810218795	MEMENTO 			    DVD 		  8.49 02-FEB-02
7619203043	BEN-HUR 			    VHS 		  9.79 12-MAR-63
7305832093	THE MATRIX			    DVD 		 12.29 03-DEC-99
9301224321	AVATAR				    BLU-RAY		 19.99 22-APR-10

SQL> select \* from VIDEO where FORMAT = 'BLU-RAY' AND DATA_OF_RELEASE < '01-JAN-2010';

BARCODE 	TITLE				    FORMAT		 PRICE DATA_OF_RELEASE
--------------- ----------------------------------- --------------- ---------- ---------------
3782460017	THE SIMPSONS - SEASON 20	    BLU-RAY		 29.99 04-JUL-09


SQL> column INDEX_NAME format A25
SQL> column START_MONITORING format A30

SQL> select INDEX_NAME, USED, START_MONITORING 
  2  from V$OBJECT_USAGE 
  3  where INDEX_NAME LIKE 'VIDEO_IDX%'
  4  /

INDEX_NAME		  USED	    START_MONITORING
------------------------- --------- ------------------------------
VIDEO_IDX1		  YES	    04/27/2010 01:10:20
VIDEO_IDX2		  NO	    04/27/2010 01:10:25
VIDEO_IDX3		  YES	    04/27/2010 01:10:31

In the above example, the index VIDEO_IDX2 was not in use during the period of index monitoring. If we are convinced that the queries that will be executed are similar to the ones that were executed during the index monitoring period, we can go ahead and remove the index VIDEO_IDX2 to reduce the performance overhead during updates on table VIDEO.

To stop monitoring the index usage, alter the index with the keywords NOMONITORING USAGE.


SQL> alter index VIDEO_IDX1 NOMONITORING USAGE;

Index altered.

Saturday Apr 03, 2010

Oracle 11g R1: Poor Data Pump Performance when Exporting a Partitioned Table

Symptom(s)

Data Pump Export utility, expdp, performs well with non-partitioned tables, but exhibits extreme poor performance when exporting objects from a partitioned table of similar size. In some cases the degradation can be as high as 3X or worse.

SQL traces may show that much of the time is being spent in a SQL statement that is similar to:


	UPDATE "schema"."TABLE" mtu 
        SET mtu.base_process_order = NVL((SELECT mts1.process_order FROM "schema"."TABLE" mts1 
        WHERE ..

Here is an example data export session:


Export: Release 11.1.0.7.0 - 64bit Production on Wednesday, 31 March, 2010 6:56:50

Copyright (c) 2003, 2007, Oracle.  All rights reserved.
;;; 
Connected to: Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SCHMA"."SYS_EXPORT_TABLE_01":  SCHMA/\*\*\*\*\*\*\*\* DIRECTORY=exp_dir DUMPFILE=SOME_DUMMY_PART_FULL.DMP TABLES=SOME_DUMMY_PART
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 20.56 GB
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/INDEX/INDEX
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
. . exported "SCHMA"."SOME_DUMMY_PART":"DUMMY_PART_P01"  1.143 GB 13788224 rows
. . exported "SCHMA"."SOME_DUMMY_PART":"DUMMY_PART_P02"  1.143 GB 13788224 rows
. . exported "SCHMA"."SOME_DUMMY_PART":"DUMMY_PART_P03"  1.143 GB 13788224 rows
...
. . exported "SCHMA"."SOME_DUMMY_PART":"DUMMY_PART_P32"  151.1 MB 1789216 rows
. . exported "SCHMA"."SOME_DUMMY_PART":"DUMMY_PART_P33"  11.37 MB  136046 rows
. . exported "SCHMA"."SOME_DUMMY_PART":"DUMMY_PART_P00"      0 KB       0 rows
Master table "SCHMA"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
Dump file set for SCHMA.SYS_EXPORT_TABLE_01 is:
  /DBDUMP/SOME_DUMMY_PART_FULL.DMP
Job "SCHMA"."SYS_EXPORT_TABLE_01" successfully completed at 11:22:36

Solution(s) / Workaround

This is a known issue (that is, a bug) and a solution is readily available. Try any of the following to resolve the issue:

  • Apply the 11g database patch 8845859
  • Upgrade to 11.2.0.2 patchset when it is available, or
  • Specify "VERSION=10.2.0.3" expdp option as a workaround

I ran into this issue and I chose the workaround to make some quick progress. With the string "VERSION=10.2.0.3" appended, export time went down from 265 minutes to 60+ minutes.

Monday Aug 17, 2009

Oracle Business Intelligence on Sun : Few Best Practices

(Updated on 10/16/09 with additional content and restructured the blog entry for clarity and easy navigation)

The following suggested best practices are applicable to all Oracle BI EE deployments on Sun hardware (CMT and M-class) running Solaris 10 or later. These recommendations are based on our observations from the 50,000 user benchmark on Sun SPARC Enterprise T5440. It is not the complete list, and your mileage may vary.

Hardware : Firmware

Ensure that the system's firmware is up-to-date.

Solaris Recommendations

  • Upgrade to the latest update release of Solaris 10.

  • Solaris runs in 64-bit mode by default on SPARC platform. Consider running 64-bit BI EE on Solaris.

      64-bit BI EE platform is immune to the 4 GB virtual memory limitation of the 32-bit BI EE platform -- hence can potentially support even more users and have larger caches as long as the hardware resources are available.

  • Enable 256M large pages on all nodes. By default, the latest update of Solaris 10 will use a maximum of 4M pages even when 256M pages are a good fit.

      256M pages can be enabled with the following /etc/system tunables.
      
      \* 256M pages for the process heap
      set max_uheap_lpsize=0x10000000
      
      \* 256M pages for ISM
      set mmu_ism_pagesize=0x10000000
      
      

  • Increase the file descriptor limits by adding the following lines to /etc/system on all BI nodes.
      
      \* file descriptor limits
      set rlim_fd_cur=65536
      set rlim_fd_max=65536
      
      
  • On larger systems with more CPUs or CPU cores, try not to deploy Oracle BI EE in the global zone.

      In our benchmark testing, we have observed unpredictable and abnormal behavior of the BI server process (nqsserver) in the global zone under moderate loads. This behavior is clearly noticeable when there are more than 64 vcpus allocated to the global zone.

  • If the BI presentation catalog is stored on a local file system, create a ZFS file system to hold the catalog.

      If there are more than 25,000 authorized users in a BI deployment, the default UFS file system may run into Too many links error when the Presentation Server tries to create more than 32,767 sub-directories (refer to LINK_MAX on Solaris)

  • Store the Presentation Catalog on a disk with faster I/O such as a Solid State Drive (SSD). For uniform reads and writes across different disk drives [ and of course for better performance ], we recommend creating ZFS file system on top of a zpool with multiple SSDs.

    Here is an example that shows the ZFS file system creation steps for the BI Presentation Catalog.

    
    # zpool create -f BIshare c1t2d0s6 c1t3d0s0 c1t4d0s6 c1t5d0s6
    
    # zpool list
    NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
    BIshare   118G    97K   118G     0%  ONLINE  -
    
    # zfs create BIshare/WebCat
    
    # fstyp /dev/dsk/c1t2d0s6
    zfs
    
    # zpool status -v
      pool: BIshare
     state: ONLINE
     scrub: none requested
    config:
    
            NAME        STATE     READ WRITE CKSUM
            BIshare     ONLINE       0     0     0
              c1t2d0s6  ONLINE       0     0     0
              c1t3d0s0  ONLINE       0     0     0
              c1t4d0s6  ONLINE       0     0     0
              c1t5d0s6  ONLINE       0     0     0
    
    errors: No known data errors
    
    

    Observe the I/O activity on ZFS file system by running zpool iostat -v command.

Solaris : ZFS Recommendations

  • If the file system is mainly used for storing the Presentation Catalog, consider setting the ZFS record size to 8K. This is because of the relatively small size (8K or less) reads/writes from/into the BI Catalog.

    eg.,
    
            # zfs set recordsize=8K BIshare/WebCat
    
    

    In the case of database, you may have to set the ZFS record size to the database block size.

  • Even though disabling ZFS Intent Log (ZIL) may improve the performance of synchronous write operations, it is not a recommended practice to disable ZIL. Doing so may compromise the data integrity.

      Disabling the ZIL on an NFS Server can lead to client side corruption.

  • When running CPU intensive workloads, consider disabling the ZFS' metadata compression to provide more CPU cycles to the application.

      Starting with Solaris 10 11/06, metadata compression can be disabled and enabled dynamically as shown below.

      To disable the metadata compression:

      
              # echo zfs_mdcomp_disable/W0t1 | mdb -kw
      
      

      To enable the metadata compression:

      
              # echo zfs_mdcomp_disable/W0t0 | mdb -kw
      
      

      To permanently disable the metadata compression, set the following /etc/system tunable.

      
              set zfs:zfs_mdcomp_disable=1
      
      

Solaris : NFS Recommendations

One of the requirements of OBIEE is that the BI Presentation Catalog must be shared across different BI nodes in the BI Cluster. (There will be only one copy of the presentation catalog). Unless the catalog has been replicated on different nodes, there is no other choice but to share it across different nodes. One way to do this is to create an NFS share with the top level directory of the catalog, and then to mount it over NFS at the BI nodes.

  • Version 4 is the default NFS version on Solaris 10. However it appears that as of this writing, NFS v4 is not as mature as v3. So we recommend experimenting with both versions to see which one fits well to the needs of the BI deployment.

    To enable NFS v3 on both server and the client, edit /etc/default/nfs and make the changes as shown below.

      NFS Server
      NFS_SERVER_VERSMIN=3 NFS_SERVER_VERSMAX=3
      NFS Client
      NFS_CLIENT_VERSMIN=3 NFS_CLIENT_VERSMAX=3
  • Experiment with the following NFS tunables.

      NFS Server
      
      NFSD_SERVERS=<desired_number> <-- on CMT systems with large number of hardware threads you can go as high as 512
      NFS_SERVER_DELEGATION=[ON|OFF] <-- ON is the default. Experiment with OFF
      NFSMAPID_DOMAIN=<network_domain_where_BI_was_deployed>
      
      
      NFS Client
      NFSMAPID_DOMAIN=<network_domain_where_BI_was_deployed>
  • Monitor the DNLC hit rate and tune the directory name look-up cache (DNLC).

      To monitor the DNLC hit rate, run "vmstat -s | grep cache" command. It is ideal to see a hit rate of 95% or above.

      Add the following tunable parameter to /etc/system on NFS server with a desired value for the DNLC cache.

      
              set ncsize=<desired_number>
      
      
  • Mounting NFS Share

    Mount the NFS share that contains the Presentation Services Catalog on all the NFS clients (BI nodes in this context) using the following mount options:

    
            rw, forcedirectio, nocto
    
    

Oracle BI EE Cluster Deployment Recommendations

  • Ensure that all the BI components in the cluster are configured in a many-to-many fashion

  • For proper load balancing, configure all BI nodes to be identical in the BI Cluster

  • When planning to add an identically configured new node to the BI Cluster, simply clone an existing well-configured BI node running in a non-global zone.

      Cloning a BI node running in a dedicated zone results in an exact copy of the BI node being cloned. This approach is simple, less error prone and eliminates the need to configure the newly added node from scratch.

Oracle BI Presentation Services Configuration Recommendations

  • Increase the file descriptors limit. Edit SAROOTDIR/setup/systunesrv.sh to increase the value from 1024 to any other value of your choice. In addition you must increase the shell limit using the ulimit -n command

    eg.,
    
    	ulimit -n 2048
    
    

  • Configure 256M large pages for the JVM heap of Chart server and OC4J web server (this recommendation is equally applicable to other web servers such as WebLogic or Sun Java system Web Server). Also use parallel GC, and restrict the number of parallel GC threads to 1/8th of the number of virtual cpus.

    eg.,

    
    	-XX:LargePageSizeInBytes=256M -XX:+UseParallelGC -XX:ParallelGCThreads=8
    
    

  • The Oracle BI Presentation Server keeps the access information of all the users in the Web Catalog. When there are large number of unique BI users, it can take a significant amount of time to look up a user if all the users reside in a single directory. To avoid this, hash the user directories. It can be achieved by having the following entry in SADATADIR/web/config/instanceconfig.xml

    eg.,

    
    	<Catalog>
    		<HashUserHomeDirectories>2</HashUserHomeDirectories>
    	</Catalog>
    
    

    HashUserHomeDirectories specifies the number of characters to use to hash user names into sub directories. When this element is turned on, for example, the default name for user Steve's home directory would become /users/st/steve.

  • BI Server and BI Presentation Server processes create many temporary files while rendering reports and dashboards for a user. This can result in significant I/O activity on the system. The I/O waits can be minimized by pointing the temporary directories to a memory resident file system such as /tmp on Solaris OS. To achieve this, add the following line to the instanceconfig.xml configuration file.

    eg.,

    
    	<TempDir>/tmp/OracleBISAW</TempDir>
    
    

    Similarly the Temporary directory (SATEMPDIR) can be pointed to a memory resident file system such as /tmp to minimize the I/O waits.

  • Consider tuning the value of CacheMaxEntries in instanceconfig.xml. A value of 20,000 was used in the 50,000 user OBIEE benchmark on T5440 servers. Be aware that the Presentation Services process (sawserver64) consumes more virtual memory when this parameter is set to a high value.

    eg.,
    
    	<CacheMaxEntries>20000</CacheMaxEntries>
    
    
  • If the presentation services log contains errors such as "The queue for the thread pool AsyncLogon is at it's maximum capacity of 50 jobs.", consider increasing the Presentation Services' asynchronous job queue. 50 is the default value.

    The following example increases the job queue size to 200.

    
    	<ThreadPoolDefaults>
    		<AsyncLogon>
    			<MaxQueue>200</MaxQueue>
    		</AsyncLogon>
    	</ThreadPoolDefaults>
    
    
  • Increase the query cache expiry time especially when the BI deployment is supposed to handle large number of concurrent users. The default is 60 minutes. However under very high loads, a cache entry may be removed before one hour if many queries are being run. Hence it is necessary to tune the parameter CacheMaxExpireMinutes in Presentation Services' instanceconfig.xml.

    The following example increases the query cache expiry time to 3 hours.

    
    	<CacheMaxExpireMinutes>180</CacheMaxExpireMinutes>
    
    
  • Consider increasing the Presentation Services' cache timeout values to keep the cached data intact for longer periods.

    The following example increases the cache timeout values to 5 hours in instanceconfig.xml configuration file.

    
    	<AccountIndexRefreshSecs>18000</AccountIndexRefreshSecs>
    	<AccountCacheTimeoutSecs>18000</AccountCacheTimeoutSecs>
    	<CacheTimeoutSecs>18000</CacheTimeoutSecs>
    	<CacheCleanupSecs>18000</CacheCleanupSecs>
    	<PrivilegeCacheTimeoutSecs>18000</PrivilegeCacheTimeoutSecs>
    
    

Oracle BI Server Configuration Recommendations

  • Enable caching at the BI server and control/tune the cache expiry time for each of the table based on your organizations' needs.

  • Unless the repository needs to be edited online frequently, consider setting up the "read only" mode for the repository. It may ease lock contention up to some extent.

  • Increase the session limit and the number of requests per session limit especially when the BI deployment is expected to handle large number of concurrent users. Also increase the number of BI server threads.

    The following configuration was used in 50,000 user OBIEE benchmark on T5440 servers.

    
    (Source configuration file: NQSConfig,.INI)
    
    [ CACHE ]
    ENABLE = YES;
    DATA_STORAGE_PATHS = "/export/oracle/OracleBIData/cache" 500 MB;
    MAX_ROWS_PER_CACHE_ENTRY = 0;
    MAX_CACHE_ENTRY_SIZE = 10 MB;
    MAX_CACHE_ENTRIES = 5000;
    POPULATE_AGGREGATE_ROLLUP_HITS = NO;
    USE_ADVANCED_HIT_DETECTION = NO;
    
    // Cluster-aware cache
    GLOBAL_CACHE_STORAGE_PATH = "/export/oracle/OracleBIsharedRepository/GlobalCacheDirectory" 2048 MB;
    MAX_GLOBAL_CACHE_ENTRIES = 10000;
    CACHE_POLL_SECONDS = 300;
    CLUSTER_AWARE_CACHE_LOGGING = NO;
    
    [ SERVER ]
    READ_ONLY_MODE = YES;
    MAX_SESSION_LIMIT = 20000 ;
    MAX_REQUEST_PER_SESSION_LIMIT = 1500 ;
    SERVER_THREAD_RANGE = 512-2048;
    SERVER_THREAD_STACK_SIZE = 0;
    DB_GATEWAY_THREAD_RANGE = 512-512;
    
    #SERVER_HOSTNAME_OR_IP_ADDRESSES = "ALLNICS";
    CLUSTER_PARTICIPANT = YES;
    
    

Related Blog Posts

Saturday Dec 20, 2008

Siebel on Sun Solaris: More Performance with Less Number of mprotect() Calls

By default each transaction in Siebel CRM application makes a large number of serialized mprotect() calls which in turn may degrade the performance of Siebel. When the load is very high on the Siebel application servers, the mprotect() calls are serialized by the operating system kernel resulting in high number of context switches and low CPU utilization.

If a Siebel deployment exhibits the above mentioned pathological conditions, performance / scalability of the application can be improved by limiting the number of mprotect() calls from the application server processes during the run-time. To achieve this behavior, set the value of Siebel CRM's AOM tunable parameter MemProtection to FALSE. Starting with the release of Siebel 7.7, the parameter MemProtection is a hidden one with the default value of TRUE. To set its value to FALSE, run the following command from the CLI version of Siebel Server Manager - srvrmgr.


change param MemProtection=False for comp <component_alias_name> server <siebel_server_name>

where:

component_alias_name is the alias name of the AOM component to be configured. eg., SCCObjMgr_enu is the alias for the Call Center Object Manager, and

siebel_server_name is the name of the Siebel Server for which the component being configured.

Note that this parameter is not a dynamic one - hence the Siebel application server(s) must be restarted for this parameter to be effective.

Run truss -c -p <pid_of_any_busy_siebmtshmw_process> before and after the change to see how the mprotect system call count varies.

For more information about this tunable on Solaris platform, check Siebel Performance Tuning Guide Version 7.7 or later in Siebel Bookshelf.

See Also:
Siebel on Sun CMT hardware : Best Practices

(Originally posted on blogger at:
Siebel on Sun Solaris: More Performance with Less mprotect() Calls)

Tuesday Apr 08, 2008

Running Batch Workloads on Sun's CMT Servers

Ever since Sun introduced Chip Multi-Threading (CMT) hardware in the form of UltraSPARC T1's T1000/T2000, our internal mail aliases were inundated with variety of customer stories, majority of those go like 'batch jobs are taking 12+ hours on T2000, where as it takes only 3 or 4 hours on US-IV+ based v490'. Even after two and half years since the introduction of the revolutionary CMT hardware, it appears that majority of Sun customers are still under the impression that Sun's CMT systems like T2000, T5220 are not capable of handling CPU intensive batch workloads. It is not a valid concern. CMT processors like UltraSPARC T1, T2, T2 Plus can handle batch workloads just as well like any other traditional/conventional processor viz. UltraSPARC-IV+, SPARC64-VI, AMD Opteron, Intel Xeon, IBM POWER6. However CMT awareness and little effort are required at the customer end to achieve good throughput on CMT systems.

First of all, the end users must realize the fact that the maximum clock speed of the existing CMT processor line-up (UltraSPARC T1, UltraSPARC T2, UltraSPARC T2 Plus) is only 1.4 GHz; and on top of that each strand (individual hardware thread) within a core shares the CPU cycles with the other strands that operate on the same core (Note: each core operates at the speed of the processor). Based on these facts, it is no surprise to see batch jobs taking longer times to complete when only one or a very few single-threaded batch jobs are submitted to the system. In such cases, the system resources are fairly under-utilized in addition to the longer elapsed times. One possible trick to achieve the required throughput in the expected time frame is to split up the workload into multiple jobs. For example, if an EDU customer needs to generate 1000 transcripts, the customer should consider submitting 4 individual jobs with 250 transcripts each or 8 jobs with 125 transcripts each rather than submitting one job for all 1000 transcripts. Ideally the customer should observe the resource utilization (CPU%, for example); and experiment with the number of jobs to be submitted until the system achieves the desired throughput within the expected time frame.

Case study: Oracle E-Business Suite Payroll 11i workload on Sun SPARC Enterprise T5220

In order to prove that the aforementioned methodology works beyond a reasonable doubt, let's take Oracle's E-Business Suite 11.5.10 Payroll workload as an example. On a single T5220 with one 1.4 GHz UltraSPARC T2 processor, acting as the batch, application and database server, 4 payroll threads generated 5,000 paychecks in 31.53 minutes of time consuming only 6.04% CPU on average. ~9,500 paychecks is the projected hourly throughput. This is a classic example of what majority of Sun's CMT customers are experiencing as of today i.e., longer batch processing times with little resource consumption. Keep in mind that each UltraSPARC T2 and UltraSPARC T2 Plus processors can execute up to 64 jobs in parallel (on a side note, UltraSPARC T1 processor can execute up to 32 jobs in parallel). So to put the idling resources for effective use, there by to improve the elapsed times and the overall throughput, few experiments were conducted with 64 payroll threads and the results are very impressive. With a maximum of 64 payroll threads, it took only 4.63 minutes to process 5,000 paychecks at an average of 40.77% CPU utilization. In other words, similarly configured T5220 can process ~64,700 paychecks at less than half of the available CPU cycles. Here is a word of caution: just because the processor can execute 64 threads in parallel, it doesn't mean it is always optimal to submit 64 parallel jobs on systems like T5220. Very high number of batch jobs (payroll threads in this particular scenario) might be an overkill for simple tasks like NACHA in Payroll process.

The following white paper has more detailed information about the nature of the workload and the results from the experiments with various number of threads for different components of the Oracle Applications' Payroll batch workload. Refer to the same white paper for the exact tuning information as well.

Link to the white paper:
     E-Business Suite Payroll 11i (11.5.10) using Oracle 10g on a Sun SPARC Enterprise T5220

Here is the summary of the results that were extracted from the white paper:

Hardware configuration
          1x Sun SPARC Enterprise T5220 for running the application, batch and the database servers
              Specifications: 1x 1.4 GHz 8-core UltraSPARC T2 processor with 64 GB memory

Software configuration
          Oracle E-Business Suite 11.5.10
          Oracle 10g R1 10.1.0.4 RDBMS
          Solaris 10 8/07

Results
Oracle E-Business Suite 11i Payroll - Number of employees: 5,000
Component #Threads Time (min) Avg. CPU% Hourly Throughput
Payroll process 64 1.87 90.56 160,714
PrePayments 64 0.20 46.33 1,500,000
Ext. Proc. Archive 64 1.90 90.77 157,895
NACHA 8 0.05 2.52 6,000,000
Check Writer 24 0.38 9 782,609
Costing 48 0.23 32.5 1,285,714
Total or Average NA 4.63 min 40.77% 64,748

It is evident from the average CPU% that the Payroll process and the External Process Archive components are extremely CPU intensive; and hence take longer time to complete. That's the reason 64 threads were configured for those components to run at the full potential of the system. Light-weight components like NACHA need fewer threads to complete the job efficiently. Configuring 64 threads for NACHA will have a negative impact on the throughput. In other words, we would be wasting CPU cycles for no apparent improvement.

It is the responsibility of the customers to tune the application and the workload appropriately. One size doesn't fit all.

The Payroll 11i results on the T5220 demonstrate clearly that Sun's CMT systems are capable of handling batch workloads well. It would be interesting to see how well they perform against other systems equipped with traditional processors with higher clock speeds. For this comparison, we could use couple of results that were published by UNISYS and IBM with the same workload. The following table summarizes the results from the following two white papers. For the sake of completeness, Sun's CMT results were included as well.

Source URLs:
  1. E-Business Suite Payroll 11i (11.5.10) using Oracle 10g on a UNISYS ES7000/one Enterprise Server
  2. E-Business Suite Payroll 11i (11.5.10) using Oracle 10g for Novell SUSE Linux on IBM eServer xSeries 366 Servers
Oracle E-Business Suite 11i Payroll - Number of employees: 5,000
Vendor OS Hardware Config #Threads Time (min) Avg. CPU% Hourly Throughput
UNISYS Linux: RHEL 4 Update 3 DB/App/Batch server: 1x Unisys ES7000/one Enterprise Server (4x 3.0 GHz Dual-Core Intel Xeon 7041 processors, 32 GB memory) 121 5.18 min 53.22% 57,915
IBM Novell SUSE Linux Enterprise Server 9 SP1 DB, App servers: 2x IBM eServer xSeries 366 4-way server (4x 3.66 GHz Intel Xeon MP Processors (EM64T), 32 GB memory) 12 8.42 min 50+%2 35,644
Sun Solaris 10 8/07 DB/App/Batch server: 1x Sun SPARC Enterprise T5220 (1x 1.4 GHz 8-core UltraSPARC T2 processor, 64 GB memory) 8 to 64 4.63 min 40.77% 64,748

Better results were highlighted. The results speak for themselves. One 1.4 GHz UltraSPARC T2 processor outperformed four 3 GHz / 3.66 GHz processors in terms of the average CPU utilization and most importantly in the hourly throughput (Hourly throughput calculation relies on the total elapsed time).

Before we conclude, let us reiterate few things purely based on the factual evidence presented in this blog post:

  • Sun's CMT servers like T2000, T5220, T5240 (two socket system with UltraSPARC T2 Plus processors) are good to run batch workloads like Oracle Applications Payroll 11i

  • Sun's CMT servers like T2000, T5220, T5240 are good to run the Oracle 10g RDBMS when the DML/DDL/SQL statements that make up the majority of the workload are not very complex, and

  • When the application is tuned appropriately, the performance of CMT processors can outperform some of the traditional processors that were touted to deliver the best single thread performance

Footnotes

1. There is a note in the UNISYS/Payroll 11i white paper that says "[...] the gains {from running increased numbers of threads} decline at higher numbers of parallel threads." This is quite contrary to what Sun observed in its Payroll 11i experiments on UltraSPARC T2 based T5220. Higher number of parallel threads (maximum: 64) improved the throughput on T5220, where as UNISYS' observation is based on their experiments with a maximum of 12 parallel threads. Moral of the story: do NOT treat all hardware alike.

2. IBM's Payroll 11i white paper has no references to the average CPU numbers. 50+% was derived from the "Figure 3: Average CPU Utilization".

About

Benchmark announcements, HOW-TOs, Tips and Troubleshooting

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today