« February 2008 | Main | April 2008 »

March 2008 Archives

March 1, 2008

Common Incorrect beliefs about process address limits on 32bit platforms

Preface

Most people believe that one cannot go beyond 1.2g-1.7g or so as a maximum heapsize for Java servlet based applications deployed with 9iAS/10gAS. The ignorant would cite this as a limitation of 32bit Intel architecture.

In reality, this assumption is only partially true as it applies only to SMP Kernels and that too with the default -unlimited native stack size per process. People might say "Your results may vary" but the truth is that there are several factors like process address space & stack limits, kernel limits, shared memory support etc which need to be understood to make an educated comment on the question of  - What is the best limit I can push the JVM heap to for my application and How can I do it  without compromising stability ?.

Kernel Limitations and known issues of JDK1.4 Heaps on 32-bit platform vis-a-vis 64-bit platforms

While it is generally believed that the maximum memory addressable space for a process on 32-bit platforms is 1.7g , this is only true for the SMP kernel.

Right? Well, only if unlimited(usually the default, given by $ ulimit -s command ) native stack size is used. If any lesser value like 10m, 8m, 6m, 4, 2m are used, the user address space increases to 2.7g.

There are white papers on metalink having details how the SGA (implemented as OS shared memory area) of an oracle database on 32-bit platform can be pushed to 3.5g. This is achieved by lowering the starting address point in the memory address stack for the oracle executable and relinking it, so that the memory addressable space increases, when a process from the oracle executable is spawned.

For JVMs, the application specific golden limit has really to be found out by trial and error by tweaking ulimit -s and some other factors.

The easy command for finding the maximum heapsize limit can be found by raising the value of -Xmx and -Xms switches at jvm startup:

# e.g. if you want to see if a heap of 1400m can be allocated and committed at startup
$ java -version -Xmx1400m -Xms1400m 

If this doesnt fail and you get the version of java in the output, keep pushing the OS with higher heap sizes and see when the segmentation fault Signal 11 errors start coming. You have reached the forbidden limit and would do well to keep ~300-400m off this limit for accomodating native stack space for java threads. The native stack space requiement would vary as per concurrent user load of the application.

The golden memory rule:

The golden rule of thumb for max address space per process to be kept in mind is:


(Total size of managed heap + native heap) + (number of threads X thread stack size) = max process address space allowed by Kernel


Viewing it in a more wholistic way,


Data segment (=Total size of managed heap + native heap) + stack segment(= # of threads X thread stack size) = max user address space allowed by Kernel

More possibilities with JDK1.5

There are some exciting things which can be done using JDK1.5, like addressing 3G & more of heapsize & using LargePages feature, using which its possible to eliminate heavy swapping, iowait & Garbage collection, since the heap is kept in real memory.
It should be noted that JDK1.5 is only supported at this time with Oracle Applications 11.5.10. Its not yet certified yet with 11.5.9 mostly due to logistical reasons and for avoiding regression testing efforts with JDK1.6, which Oracle plans supposedly plans on doing.

Related Links and Articles

Case study: How to leverage LargePages option of J2SE1.5 on Linux hugemem kernel for 11i Oracle Apps

Redhat Linux Kernels and process space limits using Java.. Some eye-openers

Different Kernels available and their memory limits using Java

There are at least 3 different kind of Kernels available in Redhat linux:
1) SMP kernel:- This kernel is the most common variant used and supports only upto 4GB Virtual Memmory ((2**32 = 4GB) linear addresses). It gives an address space of 3G virtual memory peruser process & 1G for kernel, but after deducing stack overhead, the maximum user addressable space limit comes to 1.7g.
The split is:
1.0 GB Text/code
1.7 GB available process memory for mmap, malloc / address space
     (e.g. usable by shared Pool)
0.3 GB Stack
1.0 GB Kernel
2) Enterprise Kernel:- In this kernel, the process address space can go up to 64GB VM (2**36 = 64GB linear addresses). This feature is called Page Address Extension (PAE) and is also referred to as Very Large Memory (VLM) support.
The enterprise kernel provides VLM support out of the box, BUT, it requires relinking of the kernel, which is a pretty unpopular option with Redhat and usually last resort of system administrators.
Note: You should be aware that Oracle Corp. will stop supporting any Oracle
          installation on a recompiled kernel, particularly Red Hat.
3) Hugemem Kernel:- Red Hat Enterprise Linux 3.0/4.0 includes a new kernel known as the hugemem kernel. This kernel supports a 4GB per process user space (versus 3GB for the other kernels), and a 4GB direct kernel space.
  The hugemem kernel is usually required in order to use all the memory in system configurations containing more than 16GB of memory. The hugemem kernel can also benefit configurations running with less memory (if running an application that could benefit from the larger per process user space).
The virtual address space is split as:
1.0 GB Text/code
2.7 GB available process memory for mmap, malloc / address
        space (e.g. usable by shared Pool)
0.3 GB Stack
4.0 GB Kernel

Understanding how one kernel is better than other...

It is important to understand the pros and cons between using SMP, Enterprise and Hugemem kernels. For applications having large virtual memory requirements, even the enterprise kernel could be a solution using the VLM feature in RHEL3/4, but it would entail creating a very large buffercache, using options mentioned in Note 262004.1 for Oracle Databases (one kind of application).
On top of the page addressing problem for 64GB linear addresses, there will be a problem in kernel memory if we have a lot of memory. It would be required to build a Page Translation Extension (PTE) for all available memory pages and this restricts the memory we can address.
Typically the kernel memory of 1gb is completely used up if its required to address just 24GB of memory. To overcome this problem we need a large kernel memory and this is where the hugemem kernel beats the Enterprise kernel as it provides 4G of Kernel memory.

Experimenting with process limits across different factors..

We can find out the maximum number of threads which can be spawned on a platform across any JVM version by running a program written in Java (ThreadTest.java) -- listing available in Appendix A.
This program just keeps on spawning threads and keeps them alive almost forever. When it reaches the Java.Lang.OutOfMemory error on spawning a thread, it outputs the maximum number of threads it was able to spawn.
Based on this information and the native java thread stack size specified on JVM startup and using the golden memory rule, we can calculate the total virtual memory address space of the JVM process.
For a listing of the java source code of and the unix shell script check_thread_limits.sh, please refer to Appendix B-1. The results of the max threads spawned and Maximum user address space by varying JDK version, Kernel (SMP, Hugemem), ulimit, stacksize are listed in Appendix B-2.
As can be seen from Appendix B-2:
For SMP kernels:
With ulimit -s unlimited:
(Total size of managed heap + native heap) + (number of threads X thread stack size) ~ 1.7G
With ulimit -s 10m or 8m or 6m or 4m or 2m:
(Total size of managed heap + native heap) + (number of threads X thread stack size) ~ 2.7G
For Hugemem kernels:
With ulimit -s unlimited:
(Total size of managed heap + native heap) + (number of threads X thread stack size) ~ 2.5G
With ulimit -s 10m or 8m or 6m or 4m or 2m:
(Total size of managed heap + native heap) + (number of threads X thread stack size) ~ 3.7G
Note: This also implies that the "unlimited" native stack size constant is not the same as 10m as might be seen in comments in /usr/src/linux/include/sched.h for _STK_LIM constant, rather its some valule which is very high as compared to 10m. Sometime, seeing and reading comments in source code files is not believing, as I found out.


Appendix A : Listing of ThreadTest.java

public class ThreadTest {

public static void main(String[] pArgs) throws Exception {

try {
// keep spawning new threads forever
      while (true) {
      new TestThread().start();
      }
   }
// when out of memory error is reached, print out the number of
// successful threads spawned and exit
   catch ( OutOfMemoryError e ) {
     System.out.println(TestThread.CREATE_COUNT);
     System.exit(-1);
   }
  }
  static class TestThread extends Thread {
    private static int CREATE_COUNT = 0;
    public TestThread() {
     CREATE_COUNT++;
    }
// make the thread wait for eternity after being spawned
    public void run() {
      try {
        sleep(Integer.MAX_VALUE);
      }
// even if there is an interruption, dont do anything
      catch (InterruptedException e) {
      }
    }
  }
}

Appendix B-1 : Listing of check_thread_limits.sh

check_thread_limits code:

Appendix B-2 : Results of check_threads_limits.sh across different Kernel, Java Version and native Java stack size

SMP Kernel results:

smp_check_thread_limits.sh_output:

Hugemem Kernel results:

hugemem_check_thread_limits.sh_output:

Case study: How to leverage +LargePages option of J2SE1.5 on Linux hugemem kernel for 11i Oracle Applications java servlet based modules

Preface

This case study deals with Oracle configurator product (shortcode CZ) of 11i Oracle Applications and how to make it more stable and scalable by leveraging the LargePages option provided in J2SE1.5. This paper is in continuation to earlier article Common Incorrect beliefs about process address limits on 32bit platforms
It is assumed that the maintenance release of 11i Oracle Applications is 11.5.10 as per certification specified in Note 304099.1 - Using J2SE Version 5.0 with Oracle E-Business Suite 11i. To find the latest version of Java certified with the E-Business Suite application tier, please see OracleMetaLink Note 300482.1, titled "Overview of Using Java with Oracle E-Business Suite Release 11i".
This study assumes RHAS3 as the running platform OS.

The general approach and the case for this study

To increase stability and response time of a java servlet based application (say Oracle configurator), at least these two conditions must be satistifed:

  • Garbage collection must be minimized - The more the overall heapsize, the lesser will be the contention for space across different sessions, assuming the user load remains the same. The lesser the demand for object or session allocation in the JVM heap, the lesser will be the garbage collection.
  • Response time must be reasonably high - This, again is dependent upon garbage collection. During full garbage collection (GC) cycles, the application is not able to respond to the user activity because its doing background maintenance. Therefore, the lesser the garbage collection, the higher is the response time.
There might also be a requirement for doing a considerable caching of configurator models in the JVM heap, which adds to the memory overhead. Using LargePage option can be useful in such a situation.

Installing the hugemem kernel

Based on the article Redhat Linux Kernels and process space limits using Java.. Some eye-openers , it is already established that the hugemem kernel allows to address upto 3.4G of process address space.

Install the hugemem kernel rpm for the same linux kernel level on the configurator tier. As an example For 2.4.21-40 level, the kernel-hugemem-2.4.21-40.EL.i686.rpm is the corresponding rpm.

As root:

# rpm -Uvh kernel-hugemem-2.4.21-40.EL.i686.rpm
Since you would be persisting with the hugemem kernel across boots, make Hugemem the default kernel by specifying the default=0 line in the Boot loader configuration file. Assuming the boot Loader is GRUB, the file would be /etc/grub.conf. default= specifies the order of the kernel listed in loader configuration file, starting with 0 as the first. It might be set to different value(1,2,3) depending on however many kernels installed on the machine.


Note: Installing the hugemem kernel from the rpm does put the hugemem kernel as the first entry in the boot loader configuration file (/etc/grub.conf), therefore default=0 will be needed to be specified.


#more /etc/grub.conf


# grub.conf generated by anaconda


#


# Note that you do not have to rerun grub after making changes


# to this file


# NOTICE: You have a /boot partition. This means that


# all kernel and initrd paths are relative to /boot/, eg.


# root (hd0,0)


# kernel /vmlinuz-version ro root=/dev/rootvg/lv1


# initrd /initrd-version.img


#boot=/dev/sda


default=0


timeout=10


splashimage=(hd0,0)/grub/splash.xpm.gz


title Red Hat Enterprise Linux ES (2.4.21-40.ELhugemem)


root (hd0,0)


kernel /vmlinuz-2.4.21-40.ELhugemem ro root=/dev/rootvg/lv1 hda=ide-scsi


initrd /initrd-2.4.21-40.ELhugemem.img


.....


.....


Create a symbolic link for System.map in /boot to point to System.map file specific to hugemem kernel.

#pwd
/boot
# ls -l System.map
lrwxrwxrwx 1 root root 30 May 24 20:55 System.map -> System.map-2.4.21-40.ELhugemem
This is a also good time to install the jdk-1_5_0_06-linux-i586.rpm (J2SE(TM) Development Kit 5.0 Update 6 from http://java.sun.com).


As root:

# rpm -Uvh jdk-1_5_0_06-linux-i586.rpm

Warn the Operating System that the limits will be challenged:

  • Add the following in /etc/security/limits.conf

applmgr soft nofile 65536
applmgr hard nofile 65536


applmgr soft nproc 16384


applmgr hard nproc 16384


applmgr soft stack unlimited


applmgr hard stack unlimited


applmgr soft memlock unlimited


applmgr hard memlock unlimited

  • Set the ulimit -a for the unix user running the JVM  like this:

# ulimit -a
time(cpu-seconds) unlimited


file(blocks) unlimited


coredump(blocks) 0


data(kbytes) unlimited


stack(kbytes) unlimited


lockedmem(kbytes) unlimited


memory(kbytes) unlimited


nofiles(descriptors) 65536


processes 16384

Implement hugepage & supporting features in OS

Make kernel configuration changes for supporting Hugepages and shared memory segments. The example assumes that you would be locking 9GB of real memory on the machine with 12G of RAM.

  • Add this the /etc/sysctl.conf :

##############################################################
# Added for huge page access


#


# This is worth of 9GB locked in Real Memory


# This limit needs to be changed as per the available memory

# on the Box with 12g RAM. Change this as per your

# requirement
#


##############################################################


vm.hugetlb_pool=9216


##############################################################


# Added to keep maxsize of shared memory segment at 3.5g


#


# This is required to support LargePages feature of JDK1.5

# and to avoid the following message at JVM startup:
# Java HotSpot(TM) Server VM warning: Failed to reserve

# shared memory (errno = 22).
##############################################################


kernel.shmmax=3758096384


kernel.shmall=3758096384


##############################################################

  • Mount hugepage filesystem as root:
# mkdir -p /mnt/hugepages
  • The following entries should be there in /etc/fstab:

# vi /etc/fstab
..


..


##############################################################


# These entries are needed to make hugemem work


##############################################################


none /dev/shm tmpfs defaults,size=9G 0 0


##############################################################


# In the above setting, size should be modified as real


# memory being locked in sysctl.conf for vm.hugetlb_pool


##############################################################


nodev /mnt/hugepages hugetlbfs rw 0 0


##############################################################

  • Reboot the machine to boot into Hugemem Kernel and for allocating hugepages to reflect in /proc/meminfo. After the reboot, it should look like this:

$ grep -i huge /proc/meminfo
HugePages_Total: 4608


HugePages_Free: 4608


Hugepagesize: 2048 kB

Note: For dual core CPU Dell machines, it was found that even though Hugepagesize was 2mb, the number pages to be specified in /etc/sysctl.conf had to be doubled to make it worth a particular memory limit, say 9G. You are advised to experiment & verify with the output of previous command on your machine to check if this holds true for you too.

  • Check that the hugepagefs and tmpfs filesystems have been mounted with the required options:

# mount | grep huge
nodev on /mnt/hugepages type hugetlbfs (rw)


# mount | grep tmpfs


none on /dev/shm type tmpfs (rw,size=9G)

Prepare the Java servlet application for leveraging the LargePage JVM option


$ whoami


applmgr


$ vi $IAS_ORACLE_HOME/Apache/Apache/bin/java.sh


# ###############################################################


#


# This file is automatically generated by AutoConfig. It will be


# read and overwritten. If you were instructed to edit this file,


# or if you are not able to use the settings created by


# AutoConfig, refer to Metalink document 165195.1 for assistance.


#


# ###############################################################


# Added ulimit -s 8192 to make jdk1.5 work with modified native


# thread stack size ONLY for this servlet. This will keep the


# effect local.


#################################################################


ulimit -s 8192


JSERVJAVA=/usr/java/jdk1.5.0_06/jre/bin/java


#################################################################


ARGV="$*"


JVMLOGDIR=/ORACLE/prod2/9iAS/Apache/Jserv/logs/jvm


STDOUTLOG=$JVMLOGDIR/jserv.stdout


STDERRLOG=$JVMLOGDIR/jserv.stderr


...


...


...


#################################################################

  • In $IAS_ORACLE_HOME/Apache/Jserv/etc/jserv.properties, add the JVM switches directive line in the # Begin customizations & # End customizations block at the very end of the file. Remember to put that all these JVM switches in the same line:
# vi $IAS_ORACLE_HOME/Apache/Jserv/etc/jserv.properties


...


...


...


# Begin Customizations


################################################################


# Added to make JVM1.5 work with Oracle Configurator


################################################################


wrapper.bin=/ORACLE/prod3/9iAS/Apache/Apache/bin/java1.5.sh


################################################################


# the max heapsize specified by -Xmx will have to be changed as


# per the max available RAM On the config machine e.g. if the


# RAM=12g and we are reserving 9g for JVM heaps, then each JVM


# can have a max Heapsize of 2.5g or so:


# (2.7 g + 300g -> for perm generation overhead ) x 3 JVMs ~ 9g


# minus spare


################################################################


wrapper.bin.parameters=-verbose:gc -Xmx2750M -Xms2750m -Xss128k -XX:PermSize=128m -XX:MaxPermSize=256M -XX:NewRatio=3 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled -XX:ParallelGCThreads=4


-XX:+PrintGCTimeStamps -XX:+UseTLAB -XX:+UseLargePages -server


################################################################


# End Customizations


Note: It is important to keep these directives in # Begin & End Customization section so as to preserve these settings across AutoConfig sessions.

  • You will need to comment out the following 2 lines in the starting section of $IAS_ORACLE_HOME/Apache/Jserv/etc/jserv.properties :
#######################


# Execution parameters


#######################


# The Java Virtual Machine interpreter.


# Syntax: wrapper.bin=[filename] (String)


# Note: specify a full path if the interpreter is not visible in


# your path.


# wrapper.bin=/ORACLE/prod2/9iAS/Apache/Apache/bin/java.sh


# Arguments passed to Java interpreter (optional)


# Syntax: wrapper.bin.parameters=[parameters] (String)


# Default: NONE


################################################################


# comment out this line so that the java servlet will at least


# startup


################################################################


# wrapper.bin.parameters=-verbose:gc -Xmx1024M -Xms128M -XX:MaxPermSize=128M -XX:NewRatio=2 -XX:+PrintGCTimeStamps -


XX:+UseTLAB


################################################################


Note: It should be remembered that the default wrapper.bin directive for java.sh and default JVM options will get uncommented after running Autoconfig. So you would have to remember to comment them again after running Autoconfig, or else the Apache servlet will not start.

Why dont we do it this way?

One good question would be: why dont we put the JVM switches in the autoconfig token jvm_options in $CONTEXT_FILE? This would alleviate the need for manually updating the $IAS_ORACLE_HOME/Apache/Jserv/etc/jserv.properties each time after AutoConfig. For example.


<jvm_options oa_var="s_jvm_options" osd="Linux">-verbose:gc -Xmx1024M -Xms128M -XX:MaxPermSize=128M -XX:NewRatio=2 -XX:+PrintGCTimeStamps -XX:+UseTLAB</jvm_options>


The answer is simple. Setting the JVM options here will cause EACH servlet setup in the configurator tier to use the same switches. Assume that there are 2 more servlets configured to be running on the same machine (iPayment and OXTA servlets), the total memory taken up the servlets will be 3 X 3g ~ 9G. If you were planning on running 3 confugurator JVMs with LargePage option, all your real memory is gone already.


Hence the manual change step. At the same time, if your IAS_ORACLE_HOME is setup in such a way through the *.properties file that ONLY the configurator servlet is supposed to startup, then this method would work as well. So, this is also a viable option & probably a better one in this situation.

Reap the benefits

Now when you start the confgurator servlet using $SCRIPTS/adapcctl.sh, you can easily verify that the servlet JVM process has 3G of heapsize allocated and that the Garbage collection time spent is considerably reduced after using the parallel GC JVM options used above.


........


........


45990.174: [ParNew 1640767K->1624634K(2457536K), 0.0106000 secs]


46718.218: [ParNew 1640888K->1624758K(2457536K), 0.0133860 secs]


46913.196: [ParNew 1641014K->1628789K(2457536K), 0.0291830 secs]


........


.........


The advantage of using largepages is that the heap memory will be locked and is not eligible for paging out on swap. The access of objects in heap in real memory is obviously faster.


Hence, using Largepages option is a good option  for achieving performance goals, especially when the caching demands in the servlet are considerable and there is a section of RAM on the machine running servlet, which is not being put to any use.

Related articles and links

HugePages on Linux: What It Is... and What It Is Not... Metalink Note:361323.1


Bigpages vs. Hugetlb on RedHat Linux                            Metalink Note:261889.1


Java support for Large Memory Pages


Java Tuning White Paper

A tuning case study: The goofy optimizer (9i.x RDBMS )

Preface

They say: Tune the application first. And then, many a times, the CBO behaviour is not predictable or favourable, inspite of its best intentions or internal hueristic rules or availability of latest statistics with good sampling percent.
This is one such example and how the usage of hints had to be used as one of the last options to improve the performance of a concurrent program.
Ideally speaking, the CBO chooses what is usually either the best or one of the best available data access plan for executing and fetching the sql output. It should take into considerations all the underlying objects statistics and use the least costly data access plan. Sometimes, it doesnt happen out of the box.
Note: Some seeded 11i Oracle Application modules also depend heavily on hints to make the performance of their code acceptable. One such heavy user is the QP or Advanced Pricing module.

The Problem Scenario...

At a client site, a shipping related custom concurrent program started ~5 hrs to complete on an average, rather then the usual ~3 hrs. From preliminary investigation using RDBMS trace (event 10046) and PL/SQL dbms profiler, that the maximum time (about ~4.5 hrs) was taken up by the following innocent looking SQL itself!!
SELECT HP.CATEGORY_CODE CATEGORY_CODE ,
       HC.SALES_CHANNEL_CODE  SALES_CHANNEL_CODE ,
       HC.CUSTOMER_CLASS_CODE CUSTOMER_CLASS_CODE ,
       SUM(L.SHIPPED_QUANTITY * L.UNIT_SELLING_PRICE)
         ORDER_SHIP_TOT ,
       L.ACTUAL_SHIPMENT_DATE SHIP_DATE ,
       DECODE (H.ORG_ID, 362, 'US', 365, 'CAD',
                         5624, 'MEX', H.ORG_ID) OPERATING_UNIT ,
       TT.NAME ORDER_TYPE
FROM
  ONT.OE_ORDER_HEADERS_ALL H ,
  APPS.OE_ORDER_LINES_ALL L ,
  APPS.HZ_PARTIES HP ,
  APPS.HZ_CUST_ACCOUNTS HC ,
  APPS.HZ_CUST_SITE_USES_ALL HCSU ,
  APPS.HZ_CUST_ACCT_SITES_ALL HCAS ,
  APPS.OE_ORDER_SOURCES OS ,
  APPS.OE_TRANSACTION_TYPES_TL TT ,
  XXIW.XXIW_BS_OM_ORDER_TYPES XOT
WHERE
     H.INVOICE_TO_ORG_ID = HCSU.SITE_USE_ID(+)
AND H.ORDER_SOURCE_ID = OS.ORDER_SOURCE_ID
AND OS.NAME <> 'Satcom Closed'
AND H.ORDER_TYPE_ID = TT.TRANSACTION_TYPE_ID
AND TT.LANGUAGE = 'US'
AND TT.NAME NOT LIKE 'INTERNAL%'
AND TRUNC(L.ACTUAL_SHIPMENT_DATE)
     BETWEEN NVL(:B2 , TRUNC(L.ACTUAL_SHIPMENT_DATE))
         AND NVL(:B1 , TRUNC(L.ACTUAL_SHIPMENT_DATE))
AND HCSU.CUST_ACCT_SITE_ID = HCAS.CUST_ACCT_SITE_ID(+)
AND HCAS.CUST_ACCOUNT_ID = HC.CUST_ACCOUNT_ID(+)
AND HC.PARTY_ID = HP.PARTY_ID(+)
AND UPPER(HC.CUSTOMER_CLASS_CODE) <> 'INTERNAL'
AND UPPER(HC.SALES_CHANNEL_CODE) <> 'INTERNAL'
AND H.FLOW_STATUS_CODE NOT IN ('CANCELLED','ENTERED')
AND H.BOOKED_DATE IS NOT NULL AND L.FLOW_STATUS_CODE = 'CLOSED'
AND L.LINE_CATEGORY_CODE <>'RETURN'
AND L.HEADER_ID = H.HEADER_ID
AND TT.NAME = XOT.ORDER_TYPE
AND XOT.TYPE = 'ALL'
GROUP BY HP.PARTY_NAME, HP.CATEGORY_CODE, HC.SALES_CHANNEL_CODE,
         HC.CUSTOMER_CLASS_CODE, H.ORDER_NUMBER, H.HEADER_ID,   
         H.BOOKED_DATE, L.ACTUAL_SHIPMENT_DATE,
         H.FLOW_STATUS_CODE, H.ORG_ID, OS.NAME, TT.NAME
iw_query_bad_plan:
The top time taking code section from pl/sql DBMS profiler data. As can be seen, the maximum time is spent in the Fetch of the cursor executing this query:
iw_query_bad_profiler_output:

Evaluating options

The first thing which comes to the mind is: Can we play with different sampling estimations? e.g. 10%, 50%, 70%. While the idea is appealing, it can destablize many other SQL queries and its an uphill task to judge the impact of such a change on the remaining application peformance as such. Heavy price to pay for tuning one lone SQL.
Again, the statistics of the objects were also recent enough (Stats gathered on a weekly basis as usual). There were no big changes in the data volumes as such.

Getting into the optimizer's thought process

From the SQL plan being used, It can be readily seen that the optimizer acted greedily like this: It saw that there are several small HZ* tables which could easily fit into PGA/SGA  memory and so it thought -- Cool, I can do a quick Hash join of these folks (that would also take care of the outer join predicate evaluation). After that, it proceeded to retrofit the remaining tables to be joined into the plan which was developing.
Fine approach. Only problem is that the index range scan of Header_id column using the non-unique index on the oe_order_lines_all table simply kills the performance of the fetch phase. It is very simply, very very inefficient.
19101503 INDEX RANGE SCAN OE_ORDER_LINES_N1 (object id 334899)
A much better approach would have been to first filter off the header_id values from OE_ORDER_HEADERS_ALL table so that it would use the unique primary key index of OE_ORDER_HEADERS_ALLand join with the OE_ORDER_LINES_ALL table using nested loop operation.
This itself would filter off a huge chunk of the data (which is mostly centered around OE_ORDER_LINES_ALL table). The remaining HZ* & miscellaneous data could easily be plugged in via Hash Joins. The idea was to drive the query using OE_ORDER_HEADERS_ALL to eliminate the maximum chunk of un-required data first.

Also, the predicate AND h.booked_date IS NOT NULL was replaced with
and h.booked_flag = 'Y'
. Why use a NOT NULL clause when a flag is available in OE_ORDER_HEADERS_ALL?

Also, a custom index xxoe_order_lines_n11 was found to exist on oe_order_lines_all.actual_shipment_date column. We thought we'll make use of it too. The more the merrier.

So then, after grouping together the related predicates together, formatting the query and Putting these ideas into practice, the original query was changed as following:
SELECT /*+ index (h,oe_order_headers_u1)
index (l,xxoe_order_lines_n11)
           use_nl(h,l)
       */
      hp.category_code  category_code
     ,hc.sales_channel_code sales_channel_code
     ,hc.customer_class_code customer_class_code
     ,SUM(l.shipped_quantity * l.unit_selling_pric order_ship_tot
     ,l.actual_shipment_date ship_date
     ,DECODE (h.org_id,
        362,  'US',
        365,  'CAD',
        5624, 'MEX',
        h.org_id)  operating_unit
     ,tt.NAME  order_type
   FROM
         ont.oe_order_headers_all  h
        ,apps.oe_order_lines_all   l
        ,apps.hz_parties           hp
     ,apps.hz_cust_accounts        hc
     ,apps.hz_cust_site_uses_all   hcsu
     ,apps.hz_cust_acct_sites_all  hcas
     ,apps.oe_order_sources        os
     ,apps.oe_transaction_types_tl tt
     ,xxiw.xxiw_bs_om_order_types  xot
   WHERE
          h.header_id                    = l.header_id
      AND h.ORDER_SOURCE_ID              = os.ORDER_SOURCE_ID
      AND h.ORDER_TYPE_ID                = tt.TRANSACTION_TYPE_ID
      AND h.flow_status_code not in ('CANCELLED','ENTERED')
      -- AND h.booked_date IS NOT NULL
      -- Replaced with the following condition                 
      and h.booked_flag = 'Y'
      -- Replaced with the above condition
  
      AND tt.LANGUAGE                    = 'US'
      AND tt.NAME                 not like 'INTERNAL%'
      AND l.flow_status_code             = 'CLOSED'
      AND l.LINE_CATEGORY_CODE          <>'RETURN'
      AND TRUNC(L.ACTUAL_SHIPMENT_DATE)
          BETWEEN NVL(:B2 , TRUNC(L.ACTUAL_SHIPMENT_DATE))
              AND NVL(:B1 , TRUNC(L.ACTUAL_SHIPMENT_DATE))
      AND h.invoice_to_org_id             = hcsu.site_use_id (+)
      AND hcsu.cust_acct_site_id = hcas.cust_acct_site_id(+)
      AND hcas.cust_account_id           = hc.cust_account_id(+)
      AND hc.party_id                    = hp.party_id(+)
      AND UPPER(hc.customer_class_code) <> 'INTERNAL'
      AND UPPER(hc.sales_channel_code)  <> 'INTERNAL'
      and tt.name                        = xot.order_type
      and xot.type                       = 'ALL'
      AND os.name                       <> 'Satcom Closed'
   GROUP BY
      hp.party_name,
      hp.category_code,
      hc.sales_channel_code,
      hc.customer_class_code,
      h.order_number,
      h.header_id,
      h.booked_date,
      l.actual_shipment_date,
      h.flow_status_code,
      h.org_id,
      os.name,
      tt.name;

The Improvement

The same query now executed under 5 minutes!!! ~4.5 hrs to ~5 mins. Thats quantum.

The improved plan was as follows:

iw_query_good_plan:

Conclusion

While it may not be ideal, at times, it may be required to supplement of coach the CBO to achieve acceptable levels of application performance. The effect may be more long lasting if the "relative" data volume pattern of the underlying objects is not expected to change drastically in the system.

Yet Another Case Study: The over-commit trap!

Introduction

Its always rewarding to talk about lesser known performance tuning scenarios, which are not all that well appreciated in the DBA or developer world. Ask anyone as to what performance tuning has one worked upon and quite likely, you may hear - Ah well, I made sure the stats were gathered latest, the right index was being used or I used a hint to use the right index. How often do we look at other aspects like the frequencey of commits?

An atypical performance problem scenario

At a particular client site, a custom concurrent progam had a history of taking more than 8-10 hrs to complete.
The Applicaton load nature was a mixture of high OLTP during the regular business day hours and more than moderate batch processing during the evening to early morning hours. Needless to say, a lot of long running batch jobs were contending for shared resources like Rollback & temp segments.
A lot of diagnosis was done as to why the custom concurrent program would take so long. Quite often, the program would terminate with the notorious ORA-1555 (snapshot too old error), even though the data volume involved was quite less (7000 records)  and abundant rollback segment space -- 120g.
Even setting a high rollback segment retention initialization parameter undo_retention  as 8 hours did not help out in ironing out the performance issue.

Examining the possibilities...

There could have been several reasons why the performance lag was manifesting:
  • The query had a bad plan and the compile time plan was different than runtime plan. This happens many times, but it didnt seem to be the case
  • A hint for using unique index (from using index range scans) to avoid db sequential file read wait event seemed to make no difference
  • The level of  DML activity against tables within the process could be simply too much for the batch load to complete successfully in a suitable time window. This theory did not sound right since rollback retention period ~ 8 hours
  • The program was not doing commit or rollback often enough to be stay within the rollback retention period?

Foxy observations..

  While sql  tracing (event 10046) identified that a simple select on a custom table having not so much data was having a hard time, the corresponding SQL plan was using a unique index (!!). What more could one ask for? The situation was almost wierd.
At the same time, pl/sql profiling (dbms_profiler) identified that FETCH  phase of that very simple select was taking long. The HTML output generated from plsql profiler runs using the profiler.sql script of Metalink Note 243755.1 was especially helpful in getting to this stage of analysis to determine where was the maximum delay in the entire process.

Getting to the bottom of it

  Again, one more thing identified, which could be missed easily, in the tkprof'ed output of raw sql trace was that the CR blocks read for the fetch phase of the cursor were high. The sql engine had to read a huge number of block reads for forming a read consistent image for the FETCH phase. This aspect is not readily discernable since the regular thought process is to look at elapsed time and SQL plan.
What could the reason for such a high  number of block reads for forming a read consistent image ? Too few commits/rollbacks or too much  commits/rollbacks? Overdoing anything is usually not a rewarding experience.

Looking closer into the code...

  On re-examining the PL/SQL profiler HTML output, it was seen that the program logic was executing a commit statement within a loop for every input record processed.
This would obviously overtax the rollback segments and causing the user server process to record a read consistent image of the corresponding transaction data at every commit.
Taking cue from this observation, when commits were done in a batch fashion at some threshold, the entire load ran in 4 hours (Input volume: 7000 lines).

Conclusion

Neither is it good to do commits less frequently, nor is it good to over-commit. Moderation is the key for a happy medium and at times, it can have bearings on performance problems as well.
Who said software cannot learn from Philosophy?

An Application Tuning Case Study: The Deadly Deadlock

Preface


handcuff:
Lets face it. Deadlock and Latching issues have always been hard to debug. Many a times, we don't focus too well on application tuning, even though, it always advisable to look at the application design and user access trends first. These things don't come to the mind too readily.
Many a times, we do face situations where the users are deadlocked, sitting and watching blue screens or the application was freezing up. While there are many reasons for this to happen: Poor OS implementation, (swapping/disk/Memory issues) , RDBMS bugs, RDBMS initialization parameter values, having a lot of non sharable SQL, a good share of cases can also be caused due to Poor application coding, without undertanding the sequence in which resources should be accessed or acted upon (updated/deleted etc).
This is one such case study in which a deadlock was traced back to a deadlock situation un-aware code and the solution was to enhance the code to not cause deadlocks in the future. While the change was really simple, picturizing the deadlock in action was really difficult.
This article also goes into the mechanics of contructing the deadlock situation in slow motion. This makes it an instructive case study to read through for future reference.

Getting the Bad Press...


The client had a big OLTP implementation of Order management (Order to cash business flow) 11i E Business suite and the main bread and butter was to take orders for their configurable products.
Many a times during any business day, the customer service representatives (~200 people) would get blue screens or get hung up, while trying to work on Orders. Getting frantic helpdesk calls for angry customer service reps was normal and obviously this was under the management's scanner since it was highly visible.
Customer service reps would keep making small talk to their clients who would call in to order chairs etc and eventually, the clients would give up on the wait. As a fallback of this, the client was irate (they were losing their customers) and frustrated at implementing order management.
Everybody in customer service would say: Why is Oracle so slow?

Scratching the surface...


This continued on for months and it was normal for several myths about bad oracle performance to float around. More so, no one had any idea what was happening. Someone blamed workflow, someone blamed the network and so on..
The customer had done a lot of customizations. Preliminary investigations revealed that often there was a deadlock detected between a concurrent program session and a forms user session.
As a corollary, the custom "Order Acknowledgment" program would sometimes hang inordinately for > half hour, whereas otherwise, it would "normally" finish in 1-2 mins.
Analyzing wait events (using v$session_wait) and 10046 event session tracing didnt lead anywhere. The pattern was not predictable too. It was really very puzzling behaviour.

Understanding the dead-lock in slow motion...


The "XX Sales Order Acknowledgement XML publishing" program used to run every minute. There had to be some co-relation between the programs code (table/object access/locking pattern) and the activity being done on Order entry forms.
After analyzing the code and seeing how underlying tables were being accessed or updated, It became clear that there was a problem in the "XX Sales Order Acknowledgement XML publishing" code.
The complete picture of the deadlock pattern was available now. Lets see it in slow motion here:

Deadlock Situation 1:


From the entire bunch of qualifying Order header records to be processed for Order Acknowledgements, a batch of 25 Orders were updated at a time. After the updation of 25 orders, a commit was issued, which then freed the locks.
Now, out of these 25 records, if "XX Sales Order Acknowledgement XML publishing" concurrent program (say A) was trying to update the first Order, and there were be a Customer Rep (say B) who was working on the very same order in OE screen, "A" HAD TO WAIT till B freed the record by either commit/rollback action in Order Entry form.
TILL THE TIME B freed the same record, "A" was not be able to proceed with the remaining 24 Orders in the batch and kept on waiting for the row level exclusive lock.
It eventually either timed out or acquired the lock on the row. Similarly, "A" was not be able to proceed till "B" was done with the updates on the Order Header in OEXOETEL (Order Entry) form.
The same applied for any other Order which A had to update. Thus the situation when a single run of "XX Sales Order Acknowledgement XML publishing" concurrent program would be taking 20 mins or so. During this time, one would witness a high Order Entry udpating activity at the Customer Support site.
Now, lets reverse the situation.

Deadlock Situation 2:


At anytime, there may be 20 (say) Customer Reps (say "B"), who are working on 25 or 30 orders. Therefore, in theory, around 25-20 Order Entry Headers rows may be locked in exclusive mode.
Now, every next minute, the "XX Sales Order Acknowledgement XML publishing" concurrent program was trying to update these very Order Entry Headers rows and was having a hard time waiting for the row level exclusive locks to get released.
When "B" saved its changes, the commit released the row level exclusive locks. Only then was "A" able to proceed with its job. The real time deadlocks seen during the day were caused by a mix of Deadlock situation 1 and 2.
Both the above situations were happening at the same time and vying for the same resource, the same row in the Order Entry headers table, to become free.
This was a vicious, endless cycle and caused locking grief for people working on Order Entry.

Breaking the cycle


Once the nature of the application design and sequence of accessing resources were understood, it was clear that the vicious cycle had to be broken.
It was best to make the custom program more intelligent by checking first if the Order header record was locked (can be done by using the SELECT ..... FOR UPDATE NOWAIT clause).
Any already locked record was skipped for update. The logic used was that the skipped Order header record would be picked up for processing by the custom program next time AFTER the custom service rep would finally commit or rollback on the Order Entry form (the program was running every minute, remember).

Summary


The problem was with the custom application design after analyzing the table/object access/locking pattern & understanding the code. Essentially, in this case, what was happening was that the Custom order acknowledgment program used to update a batch of 25 Order header records, in a sequential fashion, and only then issue a commit.
The update statement in culprit concurrent program never checked if the Order header record was already locked or not by another session and hence, as per default behavior, the update statement waited till the lock timeout happened -- this is a long time until NOWAIT clause is specified.
This could happen for any of the 25 Order header records it would try to update. When a customer service rep wanted to update the same Order which was being processed by the order acknowledgment program, the person would hang since the form session couldnt acquire a lock on the Order header record till the time a commit was issued by order acknowledgment program.
On the other hand, when a customer service rep would update an Order through the Order Entry screen, they would lock up an order for a good 10-20 mins or so (they kept doing stuff to it, while talking to the customer on the phone) and the custom order acknowledgment program would keep waiting for that exclusive lock on the order header record.
Again, this could happen for any of those 25 records in the batch, as per the design. So the deadlock was happening both ways.
And it was really subtle. NO amount of tracing or latch analysis or metalink research would have brought the troubleshooter to this stage of understanding the real issue.

Related Links


Other Performance Related Articles

A SQL Tuning Case Study: Could we K.I.S.S. Please?

direct shot:

Preface

They say, it pays to K.I.S.S (Keep It Simple and Smart). The simplest ideas are the greatest ideas.This philosophy also applies to sql coding when we tell the oracle SQL engine what to fetch and how to fetch it (Indirectly). There are great rewards for simplistic thinking.
Lets discuss this principle through a case study in which the user was trying to fetch very simple data through an over-kill query, involving  RA_* views which were internally using some of the same tables. The SQL engine was therefore just trying to fetch and filter too much data from the same sources. It was not certainly efficient.
In this case study, we see how its important to understand the "Why" of a query, rather than the "How" of a query. Many times, just rethinking the whole approach pays great dividends.

The Poorly Performing Query

The client had a custom concurrent program which had this query as the worst offender.. it was taking 90% of the time and the program used to take ~ 1 hour to finish.
The formatted query is listed below:
SELECT SUBSTR(A_CUST.CUSTOMER_NUMBER,1,8) CUST_NUMBER, 
       HZ_CUST_SITE_USE.LAST_UPDATE_DATE LAST_UPDATE_DATE,
       SUBSTR(A_CUST.CUSTOMER_NUMBER,1,8) BILL_CUST_ID
FROM
    RA_CUSTOMERS A_CUST,
    HZ_CUST_ACCT_SITES_ALL HZ_CUST_ACCT_SITE,
    HZ_CUST_SITE_USES_ALL HZ_CUST_SITE_USE,
    RA_ADDRESSES_ALL RAA
WHERE
  A_CUST.CUSTOMER_NUMBER = :B1
  AND A_CUST.CUSTOMER_ID =  HZ_CUST_ACCT_SITE.CUST_ACCOUNT_ID
  AND HZ_CUST_ACCT_SITE.CUST_ACCT_SITE_ID = 
      HZ_CUST_SITE_USE.CUST_ACCT_SITE_ID
  AND HZ_CUST_ACCT_SITE.PARTY_SITE_ID =  RAA.PARTY_SITE_ID
  AND HZ_CUST_ACCT_SITE.STATUS = 'A'
  AND HZ_CUST_SITE_USE.STATUS = 'A'
  AND HZ_CUST_SITE_USE.SITE_USE_CODE =  'BILL_TO'
  AND RAA.ADDRESS1 = 'ICC LEASE'
/
Lets look at the TKPROF stats and runtime SQL plan used along with row counts.
plan of bad iw query:
Its interesting to observe that while only ~3000 rows are returned as the output of this sql, a huge number of rows were fetched from one of the HZ* tables.
Lets check the underlying tables being used by the Views RA_ADDRESSES_ALL and RA_CUSTOMERS. RA_ADDRESSES_ALL was a synonym for RA_ADDRESSES_MORG, while RA_CUSTOMERS was a synonym for RA_HCUSTOMERS.
  1  select referenced_name, referenced_type
  2  from dba_dependencies
  3* where name='RA_HCUSTOMERS'
SQL> /
REFERENCED_NAME         REFERENCED_TYPE
----------------------- -----------------
HZ_CUST_ACCOUNTS        TABLE
HZ_PARTIES              TABLE
SQL> l
  1  select referenced_name, referenced_type from dba_dependencies
  2  where name='RA_ADDRESSES_MORG'
  3* and referenced_type='TABLE'
SQL> /
REFERENCED_NAME                REFERENCED_TYPE
------------------------------ -----------------
HZ_PARTY_SITES                 TABLE
HZ_LOCATIONS                   TABLE
HZ_CUST_ACCT_SITES_ALL         TABLE
HZ_LOC_ASSIGNMENTS             TABLE
Lets also check the data volume involved in the base HZ* tables being referred and the percetange of estimation sample Percentage used.. (~70%) by Gather statistics program:
hz table stats information:

Do we need this complexity?


This is a good time for Reality Check: This is certainly overkill.
Why would you want to join complex views with 3-4 base tables being referred inside the views. Not an intelligent approach, shall we say? Better, why wouldnt one want to query the same data by understanding the data model relationship between the base tables involved and see if the same results can be achieved through them.
Spending time on checking the data volume involved, the % Estimation by Gather Stats, index cardinality etc would not lead to anything conclusive or direct inference in this case.
Another good learning from the http://etrm.oracle.com website (requires a Metalink Login: Note:150230.1) regarding the RA_* views is that they should not be used directly in ANY CUSTOMIZATIONS ever! All the more reason to eliminate the views from the query altogether and find an alternate way of getting the same result.
RA_ADDRESSES_MORG internal only:
ra_hcustomers internal only:

The Better Way...


Now, Only 1 column was being really queried. Inspite of this, after understanding the object model relationship diagrams for HZ* tables, at least five HZ* tables were required to be queried for a supplied customer_number value.

What was the SQL trying to achieve:


For a particular Customer_number, It wanted the find the Last_update_date for an ACTIVE Site which was being used as an ACTIVE BILL_TO location, with an address of ICC LEASE in its information.
Lets look at the index information:
select column_position, column_name
from dba_ind_columns
where index_name='&1'
/
Enter value for 1: HZ_LOCATIONS_N1
COLUMN_POSITION COLUMN_NAME
--------------- ------------------------------
              1 ADDRESS1
SQL> /
Enter value for 1: HZ_PARTY_SITES_U1
COLUMN_POSITION COLUMN_NAME
--------------- ------------------------------
              1 PARTY_SITE_ID
SQL> /
Enter value for 1: HZ_PARTY_SITES_N2
COLUMN_POSITION COLUMN_NAME
--------------- ------------------------------
              1 LOCATION_ID
SQL> /
Enter value for 1: HZ_CUST_ACCT_SITES_N1
COLUMN_POSITION COLUMN_NAME
--------------- ------------------------------
              1 PARTY_SITE_ID
SQL> /
Enter value for 1: HZ_CUST_SITE_USES_N1
COLUMN_POSITION COLUMN_NAME
--------------- ------------------------------
              1 CUST_ACCT_SITE_ID
              2 SITE_USE_CODE
              3 PRIMARY_FLAG
The same SQL query could be re-written such that it fetched the same data using base HZ_* tables, after connecting the foreign Key relationships.
Such an approach had many advantages:
  • RA_* views avoided as they are not supported except for in standard 11i products, also, its highly inefficient to access 7 tables/views for getting 1 column
  • The new query fetches less than half the number of consistent read buffers as compared to the previous query (verified using set autotrace on sql command)
  • The new query has just 14 access steps as opposed to 21 access steps for previous un-tuned query
  • The new query uses three unique Primery Key indexes (primary key) out of the five tables and two Foriegn key indexes for the remaining two, which is a really selective way of querying data
It simply cannot get better than that!
The performance improvement was very visible as the improved query ran within 2 minutes. Compare that with its previous cousin which used to run for ~1 hour.
SELECT SUBSTR(hca.account_NUMBER,1,8) CUST_NUMBER, 
       hcsu.LAST_UPDATE_DATE LAST_UPDATE_DATE,
       SUBSTR(hca.account_NUMBER,1,8) BILL_CUST_ID
FROM
   hz_cust_accounts       hca,
   hz_cust_acct_sites_all hcas,
   hz_cust_site_uses_all  hcsu,
   hz_party_sites         hps,
   hz_locations           loc
WHERE
--
-- Joining all the tables through primray keys
-- and foreign key relationships first
--
      hca.account_number = :B1
  and hca.cust_account_id = hcas.cust_account_id
  and hcas.party_site_id = hps.party_site_id
  and hca.party_id = hps.party_id
  and hps.location_id = loc.location_id
  and hcas.cust_acct_site_id = hcsu.cust_acct_site_id
--
-- the actual data query criteria starts here
--
  AND hcas.STATUS = 'A'
  AND hcsu.STATUS = 'A'
  AND hcsu.SITE_USE_CODE =  'BILL_TO'
  AND loc.ADDRESS1 = 'ICC LEASE'
--
-- Could perhaps add this criteria too ???
-- AND hcas.bill_to_flag = 'Y'
--
/
Before:
before - iw query plan and autotrace stats:
After:
improved iw query plan and stats:

Summary

It pays to ask the "why" questions sometimes, rather than focusing on "How" and "What" and going on a wild goose chase. The "Why" questions can get you more quickly to the answer by re-thinking the strategy.
Einstein himself said, "everything should be made as simple as possible, but no simpler."
The defence rests.

Related Links


Other Performance Related Articles

When Conventional Thinking Fails: A Performance Case Study in Order Management Workflow customization

small things put together can achieve great things.gif:

Foreword

Order management workflow interacts with a variety of components like core workflow, order management and pricing application call interfaces.

Customizing it without understanding the implications can be a double edged sword and the nature of customization may unwittingly worsen the situation. An undesirable combination of the aspects can cause a very difficult to diagnose and troubleshoot situation as shall be demonstrated in this case study.

This article is essentially the distilled wisdom of severity 1 situation at a client site who was unable to ship ordered items to their customers since OM order line workflow background process had a severe performance issue. The business impact was tremendous: $ 1.5 to 2 million worth of shipping and invoicing was being prevented.

Many a times, performance troubleshooting involves having some functional insight as well. TKPROF and 10046 trace is not the panacea for all performance issues. In this case, some insight into the nature of Order Management APIs was also required for getting traction on the performance problem. A painful discovery path followed, riddled with bumps and insightful discoveries. The troubleshooting approach used was really out of the box and involved breaking down the code being executed and using logical deduction.

The generic learnings from the ordeal are presented in this case study. It is hoped that the learnings will help oracle applications user community to be concious of hidden implications of customizing OM workflow.

Feeling the pinch...

oeol high backlog image.JPG:
The client had heavily customized their order management workflow during an implementation of Order to cash business cycle of Oracle applications 11i.

Immediately after the April 2007 financial quarter end, the OEOL workflow background process started exhibiting extremely long execution times, such that the backlog of unprocessed OEOLs came to ~150,000. This was a sharp contrast from the normal processing throughput.

This was a moving problem. The hanging of OEOL workflow background process was quite apparenly due to a 'few' order lines, belonging to a few orders. Once the 'problematic' orders' OEOL workflow were either CANCELLED or ERRORED, the backlog started clearing at a slow pace. This went on for a little while, until similar Order lines OEOL workflow seemed to take inordinately long to process and held up the entire workflow queue.

Coming to grips with it..

As a knee jerk approach, many things were tried, including killing the currently running workflow background process and re-submitting it. The clincher was that once or twice the workflow background process was killed , it marked the OEOL ITEM_KEY being processed into ERROR state and hence took it out of the deferred queue. This forced the Workflow background process to consider the next deferred OEOL item_keys in line.

In ~50% of the cases, it would work and the newly thread would start exhibiting a more acceptable throughput. But after a certain point, even killing and re-submitting workflow background process was simply not working. It did seem that the logic being executed by FNDWFBG was taking extremely long for processing one order line.

The observation was baffling. The following query was used to find out approximately which order line was a single thread of workflow background:

Query 1:

SQL> select * from (select to_char(enq_time,'dd-mon-rr hh24:mi') , 
     user_data


from wf_deferred_table_m where state = 0 and corrid = 'APPSOEOL'


order by priority, enq_time) where rownum < 5;


TO_CHAR(ENQ_TIM USER_DATA(ITEMTYPE, ITEMKEY, ACTID, FUNCTION_NAME, PARAM_LIST, RESULT)


----------------------------------------------------------------------


02-apr-07 14:25 WF_PAYLOAD_T('OEOL', '30273337', 141255, NULL, NULL, NULL)


02-apr-07 16:25 WF_PAYLOAD_T('OEOL', '30458068', 140850, NULL, NULL, NULL)


02-apr-07 16:25 WF_PAYLOAD_T('OEOL', '30458075', 140850, NULL, NULL, NULL)


02-apr-07 16:25 WF_PAYLOAD_T('OEOL', '30458089', 140850, NULL, NULL, NULL)


4 rows selected.


Tying it back to the order management tables, the following query tells us order level details like order_number, line_number, ordered_item etc, which is even more useful:

Query 2:
select distinct oh.org_id, oh.order_number,
ol.line_number, ol.shipment_number,              
ol.option_number, ol.ordered_item, ol.line_id, wiash.begin_date,
priority, wdtm.enq_time
from applsys.wf_deferred_table_m wdtm,
    applsys.wf_process_activities wpa,
     applsys.wf_item_activity_statuses wiash,
     apps.oe_order_lines_all ol,
     apps.oe_order_headers_all oh
where substr (wdtm.corrid,5) = 'OEOL'
and (wdtm.delay < sysdate or wdtm.delay is null)
and wpa.instance_id=wdtm.user_data.actid
and wiash.item_type=wdtm.user_data.itemtype
and wiash.item_key=wdtm.user_data.itemkey
and wiash.process_activity=wdtm.user_data.actid
and wiash.item_key = ol.line_id
and oh.header_id = ol.header_id
and wdtm.state = 0
group by
oh.org_id, oh.order_number, ol.line_number, ol.shipment_number,
ol.option_number,
ol.ordered_item, ol.line_id, wiash.begin_date, priority, wdtm.enq_time
order by priority, wdtm.enq_time ;

Observing the output over sometime gave the capability of identifying if workflow background process was stuck on a specific OEOL workflow.

Limited insight by conventional analysis..

The conventional debugging approaches provided limited insight into the root cause of the issue.

10046 trace could not bring out the drill down of time spent on pl/sql API calls clearly. In addition, there was an OS file limit on the trace file on most production servers, even if the max_dump_file_size=unlimited is used.

Another option was to use dbms_profiler API on FNDWFBG call. It could have worked, except the FNDWFBG API call never completed, so the data could never be collected.

Searching the metalink gives a host of notes for getting OM debugs (level 5), Pricing debug etc.

1. Note 255040.1:bde_wf_data.sql - Profiling Workflow Runtime Data related
2. Note 163208.1:bde_last_analyzed.sql - Verifies Statistics for all
3. Note 353991.1:OMSuiteDataChk.sql and OMSuiteDataChk11i10.sql


This is all good information, except that there can also be too much of un-meaningful and un-directed information. Interpreting detailed application debug logs can be useful when it is known what should be checked.

Taking the whitebox approach....

Lot of debug/trace information could be generated, but it was probably best to take the FNDWFBG execution apart in slow motion. Once the execution model has been understood, more brainstorming using divide and conquer strategy is effective.

An alternate way of looking at FNDWFBG is that it is nothing but a serial simulator of steps outlined in the oracle workflow designer.
compressed OM workflow from start fulfillment.JPG:

A whitebox (a software engineering term, as opposed to blackbox) approach was the need of the hour, coupled with some simple thinking. Using the workflow status monitor from Workflow Administrator responsbility in Oracle Applications 11i, the step at which an OEOL is currently at can be seen.

compressed workflow monitor output for a suspended OEOL workflow.JPG:

This may also be verified by the output of $FND_TOP/sql/wfstat.sql. Combining the observations from Query 2 above and drilling down via Workflow Monitor, it was clear that almost all the lines having performance issue were at Fulfill-Deferred step, which means that if FNDWFBG were to pick them up, it would then execute the next activity in line, which was Fulfill step.

Note: A quick workaround was to 1) Kill the FNDWFBG process 2) SUSPEND (this status avoids any processing by workflow) the problematic OEOL workflow via workflow monitor and 3) resubmit the FNDWFBG process.

Irreconcilable paradoxes of a Test case..

We really needed to analyze the drill down on the bad test case.

From the online examination of sqls being executed by FNDWFBG session, it was seen that a lot of time was being spent on XXONT Cascade Terms and Agreement, multiple times for apparently processing one OEOL ITEM_KEY. This was a really paradoxical observation and did not make any sense.

The only acid test of this was to call OE_FULFILL_WF.START_FULFILLMENT for specific OEOL line id or ITEM_KEY. The script run_start_fulfillment_with_om_debug.sql (Listing in Appendix A) was used to simulate a single execution of Fulfill step, which in turn called OE_FULFILL_WF.START_FULFILLMENT and did a rollback at the end. One such case ran for ~6 hours and finally finished.

Most surprisingly, an examination of raw trace file did show that sub-sequent steps like XXONT Cascade Terms and Agreement and Invoice Interface- Line had indeed been called multiple times. Therefore, it seemed that a single call to OE_FULFILL_WF.START_FULFILLMENT for one kind of OEOL was triggering off sequential execution of OM workflow for other OEOLs in the same Order, which were somehow related to the order line in question.

$ grep -i xxont_cascade_pkg.set_ln_trm_agr_proc prod_ora_677526.trc 
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end; 
value="XXONT_CASCADE_PKG"
value="XXONT_CASCADE_PKG"

This simply did not make any sense because Fulfill and XXONT Cascade Terms and Agreement step were discrete steps, as verfiable from viewing the OM workflow diagram flow above.

Another very interesting and puzzling observation was that without fail, the top 2 sql calls in TKPROF output (raw trace sorted with sort=prsela,exeela,fchela) showed that ~45% of the time was spent on OE_FULFILL_WF.START_FULFILLMENT API and ~45% of it was spent on the call to xxont_cascade_pkg.set_ln_trm_agr_proc API.

"tkprof output for 10046 of start_fulfillment API call"

Resolving the paradox...

The answer, which is difficult to digest, does make sense. Order management has unique processing requirements for configured ordered items created using Oracle Configurator module. Since OE_FULFILL_WF.START_FULFILLMENT is a workflow aware API (notice the _WF in its name), it was obviously calling more Workflow routines inside it to progress the business flow to next logical set of activities.

We first need to understand the concept of Fulfillment set. A fulfillment set is a set of related order lines that represent a configurable item. Whenever a fulfillment set pr a model is ready to fulfill or get built in production floor - i.e., all members' OEOL workflow are internally in FULFILL_LINE:Notified (blocked) status. The last member reaching this activity signals completion of FULFILL_LINE activities of all other members before proceeding itself. Since the XXONT Cascade Terms and Agreement step comes right after FULFILL_LINE step, XXONT Cascade Terms and Agreement step will be executed for all members of the current model or fulfillment set in that one go.

The relation between other lines whose OM workflow was being resumed and the problematic line in question could be found using the following sql:

Select top_model_line_id from oe_order_lines_all where line_id=<problematic_line>;

This query defines a FULFILLMENT SET (a set of lines which are related together
and constitute an orderable item):

Select line_id from oe_order_lines_all where top_model_line_id=<top_model_line_id_from_above_query>;

The user first configures the components of a confgurable product/item and then the item is actually built. Depending on the kind of configured model (Assemble to order -- ATO or Pick to order -- PTO), the oe_order_lines_all.flow_status_code (order management workflow status) stays at either SHIPPED or BOOKED status or becomes AWAITING_FULFILLMENT (internally blocked).

Here is an example of this intermediate stage of processing for different kind of configured Items (ATO/PTO/Stock):

compressed sql output of an example of configured Item.JPG:

Now, since a lot of time was being spent on XXONT Cascade Terms and Agreement step, After reviewing the code in the package XXONT_CASCADE_PKG.pkb, it could be seen that OE_ORDER_PUB.PROCESS_ORDER() API was being called to sync up the terms and agreements across Order header and line in question.

call sequence of xxont_cascade_pkg.GIF:

There were some penalties for doing this: the payment term and agreement fields on the order line triggered repricing of the order lines. Also, for especially large orders (~4000 lines) with mostly mis-matched payment terms and agreement lines, repeated calls to OE_ORDER_PUB.PROCESS_ORDER() were causing a huge overhead.

The reason why the timings for OE_FULFILL_WF.START_FULFILLMENT() showed high in the trace files was that FULFILL_LINE step syncronizes accross all components in a configurable model or fulfillment set. The following output from a workflow query in Appendix B lists out the time recorded by workflow for executing XXONT_PROD_SEC_STANDARD_LINE->FULFILL_LINE step:

complexity of the start_fulfillment api call.GIF:

Visualizing it in slow motion...

This means that, whenever a fulfillment set pr a model is ready to fulfill - i.e., all members OEOL workflow is internally in FULFILL_LINE:Notified (blocked) status, the last member reaching this activity will complete FULFILL_LINE activities of all other members before proceeding itself.

Since the Cascading activity comes right after FULFILL_LINE, it means that it will be executed for all members of the current model or fulfillment set in that one go. The aggregate timings for completing all other related OEOL workflows in the same fulfillment set goes into the timing recorded for OE_FULFILL_WF.START_FULFILLMENT() step.

The following diagrams attempt to represent the dynamics more clearly:

BEFORE calling OE_FULFILL_WF.START_FULFILLMENT() step:

before - status of oeol fulfillment set.GIF:

AFTER calling OE_FULFILL_WF.START_FULFILLMENT() step:

After - status of oeol fulfillment set.GIF:

Projecting the poor performance..

Based on the situation described above, a simple projection of poor performance can be done for a particular kind of order. Lets assume that there are 100 lines in a configurable model/Item. Also, lets assume that for 75% of the order lines, payment terms or/and agreements do not match with those in the Header. For each mis-matched (terms and agreements - with header) line, the time taken for pricing varied from 150 seconds to 750 seconds.

When we calcualte the cumulative time taken for executing xxont_cascade_pkg.set_ln_trm_agr_proc API for about 75% mis-matched lines of 100 lines, with ~5 mins taken for each line, the overall time taken comes to 75 x 5 mins ~ 8 hrs. That's the time for processing OE_FULFILL_WF.START_FULFILLMENT() step for one model. That was exactly the performance problem being experienced.

Now, just imagine the performance strain if the order had 10 such configurable items.

Mathmatically speaking:

complexity of the start_fulfillment api call.GIF:

The Solution: Simplify by taking the Highest Common Factor...

There were two important dimensions to the solution:

There was a bug in 11.5.9 Order management module which prevented the cascading order header changes to order lines from working properly (Profile option OM: Sales Order Form: Cascade Header Changes to Line should be Yes). This was prompting the calls to OE_ORDER_PUB.PROCESS_LINE() APIs.

Patch 5981436 is to be delivered by Oracle order management development for resolving this issue. This fix is also made part of 5950291 -11.5.9 OM CUMULATIVE PATCH. This would automatically match the payment terms and agreements from Order Header to order lines for majority of the cases, except if the user purposely chose to say No to the cascade option.

Another smart option was to do batch processing of all the mis-matched component lines of a configured model in ONE call to PROCESS_LINE() API by passing them as a pl/sql table argument. Also, this was to be done ONLY if the order line in question was for a Shippable Item. This was especially beneficial since only 1-5% of sub-components in a configured model are shippable. So that made it quite restrictive. This single change took out 95-97% of the repetitive work being done by OE_ORDER_PUB.PROCESS_ORDER() calls.

       --
       -- Calling API to update Terms and/or Agreement value(s) on the order line
       --
       oe_order_pub.process_line (p_line_tbl      => l_line_tbl,
                                  x_line_out_tbl  => o_line_tbl,
                                  x_return_status => l_return_status,
                                  x_msg_count     => l_msg_count,
                                  x_msg_data      => l_msg_data);

Seeing the performance Boost...

Before putting in the code change:

For a fulfillment set with top_model_line_id=9905479, there were 59 mis-matched lines, and the avg time for processing one call of process_line() was ~ 580 seconds.

before the code change.JPG:

After putting in the code change:

The time taken to same 59 mis-matched lines (of the same fulfillment set with top_model_line_id=29905479) in one pls/sql table (as an argument) in process_line() API call was ~ 625 seconds.

after the code change.JPG:

Putting statspack data into perspective...

This also explained why statspack reports showed xxont_cascade_pkg.set_ln_trm_agr_proc as one of  the major I/O offenders per execution:

statspack snippet for xxont_cascade_pkg api.JPG:

Summary of learnings from the Case study..

  • Don't have expensive custom activities defined after START_FULFILLMENT step in OM workflow. It will pronounce the performance hit many times, especially when Oracle configurator and order management modules are being used in tamdem (Better, don't  have expensive workflow custom activities at all)
  • Batch processing should be over piece-meal processing, especially when an API has provision for the same. This reduces most of the repetitive processing
  • The whitebox (drill down) approach works for taking apart a baffling performance problem. Trying to simulate a test case usually leads to the heart of the performance issue in a more reliable way
  • Getting extensive debugs and traces is great, but only when it is known what is being looked for. Asking the right probing questions is very important. Question, question, question.
  • A well though out plan with minimal tracing and using a drill down approach, can bring better results than a shot gun or blunderbuss approach
  • Sometimes, a high level functional knowledge of the processing being done can be very useful in understanding the nature of problem. A balance between strict technical and pure functional knowledge can be fruitful towards solving performance issues

TRIVIA: Why was the customization put in the first place?

Cascading is the process in which values from the sales order header level automatically populate to the corresponding fields at the sales order line level.

Due to a bug in 11.5.9, Agreement field at Header level was not populating to the child lines of a configuration.

Due to this, the autoinvoicing program split lines into separate invoices by agreement. There were thousands of invoices where the components were being invoicing separately from the top model. It was an invoicing fiasco.

Appendix A:

sandbox:dev> more run_start_fulfillment_with_om_debug.sql
var resultout varchar2(1000)
set serveroutput on time on timing on
prompt
prompt setting userid, responsbility and application id for JHASLAGE FOR order management super user
prompt
define user_id=7962
define resp_id=52560
define appl_id=660
define item_key=29904309
define activity_id=141236

declare
l_debug_file varchar2(400);
l_org_id NUMBER;
l_count NUMBER;
l_result VARCHAR2(30);
l_spid varchar2(20);
begin
fnd_global.apps_initialize(&user_id,&resp_id,&appl_id);
select p.spid
into l_spid
from v$process p, v$session s
where s.audsid=userenv('sessionid')
and paddr = addr;
OE_Standard_WF.OEOL_SELECTOR
(p_itemtype => 'OEOL'
,p_itemkey => '&&item_key'
,p_actid => 12345
,p_funcmode => 'SET_CTX'
,p_result => l_result
);
oe_debug_pub.debug_off;
--enable OM debug
oe_debug_pub.debug_on;
oe_debug_pub.initialize;
l_debug_file := OE_DEBUG_PUB.Set_Debug_Mode('FILE');
oe_debug_pub.setdebuglevel(5);
dbms_output.put_line('Debug file is : '|| l_debug_file);
--enable trace
EXECUTE IMMEDIATE ('alter session set max_dump_file_size=unlimited');
EXECUTE IMMEDIATE ('alter session set tracefile_identifier=''gverma_start_fulfillment_with_om_debug_&&item_key''');
EXECUTE IMMEDIATE ('ALTER SESSION SET EVENTS ''10046 TRACE NAME CONTEXT FOREVER, LEVEL 12''');
OE_FULFILL_WF.START_FULFILLMENT('OEOL','&&item_key',&&activity_id,'RUN',:resultout);
--disable trace
EXECUTE IMMEDIATE ('ALTER SESSION SET EVENTS ''10046 TRACE NAME CONTEXT OFF''');
--turn debug off
oe_debug_pub.debug_off;
dbms_output.put_line('File name '||OE_DEBUG_PUB.G_DIR||'/'||OE_DEBUG_PUB.G_FILE);
dbms_output.put_line('Trace file generated for this request:'||'/*'||l_spid||'*.trc');
rollback;
EXCEPTION WHEN OTHERS THEN
dbms_output.put_line('reached error in script ' ||sqlerrm);
rollback;
end;
/
select :resultout from dual
/

Appendix B:

col ITEM_KEY for a10 Head "Item Key"
col BEGIN_DATE for a18 head "Begin Date"


col END_DATE for a18 head "End Date"


col ROOT_ACTIVITY for a30 head "Root Activity"


col RESULT for a15


col STATUS for a8 Head Status


col time_taken_mins for 99999 head "Time(Mins)"


col step format a60

set pages 100 time on timing on lines 150

spool get_details_for_wf_step

select ias.item_key,
to_char(ias.begin_date,'DD-MON-RR HH24:MI:SS') begin_date, to_char(ias.end_date,'DD-MON-RR HH24:MI:SS') end_date,


(ias.end_date-ias.begin_date)*24*60 time_taken_mins,


pa.process_name||'->'||pa.instance_label step,


ias.activity_status Status,


ias.activity_result_code Result


from wf_item_activity_statuses ias,


wf_process_activities pa


where ias.item_type = 'OEOL'


and ias.process_activity = pa.instance_id


and ias.begin_date >= sysdate-14


and pa.process_name||'->'||pa.instance_label = 'XXONT_PROD_SEC_STANDARD_LINE->FULFILL_LINE'


and ias.activity_status = 'COMPLETE'


--


-- and the step was not cancelled


--


and ias.activity_result_code != '#FORCE'


--


-- it took more than so many mins to complete this step


--


and (ias.end_date-ias.begin_date)*24*60 > 30


order by (ias.end_date-ias.begin_date)*24*60 desc, ias.begin_date

/
spool off

Workflow performance case study: Dont Repeat History, Learn from it

shlok for xxont_custom_atp 25012007.gif:

A Sanskrit Shlok underlying the importance of careful planning

Preface

They say: "Those who do not know history are condemned to repeat it."
But why repeat history if you dont have to?
Many questions come to light on this topic of discussion:
  • Is there a better way of designing conditional implementation in workflow?
  • What are the merits/demerits of repeated checking of a particular condition as opposed to executing something when the actual condition happens? What are the performance implications?
  • Are there workflow APIs available which simulate event based subscription model?
  • What are the performance advantages of using the basic Workflow APIs that simulate this event based subscription model?
This case study discusses the performance benefits of keeping unnecessary looping execution in workflow design a a bare minimum and if possible, eliminate it. If there are several short timed loops in the workflow design, that get executed very often, the bad news is that the System spends valuable resources spinning the wheels in vain.
The intended audience is workflow designers, support personnel and Oracle Apps DBA alike. This case study can potentially help users to design their workflow more optimally and succintly, which would have a minimal impact on the execution. The I/O profile and time taken for Workflow Background Process bears out all this information. As far as possible, trends have been depicted from the customer's Production system.

The Business Scenario

The business requirement to be implemented was during the ATP (Available to Promise) phase in the life of an Order line. ATP simply means that we can guarentee the customer that a particular Item can be made available from a particular warehouse on a particular date.

When scheduling an Order Line, if a financial hold existed on the Order line due to some reason, it needed to be unschedule. Also, as soon as the financial hold was released by the Accounts Receivables department, the requirement was to proceed with the remaining business flow in the OEOL workflow design.

How It Was Designed

This is how it was designed:

xxont_custom_atp initial loop design.jpg:

Let us translate the flow:

  • When the WF background program executes custom ATP process, it checks if the order is on hold or not.
  • If the order header or one of the lines is on hold the custom activity "Unschedule Lines On Hold" will unschedule all those lines and waits for an hour before checking the line status again.
  • This loop repeats until all holds on the header/line are released.
Lets look more closely at the Wait Activity definition:
activity details of wait.JPG:
Checking its Node Attributes tab:
activity details of wait -- timeout attribute.JPG:
Now, checking its actual timeout value:
activity details of wait -- timeout attribute value.JPG:
So, the bottomline is that the WAIT activity was timing out after every 15 mins (60 x 0.0416 ~ 15 mins). Just imagine the volume of OEOL item keys at this activity on any given hour. What kind of a load would they be putting on the system.
Now, is that an intelligent way of designing the workflow for this requirement?

Some shocking revelations..

A little analysis of the volume in WF_ITEM_ACTIVITY_STATUSES_H (history) table showed something shocking (!!):
xxont_custom_atop getting the highest looped activity.jpg:
The XXONT_CUSTOM_ATP->WAIT step was generating the highest history records. In this case, it had created 27 million rows (75% of the entire volume) out of 35 millions rows in in WF_ITEM_ACTIVITY_STATUSES_H table.
PS: This is one of the reasons as to why this case study is very interesting: simply because its impact was tremendous.

How are history records created?

Lets clear up some air and get our basics right about how and when are history records created.

When the workflow engine encounters an activity which has a status of COMPLETE, it moves (figuratively) that record from WF_ITEM_ACTIVITY_STATUSES to WF_ITEM_ACTIVITY_STATUSES_H and creates a new record in WF_ITEM_ACTIVITY_STATUSES with status ACTIVE.

Therefore, the more times a COMPLETed activity is re-visited, the more history records for that activity are created. This can happen when a loop having a WAIT activity with a short timeout is designed. The more the workflow background process flows through the activities of such a loop, the more records will get created in WF_ITEM_ACTIVITY_STATUSES_H.

The performance implications...

Lets try to think as to what are the other implications of such poor design?

  • The system keeps looping on idle timeout activity every 15 mins
  • Immediately after the timeout completes, the activity becomes eligible to be processed by the workflow.
  • The next activity (check if hold has been released) checks if the hold has been released?
  • Statistically speaking, as the probability of a financial hold being released is not immediate, the control goes back to the WAIT activity, wherein the cycle repeats
While this is happening:
  • Every 15 mins, the backlog (timedout or deferred) count of unprocessed OEOL item type bloats
  • Due to such frequent time-out activities, other more important processes/activities in the OEOL workflow are not executed in a realistic timeframe
  • Valuable system resource are spent on scheduling and executing the loop's activities on the Production machine
  • The OS on Production machine spends vital I/O resources in servicing these application calls
Simply speaking, that is very in-efficient.
This is like a vicious cycle. The more data you have in in WF_ITEM_ACTIVITY_STATUSES and WF_ITEM_ACTIVITY_STATUSES_H tables, the more time it takes to query and insert  into them (also, the more indexes you have, the more time it would take to delete/update/insert due to automatic index re-organization effort required). It just kills the online performance at times.
The following diagram bears out the above ideas. It shows the top activity (in terms of number of item_keys) in the breakup of OEOL backlog. It can be seen that, without fail, XXONT_CUSTOM_ATP was the top wait activity in the OEOL backlog. Not only was it a liablity to itself, it was also a liability to other legitimate activites perfectly eligible to be processed:
xxont_custom_atp being the highest OEOL wait activity.JPG:

Design Considerations: The Danger Of Introducing Infinite Loops

There is another big danger here: that of introducing infinite loop in the execution of Workflow Background Process (also referred to as FNDWFBG) for OEOL.

Let us assume that there is a standard WAIT activity with a very short timeout/wait time in a loop with a condition checking activity. When the condition is checked, the activity will be deferred or timed out for a very short time.
Meanwhile, Workflow Background Process would be busy processing all other deferred or timed out activities. If FNDWFBG is not done processing other deferred activities by the time the wait/time-out time of previous activities expires, the RE-timedout or deferred activities will be enqueued and re-processed AGAIN by the very same FNDWFBG (Workflow Background Process for say item_type=OEOL/OEOH) job over and over, as long as the condition (e.g "line on hold") is met.
This creates an infinite loop, so FNDWFBG process almost never completes, eating up CPU time, creating massive DBMS archive logs (if turned on), and pumping rows into the WF activity statuses table. This would also create unnecessary rows in the WF_ITEM_ACTIVITY_STATUS_H table, which can be nemesis of online order entry performance in the 11i EBusiness Suite later on.

It is a good idea for workflow activities in custom workflows to have Wait/Timeout time greater than the longest normal execution time of Workflow Background process.
Also, a workflow activity having a near zero Relative (Wait) Time, or Wait Mode stated as Absolute Date, but no Absolute Date specified, which is equivalent to zero Relative Wait Time can cause the same issue (infinite loop) to occur for FNDWFBG (Workflow Background Process for say item_type=OEOL/OEOH) runs with Process Deferred = Yes.

Hence, these poor design practices should be avoided, because, remember, once a runtime for a particular WF definition gets initiated and even if the workflow definition is corrected later and uploaded, the previous runtime data will not be automatically corrected. So its prudent to take in good design considerations before letting your production data run on it.

More design considerations..

Whenever possible, standard Wait activity ( WF_STANDARD.WAIT() ) should not be placed in a loop with a small wait time (shorter than several hours). Alternative workflow designs should be evaluated.
WF_STANDARD.WAIT() with wait time of zero or near zero, or standard Defer activity (WF_STANDARD.DEFER()), should be NEVER be placed in a loop in a workflow process design diagram.

Extending the Wait time should only be used if this solution is not acceptable for a very good reason.

The Superior Design Alternative....

For the above business scenario, the best solution is not to use Wait activity in a loop at all. A much more attractive option is to use published Workflow blocking/contnuation APIs like WF_STANDARD.BLOCK() or OE_STANDARD_WF.STANDARD_BLOCK() should be used for stopping processing flow of a workflow item_key. WF_ENGINE.COMPLETEACTIVITY() should be used for resuming processing flow of a workflow item_key.

An additional advantage of using OE_STANDARD_WF.STANDARD_BLOCK is that  any OEOL workflow item which is blocked on OE_STANDARD_WF.STANDARD_BLOCK can also be progressible through Order Entry form via "Actions" button Progress Order option. This gives more flexibility to the user/designer.

If one wants to progress the flow of a workflow item programatically, WF_ENGINE.COMPLETEACTIVITY() API can be called. This API notifies the Workflow Engine that the specified activity has been completed for a particular item. It signals the Workflow Engine that an asynchronous activity has been completed. This procedure requires that the activity currently has a status of 'Notified' (Internal status equivalent to BLOCKED). An optional activity completion result can also be passed. The result can determine what transition the process takes next.

Example

/* Complete the BLOCK activity which is used in multiple
* subprocesses in parallel splits. */

wf_engine.CompleteActivity('ORDER', '10045',
'XXONT_CUSTOM_ATP:WAIT', 'null');

To enhance the design AND achieve the same business functionality, we need to replace the WAIT activity with a BLOCK activity which will wait till the last hold is released from the order header/line without causing any looping. The COMPLETEACTIVITY API would be called from the Financial Hold release form at Order Header level.

Its as simple as that. The new design would look like:

xxont_custom_atp improved loop design.jpg:

Seeing the Performance Benefits in Production..

Adopting this approach will create only 1 record in WF_ITEM_ACTIVITY_STATUSES table and 2 records in WF_ITEM_ACTIVITY_STATUSES_H table.


Contrast this against continuous looping till a condition is met: for each execution of the WF logic, records are created in WF_ITEM_ACTIVITY_STATUS* tables.

As a result of this change, the landscape of OEOL backlog also changed drastically.

changed -- xxont_custom_atp being NOT the highest OEOL wait activity.JPG:

The average runtime of the Workflow background Process for OEOL also came down consistently.
trend of FNDWFBG OEOL avg runtimes.GIF:

Summary of Learnings

  • It pays to know the published Workflow APIs. The Workflow Developer Guide is a good place to start. Spend time reading the fine manual. More information can never hurt.
  • It pays to spend time and effort in evaluating re-design of loops in workflow design early, especially since once initiated, a specific version of workflow runtime data does not automatically get updated after a newer version of workflow design is uploaded (through WFLOAD)
  • Poor design involving WAIT and poll activities in a loop can cause infinitely running Workflow Background Process (FNDWFBG)
  • Short-timeout activities in loops should be replaced with OE_STANDARD_WF.STANDARD_BLOCK or WF_STANDARD.BLOCK APIs and WF_ENGINE.COMPLETEACTIVITY APIs
  • Simple design changes can bring HUGE performance benefits, especially when a lot of volume is involved
  • One thing which was also learnt in hindsight was that the Development/Testing team needs to be extremely careful to account for all instantiated financial Holds.  We ran into a problem with the removal of the loop where a certain set of financial holds were not being completed.  For this set of scenarios (automatically released holds not going through the quick order form) we created a custom concurrent program to identify the blocks and call the WF_ENGINE.COMPLETEACTIVITY API. 


Simply put, Know your facts, Be smart and Dont repeat history if you dont have to. Above all, do you testing well.

Suggested Related Readings

When Conventional Thinking Fails: A Performance Case Study in Order Management Workflow customization --
Posted on 17 May 2007; 12:09:31 PM


http://blogs.oracle.com/gverma/performance/om_wf_xxont_cascade

A Performance Case Study: Taming the Workflow Purge --
Posted on 21 March 2007; 6:58:11 PM


http://blogs.oracle.com/gverma/2007/03/21#a72

We're Skewed! Using 10053 event for being Histogram-wise --
Posted on 27 March 2007; 3:02:33 PM


http://blogs.oracle.com/gverma/2007/03/27#a80

March 8, 2008

Troubleshooting case study for 9i RAC ..PRKC-1021 : Problem in the clusterware

Preface

This is a troubleshooting case study for a 9i RAC environment on HP-UX PA RISC 64 bit, which was using HP ServiceGuard solution (third party) for cluster maintenance.

The problem

The problem was that the srvctl command was giving the following error:

(oracle@rac2):  gsdctl stat
PRKC-1021 : Problem in the clusterware
Failed to get list of active nodes from clusterware

(oracle@rac1):  srvctl config
PRKC-1021 : Problem in the clusterware

(oracle@rac1):  pwd
/u01/oracle/uatdb/9.2/bin
(oracle@rac1):  ./lsnodes
rac1
rac2
app1
app2


Troubleshooting Approach

Followed note 178683.1 to edit $ORACLE_HOME/bin/gsd.sh like this and traced gsd.sh:

# added -DTRACING.ENABLED=true -DTRACING.LEVEL=2 before -classpath
exec $JRE -DPROGRAM=gsd -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath $CLA
SSPATH oracle.ops.mgmt.daemon.OPSMDaemon $MY_OHOME


$ $ORACLE_HOME/bin/gsd.sh
...
...
[main] [13:15:42:852] [line# N/A]  my property portnum=null
[main] [13:15:42:942] [NativeSystem.&lt;init>:123]  Detected Cluster
[main] [13:15:42:946] [NativeSystem.&lt;init>:124]  Cluster existence = true
[main] [13:15:42:947] [UnixSystem.&lt;init>:118]  Going to load SRVM library
[main] [13:15:42:950] [UnixSystem.&lt;init>:118]  loaded libraries
[main] [13:15:42:950] [OPSMDaemon.main:726]  Initializing the daemon ...
[main] [13:15:42:951] [OPSMDaemon.&lt;init>:188]  NativeDebug is set to true
[main] [13:15:42:952] [OPSMDaemon.&lt;init>:188]  UnixSystem.initializeDaemon: groupName is opsm
[main] [13:15:42:953] [OPSMDaemon.&lt;init>:188]  Unsatisfied Link Error caught. Could not initialize the cluster
[main] [13:15:42:954] [OPSMDaemon.main:726]  initializeDaemon status = false
[main] [13:15:42:955] [OPSMDaemon.main:726]  Failed to initialize and register with clusterware
[main] [13:15:42:955] [OPSMDaemon.main:726]  OPSMErrCode = 1003
[main] [13:15:42:958] [OPSMDaemon.main:729]  java.rmi.RemoteException: Unable to initialize with the clusterware
java.rmi.RemoteException: Unable to initialize with the clusterware
        at oracle.ops.mgmt.daemon.OPSMDaemon.&lt;init>(OPSMDaemon.java:195)
        at oracle.ops.mgmt.daemon.OPSMDaemon.main(OPSMDaemon.java:726)

[main] [13:15:42:958] [line# N/A]  Exiting from main..no point trying to start the daemon


At this point, one option was to initialize the srvconfig raw device (OCR) and then add the RAC instances etc manuall using srvctl add command:

(oracle@rac1):  srvconfig -init
java.lang.UnsatisfiedLinkError: readRawObject
        at oracle.ops.mgmt.nativesystem.UnixSystem.readRawObject(UnixSystem.java:410)
        at oracle.ops.mgmt.rawdevice.RawDevice.readMagicString(RawDevice.java:187)
        at oracle.ops.mgmt.rawdevice.RawDeviceVersion.readVersionString(RawDeviceVersion.java:175)
        at oracle.ops.mgmt.rawdevice.RawDeviceVersion.isValidConfigDevice(RawDeviceVersion.java:75)
        at oracle.ops.mgmt.rawdevice.RawDeviceUtil.&lt;init>(RawDeviceUtil.java:147)
        at oracle.ops.mgmt.rawdevice.RawDeviceUtil.main(Compiled Code)
Exception in thread "main" (oracle@rac1): 

(oracle@rac1):  srvconfig  -version  
java.lang.UnsatisfiedLinkError: readRawObject
        at oracle.ops.mgmt.nativesystem.UnixSystem.readRawObject(UnixSystem.java:410)
        at oracle.ops.mgmt.rawdevice.RawDevice.readMagicString(RawDevice.java:187)
        at oracle.ops.mgmt.rawdevice.RawDeviceVersion.readVersionString(RawDeviceVersion.java:175)
        at oracle.ops.mgmt.rawdevice.RawDeviceVersion.isValidConfigDevice(RawDeviceVersion.java:75)
        at oracle.ops.mgmt.rawdevice.RawDeviceUtil.&lt;init>(RawDeviceUtil.java:147)
        at oracle.ops.mgmt.rawdevice.RawDeviceUtil.main(Compiled Code)

If the config file is pointing to a raw device the following type of output should be returned:

     $ raw device version "9.0.0.0.0"

Since we were not getting that output, there was either a problem in accessing the OCR raw device or the soft link was not working due to a permissions issue. Outputting the contents of the OCR on the standard output using $ dd if=/dev/orar1/rrawuat.conf bs=1500 showed that the OCR device was readable all right.

$  more /var/opt/oracle/srvConfig.loc
srvconfig_loc=/dev/orar1/rrawuat.conf

$ ls -l /dev/orar1/rrawuat.conf
crw-rw----   1 oracle    dba         64 0x110004 Apr 11  2007 /dev/orar1/rrawuat.conf


Then one idea was to relink the srv* binaries using the make command, but that also resulted in error:

(oracle@rac2):  cd $ORACLE_HOME/srvm/lib  
(oracle@rac2):  make -f ins_srvm.mk install
nm -gp /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.a | grep T  | grep Java | awk '{ print "-u " $3 }' >  /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.def;
/u01/oracle/uatdb/9.2/bin/echodo ld +s -G -b -o libsrvm.sl -c
/u01/oracle/uatdb/9.2/srvm/lib/libsrvm.def /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.a                           -L/u01/oracle/uatdb/9.2/lib32/ -L/u01/oracle/uatdb/9.2/srvm/lib/  -L/usr/lib -lc  -lclntsh -lwtc9 -lnls9  -lcore9 -lnls9 -lcore9  -lnls9 -lxml9 -lcore9
-lunls9 -lnls9 /opt/nmapi/nmapi2/lib/libnmapi2.sl -lm `cat /u01/oracle/uatdb/9.2/lib/sysliblist` ;
rm /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.def
ld +s -G -b -o libsrvm.sl -c /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.def /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.a -L/u01/oracle/uatdb/9.2/lib32/ -L/u01/oracle/uatdb/9.2/srvm/lib/
-L/usr/lib -lc -lclntsh -lwtc9 -lnls9 -lcore9 -lnls9 -lcore9 -lnls9 -lxml9 -lcore9 -lunls9 -lnls9 /opt/nmapi/nmapi2/lib/libnmapi2.sl -lm -l:libcl.sl -l:librt.sl -lpthread -l:libnss_dns.1 -l:libdld.sl
ld: Mismatched ABI (not an ELF file) for -lclntsh, found /u01/oracle/uatdb/9.2/lib32//libclntsh.sl
Fatal error.
*** Error exit code 1

Stop.

The turning point

Saw bug 6281672 and kind of got hint from there.

Compared the file $ORACLE_HOME/lib32/libsrvm.sl on another RAC system (duat).

(oracle@duat1):  ls -l $ORACLE_HOME/lib*/libsrvm*
-rwxr-xr-x   1 oracle    dba          57344 Nov  7 20:04 /u01/oracle/uatdb/9.2/lib32/libsrvm.sl
-rwxr-xr-x   1 oracle    dba          57344 Nov  7 08:11 /u01/oracle/uatdb/9.2/lib32/libsrvm.sl0
-rwxr-xr-x   1 oracle    dba          36864 Nov  7 20:04 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl
-rwxr-xr-x   1 oracle    dba          36864 Nov  7 08:11 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl0
(oracle@duat1): 


On rac1/2, the /u01/oracle/uatdb/9.2/lib32/libsrvm.sl file was missing. Saw that the .sl and .sl0 files were copied of each other.

so on rac1/2, did the following:

$ cd /u01/oracle/uatdb/9.2/lib32
$ cp libsrvm.sl0 libsrvm.sl


Before:

(oracle@rac2):  ls -l $ORACLE_HOME/lib32/libsrvm*
-r-xr-xr-x   1 oracle    dba          57344 Oct 17  2004 /u01/oracle/uatdb/9.2/lib32/libsrvm.sl0
-rwxr-xr-x   1 oracle    dba          36864 Feb  9 06:49 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl
-r-xr-xr-x   1 oracle    dba         106496 Nov 30  2004 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl0


After:

(oracle@rac2):  ls -l $ORACLE_HOME/lib32/libsrvm*
-r-xr-xr-x   1 oracle    dba          57344 Mar  5 15:14 /u01/oracle/uatdb/9.2/lib32/libsrvm.sl
-r-xr-xr-x   1 oracle    dba          57344 Oct 17  2004 /u01/oracle/uatdb/9.2/lib32/libsrvm.sl0
-rwxr-xr-x   1 oracle    dba          36864 Feb  9 06:49 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl
-r-xr-xr-x   1 oracle    dba         106496 Nov 30  2004 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl0

The fact that the size of libsrvmocr.sl0 was not matching with libsrvmocr.sl did not seem to be a showstopper. The Oracle system seemed to be taking libsrvm.sl into account for srv* related commands.

After this, the gsdctl and srvctl commands started working:

(oracle@rac1):  gsdctl stat                                             
GSD is running on the local node

(oracle@rac1):  srvctl config
uat

(oracle@rac2):  gsdctl stat
GSD is not running on the local node
(oracle@rac2): 

(oracle@rac2):  gsdctl start
Successfully started GSD on local node

(oracle@rac2):  srvctl status database -d uat
Instance uat1 is not running on node rac1
Instance uat2 is not running on node rac2

But there was "just" one more problem..

Interestingly, srvctl stop database/instance worked fine at this point, but srvctl start did not.

(oracle@rac2):  srvctl start instance -d uat -i uat1
PRKP-1001 : Error starting instance uat1 on node rac1
ORA-00119: invalid specification for system parameter local_listener
ORA-00132: syntax error or unresolved network name 'uat1' reserved.
ORA-01078: failure in processing system parameters local_listener
ORA-00132: syntax error or unresolved network name 'uat1' reserved.

(oracle@rac2):  srvctl start instance -d uat -i uat2
PRKP-1001 : Error starting instance uat2 on node rac2
ORA-00119: invalid specification for system parameter local_listener
ORA-00132: syntax error or unresolved network name 'uat2' reserved.
ORA-01078: failure in processing system parameters
SQL> Disconnected
ORA-00132: syntax error or unresolved network name 'uat2' reserved.

initUAT2.ora:
-------------
uat1.local_listener='uat1'
uat2.local_listener='uat2'
uat1.remote_listener='uat2'
uat2.remote_listener='uat1'

(oracle@rac1):  strings spfileuat1.ora | grep listener
uat1.local_listener='uat1'
uat2.local_listener='uat2'
uat1.remote_listener='uat2'
uat2.remote_listener='uat1'

Tnsping utility was working for both UAT1 and UAT2 service names..

(oracle@rac2):  tnsping uat1
TNS Ping Utility for HPUX: Version 9.2.0.6.0 - Production on 05-MAR-2008 16:19:36

Copyright (c) 1997 Oracle Corporation.  All rights reserved.

Used parameter files:
/u01/oracle/uatdb/9.2/network/admin/uat2_rac2/sqlnet_ifile.ora

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=rac1.test.com)(PORT=1522)) (CONNECT_DATA= (SERVICE_NAME=uat) (INSTANCE_NAME=uat1)))
OK (0 msec)

(oracle@rac2):  tnsping uat2
TNS Ping Utility for HPUX: Version 9.2.0.6.0 - Production on 05-MAR-2008 16:19:39

Copyright (c) 1997 Oracle Corporation.  All rights reserved.

Used parameter files:
/u01/oracle/uatdb/9.2/network/admin/uat2_rac2/sqlnet_ifile.ora

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=rac2.test.com)(PORT=1522)) (CONNECT_DATA= (SERVICE_NAME=uat) (INSTANCE_NAME=uat2)))
OK (0 msec)

Added entries for local_listener and remote_listener to $TNS_ADMIN/tnsnames.ora on rac1 (and rac2 resp) and made sure that tnsping to them was working:


local_listener=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=rac2.test.com)(PORT=1522))
        )
remote_listener=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=rac1.test.com)(PORT=1522))
        )

(oracle@rac2):  tnsping local_listener
TNS Ping Utility for HPUX: Version 9.2.0.6.0 - Production on 05-MAR-2008 16:44:05

Copyright (c) 1997 Oracle Corporation.  All rights reserved.

Used parameter files:
/u01/oracle/uatdb/9.2/network/admin/uat2_rac2/sqlnet_ifile.ora

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=rac2.test.com)(PORT=1522)))
OK (10 msec)


(oracle@rac2):  tnsping remote_listener
TNS Ping Utility for HPUX: Version 9.2.0.6.0 - Production on 05-MAR-2008 16:44:13
Copyright (c) 1997 Oracle Corporation.  All rights reserved.

Used parameter files:
/u01/oracle/uatdb/9.2/network/admin/uat2_rac2/sqlnet_ifile.ora

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=rac1.test.com)(PORT=1522)))
OK (10 msec)

The final clincher

I saw the light when I realized that the Database ORACLE_HOME was using autoconfig, which meant that the $TNS_ADMIN variable had $CONTEXT_NAME in it, and was not just plain vanilla $ORACLE_HOME/network/admin. Therefore, the listener.ora and tnsnames.ora files in $TNS_ADMIN were not the same as those in $ORACLE_HOME/network/admin

To get around this issue graciously, srvctl setenv command could be used:

$ srvctl setenv instance -d uat -i uat1 -t TNS_ADMIN=/u01/oracle/uatdb/9.2/network/admin/uat1_rac1


$ srvctl getenv instance -d uat -i uat1
TNS_ADMIN=/u01/oracle/uatdb/9.2/network/admin/uat1_rac1

$ srvctl setenv instance -d uat -i uat2 -t TNS_ADMIN=/u01/oracle/uatdb/9.2/network/admin/uat2_rac2

$ srvctl getenv instance -d uat -i uat2
TNS_ADMIN=/u01/oracle/uatdb/9.2/network/admin/uat2_rac2


(oracle@rac1):  srvctl start instance -d uat -i uat1

(oracle@rac1):  srvctl status instance -d uat -i uat1
Instance uat1 is running on node rac1

(oracle@rac1):  srvctl start instance -d uat -i uat2

(oracle@rac1):  srvctl status instance -d uat -i uat2
Instance uat2 is running on node rac2

Some pending issues..

Some issues were still pending, pointing towards some more mis-configuration, but at least we were able to get over the initial error.

(oracle@rac1):  srvconfig  -version
java.lang.NullPointerException
        at java.text.MessageFormat.format(Compiled Code)
        at java.text.MessageFormat.format(MessageFormat.java)
        at java.text.MessageFormat.format(MessageFormat.java)
        at oracle.ops.mgmt.nls.MessageBundle.getMessage(MessageBundle.java:225)
        at oracle.ops.mgmt.rawdevice.RawDeviceUtil.main(Compiled Code)


March 10, 2008

Troubleshooting done to make root.sh work after a 10gR2 CRS (10.2.0.1) installation on HP-UX PA RISC 64-bit OS

Preface

This posting describes the troubleshooting done to make root.sh work after a 10gR2 CRS (10.2.0.1) installation on HP-UX PA RISC 64-bit OS. We were upgrading from 9i (9.2.0.6) to 10gR2.

The problem

(root@testdb):/stage11i/software# /u01/oracle/testdb/CRS/root.sh
WARNING: directory '/u01/oracle/testdb' is not owned by root
WARNING: directory '/u01/oracle' is not owned by root
WARNING: directory '/u01' is not owned by root
Checking to see if Oracle CRS stack is already configured
Checking to see if any 9i GSD is up

Setting the permissions on OCR backup directory
Setting up NS directories
PROT-3: Failed to create export file
Failed to upgrade Oracle Cluster Registry configuration


The fix was mentioned in Bug 5487374 -- TST&PERF:CRS UPGRADE NEEDS TO CREATE OCR BACKUP DIR DURING OUT OF PLACE UPGRADE

We had to create sub-directory  "crs" under /u01/oracle/testdb/CRS/cdata/crs10g and re-run root.sh. The root.sh expects to find a subdirectory for the previous RAC cluster's name.

More trouble..

There were some more errors while running root.sh
...
...
Format of 1 voting devices complete.

/u01/oracle/testdb/CRS/install/rootconfig[938]: /sbin/init.d/init.crs: Execute permission denied.
/u01/oracle/testdb/CRS/install/rootconfig[941]: /sbin/init.d/init.cssd: Execute permission denied.
CRS daemons not set to start. See /var/adm/messages for details.


(root@testdb):/u01/oracle/testdb/CRS/cdata/crs10g# echo $?
1


Fix:

As per the SR 6807470.992 we performed the following action Plan.

Login as root and grant the following permissions -

# cd /sbin/init.d
# chmod 755 init.crs
# chmod 755 init.crsd
# chmod 755 init.cssd
# chmod 555 init.evmd


Then check the permissions are correct -

(root@testdb):/sbin/init.d# ls -rlt init*
-rwxr-xr-x 1 root sys 2226 Mar 3 18:32 init.crs
-rwxr-xr-x 1 root sys 4845 Mar 3 18:32 init.crsd
-rwxr-xr-x 1 root sys 36798 Mar 3 18:32 init.cssd
-rwxr-xr-x 1 root sys 3185 Mar 3 18:32 init.evmd


When we tried to execute root.sh we are getting following warning:

(root@testdb):/stage11i/software# /u01/oracle/testdb/CRS/root.sh
WARNING: directory '/u01/oracle/testdb' is not owned by root
WARNING: directory '/u01/oracle' is not owned by root
WARNING: directory '/u01' is not owned by root
Checking to see if Oracle CRS stack is already configured


At this point, we edited the file /var/opt/oracle/scls_scr/testdb/oratest1/cssfatal from disable to enable in both the nodes (there is only 1 line in this file)

(root@testdb):more /var/opt/oracle/scls_scr/testdb/oratest1/cssfatal   
enable


Also give permissions as follows:

(root@testdb):/var/opt/oracle# chmod -R 777 /var/opt/oracle/scls_scr

Then started the CRS.

# $ORA_CRS_HOME/bin/crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly

About March 2008

This page contains all entries posted to Experiments from the Field..Based on True Stories in March 2008. They are listed from oldest to newest.

February 2008 is the previous archive.

April 2008 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type and Oracle