Tuesday Aug 03, 2010

Understanding AQs in Workflow

This blog post contributed by Shivdas Tomar.

In Oracle Workflow, Advanced Queues are used for inter-process communication, background processing, sending messages to external agents and so on. Many times the messages may not be dequeued by the consumers such as Workflow Agent Listeners or Workflow Background Engines.

Symptoms

  • Workflow Agent Listeners or Background engines are not able to dequeue messages in READY state.
  • Messages continue to be in WAIT state even after passing the "delay" interval
  • Processed messages are not purged after passing the "retention" interval
  • Queue propagation is not happening

Checks

Always check AQ view (AQ$<Queue_Table_Name> for message state and not the under lying table directly.

  • If messages are in READY state but consumers such as Workflow Agent Listeners and Workflow Background Engines are not able to dequeue, then there might be issues with message recipients or AQ subscribers. Check the CONSUMER_NAME and ADDRESS columns value in that AQ and then the subscriber rule sets.
  • If event messages are not moving from one state to another, check AQ_TM_PROCESSES parameter's value to be greater than 0.
Note: In 10gR1 or later database releases, the AQ time management and many other background processes are automatically controlled by a coordinator-slave architecture called Queue Monitor Coordinator (QMNC). QMNC dynamically spawns slaves named qXXX depending on the system load.
If you want to disable the Queue Monitor Coordinator, then you must set AQ_TM_PROCESSES = 0 in your pfile or spfile. Oracle strongly recommends that you DO NOT set AQ_TM_PROCESSES = 0. If you are using Oracle Streams, setting this parameter to zero (which Oracle Database respects no matter what) can cause serious problems. So either remove AQ_TM_PROCESSES parameter or do not set its value as zero.
  • Check if processes, named like 'qmnc' or 'ora_qxxx' are running at the OS level. If there are no such process(es) running then bounce the Oracle Database after removing AQ_TM_PROCESSES or changing its value to a non-zero value as recommended in Oracle Applications setup documentation.

  • If Queue propagation is not happening, job_queue_processes parameter's value should be checked.
Note: In 11gR1 or later database releases, the init.ora parameter JOB_QUEUE_PROCESSES does not need to be set for AQ propagations. AQ propagation is now handled by DBMS_SCHEDULER jobs rather than DBMS_JOBS.
Propagation takes advantage of the event based scheduling features of DBMS_SCHEDULER for better scalability. If the value of the JOB_QUEUE_PROCESSES database initialization parameter is zero, then that parameter does not influence the number of Oracle Scheduler jobs that can run concurrently. However, if the value is non-zero, it effectively becomes the maximum number of scheduler jobs and job queue jobs that can run concurrently. If a non-zero value is set, it should be large enough to accommodate a scheduler job for each Messaging Gateway agent to be started.

Gathering Stats for Workflow's Queue Tables

This blog post contributed by Ajay Verma.

Statistics gathering is an important activity in order to enable the cost based optimizer to take a decision on the execution plan to choose from the various possible ones with which a particular SQL statement can be run. It is recommended to gather the statistics for an object when it is in steady state, i.e. when the corresponding tables have a fair amount of data giving out statistics which would lead the best execution plan to be picked by CBO.

From 10g onwards , Oracle database introduced Automatic Optimizer Statistics Collection feature in which a scheduled job GATHER_STATS_JOB runs in the nights and gathers stats for tables with either empty or stale statistics using DBMS_STATS  package APIs. This indeed is a great feature but for highly volatile tables such as AQ/Streams tables  it is quite possible that when the stats collection job runs, these tables may not have the data that is representative of their full load period. One scenario can be stats collection on queue tables when they were empty, and the optimizer, as a result, choosing poor execution plans when they have a lot of data in them during normal workload hours.

As a direct fix to tackle the above issue, from 10gR2 database onwards, AQ table stats are getting locked immediately after its creation so that the auto stats collector doesn't change it in the future. This can be confirmed as below:

SQL> CREATE TYPE event_msg_type AS OBJECT   
                  (  name            VARCHAR2(10), 
                     current_status  NUMBER(5),  
                     next_status     NUMBER(5)  
                   ); 
Type created.
--Create Queue Table
SQL> EXEC DBMS_AQADM.create_queue_table(
              queue_table=>'test_queue_tab',
              queue_payload_type=>'event_msg_type')
PL/SQL procedure successfully completed. 
--Create Queue
SQL> EXEC DBMS_AQADM.create_queue(queue_name =>'test_queue',
                             queue_table=>'test_queue_tab');
PL/SQL procedure successfully completed.
--Check for locks on stats
SQL> select owner, table_name, stattype_locked
     from dba_tab_statistics
     where stattype_locked is not null
     and owner='APPS' and table_name='TEST_QUEUE_TAB';
OWNER       TABLE_NAME      STAT
---------- ---------------- ----- 
APPS        TEST_QUEUE_TAB   ALL 
--Gather stats
SQL> EXEC dbms_stats.gather_table_stats('APPS','TEST_QUEUE_TAB')
BEGIN dbms_stats.gather_table_stats('APPS','TEST_QUEUE_TAB'); END; 
* 
ERROR at line 1: 
ORA-20005: object statistics are locked (stattype = ALL) 
ORA-06512: at "SYS.DBMS_STATS", line 18408 
ORA-06512: at "SYS.DBMS_STATS", line 18429 
ORA-06512: at line 1

As we can see above , queue table stats are locked and hence further stats gathering fails. We can still go ahead and use force parameter provided in DBMS_STATS package to override any lock on statistics:

SQL>exec dbms_stats.gather_table_stats('APPS','TEST_QUEUE_TAB'
					      ,force=>TRUE)
PL/SQL procedure successfully completed.

The recommended use is to gather statistics with a representative queue message load and keep the stats locked to avoid getting picked up by auto stats collector.

What happens in Apps Instance?

From Oracle Apps 11i  (on top of Oracle Database 10gR2) instance onwards, the automatic statistics gathering job GATHER_STATS_JOB is disabled using $APPL_TOP/admin/adstats.sql , as can be confirmed below:

 
SQL> select job_name, enabled from DBA_SCHEDULER_JOBS 
WHERE job_name = 'GATHER_STATS_JOB';
JOB_NAME                       ENABLE
------------------------------ ------
GATHER_STATS_JOB               FALSE	

Oracle Apps provides separate concurrent programs which make use of procedures in FND_STATS package to gather statistics for apps database objects. FND_STATS is basically a wrapper around DBMS_STATS and is recommended by Oracle for stats collection in Oracle Apps environment because of the flexibility it provides in identifying the objects with empty and stale stats. The modification threshold ( % of rows used to estimate) is fixed to 10% in DBMS_STATS while it can vary from 0 to 100% incase of FND_STATS.

But FND_STATS doesn't provide the force parameter to override any lock on statistics and hence fails while gathering statistics for a queue table whose stats are locked :

 
SQL> select owner, table_name, stattype_locked ,last_analyzed
  2   from dba_tab_statistics where table_name='WF_DEFERRED'
  3  and owner='APPLSYS';
OWNER          TABLE_NAME      STATT LAST_ANAL
-------------- --------------- ----- ---------
APPLSYS        WF_DEFERRED     ALL   23-FEB-09
SQL> exec fnd_stats.gather_table_stats('APPLSYS','WF_DEFERRED');
BEGIN fnd_stats.gather_table_stats('APPLSYS','WF_DEFERRED'); END;
*
ERROR at line 1:
ORA-20005: object statistics are locked (stattype = ALL)
ORA-06512: at "APPS.FND_STATS", line 1505
ORA-06512: at line 1	

The same error is observed on running the Gather Table Statistics concurrent program for such queue tables. Gather Schema Statistics concurrent program in general skips the table incase any error is encountered while gathering its stats and hence the queue tables with locked stats are also skipped.

Hence currently ,  in order to carry out stat collection for Workflow queue tables whose stats are locked,the approach should be to unlock the queue table temporarily using DBMS_STAT's Unlock APIs when the table have a representative load for the correct stats to be gathered, gather the statistics using FND_STATS and lock it again.

SQL> begin
  2    dbms_stats.unlock_table_stats('APPLSYS','WF_DEFERRED');
  3    fnd_stats.gather_table_stats('APPLSYS','WF_DEFERRED');
  4    dbms_stats.lock_table_stats('APPLSYS','WF_DEFERRED');
  5  end;
  6  /
PL/SQL procedure successfully completed.
SQL> select owner, table_name, stattype_locked ,last_analyzed
  2  from dba_tab_statistics where table_name='WF_DEFERRED'
  3  and owner='APPLSYS';
OWNER            TABLE_NAME       STATT LAST_ANALYZED
---------------- ---------------- ----- -------------
APPLSYS          WF_DEFERRED      ALL   12-APR-09	

Another approach can be to unlock the WF queue table stats during the adpatch run itself as Oracle's automatic stats collection feature is disabled in Apps environments and the stats collection in general occurs through the Apps provided concurrent programs. What say?

About

This blog is dedicated to bring latest information on Oracle Workflow new features, best practices, troubleshooting techniques and important fixes directly from it's Development Team.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today