Tuesday Jun 12, 2012

Oracle E-Business Suite Tip : SQL Tracing

Issue:

Attempts to enable SQL tracing from concurrent request form fails with error:

Function not available to this responsibility.
Change Responsibilities or contact your System Administrator

Resolution:

Switch responsibility to "System Administrator". Navigate to System -> Profiles, and query for "%Diagnostics% ("Utilities : Diagnostics")". Once found the profile, change its value to "Yes". Restart web browser and try enabling SQL trace again.

Tuesday May 08, 2012

OBIEE 11g: Resolving Presentation Services Startup Failure

ISSUE:

Starting Presentation Services fail with the error:

[OBIPS] [ERROR:1] [] [saw.security.odbcuserpopulationimpl.getbisystemconnection] [ecid: ] [tid: ] Authentication Failure.
Odbc driver returned an error (SQLDriverConnectW).
State: 08004.  Code: 10018.  [NQODBC] [SQL_STATE: 08004] [nQSError: 10018] Access for the requested connection is refused.
[nQSError: 43113] Message returned from OBIS.
[nQSError: 43126] Authentication failed: invalid user/password. (08004)[[

Also connecting to the metadata repository (RPD) in online mode fails with similar error.

Looking through the BI server log, nqserver.log, you may find an error message similar to the following:

[OracleBIServerComponent] [ERROR:1] [] [] [ecid: 0001J1LfUetFCC3LVml3ic0000pp000000] [tid: 1] 
[13026] Error in getting roles from BI Security Service:	 
'Error Message From BI Security Service: [nQSError: 46164] HTTP Server returned 404 (Not Found) for URL .' ^M

RESOLUTION:

  • Connect to WebLogic Server (WLS) Console -> Deployments. Ensure that all deployed components are in 'Active' state.

  • If any of the components is in 'Prepared' state, select that application and then click on "start servicing all requests"

  • Restart BI Server and Presentation Services

In some cases, the following additional step might be needed to resolve the issue.

  • Access the Enterprise Manager Fusion Middleware control: http://<host.domain>:port/em

  • Navigate to Business Intelligence -> coreapplication

  • 'Capacity Management' tab -> 'Scalability' sub-tab

  • Click on 'Lock and Edit Configuration' button

  • Enter the IP address in the 'Listen Address' field

  • Click on 'Activate Changes' followed by 'Release Configuration' buttons

  • Restart BI Server and Presentation Services

Also check these My Oracle Support (MOS) documents for more clues and information.

1387283.1 Authentication failed: invalid user/password
1251364.1 Error: "[nQSError: 10018] Access .. Refused. [nQSError: 43126] Authentication Failed .." when Installing OBIEE 11g
1410233.1.1 How To Bind Components / Ports To A Specific IP Address On Multiple Network Interface (NIC) Machines

Friday Apr 27, 2012

Solaris Volume Manager (SVM) on Solaris 11

SVM is not installed on Solaris 11 by default.

# metadb
-bash: metadb: command not found

# /usr/sbin/metadb
-bash: /usr/sbin/metadb: No such file or directory

Install it using pkg utility.

# pkg info svm
pkg: info: no packages matching the following patterns you specified are
installed on the system.  Try specifying -r to query remotely:

        svm

# pkg info -r svm
          Name: storage/svm
       Summary: Solaris Volume Manager
   Description: Solaris Volume Manager commands
      Category: System/Core
         State: Not installed
     Publisher: solaris
       Version: 0.5.11
 Build Release: 5.11
        Branch: 0.175.0.0.0.2.1
Packaging Date: October 19, 2011 06:42:14 AM 
          Size: 3.48 MB
          FMRI: pkg://solaris/storage/svm@0.5.11,5.11-0.175.0.0.0.2.1:20111019T064214Z

# pkg install storage/svm
           Packages to install:   1
       Create boot environment:  No
Create backup boot environment: Yes
            Services to change:   1

DOWNLOAD                                  PKGS       FILES    XFER (MB)
Completed                                  1/1     104/104      1.6/1.6

PHASE                                        ACTIONS
Install Phase                                168/168 

PHASE                                          ITEMS
Package State Update Phase                       1/1 
Image State Update Phase                         2/2 

# which metadb
/usr/sbin/metadb

This time metadb may fail with a different error.

# metadb
metadb: <HOST>: /dev/md/admin: No such file or directory

Check if md.conf exists.

# ls -l  /kernel/drv/md.conf 
-rw-r--r--   1 root     sys          295 Apr 26 15:07 /kernel/drv/md.conf

Dynamically re-scan md.conf so the device tree gets updated.

# update_drv -f md

# ls -l  /dev/md/admin
lrwxrwxrwx   1 root root 31 Apr 20 10:12 /dev/md/admin -> ../../devices/pseudo/md@0:admin

# metadb
metadb: <HOST>: there are no existing databases

Now Solaris Volume Manager is ready to use.

eg.,
#  metadb -f -a c0t5000CCA00A5A7878d0s0

# metadb
        flags           first blk       block count
     a        u         16              8192          /dev/dsk/c0t5000CCA00A5A7878d0s0

Friday Mar 30, 2012

Resolving "PLS-00201: identifier 'DBMS_SYSTEM.XXXX' must be declared" Error

Here is a failure sample.

SQL> set serveroutput on
SQL> alter package APPS.FND_TRACE compile body;

Warning: Package Body altered with compilation errors.

SQL> show errors
Errors for PACKAGE BODY APPS.FND_TRACE:

LINE/COL ERROR
-------- -----------------------------------------------------------------
235/6    PL/SQL: Statement ignored
235/6    PLS-00201: identifier 'DBMS_SYSTEM.SET_EV' must be declared
..

By default, DBMS_SYSTEM package is accessible only from SYS schema. Also there is no public synonym created for this package. So, the solution is to create the public synonym and grant "execute" privilege on DBMS_SYSTEM package to all database users or a specific user.

eg.,

SQL> CREATE PUBLIC SYNONYM dbms_system FOR dbms_system;

Synonym created.

SQL> GRANT EXECUTE ON dbms_system TO APPS;

Grant succeeded.

- OR -

SQL> GRANT EXECUTE ON dbms_system TO PUBLIC;

Grant succeeded.

SQL>  alter package APPS.FND_TRACE compile body;

Package body altered.

Note that merely granting execute privilege is not enough -- creating the public synonym is as important to resolve this issue.

Tuesday Feb 28, 2012

Oracle RDBMS & Solaris : Few Random Tips (Feb 2012)

These tips are just some quick solutions or workarounds. Use these quickies at your own risk.

[#1] Oracle Data Pump

Q: How to exclude the table definition while importing a table using Oracle Data Pump import utility?

A: Use EXCLUDE=TABLE/TABLE option.

eg.,

impdp login/password DUMPFILE=<DUMP_FILENAME> LOGFILE=<LOGFILE_NAME> \
 DIRECTORY=<DB_DIR_NAME> TABLES=<TABLE_NAME> EXCLUDE=TABLE/TABLE



[#2] Workaround to ORA-01089: immediate shutdown in progress - no operations are permitted

When the database is in the middle of an instance shutdown, if another shutdown or startup was attempted, Oracle RDBMS may throw the above ORA-01089 error. The workaround is to force Oracle to start the database instance using startup force option. This option will shutdown the database instance (if running) using the abort command and then starts it up.

eg.,

SQL> STARTUP FORCE



[#3] Quick steps to upgrade the Oracle database from version 11.2.0.[1 or 2] to 11.2.0.3

Execute the following in the same sequence as sysdba.

startup upgrade
!cd $ORACLE_HOME/rdbms/admin
@utlu112i.sql		/* pre-upgrade information tool */
exec dbms_stats.gather_dictionary_stats (DEGREE => 64);
@catupgrd.sql		/* create/modify data dictionary tables */
@utlu112s 		/* all components should be in VALID state */
shutdown immediate
startup
@catuppst.sql		/* upgrade actions that do not require DB in UPGRADE mode */
@utlrp.sql		/* recompile stored PL/SQL and Java code */
SELECT count(*) FROM dba_invalid_objects;		
                        /* verify that all packages and classes are valid */
exit



[#4] Q: Solaris: how to get rid of zombie processes?

A: Run the following with appropriate user privileges.

ps -eaf | grep defunct | grep -v grep | preap `awk '{ print $2 }'`

Alternative way: (not as good as the previous one - still may work as expected)

prstat -n 500 1 1 | grep zombie | preap `awk '{ print $1 }'`



[Added on 03/01/2012]

[#5] Solaris: Many TCP listen drops

eg.,

# netstat -sP tcp | grep tcpListenDrop
        tcpListenDrop       =2442553     tcpListenDropQ0     =     0

To alleviate numerous TCP listen drops, bump up the value for the tunable tcp_conn_req_max_q

# ndd -set /dev/tcp tcp_conn_req_max_q <value>



[Added on 03/02/2012]

[#6] Solaris ZFS: listing all properties and values for a zpool

Run: zfs get all <zpool_name> as any OS user

eg.,

% zpool list
NAME    SIZE  ALLOC   FREE    CAP  HEALTH  ALTROOT
rpool   276G   167G   109G    60%  ONLINE  -
spec    556G   168G   388G    30%  ONLINE  -

% zfs get all rpool
NAME   PROPERTY              VALUE                  SOURCE
rpool  type                  filesystem             -
rpool  creation              Fri May 27 17:06 2011  -
...
rpool  compressratio         1.00x                  -
rpool  mounted               yes                    -
rpool  quota                 none                   default
rpool  reservation           none                   default
rpool  recordsize            128K                   default
...
rpool  checksum              on                     default
rpool  compression           off                    default
...
rpool  logbias               latency                default
rpool  sync                  standard               default
rpool  rstchown              on                     default



[#7] Solaris: listing all ZFS tunables

Run: echo "::zfs_params" | mdb -k with root/super-user privileges

eg.,

# echo "::zfs_params" | mdb -k
arc_reduce_dnlc_percent = 0x3
zfs_arc_max = 0x10000000
zfs_arc_min = 0x10000000
arc_shrink_shift = 0x5
zfs_mdcomp_disable = 0x0
zfs_prefetch_disable = 0x0
..
..
zio_injection_enabled = 0x0
zvol_immediate_write_sz = 0x8000

Tuesday Dec 27, 2011

Oracle Application Testing Suite (OATS): Few Tips & Tricks

OATS is a suite of applications that can be used for performance and scalability testing, functional and regression testing. It is a thin client application that runs within a web browser - so, it is easy to use the tool from anywhere as long as the web server running on the host node is accessible. Hopefully the following tips and tricks will benefit some of the users of the Oracle Application Testing Suite.

Few technical details first - OATS is a 32-bit Java application that runs in a WebLogic container (WLS) with Oracle XE database being the backend store for test session data.



[Trick] Issue : OATS software fails to install on 64-bit Windows systems

Resolution:
Download and install 64-bit .NET framework manually before installing the OATS software. Look for .NET framework on Microsoft's downloads website.




[Trick] Issue : OATS software fails to install on systems with large number of [virtual] CPUs

Resolution:
On systems with many cores/vCPUs, Oracle database in general requires large amounts of memory to be configured for SGA - so, one solution would be to allocate as much memory as required. However Oracle XE limits the memory utilization within the database to 1 GB. Besides, XE uses only one CPU even if there are multiple CPUs available on a system. Hence one workaround is to limit the number of vCPUs that the system exposes during the installation of OATS software. The steps are shown below.

  • Start button -> Run -> type "msconfig"
  • Click on Boot tab -> Advanced Options
  • Check "Number of processors" and set appropriate value (I believe we can go up to 16)
  • Reboot Windows
  • Uninstall failed OATS installation and try installing again
  • Undo the above made changes after the successful installation of OATS
  • Reboot Windows one final time

Thanks to my colleague Bao Doan for providing this workaround.




[Trick] Issue : During runtime, OATS drive the load and executes the test as expected but fails to collect runtime statistics

Resolution:
This is another limitation of Oracle XE database. Until 10g, XE limits the maximum amount of user data in the database to 4 GB. This limit was raised to 11 GB in release Oracle 11g XE. OATS 9.x releases bundle Oracle 10g XE. To take advantage of the larger limit for data, install Oracle 11g XE manually before installing OATS software. OATS installer gives the option to use an existing installation of Oracle XE. Besides, it is not possible to have multiple Oracle XE installations on a single box anyway (that's another XE limitation).

For existing installations, one workaround is to remove old and unwanted sessions to make room for new sessions in the database. Listed below are the steps.

  • Connect to the Oracle Load Testing (OLT) tool
  • Click on "Manage" top-level menu (upper right corner) -> Sessions
  • Click on any unwanted session and press "Delete" button (I recommend deleting one session at a time)



[Trick] Issue : Under load, there are many network timeouts with ton of sockets in TIME_WAIT state on OATS agent systems including the OATS Controller node

Resolution:
Tune TCP/IP parameters on Windows as shown below.

  • Launch Windows registry
  • Navigate to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\TcpIP\parameters
  • Configure the following two parameters. If not found, create those parameters by selecting Edit -> New -> DWORD Value from the menu bar. Select "Decimal" under Base.
      TcpTimedWaitDelay : 30 [seconds]
      MaxUserPort : 65534
  • Reboot Windows

Thanks to my colleagues Dino and Vishnu for sharing this workaround.




[Trick] Issue : OATS Controller does not show any graphs or analysis reports

Resolution:
Install Adobe Flash Plugin and try again.




[Trick] Issue : Under load, OATS Controller stops collecting runtime statistics at some random point

Resolution:
Check Oracle database alert log for some clue(s). If there is an error message such as "ORA-12516: TNS:listener could not find available handler with matching protocol stack", connect to the database, query v$resource_limit view and compare the values reported under CURRENT_UTILIZATION and MAX_UTILIZATION for the resource "processes". If the current utilization is pretty close to the configured maximum value, raise the value for processes parameter in [S]PFILE.




[Tip] Balancing the load among multiple OATS agent systems

One simple way is to create a VU Agent System Group based on the available agent systems. Steps listed below.

  • Connect to the Oracle Load Testing (OLT) tool
  • Click on "Manage" top-level menu (upper right corner) -> Systems
  • Click on "VU Agent System Group" in the left hand side
  • On the right hand side, click on "New" option
  • Select all the agent systems that you want to be part of the "VU Agent System Group"
  • Finally name the newly created system group and save

Note that it is not possible to attach weights to the agent systems - so, it is suggested to have agent systems with similar hardware configurations in the VU Agent System Group.



[Tip] Balancing the load among multiple web servers using OATS Controller

If there are multiple web server instances running in a enterprise application deployment; and OATS software is being used to test the performance and scalability of the application, parameterizing the web server hostname and port number in OATS test script will take care of the web server load balancing problem. Of course there are many alternatives to this approach such as using a hardware load balancer, using web server Reverse Proxy etc.,



[Added on 01/19/2012]

[Tip] How-To check the available space in USERS tablespace?

Run the following on OATS Controller node:

Start -> All Programs -> Oracle Database XX Express Edition -> Run SQL Command Line

SQL> connect / as sysdba

SQL> SELECT /* + RULE */  df.tablespace_name "Tablespace",
       df.bytes / (1024 * 1024) "Size (MB)",
       SUM(fs.bytes) / (1024 * 1024) "Free (MB)",
       Nvl(Round(SUM(fs.bytes) * 100 / df.bytes),1) "% Free",
       Round((df.bytes - SUM(fs.bytes)) * 100 / df.bytes) "% Used"
  FROM dba_free_space fs,
       (SELECT tablespace_name,SUM(bytes) bytes
          FROM dba_data_files
         GROUP BY tablespace_name) df
 WHERE fs.tablespace_name (+)  = df.tablespace_name
 GROUP BY df.tablespace_name,df.bytes
UNION ALL
SELECT /* + RULE */ df.tablespace_name tspace,
       fs.bytes / (1024 * 1024),
       SUM(df.bytes_free) / (1024 * 1024),
       Nvl(Round((SUM(fs.bytes) - df.bytes_used) * 100 / fs.bytes), 1),
       Round((SUM(fs.bytes) - df.bytes_free) * 100 / fs.bytes)
  FROM dba_temp_files fs,
       (SELECT tablespace_name,bytes_free,bytes_used
          FROM v$temp_space_header
         GROUP BY tablespace_name,bytes_free,bytes_used) df
 WHERE fs.tablespace_name (+)  = df.tablespace_name
 GROUP BY df.tablespace_name,fs.bytes,df.bytes_free,df.bytes_used
 ORDER BY 4 DESC;

Copy/paste the above SQL code in a text file with sql extension and execute that SQL statement by calling the SQL script from SQL> command prompt. eg., assuming the above code was saved in a plain text file called chktblspcusg.sql under C:\ drive, execute the SQL script as shown below:

SQL> @C:\chktblspcusg.sql




[Added on 06/27/2012]

[Trick] Issue : An attempt to open a test script in OpenScript fails with error

'Failed to open script' has encountered a problem.
Failed to open <script_name>. See error log for details.

Clicking on "Details" button provides the following clue.

The project description file (.project) for '<script_name>' is missing"

In addition the title bar shows "Relocating Eclipse Projects: The project description file (.project) for XXX is missing".

Resolution:
Navigate to C:\Documents and Settings\Administrator\osworkspace\.metadata\.plugins\org.eclipse.core.resources\.projects\

Look for the directory by name "<failing_script_name>" and remove it



[Added on 08/03/2012]

[Trick] Issue: Unexpected Agent exit. Code = 51 in the middle of an OLT load test

When running a load scenario in Oracle Load Testing (OLT) that uses a databank, the scenario runs fine for some time and then all of a sudden fails with the following error: Unexpected Agent exit. Code = 51.

Workaround:

The following settings may alleviate the issue.

  • - toggle/experiment with the settings for "Clear cache between iterations" and "Clear cache before playing back"
    • those settings can be found under the test script preferences -> Playback -> Web Functional -> Miscellaneous
  • - experiment with different values for "Maximum users per process" setting
    • this setting is under OLT -> Configure all parameters -> Advanced
  • - increase the Java heap size (both min & max) in file <OATS_HOME>\agentmanager\bin\AgentManagerService.conf
    • default values: min heap size: 16 MB; max heap size: 64 MB

Contributors: John Snyder, Richard Barry

[Added 02/25/13]

Another colleague Dave Suri has an alternate tip to resolve the Agent 51 issue.

Edit <OATS_HOME>\agentmanager\processDescriptors\JavaAgent.properties

Change the following lines:

#process.debug=y
#process.debug.suspend=y
#process.debug.port=8123
#process.debug.custom=

To:

process.debug=y
#process.debug.suspend=y
#process.debug.port=8123
process.debug.custom=-verbose:gc -XX:+HeapDumpOnOutOfMemoryError -Xms512M 
-Xmx1536M -jrockit -Xrs -XgcPrio:deterministic -XpauseTarget=50ms 
-XX:+UseCallProfiling -XX:+UseAdaptiveFatSpin -XX:+ExitOnOutOfMemoryError 
-XXnoSystemGC -XX:+UseFastTime


See Also:

Tuesday Dec 13, 2011

Solaris Tip: Resolving "statd: cannot talk to statd at <target_host>, RPC: Timed out(5)"

Symptom:

System log shows a bunch of RPC timed out messages such as the following:


Dec 13 09:23:23 gil08 last message repeated 1 time
Dec 13 09:29:14 gil08 statd[19858]: [ID 766906 daemon.warning] statd: cannot talk to statd at ssc23, RPC: Timed out(5)
Dec 13 09:35:05 gil08 last message repeated 1 time
Dec 13 09:40:56 gil08 statd[19858]: [ID 766906 daemon.warning] statd: cannot talk to statd at ssc23, RPC: Timed out(5)
..

Those messages are the result of an apparent communication failure between the status daemons (statd) of both local and remote hosts using RPC calls.

Workaround/Solution:

If the target_host is reachable, execute the following to stop the system from generating those warning messages --- stop the network status monitor, remove the target host entry from /var/statmon/sm.bak file and start the network status monitor process. Removing the target host entry from sm.bak file keeps that machine from being aware that it may have to participate in locking recovery.

eg.,


# ps -eaf | fgrep statd 
  daemon 14304 19622   0 09:47:16 ?           0:00 /usr/lib/nfs/statd
    root 14314 14297   0 09:48:03 pts/15      0:00 fgrep statd

# svcs -a | grep "nfs/status"
online          9:52:41 svc:/network/nfs/status:default

# svcadm -v disable nfs/status
svc:/network/nfs/status:default disabled.

# ls /var/statmon/sm.bak
ssc23

# rm /var/statmon/sm.bak/ssc23

# svcadm -v enable nfs/status
svc:/network/nfs/status:default enabled.

Friday Nov 18, 2011

Siebel Troubleshooting : An ODBC error occurred; SBL-GEN-03006: Error calling function: DICFindTable m_pReqTbl

Symptom:

A newly installed Siebel application server fails to start despite successful ODBC connectivity to the database. SRProc process logs ODBC error messages similar to the following:


Message: GEN-13,
 Additional Message: dict-ERR-1109: 
       Unable to read value from export file (Data length (32) > Column definition (3)).

Message: GEN-13,
 Additional Message: dict-ERR-1107: Unable to read row 0 from export file (UTLDataValRead pBuf, col 4 ).

GenericLog  GenericError  1     0002157..  11-11-18 13:28  Message: Generated SQL statement:,
 Additional Message: SQLFetch:
   SELECT RDOBJ.DOCK_ID, RDOBJ.RELATED_DOCK_ID, RDOBJ.SQL_STATEMENT, RDOBJ.CHECK_VISIBILITY,
          'N', RDOBJ.COMMENTS, RDOBJ.ACTIVE, RDOBJ.SEQUENCE, RDOBJ.VIS_STRENGTH,
          RDOBJ.REL_VIS_STRENGTH, RDOBJ.VIS_EVT_COLS
     FROM ORAPERF.S_DOCK_REL_DOBJ RDOBJ, ORAPERF.S_DOCK_OBJECT DOBJ
    WHERE RDOBJ.REPOSITORY_ID = (SELECT ROW_ID FROM ORAPERF.S_REPOSITORY WHERE NAME = ?)
      AND DOBJ.ROW_ID = RDOBJ.DOCK_ID
      AND (DOBJ.INACTIVE_FLG = 'N' OR DOBJ.INACTIVE_FLG IS NULL)
      AND (RDOBJ.INACTIVE_FLG = 'N' OR RDOBJ.INACTIVE_FLG IS NULL)

Message: Error: An ODBC error occurred,
 Additional Message: Function: DICGetRDObjects; ODBC operation: SQLFetch

Message: GEN-13,
 Additional Message: dict-ERR-1109: Unable to read value from export file (UTLCompressFRead (fseek)).

Message: GEN-13,
 Additional Message: dict-ERR-1107: Unable to read row 0 from export file (UTLDataValRead pBuf, col 0 ).

Message: GEN-10,
 Additional Message: Calling Function: DICLoadDObjectInfo; Called Function: Calling DICGetRDObjects

Message: GEN-10,
 Additional Message: Calling Function: DICLoadDict; Called Function: DICLoadDObjectInfo

GenericError
(srpdb.cpp (860) err=3006 sys=2) SBL-GEN-03006: Error calling function: DICFindTable m_pReqTbl
(srpsmech.cpp (74) err=3006 sys=0) SBL-GEN-03006: Error calling function: DICFindTable m_pReqTbl
(srpmtsrv.cpp (107) err=3006 sys=0) SBL-GEN-03006: Error calling function: DICFindTable m_pReqTbl
(smimtsrv.cpp (1203) err=3006 sys=0) SBL-GEN-03006: Error calling function: DICFindTable m_pReqTbl
SmiLayerLog Error       Terminate process due to unrecoverable error: 3006. (Main Thread)

An inconsistent or corrupted dictionary file "diccache.dat" is likely the cause.

Solution:

  • Stop the application server and manually kill the remaining Siebel application specific processes

    eg.,

    stop_server all
    
    pkill siebmtsh
    pkill siebproc
    ..
    
  • Remove $SIEBEL_HOME/bin/diccache.dat file. It will be re-generated during the application server startup

  • Start the application server
    start_server all
    

Monday Oct 10, 2011

Oracle Database on NFS : Resolving "ORA-27086: unable to lock file - already in use" Error

Some Context

Oracle database was hosted on ZFS Storage Appliance (NAS). The database files are accessible from the database server node via NFS mounted filesystems. Solaris 10 is the operating system on DB node.

Someone forgets to shutdown the database instance and unmount the remote filesystems before rebooting the database server node. After the system boots up, Oracle RDBMS fails to bring up the database due to locked-out data files.

eg.,

SQL> startup
ORACLE instance started.

Total System Global Area 1.7108E+10 bytes
Fixed Size		    2165208 bytes
Variable Size		 9965671976 bytes
Database Buffers	 6845104128 bytes
Redo Buffers		  295329792 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 1 - see DBWR trace file
ORA-01110: data file 1: '/orclvol4/entDB/system01.dbf'

======================
Extract from alert log:
======================

...
ALTER DATABASE OPEN
Fri Aug 05 21:30:54 2011
Errors in file /oracle112/diag/rdbms/entdb/entDB/trace/entDB_dbw0_7235.trc:
ORA-01157: cannot identify/lock data file 1 - see DBWR trace file
ORA-01110: data file 1: '/orclvol4/entDB/system01.dbf'
ORA-27086: unable to lock file - already in use
SVR4 Error: 11: Resource temporarily unavailable
Additional information: 8
Additional information: 21364
Errors in file /oracle112/diag/rdbms/entdb/entDB/trace/entDB_dbw0_7235.trc:
ORA-01157: cannot identify/lock data file 2 - see DBWR trace file
ORA-01110: data file 2: '/orclvol4/entDB/sysaux01.dbf'
ORA-27086: unable to lock file - already in use
SVR4 Error: 11: Resource temporarily unavailable
Additional information: 8
Additional information: 21364
...

Reason for the lock failure:

Because of the sudden ungraceful shutdown of the database, file locks on data files were not released by the NFS server (ZFS SA in this case). NFS server held on to the file locks even after the NFS client (DB server node in this example) was restarted. Due to this, Oracle RDBMS is not able to lock those data files residing on NFS server (ZFS SA). As a result, database instance was failed to start up in exclusive mode.

Workaround

Manually clear the NFS locks as outlined below.

On NFS Client (database server node):

  1. Shutdown the mounted database
  2. Unmount remote (NFS) filesystems
  3. Execute: clear_locks -s <nfs_server_host>

    eg.,

    # clear_locks -s sup16
    Clearing locks held for NFS client ipsedb1 on server sup16
    clear of locks held for ipsedb1 on sup16 returned success
    

On NFS Server (ZFS SA):
    (this step may not be necessary but wouldn't hurt to perform)

  1. Execute: clear_locks <nfs_client_host>

    eg.,

    sup16# clear_locks 10.129.207.93
    Clearing locks held for NFS client 10.129.207.93 on server sup16
    clear of locks held for 10.129.207.93 on sup16 returned success
    

Again back on NFS Client (database server node):

  1. Restart NFS client
        (this step may not be necessary but wouldn't hurt to perform)
    # svcadm -v disable nfs/client
    # svcadm -v enable nfs/client
    
  2. Mount remote/NFS filesystems
  3. Finally start the database

Also see:
Listing file locks on Solaris 10

Thursday Oct 06, 2011

Siebel Connection Broker Load Balancing Algorithm

Siebel server architecture supports spawning multiple application object manager processes. The Siebel Connection Broker, SCBroker, tries to balance the load (incoming requests) across different object manager processes running in a single Siebel server.

Least Loaded or Round Robin?

By default, SCBroker forwards the incoming request to any object manager process that is least loaded - meaning the process with the least number of running tasks. In Siebel terminology, this behavior is referred as "least-loaded" or "LL" connection forwarding algorithm. While the default LL algorithm provides the optimal behavior in the best case scenarios, it may lead to serious availability problems if one of several object manager prcesses running in a Siebel server stops responding in a timely fashion [for some reason]. Such an object manager may still accept requests though it may timeout. At some point, the unresponsive/hung or erroneous object manager will have the least number of tasks that may prompt SCBroker component to forward new incoming requests to that object manager process - which in turn leads to a stalemate. To avoid such situations, it is recommended to configure "round-robin" or "RR" algorithm in SCBroker component. When round-robin algorithm is configured, SCBroker ignores the number of running tasks per object manager process and routes all requests to all object managers in a round robin fashion.

While both algorithms have their strengths and weaknesses, customers must weigh both options and choose the one that fits best in their deployment.

eg.,

Find the current load balancing algorithm:

srvrmgr>  list advanced param ConnForwardAlgorithm for comp SCBroker \
             show PA_ALIAS, PA_VALUE, PA_NAME

PA_ALIAS              PA_VALUE  PA_NAME                                    
--------------------  --------  -----------------------------------------  
ConnForwardAlgorithm  LL        Connection Forward algorithm for SCBroker

Configure SCBroker to use round-robin algorithm:

srvrmgr> change param ConnForwardAlgorithm=RR for comp SCBroker server SERVER_NAME
Command completed successfully.

srvrmgr> list advanced param ConnForwardAlgorithm for comp SCBroker \
            show PA_ALIAS, PA_VALUE, PA_NAME

PA_ALIAS              PA_VALUE  PA_NAME                                    
--------------------  --------  -----------------------------------------  
ConnForwardAlgorithm  RR        Connection Forward algorithm for SCBroker

Other SCBroker parameters of interest: ConnForwardTimeout and ConnRequestTimeout

About

Benchmark announcements, HOW-TOs, Tips and Troubleshooting

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today