FMw Diagnostic Framework : Automatic Capture of Diagnostic Data Upon First Failure!

Introduction

There is nothing more frustrating than a problem that "cannot be reproduced". Logs, configuration files have been analysed but there just isn't enough information to establish the root cause. The issue maybe closed, but you are left with the feeling that the problem will raise its ugly head again in the future. Trouble is, to resolve such issues you need to capture diagnostic data at the exact time the incident occurs. Step forward Fusion Middleware Diagnostic Framework! 

Diagnostic Framework monitors WebLogic Managed Servers and delivers "Automatic capture of diagnostic data upon first failure". To quote from

Oracle Fusion Middleware Administrator's Guide 11g Release 1 (11.1.1)
Chapter 13 Diagnosing Problems

"When a critical error occurs ... the Diagnostic Framework automatically collects diagnostics, such as thread dumps, DMS metric dumps, and WebLogic Diagnostics Framework (WLDF) server image dumps ... The data is stored in a file-based repository and is accessible with command-line utilities."

In other words the data collected upon first failure - especially the thread and image dumps - provides a snapshot of the system as or immediately after the problem occurs. The table below shows the type of WebLogic Server issues which fall into the scope of Diagnostic Framework

How to Configure Diagnostic Framework?

Depending on your Fusion Middleware product choice you may not need to do anything! Diagnostic Framework is automatically installed, configured and initiated for any WebLogic Domain which has the Oracle Java Required Files (JRF) template applied. This template is applied by default whenever you configure WebLogic Managed Servers for products such as

  • Portal / Forms / Reports / Discoverer
  • Identity Management ( OID , OAM , OIM etc)
  • WebCenter
  • SOA

Check your WebLogic Domain directory structure. If you have an "adr" sub directory under

DOMAIN_HOME/servers/<servername>/

then JRF template has been applied and Diagnostic Framework will be in play.

Should the "adr" sub directory not exist, review the advice given in My Oracle Support article

How to Apply FMW ( EM ) Control and JRF to a WebLogic Domain and Managed Servers [ID 947043.1]

If you are working with a standalone WebLogic Server solution and applying Oracle JRF is not acceptable, consider using WLDF - WebLogic Diagnostic Framework. (Fusion Middleware Diagnostic Framework makes use of WLDF under the covers.) Couple of useful links about WLDF are listed below

How to Get Started With Diagnostic Framework

To be frank, the Fusion Middleware Administrator's Guide is the best place to start your learning

Oracle Fusion Middleware Administrator's Guide 11g Release 1 (11.1.1)
Chapter 13 Diagnosing Problems

A lot of reading here,  but if you are in hurry and just want to get the right information to Oracle Support to help resolve your issue, check out the next section below.

How to Upload Diagnostic Framework Incident Data to Oracle Support

Some Background Information

There are three interfaces to the Repository:

  1. Enterprise Manager Cloud Control (Support Workbench)
  2. WLST (Command Line)
  3. ADRCI (Command Line)

The Enterprise Manager Cloud Control does provide a nice GUI interface to search, view and package diagnostic framework incidents. However, this software is not to be confused with Fusion Middleware (EM) Control. Cloud Control (formerly known as Grid Control) is part of the Enterprise Manager media package. EM Cloud Control has it's own install and configuration story. Therefore, for the benefit of those yet to install and play with Cloud Control, I am going to describe how to use the command line tools.

Ideally, you would only need to one command line interface, but currently I suggest using both - mainly due to the fact that ADRCI SHOW INCIDENTS does not reveal the description behind the Diagnostic Framework error code.

Instructions:

Note:

WLST and ADRCI are case sensitive when it comes to handling parameter values. If you make a mistake, expect an unfriendly syntax error message.

1) Find the incident

Note:

The managed server which you are troubleshooting must be up and running. If the managed server is down, ensure the domain's Admin Server is accessible. If you cannot connect to the Admin Server or the Managed Server the example WLST commands will not work.

a) Launch WLST 

Note: Use the WLST which resides in the "oracle_common" directory (not WL_HOME/common/bin) otherwise you will get a syntax error like the one below

Traceback (innermost last):
  File "<console>", line 1, in ?
NameError: listIncidents

MW_HOME/oracle_common/common/bin/wlst.sh

b) Connect to the managed server or the admin server e.g.

wls:/offline> connect('weblogic','welcome1','t3://localhost:7020')

c) Run the command

wls:/mydomain/serverConfig> listIncidents()

This will list the incidents for the server to which you have connected. If you have connected to the Admin Server and want to list the incidents for a managed server within the domain, use the command

wls:/mydomain/serverConfig> listIncidents(adrHome='diag\ofm\mydomain\mymanagedserver'
,server='mymanagedserver')

Example output

Incident Id     Problem Key              Incident Time
        1       DFW-99998 [java.lang.NullPointerException]
[oracle.error.simulator.ErrorSimulator.createNullPointerException][errorWebApp_1-0-0-0]        
Fri Nov 02 10:38:46 GMT 2012

 The piece highlighted in bold is the description you do not see when using the ADRCI 'SHOW INCIDENT' command.

Make a note of the incident id. You are ready to move to step 2

2. Package the incident

a) Set up the environment - example commands below are for Unix

cd <DOMAIN_HOME>/bin
. ./setDomainEnv.sh

LD_LIBRARY_PATH=$WL_HOME/server/adr; export LD_LIBRARY_PATH

If you want ADRCI to run a Remote Diagnostic Agent collection (recommended) at generate package time, point ORACLE_HOME at oracle_common

ORACLE_HOME=$MW_HOME/oracle_common; export ORACLE_HOME

To prevent ADRCI from running RDA at generate package time, point ORACLE_HOME at WL_HOME/server/adr directory. 

ORACLE_HOME=$WL_HOME/server/adr; export ORACLE_HOME

b) Launch adrci

$WL_HOME/server/adr/adrci

c) Set BASE and HOMEPATH

adrci> SET BASE /oracle/middleware/user_projects/domains/
mydomain/servers/mymanagedserver/adr
adrci> SET HOMEPATH diag/ofm/mydomain/mymanagedserver

d)  Optionally run SHOW INCIDENTS e.g.

adrci> SHOW INCIDENTS -MODE DETAIL
ADR Home = /oracle/middleware/user_projects/domains/mydomain/
servers/mymanagedserver/adr/diag/ofm/mydomain/mymanagedserver:
*************************************************************************

**********************************************************
INCIDENT INFO RECORD 1
**********************************************************
   INCIDENT_ID                   1
   STATUS                        ready
   CREATE_TIME                   2012-11-02 10:38:46.468000 +00:00
   PROBLEM_ID                    1
   CLOSE_TIME                    <NULL>
   FLOOD_CONTROLLED              none
   ERROR_FACILITY                DFW
   ERROR_NUMBER                  99998
   ERROR_ARG1                    <NULL>
   ERROR_ARG2                    <NULL>
   ERROR_ARG3                    <NULL>
   ERROR_ARG4                    <NULL>
   ERROR_ARG5                    <NULL>
   ERROR_ARG6                    <NULL>
   ERROR_ARG7                    <NULL>
   ERROR_ARG8                    <NULL>
   ERROR_ARG9                    <NULL>
   ERROR_ARG10                   <NULL>
   ERROR_ARG11                   <NULL>
   ERROR_ARG12                   <NULL>
   SIGNALLING_COMPONENT          <NULL>
   SIGNALLING_SUBCOMPONENT       <NULL>
   SUSPECT_COMPONENT             <NULL>
   SUSPECT_SUBCOMPONENT          <NULL>
   ECID                          5162744c6a2eea5e:155ff445:13ac0aae7cb:-8000-000
0000000000325
   IMPACTS                       0
1 rows fetched

e)  Create a logical package

IPS CREATE PACKAGE INCIDENT incident_number

e.g.

adrci> IPS CREATE PACKAGE INCIDENT 1
Created package 1 based on incident id 1, correlation level typical

f) Generate the package

IPS GENERATE PACKAGE package_number IN path

e.g.

adrci> IPS GENERATE PACKAGE 1 IN /tmp
Generated package 1 in file /tmp/DFW99998j_20121102113633_COM_1.zip, mode complete

Note:

If the generate package command hangs, ADRCI may be experiencing an issue when running RDA. To avoid such trouble, exit ADRCI and point the ORACLE_HOME environment variable at WL_HOME/server/adr

3) Upload the package zip to Oracle Support via your Service Request

a) Log into My Oracle Support and locate your Service Request

b) Click on "Add Attachments


c) And upload the zip file


Comments:

Nice :)

and thank you for sharing.

Posted by guest on November 06, 2012 at 11:27 AM GMT #

In your post in the following section "How to Upload Diagnostic Framework Incident Data to Oracle Support" you have spoken of EMCC as a support tool.
could you please let me know how to integrate it with EMCC

thanks.

Posted by guest on August 20, 2013 at 12:16 PM BST #

Go here for documentation regards configuring Enterprise Management Cloud Control 12c to manage Oracle Fusion Middleware targets.

Oracle Enterprise Manager Cloud Control Getting Started with Oracle Fusion Middleware Management 12c Release 3 (12.1.0.3)
http://docs.oracle.com/cd/E24628_01/install.121/e24215/toc.htm

I guess you have already looked at this.

I have to admit I have not yet had the opportunity to install and configure EM Cloud Control 12c. I spoke to a colleague who has and reports back that there are no special steps for configuring Support Workbench.

The problem, at present, however is that the documentation does not provide much more than a paragraph or two on the subject of support workbench e.g. "3.9 Support Workbench" in document referenced above.

I would like to see half or a whole chapter which describes in detail how to use the Support Workbench features, including how to realise the benefits of its integration with Diagnostic Framework. I raised this concern with the EM Cloud Control product managers and they have logged a documentation bug, and promised to explore the idea of creating a video or presentation which can be exposed via the Oracle Learning Library. Hopefully we will see some useful material emerge in the coming months.

Sorry, I cannot give you more insights in the short term.

Posted by Dan Mortimer on August 23, 2013 at 03:27 PM BST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

This is the blog of the Oracle Fusion Middleware Proactive Support Delivery Team. Here we will provide information about our activities, publications, product related information and more. Feedback welcome.

Follow OracleMWSupport on Twitter

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today