Wednesday Apr 03, 2013

Diagnostics Enhancements in SOA Suite 11g PS6 (11.1.1.7)

What's new with Diagnostics in PS6?



Interval Sampling

Purpose: To collect Diagnostic Dumps at specified intervals for as long as they need to be collected.
Activation: By default there are samplings configured for the Diagnostic Dumps 'jvm.threads' and 'jvm.histogram' but they will not begin collecting samples until an Incident is generated on the server. New samples can be started and managed through WLST.

NOTE: To enable dump sampling on a healthy server that has not yet generated a DFW Incident you must activate dump sampling for a healthy server. Steps:
  1. Login in the EM Console
  2. Right click on the domain name in the left menu and select 'System MBean Browser'
  3. Navigate to the MBean: '‘Application Defined MBeans’ -> ‘oracle.dfw’ -> ‘Domain: <domain>’ -> ‘oracle.dfw.jmx.DiagnosticsConfigMbean’
  4. Select the 'DiagnosticsConfig' mbean
  5. Change the value of the 'DumpSamplingIdleWhenHealthy' attribute to 'false'
  6. Click 'Apply'. No restart is necessary for the change to take effect


Collection: The samples are loaded into memory and maintained there until the sampling session is stopped. To collect the current sample archive requires only the execution of a simple WLST command.

Example Commands:
Start and connect to a running server with WLST:
  1. Start WLST with '/oracle_common/common/bin/wlst.sh'
  2. Connect to the running server with 'connect('<user>','<pwd>','t3://<host>:<port>')'

List the currently running dump samples:

wls:/soasingle/serverConfig> listDumpSamples()

(default output)
Name:JVMThreadDump
Dump Name:jvm.threads
Application Name:
Sampling Interval:5
Rotation Count:10
Dump Implicitly:true
Append Samples:true
Dump Arguments:timing=true, context=true


Name:JavaClassHistogram
Dump Name:jvm.classhistogram
Application Name:
Sampling Interval:1800
Rotation Count:5
Dump Implicitly:false
Append Samples:true
Dump Arguments:


Collect the current memory buffer for the 'JVMThreadDump' sample:

wls:/soasingle/serverConfig> getSamplingArchives(sampleName='JVMThreadDump',outputFile='<file>')

wrote 194751 bytes to <file>

The archive is written to the specified location as a zip. In the case of JVMThreadDumps the archive contains a .dmp file containing the actual thread dumps and a text file listing the sampling parameters and the dumps that were taken.


Start a new sampling session:

wls:/soasingle/serverConfig> addDumpSample(sampleName='ThreadsSample',diagnosticDumpName='jvm.threads',samplingInterval=20,rotationCount=20, toAppend=true,args={'context' : 'true'})

ThreadsSample is added

You can then confirm the sampling session with the 'listDumpSamples()' command.


Kill a sampling session:

wls:/soasingle/serverConfig> removeDumpSample(sampleName='ThreadsSample')

Removed ThreadsSample

You can confirm that the sampling session is gone by again running the 'listDumpSamples()' command.

back to top



Contextual SOA Thread Dump Translation


In PS6 we now have the ability to generate thread dumps from DFW that will tell us which composite and component a particular active SOA thread is running. This is best illustrated through an example.

Execute the jvm.threads Diagnostic Dump from WLST against an active SOA Server:

NOTE: You MUST pass the 'context' argument in order to have the SOA context information included in the dump.

wls:/soasingle/serverConfig> executeDump(name='jvm.threads',outputFile='<file>',args={'context' : 'true'})

(no output in the command window)

Go to the location where you wrote the file and open it. The file will look like a typical thread dump but scroll past the stack traces and you'll find some CPU statistics for each thread and below that the SOA context information. Here we list a partial stack for an active SOA thread followed by the corresponding context entry:

"[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'"
id=69
idx=0x100 tid=27191 prio=5 alive, in native, daemon
    at jrockit/net/SocketNativeIO.readBytesPinned(Ljava/io/FileDescriptor;[BIII)I(Native Method)
    at jrockit/net/SocketNativeIO.socketRead(SocketNativeIO.java:32)
    at java/net/SocketInputStream.socketRead0(Ljava/io/FileDescriptor;[BIII)I(SocketInputStream.java)
    at java/net/SocketInputStream.read(SocketInputStream.java:129)
    at oracle/net/nt/MetricsEnabledInputStream.read(TcpNTAdapter.java:730)
    at oracle/net/ns/Packet.receive(Packet.java:302)
    at oracle/net/ns/DataPacket.receive(DataPacket.java:108)
    at oracle/net/ns/NetInputStream.getNextPacket(NetInputStream.java:317)
    at oracle/net/ns/NetInputStream.read(NetInputStream.java:262)
    ....


===== THREAD CONTEXT INFORMATION =====
idECIDRIDContext Values
-----------------------------------------------------------------------------------------------------------------------------
id=205911d1def534ea1be0:39fa34d1:....0dbRID=0:6
id=209511d1def534ea1be0:39fa34d1:....0dbRID=0:6
id=6011d1def534ea1be0:39fa34d1:....0dbRID=0:6
id=69
11d1def534ea1be0:39fa34d1:....0WEBSERVICE_PORT.name=execute_pt
dbRID=0:9
composite_name=DiagProject
component_instance_id=DF4389F07C6611E2BFBECB6C185E5342
component_name=TestProject_BPELProcess1
J2EE_MODULE.name=fabric
WEBSERVICE_NAMESPACE.name=http://xmlns.oracle.com/TestApp/DiagProject/TestProject_Mediator1
activity_name=AQ_Java_Embedding1:BxExe3:405
J2EE_APP.name=soa-infra
WEBSERVICE.name=TestProject_Mediator1_ep
composite_instance_id=182858


The entries of primary interest are highlighted in red.
  1. Thread ID: This is the ID of the thread from the thread dump entry. We list them all but only provide the context information for SOA threads
  2. composite_name: The name of the composite that is running in that thread
  3. component_name: The name of the component inside the composite that is running in the thread
  4. activity_name: The name of the BPEL activity that is currently running
  5. composite_instance_id: The instance id of the composite that can be looked up in EM
There are obviously some other entries but these are the one that we think will be of most benefit.

NOTE: In the initial release you may see instances where Mediator is listed as the running component when it ought to be BPEL. In these instances the 'activity_name' should still be accurate and list the running BPEL activity

back to top



BPEL Diagnostic Dumps


There are 3 new Diagnostic Dumps for BPEL in PS6:
  • bpel.dispatcher: Dumps the current state of the BPEL thread pools including Engine, Invoke and System.
  • bpel.adt: Dumps the average time that messages for asynchronous BPEL processes wait in the DLV_MESSAGE table before being consumed by an Engine thread
  • bpel.apt: Dumps the average processing time for all BPEL components in every deployed composite

Examples:

List the available SOA Diagnostic Dumps after starting WLST with '/oracle_common/common/bin/wlst.sh' and connecting to a running SOA server.

wls:/soasingle/serverConfig> listDumps(appName='soa-infra')

adf.ADFConfigDiagnosticDump
bpel.apd
bpel.apt
bpel.dispatcher
mediator.resequencer
soa.adapter.connpool
soa.adapter.ra
soa.adapter.stats
soa.composite
soa.composite.trail
soa.config
soa.db
soa.edn
soa.env
soa.wsdl


(To list the dumps that are part of the system default)

wls:/soasingle/serverConfig> listDumps()

dfw.samplingArchive
dms.ecidctx
dms.metrics
http.requests
jvm.classhistogram
jvm.flightRecording
jvm.threads
odl.activeLogConfig
odl.logs
odl.quicktrace
wls.image



Execute bpel.dispatcher:

wls:/soasingle/serverConfig> executeDump(name='bpel.dispatcher',appName='soa-infra',outputFile='<file>')

(There is no console output)

The output is a text file in the specified location. Sample

The report first lists the BPEL thread pools along with their configured sizes. It then breaks them down by statistics such as number of active threads, processed messages, errored messages, etc. Keep in mind that synchronous requests are handled by the default WLS Work Manager and therefore will not be reflected here. These pools handle callbacks (Invoke) and asynchronous requests (Engine).


Execute bpel.adt:

wls:/soasingle/serverConfig> executeDump(name='bpel.adt',appName='soa-infra',outputFile='<file>')

(There is no console output)

The output is a text file in the specified location. No sample available at the moment report looks identical to bpel.apt, listing the asynchronous BPEL instances and their avg delay in processing messages from the DLV_MESSAGE table.


Execute bpel.apt:

wls:/soasingle/serverConfig> executeDump(name='bpel.apt',appName='soa-infra',outputFile='<file>')

(There is no console output)

The output is a text file in the specified location. Sample

The report lists the deployed BPEL components and their avg processing times in seconds.

back to top



Adapater Diagnostic Dumps


In PS6 there are 3 new Adapter Diagnostic Dumps:
  • soa.adapter.connpool: Dumps the current adapter connection pool statistics
  • soa.adapter.ra: Dumps the adapter Connection Factory configuration parameters
  • soa.adapter.stats: Dumps the current DMS statistics for all deployed adapter instances


Execute soa.adapter.connpool:
wls:/soasingle/serverConfig> executeDump(name='soa.adapter.connpool',appName='soa-infra',outputFile='<file>')

(There is no console output)

The output is a text file in the specified location. Sample

The report groups the connection pool statistics by the adapter instance so you'll see a header including the instance name followed by the list of statistics.


Execute soa.adapter.ra:
wls:/soasingle/serverConfig> executeDump(name='soa.adapter.ra',appName='soa-infra',outputFile='<file>')

(There is no console output)

The output is a text file in the specified location. Sample

The report groups the Connection Factory configurations by deployed adapter so again you'll see the header with the Adapter type followed by the configuration parameters.


Execute soa.adapter.stats:
wls:/soasingle/serverConfig> executeDump(name='soa.adapter.stats',appName='soa-infra',outputFile='<file>')

(There is no console output)

The output is a text file in the specified location. Sample

The report lists statistics such as min time, max time and avg time along with number of processed / error instances. They are grouped by adapter instance.

back to top



Service Bus Diagnostic Dumps


In PS6 there are 3 new Service Bus Diagnostic Dumps:
  • OSB.derived-cache: Dumps the current statistics and contents of all Service Bus derived caches
  • OSB.jms-async-table: Dumps the current contents of the JMS async table which stores message correlation IDs, timestamps, etc
  • OSB.mq-async-table: Dumps the current contents of the MQ async table which stores similar information to the JMS async table


Execute OSB.derived-cache:
wls:/soasingle/serverConfig> executeDump(name='OSB.derived-cache',appName='OSB',outputFile='<file>')

(There is no console output)

The output is a text file in the specified location and the specific caches dumped are:

  • ArchiveClassLoader: Cache of dependency-aware archive class-loaders
  • ArchiveSummary: Cache of archive summaries
  • CodecFactory: Cache of codec factories
  • EffectiveWSDL: Cache of the EffectiveWSDL objects that are derived from the service or wsdl resources of Proxy or Business services
  • Flow_Info: Cache of flow info objects
  • LightweightEffectiveWSDL: Cache of the EffectiveWSDL objects that are derived from the service or wsdl resources of Proxy or Business services
  • MflExecutor: Cache of the MFL executors
  • RouterRuntime: Cache of the compiled router runtimes for proxy services
  • RuntimeEffectiveWSDL: Cache of the Session valid EffectiveWSDL objects that are derived from the service or wsdl resources of Proxy or Business services
  • RuntimeEffectiveWSPolicy: Cache of WS Policies for Proxy or Business services
  • SchemaTypeSystem: Type system cache for MFL, XS and WSDLs
  • ServiceAlertsStatisticInfo: Cache of service alert statistics for Proxy or Business services
  • ServiceInfo: Cache of compile service info for Proxy or Business services and WSDLs
  • Wsdl_Info: Cache of WSDL Info objects
  • WsPolicyMetadata>: Cache of compiled WS Policy Metadata
  • XMLSchema_Info: XML Schema info cache for xml schema objects
  • XqueryExecutor: Cache of Xquery executors
  • XsltExecutor: Cache of XSLT executors
  • alsb.transports.ejb.bindingtype: Cache of EJB Binding Info for EJB Business Services
  • alsb.transports.jejb.business.bindingtype: Cache of JEJB Binding Info
  • alsb.transports.jejb.proxy.bindingtype: Cache of JEJB Binding Info



Execute OSB.jms-async-table:
wls:/soasingle/serverConfig> executeDump(name='OSB.jms-async-table',appName='OSB',outputFile='<file>')

(There is no console output)

The output is a text file in the specified location.


Execute OSB.mq-async-table:
wls:/soasingle/serverConfig> executeDump(name='OSB.mq-async-table',appName='OSB',outputFile='<file>')

(There is no console output)

The output is a text file in the specified location.

back to top



There is one new Diagnostic Dump for Mediator which dumps the content of the resequencer table to help in troubleshooting resequencer issues. To execute:

wls:/soasingle/serverConfig> executeDump(name='mediator.resequencer',appName='soa-infra',outputFile='<file>')

(There is no console output)

The output is a text file in the specified location.


Troubleshooting Example with the new Features


Most of the new Diagnostic Dumps report performance related information so we will use a performance example. Our composite called 'DiagProject' is failing to process messages at the expected rate. What might be the problem?

We can start at a number of places but here is a list of Diagnostic Dumps to collect up front:

wls:/soasingle/serverConfig> executeDump(name='bpel.dispatcher',appName='soa-infra',outputFile='<file>')

wls:/soasingle/serverConfig> executeDump(name='bpel.apt',appName='soa-infra',outputFile='<file>')

wls:/soasingle/serverConfig> executeDump(name='soa.adapter.connpool',appName='soa-infra',outputFile='<file>')

wls:/soasingle/serverConfig> executeDump(name='soa.adapter.stats',appName='soa-infra',outputFile='<file>')

wls:/soasingle/serverConfig> executeDump(name='jvm.threads',outputFile='<file>',args={'context' : 'true'})

So what do we have?
  • bpel.dispatcher shows that all of the BPEL thread pools are fine and running without a backlog. This means that the bottleneck is not a lack of available BPEL threads
  • bpel.apt clearly shows that average execution time of the bpel component is over 5 seconds which is much longer than expected
  • soa.adapter.connpool shows that only one adapter, AQ, is running and its connection pool is healthy
  • soa.adapter.stats again shows that the only adapter running is AQ and its performance is quite good with an average processing time of 421ms
  • jvm.threads may actually be the first thing you want to look at and when viewing the context information at the bottom it's clear that everything is running the BPELProcess1 component, indicating that it is taking by far the most time to complete. Looking a little closer at the 'activity_name' we can see that many of them are running an embedded activity which is indeed where the delay is (Thread.sleep()).

So clearly the bottleneck is within the BPEL process itself and on further review you find that someone left a 5 second Wait activity in the process. Contrived to be sure but hopefully still somewhat helpful.

back to top
About

This is the official blog of the SOA Proactive Support Team. Here we will provide information on our activities, publications, product related information and more. Additionally we look forward to your feedback to improve what we do.

Search

Categories
Archives
« April 2013 »
SunMonTueWedThuFriSat
 
1
2
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today