Monday Jul 22, 2013

The ateamsoab2b blogs have been moved to www.ateam-oracle.com

The A-Team has a new web site: The A-Team Chronicles

All of our articles and posts from various locations, including this site, have been consolidated there. Please visit the new site at www.ateam-oracle.com

Please note that all ateamsoab2b posts will be removed from this site on August 30th.

Thanks,

Pete Farkas

Sunday Mar 10, 2013

EDN Debugging

I have a customer asked me about how to debug EDN. This blog will show you how to debug EDN and the tools that can be used to debug EDN.

1. Using EDN-DB-LOG

EDN comes with a useful EDN DB logging servlet to view logging information generated by the EDN component. It is only available for END-DB which is based on AQ, it will not work for EDN with JMS. The servlet uses a table called “EDN_LOG_MESSAGES” in SOA_INFRA schema. It logs the operation on “main” operation of event_queue and oaoo-queue with timestamp information.

The default URL is http://<host_name>:<port_number>/soa-infra/events/edn-db-log.  In this servlet, you can enable, disable and clear logs but you need to have the administrative role in order to access the servlet.  This is a good tool to use to display dynamic counts of un-deq'ed events (potentially "stuck") in the "main" and "OAOO" queues. The log also provides information of EDN bus when it is being connected to AQ-DB.  In the screenshot below, “EVENT_SEQ:202” shows that the EDN bus is being started.

When the logging is enabled, the EDN_LOG_MESSAGES table will be populated with messages until the logging is disabled, so it is inadvisable to leave logging turned on for large amounts of events. It is recommended to clear the log regularly.

Messages in the log are grouped together. Usually the first line in the group will indicate what operation is being performed, and the event sequence number is used to group the messages together and each group will be highlighted using the same color (e.g. enqueuing an event or handling an event that has been dequeued). In the screenshots below, “EVENT_SEQ:204” is dequeuing an event and “EVENT_SEQ:205” is enqueuing an event.

2. Database tables

The second method is to examine the database table. You can check on count of potentially “stuck” events currently in the following queue tables:

  • EDN_EVENT_QUEUE_TABLE – This table is for “EDN_EVENT_QUEUE” AQ. Every event published is temporarily enqueued into this table.
  • EDN_OAOO_DELIVERY_TABLE – This table only stores the event with “OAOO” (one-and-only-one) delivery target(s). The event is temporarily enqueued into this table for END_OAOO_QUEUE AQ.

For event with OAOO delivery target, it travels through both tables, first it is stored in EDN_EVENT_QUEUE_TABLE and then in EDN_OAOO_DELIVERY_TABLE.

This example shows the event enq'ed in "edn_event_queue".

Another alternative is to check the count from the following database views:

  • AQ$EDN_EVENT_QUEUE_TABLE: There are two rows for every event enqueued into "edn_event_queue".
  • AQ$EDN_OAOO_DELIVERY_TABLE: There is one row for every event enqueued into "edn_oaoo_queue". 

This example shows further details about that event which is deq'ed by the subscribers of "edn_event_queue".

The AQ$EDN_EVENT_QUEUE_TABLE.MSG_STATE shows the state of the message.  The states are listed in the table below:

State Code

Value

Description

0

Ready

The message is ready to be processed, i.e., either the delay
time of the message has passed or the message did not have a delay time specified

1

Wait

The delay specified by message_properties_t.delay while executing dbms_aq.enqueue has not been reached.

2

Processed

The message has been successfully processed (dequeued) but will remain in the queue until the retention_time specified for the queue while executing dbms_aqadm.create_queue has been reached.

3

Expired

The message was not successfully processed (dequeued) in either 1) the time specified by message_properties_t.expiration while executing dbms_aq.enqueue or 2) the maximum number of dequeue attempts (max_retries) specified for the queue while executing dbms_aqadm.create_queue.

8

Deferred

Buffered messages enqueued by a Streams Capture process

10

Buffered Expired

User-enqueued expired buffered messages

If the subscriber type is equal to 2 when there is no subscribers to the message, and there is no transaction id due to invalid transaction, it will be marked as UNDELIVERABLE.

When the state message is expired, the AQ$EDN_EVENT_QUEUE_TABLE.EXPIRATION_REASON will be populated with one of the following value:

  • Messages to be cleaned up later
  • MAX_RETRY_EXCEEDED
  • TIME_EXPIRATION
  • INVALID_TRANSACTION
  • PROPAGATION_FAILURE

3. Server Logs

The third method is using EM log configuration and log viewer. There are few logger names related the EDN:

  • oracle.integration.platform.blocks.event
  • oracle.integration.platform.blocks.event
  • oracle.integration.platform.blocks.event.saq
  • oracle.integration.platform.blocks.event.jms

You can set log level to one of the following to capture more details:

    • TRACE:1 (FINE) - Logging the event content details, XPath filter results, event enqueue, dequeue, publish and send operations
    • TRACE:16 (FINER) – Logging the begin, commit and rollback statements of XA transaction (for OAOO) and retry count.
    • TRACE:32 (FINEST)  - All above.

    The log level changes take effect immediately without server restart. However, if you want the changes to persist after server restart, make sure to check on the “Persist Log Level State Across Component Restarts” prior to server restart.

    At FINER or FINEST level, you may see loggings like "Began XA for OAOO." and "Rolled back XA for OAOO." These are normal messages of OAOO event delivery when there are no events waiting to be delivered. They are NOT errorred conditions. You may turn off these messages by setting the Java logging level to "TRACE:1 (FINE)" or a higher value. All detailed logging goes into SOA server's diagnostic.log file configured in EM.  Below is a snippet of the diagnostic log showing the event delivery to an OAOO subscriber:


    [SRC_METHOD: finerBeganXA] Began XA for OAOO.

    [SRC_METHOD: fineEventPublished] Received event: Subject: ... Sender: .... Event: ...

    [SRC_METHOD: fineFilterResults] Filter [XPath Filter: …] for subscriber "..." returned true/false

    [SRC_METHOD: fineDequeuedEvent] Dequeued event, Subject: ... [source type ..]: business-event...

    [SRC_METHOD: fineOAOOEnqueuedEvent] Enqueued OAOO event, Subject: ... [source: ..., target: ... ]: business-event...

    [SRC_METHOD: fineOAOODequeuedEvent] Dequeued OAOO event, Subject: ... [source: ..., target: ...]: business-event...

    [SRC_METHOD: finerInsertedTryCount] Inserted try count for msgId: .... Status: ...

    [SRC_METHOD: finerRemovedTryCount] Removed try count for msgId: ...

    [SRC_METHOD: fineSentOAOOEvent] Sent OAOO event [QName: ... to target: ...]: business-event...

    [SRC_METHOD: fineCommittedOAOODelivery] Committed OAOO Delivery, Subject: ... [source: ..., target: ...]: business-event...

    [SRC_METHOD: finerBeganXA] Began XA for OAOO.

    [SRC_METHOD: finerRolledbackXA] Rolled back XA for OAOO.

    In some cases, more than one method may be necessary to assist in the debugging process. Below is a comparison of server and DB logging that might help you in evaluating and determining which method(s) is/are most suitable in your environment.

    Server Logging

    • EDN will generate standard Java logging messages when events are published, when they are pulled from persistent storage and when they are delivered.
    • The logger used by EDN depends on the implementation. For instance, EDN-DB uses "oracle.integration.platform.blocks.event.saq" and EDN-JMS uses "oracle.integration.platform.blocks.event.jms".
    • As in all Java logging, messages are written at different log levels from ERROR to FINEST. The most detailed messages (including the event body) use FINEST.
    • Loggers can also be configured in logging.xml in your config directory.

    DB Logging

    • If you are using EDN-DB, a lot of debugging information may not be accessible due to the many activities that occurred in the database which couldn’t be logged in the server. Hence, a servlet web page that accesses the debug logging table is implemented to assist the debugging process. The page is located at: http://hostname:port/soa-infra/events/edn-db-log and you do need to have administrative role to access the servlet page.
    • There are commands on the servlet web page to enable and disable logging and for clearing the log table. The table will be filled with messages, so it is inadvisable to leave logging turned on for large amounts of events. It is recommended to clear the log regularly.
    • Messages in the log are grouped together. Usually the first line in the group will indicate what operation is being performed (e.g. enqueuing an event or handling an event that has been dequeued).

    Mediator Instance Tracking

    Mediator supports three modes for instance tracking by changing the audit level in EM->SOA->SOA-INFRA->SOA Administration->Mediator Properties:

    1. Off - No instance tracking for successfully completed instance, however, instances and faulted instances are created even in this mode.  Audit trail will not be created with this flag.
    2. Production - Instance tracking is enabled for all.  All audit details are logged, except the details of assign activities, but the instances and payloads are not captured.
    3. Development - Instance tracking is enabled for all.  All audit details are logged, and the instances and payloads are also captured.

    The following tables are used by Mediator to store the instance and audit trail data:

    1. MEDIATOR_INSTANCE - This table contains one row for each mediator instance. Each instance has a unique id. It stores ecid, composite instance id and parent component id from normalized message and overall state of an instance in the component_state column.  The component state depends on the combination of the mediator case instance states, the states are listed here.
    2. MEDIATOR_CASE_INSTANCE - This table contains one row for each mediator routing rule and fault information for a routing rule is also stored.  Each case instance has one unique id.  It stores mediator instance id and case name, related fault information and information pertaining to retries.  This is the base table for executing automatic retries using fault policies.
    3. MEDIATOR_CASE_DETAIL - This table contains multiple rows for each routing rule and stores mediator audit trail xml as a blob for each routing rule. Each case detail rows are bound together by case id.  It stores case detail state, audit trail for each case detail.  The state of the latest case detail is the current state of the case.
    4. MEDIATOR_AUDIT_DOCUMENT- This table stores payload at each stage of mediator message flow and payloads are stored only when instance tracking audit level is set to "Development". Each row in this table stores the payload at a point in the message flow. e,g, transformed payload, payload being sent to the target service.

    Below is a screenshot of a basic mediator project with 2 routing rules which polls an xml file from an input folder, transforms the content and writes the xml file to a folder. 

    When the mediator receives a massage, it creates a mediator instance, and then depending on the number of routing rule, one or more case instance will be created in the MEDIATOR CASE INSTANCE table. The engine will then initializes the audit trail xml and stores it as an XML document. After each processing point (e.g. transformation, filter evaluation etc), it stores the trail messages to audit trail xml and persists to audit trail table (MEDIATOR_CASE_DETAIL.AUDIT_TRAIL  and/or MEDIATOR_AUDIT_DOCUMENT), then the mediator instance state will be updated.

    1. When the mediator instance kicks off, a composite instance will be created in the COMPOSITE_INSTANCE table, and unique ECID will be assigned to the instance.

    select * from composite_instance where ecid='1b7e5955c26b51de:-56440391:13d41f410c6:-8000-000000000000144b'

    2. Using ECID, you can retrieve the mediator instance data and the component state from the mediator instance table.  From this point onward, MEDIATOR_INSTANCE.ID will be used to retrieve the mediator case data.

    select * from mediator_instance where ecid='1b7e5955c26b51de:-56440391:13d41f410c6:-8000-000000000000144b'

     3. Depending on the number of routing rules, the mediator will store each routing rule separately in the MEDIATOR_CASE_INSTANCE table and the MEDIATOR_CASE_INSTANCE .ID will be used to retrieve the case detail for each routing rule.  In the above example, there are 2 routing rules.

    select * from mediator_case_instance where instance_id = 'C64B82E086BB11E2BFBE1B53FB1929E1';

    4. The audit trail of each routing rule is stored in the MEDIATOR_CASE_DETAIL table in compressed format.

    select * from mediator_case_detail where instance_id = 'C64B82E086BB11E2BFBE1B53FB1929E1';

    Below are the xml data that are stored in the MEDIATOR_CASE_DETAIL.AUDIT_TRAIL column.  In the example below, two routing rules were being executed. The first event routing rule’s result was equal to “false”, then the second routing rule was executed. The second event routing rule’s result was successful, subsequently the message was transformed and published to the destination. If you have the audit trail level set to “Development”, you can use the audit id in the case trail to retrieve the payload from the MEDIATOR_AUDIT_DOCUMENT table for further investigation.

    CASE=ID= C64BA9F086BB11E2BFBE1B53FB1929E1

    <case_trail>

      <event type="inputPayloadReceived" status="Completed"

             parentId="C64B82E086BB11E2BFBE1B53FB1929E1" date="1362615182063"

             auditId="C64BA9F086BB11E2BFBE1B53FB1929E1">

        <message>MediatorAudit_29</message>

      </event>

    </case_trail>

    CASE_ID= C66EE96086BB11E2BFBE1B53FB1929E1

    <case_trail>

      <event type="case" id="C66EE96086BB11E2BFBE1B53FB1929E1"

             parentId="C64B82E086BB11E2BFBE1B53FB1929E1" caseName="USCustomer.Write"

             date="1362615182073" auditId="C64BA9F086BB11E2BFBE1B53FB1929E1">

        <message>MediatorAudit_0#USCustomer.Write</message>

      </event>

      <event type="condition" status="Completed"

             parentId="C66EE96086BB11E2BFBE1B53FB1929E1" date="1362615182074"

             auditId="C64BA9F086BB11E2BFBE1B53FB1929E1">

        <message>MediatorAudit_1#false#$in.CustomerData/imp1:CustomerData/Country='US'</message>

      </event>

    </case_trail>


    CASE _ID= C670700086BB11E2BFBE1B53FB1929E1

    <case_trail>

      <event type="case" id="C670700086BB11E2BFBE1B53FB1929E1"

             parentId="C64B82E086BB11E2BFBE1B53FB1929E1"

             caseName="CanadaCustomer.Write" date="1362615182083"

             auditId="C64BA9F086BB11E2BFBE1B53FB1929E1">

        <message>MediatorAudit_0#CanadaCustomer.Write</message>

      </event>

      <event type="condition" status="Completed"

             parentId="C670700086BB11E2BFBE1B53FB1929E1" date="1362615182083"

             auditId="C64BA9F086BB11E2BFBE1B53FB1929E1">

        <message>MediatorAudit_1#true#$in.CustomerData/imp1:CustomerData/Country='CA'</message>

      </event>

      <event type="transform" status="Completed"

             parentId="C670700086BB11E2BFBE1B53FB1929E1" date="1362615182102"

             auditId="C67292E086BB11E2BFBE1B53FB1929E1">

        <message>MediatorAudit_3#Customer#xsl/CustomerData_To_Customer_2.xsl</message>

      </event>

      <event type="publish" status="Completed"

             parentId="C670700086BB11E2BFBE1B53FB1929E1" date="1362615182124"

             auditId="C67292E086BB11E2BFBE1B53FB1929E1" parentRefId="mediator:C64B82E086BB11E2BFBE1B53FB1929E1:C670700086BB11E2BFBE1B53FB1929E1:oneway">

        <message>MediatorAudit_9#Write#CanadaCustomer</message>

      </event>

    </case_trail>

    Tuesday Dec 04, 2012

    How to Achieve OC4J RMI Load Balancing

    This is an old, Oracle SOA and OC4J 10G topic. In fact this is not even a SOA topic per se. Questions of RMI load balancing arise when you developed custom web applications accessing human tasks running off a remote SOA 10G cluster. Having returned from a customer who faced challenges with OC4J RMI load balancing, I felt there is still some confusions in the field how OC4J RMI load balancing work. Hence I decide to dust off an old tech note that I wrote a few years back and share it with the general public.

    Here is the tech note:

    Overview

    A typical use case in Oracle SOA is that you are building web based, custom human tasks UI that will interact with the task services housed in a remote BPEL 10G cluster. Or, in a more generic way, you are just building a web based application in Java that needs to interact with the EJBs in a remote OC4J cluster. In either case, you are talking to an OC4J cluster as RMI client. Then immediately you must ask yourself the following questions:

    1. How do I make sure that the web application, as an RMI client, even distribute its load against all the nodes in the remote OC4J cluster?

    2. How do I make sure that the web application, as an RMI client, is resilient to the node failures in the remote OC4J cluster, so that in the unlikely case when one of the remote OC4J nodes fail, my web application will continue to function?

    That is the topic of how to achieve load balancing with OC4J RMI client.

    Solutions

    You need to configure and code RMI load balancing in two places:

    1. Provider URL can be specified with a comma separated list of URLs, so that the initial lookup will land to one of the available URLs.

    2. Choose a proper value for the oracle.j2ee.rmi.loadBalance property, which, along side with the PROVIDER_URL property, is one of the JNDI properties passed to the JNDI lookup.(http://docs.oracle.com/cd/B31017_01/web.1013/b28958/rmi.htm#BABDGFBI)

    More details below:

    About the PROVIDER_URL

    The JNDI property java.name.provider.url's job is, when the client looks up for a new context at the very first time in the client session, to provide a list of RMI context

    The value of the JNDI property java.name.provider.url goes by the format of a single URL, or a comma separate list of URLs.
    • A single URL. For example: opmn:ormi://host1:6003:oc4j_instance1/appName1
    • A comma separated list of multiple URLs. For examples:  opmn:ormi://host1:6003:oc4j_instanc1/appName, opmn:ormi://host2:6003:oc4j_instance1/appName, opmn:ormi://host3:6003:oc4j_instance1/appName

    When the client looks up for a new Context the very first time in the client session, it sends a query against the OPMN referenced by the provider URL. The OPMN host and port specifies the destination of such query, and the OC4J instance name and appName are actually the “where clause” of the query.

    When the PROVIDER URL reference a single OPMN server

    Let's consider the case when the provider url only reference a single OPMN server of the destination cluster. In this case, that single OPMN server receives the query and returns a list of the qualified Contexts from all OC4Js within the cluster, even though there is a single OPMN server in the provider URL. A context represent a particular starting point at a particular server for subsequent object lookup.

    For example, if the URL is opmn:ormi://host1:6003:oc4j_instance1/appName, then, OPMN will return the following contexts:

    • appName on oc4j_instance1 on host1
    • appName on oc4j_instance1 on host2,
    • appName on oc4j_instance1 on host3, 
    (provided that host1, host2, host3 are all in the same cluster)

    Please note that
    • One OPMN will be sufficient to find the list of all contexts from the entire cluster that satisfy the JNDI lookup query. You can do an experiment by shutting down appName on host1, and observe that OPMN on host1 will still be able to return you appname on host2 and appName on host3.
    When the PROVIDER URL reference a comma separated list of multiple OPMN servers


    When the JNDI propery java.naming.provider.url references a comma separated list of multiple URLs, the lookup will return the exact same things as with the single OPMN server: a list of qualified Contexts from the cluster.

    The purpose of having multiple OPMN servers is to provide high availability in the initial context creation, such that if OPMN at host1 is unavailable, client will try the lookup via OPMN on host2, and so on. After the initial lookup returns and cache a list of contexts, the JNDI URL(s) are no longer used in the same client session. That explains why removing the 3rd URL from the list of JNDI URLs will not stop the client from getting the EJB on the 3rd server.


    About the oracle.j2ee.rmi.loadBalance Property

    After the client acquires the list of contexts, it will cache it at the client side as “list of available RMI contexts”.  This list includes all the servers in the destination cluster. This list will stay in the cache until the client session (JVM) ends. The RMI load balancing against the destination cluster is happening at the client side, as the client is switching between the members of the list.

    Whether and how often the client will fresh the Context from the list of Context is based on the value of the  oracle.j2ee.rmi.loadBalance. The documentation at http://docs.oracle.com/cd/B31017_01/web.1013/b28958/rmi.htm#BABDGFBI list all the available values for the oracle.j2ee.rmi.loadBalance.

    Value Description
    client
    If specified, the client interacts with the OC4J process that was initially chosen at the first lookup for the entire conversation.
    context
    Used for a Web client (servlet or JSP) that will access EJBs in a clustered OC4J environment.
    If specified, a new Context object for a randomly-selected OC4J instance will be returned each time InitialContext() is invoked.
    lookup
    Used for a standalone client that will access EJBs in a clustered OC4J environment.
    If specified, a new Context object for a randomly-selected OC4J instance will be created each time the client calls Context.lookup().


    Please note the regardless of the setting of oracle.j2ee.rmi.loadBalance property, the “refresh” only occurs at the client. The client can only choose from the "list of available context" that was returned and cached from the very first lookup. That is, the client will merely get a new Context object from the “list of available RMI contexts” from the cache at the client side. The client will NOT go to the OPMN server again to get the list. That also implies that if you are adding a node to the server cluster AFTER the client’s initial lookup, the client would not know it because neither the server nor the client will initiate a refresh of the “list of available servers” to reflect the new node.

    About High Availability (i.e. Resilience Against Node Failure of Remote OC4J Cluster)

    What we have discussed above is about load balancing. Let's also discuss high availability.

    This is how the High Availability works in RMI: when the client use the context but get an exception such as socket is closed, it knows that the server referenced by that Context is problematic and will try to get another unused Context from the “list of available contexts”. Again, this list is the list that was returned and cached at the very first lookup in the entire client session.

    Using BPEL Performance Statistics to Diagnose Performance Bottlenecks

    Tuning performance of Oracle SOA 11G applications could be challenging. Because SOA is a platform for you to build composite applications that connect many applications and "services", when the overall performance is slow, the bottlenecks could be anywhere in the system: the applications/services that SOA connects to, the infrastructure database, or the SOA server itself.How to quickly identify the bottleneck becomes crucial in tuning the overall performance.

    Fortunately, the BPEL engine in Oracle SOA 11G (and 10G, for that matter) collects BPEL Engine Performance Statistics, which show the latencies of low level BPEL engine activities. The BPEL engine performance statistics can make it a bit easier for you to identify the performance bottleneck.

    Although the BPEL engine performance statistics are always available, the access to and interpretation of them are somewhat obscure in the early and current (PS5) 11G versions.

    This blog attempts to offer instructions that help you to enable, retrieve and interpret the performance statistics, before the future versions provides a more pleasant user experience.

    Overview of BPEL Engine Performance Statistics 

    SOA BPEL has a feature of collecting some performance statistics and store them in memory.

    One MBean attribute, StatLastN, configures the size of the memory buffer to store the statistics. This memory buffer is a "moving window", in a way that old statistics will be flushed out by the new if the amount of data exceeds the buffer size. Since the buffer size is limited by StatLastN, impacts of statistics collection on performance is minimal. By default StatLastN=-1, which means no collection of performance data.

    Once the statistics are collected in the memory buffer, they can be retrieved via another MBean oracle.as.soainfra.bpel:Location=[Server Name],name=BPELEngine,type=BPELEngine.>

    My friend in Oracle SOA development wrote this simple 'bpelstat' web app that looks up and retrieves the performance data from the MBean and displays it in a human readable form. It does not have beautiful UI but it is fairly useful.

    Although in Oracle SOA 11.1.1.5 onwards the same statistics can be viewed via a more elegant UI under "request break down" at EM -> SOA Infrastructure -> Service Engines -> BPEL -> Statistics, some unsophisticated minds like mine may still prefer the simplicity of the 'bpelstat' JSP. One thing that simple JSP does do well is that you can save the page and send it to someone to further analyze

    Follows are the instructions of how to install and invoke the BPEL statistic JSP. My friend in SOA Development will soon blog about interpreting the statistics. Stay tuned.

    Step1: Enable BPEL Engine Statistics for Each SOA Servers via Enterprise Manager

    First st you need to set the StatLastN to some number as a way to enable the collection of BPEL Engine Performance Statistics

    • EM Console -> soa-infra(Server Name) -> SOA Infrastructure -> SOA Administration -> BPEL Properties
    • Click on "More BPEL Configuration Properties"
    • Click on attribute "StatLastN", set its value to some integer number. Typically you want to set it 1000 or more.

    Step 2: Download and Deploy bpelstat.war File to Admin Server,

    Note: the WAR file contains a JSP that does NOT have any security restriction. You do NOT want to keep in your production server for a long time as it is a security hazard. Deactivate the war once you are done.
    • Download the bpelstat.war to your local PC
    • At WebLogic Console, Go to Deployments -> Install
    • Click on the "upload your file(s)"
    • Click the "Browse" button to upload the deployment to Admin Server
    • Accept the uploaded file as the path, click next
    • Check the default option "Install this deployment as an application"
    • Check "AdminServer" as the target server
    • Finish the rest of the deployment with default settings


    • Console -> Deployments
    • Check the box next to "bpelstat" application
    • Click on the "Start" button. It will change the state of the app from "prepared" to "active"

    Step 3: Invoke the BPEL Statistic Tool


    • The BPELStat tool merely call the MBean of BPEL server and collects and display the in-memory performance statics. You usually want to do that after some peak loads.
    • Go to http://<admin-server-host>:<admin-server-port>/bpelstat
    • Enter the correct admin hostname, port, username and password
    • Enter the SOA Server Name from which you want to collect the performance statistics. For example, SOA_MS1, etc.
    • Click Submit
    • Keep doing the same for all SOA servers.

    Step 3: Interpret the BPEL Engine Statistics

    You will see a few categories of BPEL Statistics from the JSP Page.

    First it starts with the overall latency of BPEL processes, grouped by synchronous and asynchronous processes. Then it provides the further break down of the measurements through the life time of a BPEL request, which is called the "request break down".

    1. Overall latency of BPEL processes

    The top of the page shows that the elapse time of executing the synchronous process TestSyncBPELProcess from the composite TestComposite averages at about 1543.21ms, while the elapse time of executing the asynchronous process TestAsyncBPELProcess from the composite TestComposite2 averages at about 1765.43ms. The maximum and minimum latency were also shown.

    Synchronous process statistics
    <statistics>
        <stats key="default/TestComposite!2.0.2-ScopedJMSOSB*soa_bfba2527-a9ba-41a7-95c5-87e49c32f4ff/TestSyncBPELProcess" min="1234" max="4567" average="1543.21" count="1000">
        </stats>
    </statistics>



    Asynchronous process statistics
    <statistics>
        <stats key="default/TestComposite2!2.0.2-ScopedJMSOSB*soa_bfba2527-a9ba-41a7-95c5-87e49c32f4ff/TestAsyncBPELProcess" min="2234" max="3234" average="1765.43" count="1000">
        </stats>
    </statistics>


    2. Request break down

    Under the overall latency categorized by synchronous and asynchronous processes is the "Request breakdown". Organized by statistic keys, the Request breakdown gives finer grain performance statistics through the life time of the BPEL requests.It uses indention to show the hierarchy of the statistics.

    Request breakdown
    <statistics>
        <stats key="eng-composite-request" min="0" max="0" average="0.0" count="0">
            <stats key="eng-single-request" min="22" max="606" average="258.43" count="277">
                <stats key="populate-context" min="0" max="0" average="0.0" count="248">



    Please note that in SOA 11.1.1.6, the statistics under Request breakdown is aggregated together cross all the BPEL processes based on statistic keys. It does not differentiate between BPEL processes. If two BPEL processes happen to have the statistic that share same statistic key, the statistics from two BPEL processes will be aggregated together. Keep this in mind when we go through more details below.


    2.1 BPEL process activity latencies

    A very useful measurement in the Request Breakdown is the performance statistics of the BPEL activities you put in your BPEL processes: Assign, Invoke, Receive, etc. The names of the measurement in the JSP page directly come from the names to assign to each BPEL activity. These measurements are under the statistic key "actual-perform"

    Example 1: 
    Follows is the measurement for BPEL activity "AssignInvokeCreditProvider_Input", which looks like the Assign activity in a BPEL process that assign an input variable before passing it to the invocation:


                                   <stats key="AssignInvokeCreditProvider_Input" min="1" max="8" average="1.9" count="153">
                                        <stats key="sensor-send-activity-data" min="0" max="1" average="0.0" count="306">
                                        </stats>
                                        <stats key="sensor-send-variable-data" min="0" max="0" average="0.0" count="153">
                                        </stats>
                                        <stats key="monitor-send-activity-data" min="0" max="0" average="0.0" count="306">
                                        </stats>
                                    </stats>

    Note: because as previously mentioned that the statistics cross all BPEL processes are aggregated together based on statistic keys, if two BPEL processes happen to name their Invoke activity the same name, they will show up at one measurement (i.e. statistic key).

    Example 2:
    Follows is the measurement of BPEL activity called "InvokeCreditProvider". You can not only see that by average it takes 3.31ms to finish this call (pretty fast) but also you can see from the further break down that most of this 3.31 ms was spent on the "invoke-service". 

                                    <stats key="InvokeCreditProvider" min="1" max="13" average="3.31" count="153">
                                        <stats key="initiate-correlation-set-again" min="0" max="0" average="0.0" count="153">
                                        </stats>
                                        <stats key="invoke-service" min="1" max="13" average="3.08" count="153">
                                            <stats key="prep-call" min="0" max="1" average="0.04" count="153">
                                            </stats>
                                        </stats>
                                        <stats key="initiate-correlation-set" min="0" max="0" average="0.0" count="153">
                                        </stats>
                                        <stats key="sensor-send-activity-data" min="0" max="0" average="0.0" count="306">
                                        </stats>
                                        <stats key="sensor-send-variable-data" min="0" max="0" average="0.0" count="153">
                                        </stats>
                                        <stats key="monitor-send-activity-data" min="0" max="0" average="0.0" count="306">
                                        </stats>
                                        <stats key="update-audit-trail" min="0" max="2" average="0.03" count="153">
                                        </stats>
                                    </stats>



    2.2 BPEL engine activity latency

    Another type of measurements under Request breakdown are the latencies of underlying system level engine activities. These activities are not directly tied to a particular BPEL process or process activity, but they are critical factors in the overall engine performance. These activities include the latency of saving asynchronous requests to database, and latency of process dehydration.

    My friend Malkit Bhasin is working on providing more information on interpreting the statistics on engine activities on his blog (https://blogs.oracle.com/malkit/). I will update this blog once the information becomes available.

    Update on 2012-10-02: My friend Malkit Bhasin has published the detail interpretation of the BPEL service engine statistics at his blog http://malkit.blogspot.com/2012/09/oracle-bpel-engine-soa-suite.html.

    Retrieve Performance Data from SOA Infrastructure Database

    Here I would like offer examples of some basic SQL queries you can run against the infrastructure database of Oracle SOA Suite 11G to acquire the performance statistics for a given period of time. The final version of the script will prompt for the start and end time of the period of your interest.[Read More]

    Monday Sep 24, 2012

    2 way SSL between SOA and OSB

    This blog describes all the steps to setup 2 way SSL between SOA and OSB.  The steps should be applicable if the external service is not hosted on OSB and other server with certain adjustment where appropriate.   [Read More]

    Thursday Aug 30, 2012

    SOA Suite 11g Asynchronous Testing with soapUI

    Although there are various write-ups on the topic of testing asynchronous web services using soapUI, this blog is intended to provide a very simple guide to setting this up in the context of SOA Suite 11g. With this knowledge you can use soapUI free edition to go beyond the test harness that comes bundled with SOA Suite 11g Enterprise Manager. This also serves as a nice introduction to another blog of mine: SOA Suite 11g Dynamic Payload Testing with soapUI Free Edition.[Read More]

    SOA Suite 11g Dynamic Payload Testing with soapUI Free Edition

    When running various tests like smoke tests, unit tests, load tests, etc. tools like soapUI are frequently used. Although soapUI is a very easy to use tool, there are things that can be done to expand beyond the basics which many may or may not consider "easy to use". For example, how do you create dynamic payloads for stress tests without using the Pro version of soapUI? Well, this blog will show one way to use soapUI free edition to run tests with dynamic payloads.[Read More]

    Friday Jun 29, 2012

    BAM design pointers

    In working recently with a large Oracle customer on SOA and BAM, I discovered that some BAM best practices are not quite well known as I had always assumed ! There is a doc bug out to formally incorporate those learnings but here are a few notes..

     EMS-DO parity

    When using EMS (Enterprise Message Source) as a BAM feed, the best practice is to use one EMS to write to one Data Object. There is a possibility of collisions and duplicates when multiple EMS write to the same row of a DO at the same time. This customer had 17 EMS writing to one DO at the same time. Every sensor in their BPEL process writes to one topic but the Topic was read by 1 EMS corresponding to one sensor. They then used XSL within BAM to transform the payload into the BAM DO format. And hence for a given BPEL instance, 17 sensors fired, populated 1 JMS topic, was consumed by 17 EMS which in turn wrote to 1 DataObject.(You can image what would happen for later versions of the application that needs to send more information to BAM !). 

    We modified their design to use one Master XSL based on sensorname for all sensors relating to a DO- say Data Object 'Orders' and were able to thus reduce the 17 EMS to 1 with a master XSL.

    For those of you wondering about how squeaky clean this design is, you are right ! This is indeed not squeaky clean and that brings us to yet another 'inferred' best practice. (I try very hard not to state the obvious in my blogs with the hope that everytime I blog, it is very useful but this one is an exception.)

    Transformations and Calculations

    It is optimal to do transformations within an engine like BPEL. Not only does this provide modelling ease with a nice GUI XSL mapper in JDeveloper, the XSL engine in BPEL is quite efficient at runtime as well. And so, doing XSL transformations in BAM is not quite prudent. 

    The same is true for any non-trivial calculations as well. It is best to do all transformations,calcuations and sanitize the data in a BPEL or like layer and then send this to BAM (via JMS, WS etc.) This then delegates simply the function of report rendering and mechanics of real-time reporting to the Oracle BAM reporting tool which it is most suited to do.

    All nulls are not created equal
    Here is yet another possibly known fact but reiterated here.
    For an EMS with an Upsert operation:

    a) If Empty tags or tags with no value are sent like <Tag1/> or <Tag1></Tag1>, the DO will be overwritten with --null--
    b) If Empty tags are suppressed ie not generated at all, the corresponding DO field will NOT be overwritten. The field will have whatever value existed previously. 

    For an EMS with an Insert operation, both tags with an empty value and no tags result in –null-- being written to the DO.

    Hope this helps ..

    Happy 4th!

    Thursday May 24, 2012

    How to Set JVM Parameters in Oracle SOA 11G

    You know you need to tune the JVM and you already know what parameters to set: -Xmx, -verbose:gc ... Plus, there are plenty of WebLogic or Java documentation talking about them. Now you just need to ... set them. But, such a simple task is sometimes confusing.

    In your Oracle SOA 11G running within WebLogic, HOW and WHERE to set these parameters? Which files: startWebLogic.sh, setDomainEnv.sh or the server-starts in config.xml? And where in the files to set them, as the scripts seem to set and modify the parameter values a lot ...

    If you have this question, please see my blog  How to Set JVM Parameters in Oracle SOA 11G

    Tuesday Apr 10, 2012

    Running Built-In Test Simulator with SOA Suite Healthcare 11g in PS4 and PS5

    SOA Suite Healthcare Integration Pack provides a built-in utility to simulate an external endpoint for HL7 messaging flows. This lightweight, ant-based utility can be very useful to quickly simulate an endpoint for setting up round-trip HL7 messaging in a standalone, closed environment.

    This note gives an overview about the setup and usage of the simulator utility. It also points out the differences that users have to keep in mind when migrating from PS4 to PS5 release.

    [Read More]

    Wednesday Mar 14, 2012

    Enterprise-class SOA on Exalogic... what, why and how?

    Exalogic A-Team member Rupesh Ramachandran has recently blogged about his OpenWorld 2011 talk on enterprise-class SOA systems on Exalogic. He covers some of the basic "what, why, and how" questions about Exalogic. Please follow this link to the A-Team Exalogic blog to read more.

    Wednesday Feb 08, 2012

    Tune Audit Trail in SOA 11G to Avoid Memory and Transaction Problems

    Until 11.1.1.3, BPEL audit trails are saved to database in the same JTA transaction as the main transaction. This causes three main problems. What are the problems, what SOA 11.1.1.3 does differently to solve these problems? Please read my blog at at BlogSpot Tune Audit Trail in SOA 11G to Avoid Memory and Transaction Problems for more details.[Read More]

    Tuesday Feb 07, 2012

    11g purging white paper

    Its finally released!!! The 11g white paper on purging is now readily available on OTN. This white paper has been written by Michael Bousamra of Oracle SOA development with contributions from me and Sai of SOA development. You can find the 11g white paper here.

    As always comments welcome!

    Deepak
    About


    This is the blog for the Oracle FMW Architects team fondly known as the A-Team. The A-Team is the central, technical, outbound team as part of the FMW Development organization working with Oracle's largest and most important customers. We support Oracle Sales, Consulting and Support when deep technical and architectural help is needed from Oracle Development.
    Primarily this blog is tailored for SOA issues (BPEL, OSB, BPM, Adapters, CEP, B2B, JCAP)that are encountered by our team. Expect real solutions to customer problems, encountered during customer engagements.
    We will highlight best practices, workarounds, architectural discussions, and discuss topics that are relevant in the SOA technical space today.

    Search

    Archives
    « April 2014
    SunMonTueWedThuFriSat
      
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
       
           
    Today