X

Technical info and insight on using Oracle Documaker for customer communication and document automation

ODEE and JMS Performance

Andy Little
Technical Director

So you've just installed a brand-new Oracle Documaker Enterprise Edition system ("ODEE") and at some point during your implementation, you're going to have to scale the system. You probably are already familiar with ODEE's scaling properties, but let's review a little bit. In the past, with Standard Edition (a/k/a Documaker, Documaker RP, "ODSE", or various other names) scaling meant figuring out how to split up input files into multiple jobs, and then distribute those jobs to multiple executions of ODSE, commingling  the intermediary output, and then running a final print process. This creates a rigid framework that has to scale manually to meet increased volumes or reduced processing windows. Some years ago, Docupresentment (a/k/a IDS) came along and suddenly Documaker was adorned with a service-based interface that allow for real-time document generation both in batch and "batches of one". Docupresentment added some enhanced scaling capabilities, but still requires some amount of manual intervention for scaling large batches, and has limited automatic scaling capabilities. With ODEE and the database-backed processing capabilities combined with scalable technologies, you're in the driver's seat of a supercar in the world of truly scalable document automation. Under the hood, ODEE uses JMS queues to push around internal units of work from schedulers to workers, and as such requires a well-tuned JMS server to obtain the best performance. In this post, I'm going to discuss JMS configuration within WebLogic, and how you can implement JMS configuration for high-availability and failover with ODEE. Finally, we'll cover one facet of tuning, and that is JMS performance. Let's get started!

JMS Implementation

Let's review some of the JMS implementation details within WebLogic. The JMS components deployed by the ODEE installer consist of:

  • A Managed Server - this is a discrete JVM which hosts the JMS services. The managed server is named jms_server by default, but you can of course change this. Note that it's also possible to target the JMS services onto another managed server, and that's fine (however if you choose to target JMS services to the administrative server 'AdminServer', there is a WebLogic patch set that is needed. Refer to your ODEE documentation for specific details).
  • A JMS Server - not to be confused with the JVM, this JMS Server is a management containers for the JMS modules (and queues therein) that are targeted to them. This component maintains information on the persistent store is used for any messages that arrive on destinations, and maintain the states of durable subscribers created on the destinations.
  • A Persistent Store - a physical destination, either in file storage or database storage that is used to house persistent messages. If message delivery is critical, use of a persistent store is recommended, and using database storage is more scalable and flexible.
  • A Module - a collection of JMS queues and a connection factory which can be deployed to a target JMS Server. 
  • A Subdeployment - a grouping of JMS queues.
  • A Queue Connection Factory (QCF) - a JMS connection object, through which are exposed general configuration parameters including various client connection, default delivery, load balancing, and security.
  • Multiple Queues - JMS queues, through which are exposed configuration parameters and load balancing options.

The hierarchy of these objects looks like this in a default installation with a single assembly line.

  • Managed Server "jms_server"
    • JMS Server "AL11Server"
      • Persistent Store "AL1FileStore"
      • Module "AL1Module"
        • QCF "AL1QCF"
        • Subdeployment "AL1Sub" (note that the queues are organized at the module level, but are grouped for deployment purposes into a subdeployment).
        • Q-1 "ArchiverReq"
        • Q-2 "ArchiverRes"
        • Q-n ...

An ODEE Assembly Line has its own set of workers and therefore needs its own set of JMS resources - this is why the hierarchy of components is structured as it is: an Assembly Line has a JMS Server, JMS Module, Subdeployment, Queues, and a QCF. These can be collectively retargeted and migrated as scaling needs change. 

JMS Operation 

First, it is important to know that WebLogic JMS provides two load-balancing algorithms: Round Robin (default) and Random. In the round-robin algorithm, WebLogic maintains an ordering of physical destinations within the distributed destination. The messaging load is distributed across the physical destinations one at a time in the order that they are defined in the WebLogic Server configuration. Each WebLogic Server maintains an identical ordering, but may be at a different point within the ordering. Multiple threads of execution within a single server using a given distributed destination affect each other with respect to which physical destination a member is assigned to each time they produce a message. Round-robin is the default algorithm and doesn't need to be configured, and is recommended for Documaker.

When an ODEE Worker starts, it must connect to a queue destination as a consumer. When distributed destinations are used, WebLogic JMS must find a physical destination that the worker will receive messages from. The choice of which destination member to use is made only upon initial connection by using one of the load-balancing algorithms. From that point on, the consumer gets messages from that member only. When testing failover behavior of Workers and queues, you will notice how ODEE handles loss of queue connections. When a distributed JMS destination member goes down, the Worker will lose connection to the member, and will destroy the existing consumer. The Worker will attempt to re-establish queue connection by creating a new consumer, according to the selected load-balancing algorithm.

When a producer sends a message, WebLogic JMS looks at the destination where the message is being sent. If the destination is a distributed destination, WebLogic JMS makes a decision as to where the message will be sent. The producer will send to one of the destination members according to one of the load-balancing algorithms. The producer makes such a decision each time it sends a message. However, there is no compromise of ordering guarantees between a consumer and producer, because consumers are load balanced once, and are then pinned to a single destination member. If a producer attempts to send a persistent message to a distributed destination, every effort is made to first forward the message to distributed members that utilize a persistent store. However, if none of the distributed members utilize a persistent store, then the message will still be sent to one of the members according to the selected load- balancing algorithm. Therefore it is important to understand that JMS Servers do not share messages in a cluster unless additional configuration is performed to forward JMS messages between distributed queue members.

This specific configuration is in relation to JMS clustering, however, in our testing with ODEE 12.6.2 we found that it does not properly support the use of clustered JMS queues (we have found that some older versions may support clustered JMS queues). A primary objective in implementing high availability is to eliminate single points of failure (SPoFs), and clustering is a typical remediation for SPoFs. However, there is another option available in WebLogic that remediates SPoFs and that is service migration - this is a feature of WebLogic high availability. In this configuration, a cluster of WebLogic managed servers can be made, and can be scaled, and JMS service can be pinned to one cluster member, and automatically (or manually, if you prefer) migrated from an unhealthy cluster member to a healthy cluster member. This model requires a bit more effort to ensure the cluster members are sized appropriately to handle the work being passed through the system, however in our testing we have found that JMS services are extremely lightweight and trivial in terms of performance hit on system processing speed.

Summary:

  • WebLogic JMS implementation supports a high number of consumers and connections with just a single server.
  • WebLogic JMS connections are distributed round-robin across JMS servers.
  • Connections are established at worker startup and are held for the life of the worker.
  • JMS Messages are not shared across clustered JMS queues by default, and can be forwarded – but this is not the default behavior and must be explicitly set. Some ODEE versions do not support clustered JMS with forwarded messages, so this is not the best practice.
  • JMS Messages are not shared across uniform distributed queues if they do not all utilize persistent storage.
  • Low worker instance/thread count resulting in fewer connections to JMS servers will not saturate the connections
  • Worker starvation can occur if messages are concentrated on one JMS server over another. Conversely, workers can be overworked by the same concentration.
  • High availability is achieved by implementing migratable services, which ensures that JMS services are available on a healthy cluster member at all times.

Configuration

Failover configuration for JMS can take several forms depending on your level of tolerance for message loss. Since this post is specifically dealing with performance I'm not going to cover failover in great detail. In general, JMS services can be configured for service migration which meets the failover requirement. To modify the default deployment of ODEE to support highly available configuration, perform the following configuration steps in WebLogic Console.

These instructions assume you have an existing ODEE installation that is already deployed to WebLogic, which means you have a machine (node), on which are multiple managed servers (one of which is hosting JMS modules). These instructions assume some familiarity with WebLogic Console, which is where this configuration takes place.

  1. Create an additional machine ("machine2")
  2. Create an additional managed server ("jms_server2")
  3. Create a cluster containing the original managed server hosting JMS, and the new additional managed server.
  4. Configure the server to support migration
    1. On the cluster containing the JMS managed servers, click on the Migration tab and change the Migration Basis to Consensus
    2. Make sure the available machines are shown in chosen Candidate Machines.
    3. Save
    4. Under Environment > Clusters > Migratable Targets, select the JMS server hosting the JMS module and change the Migration policy to Auto-Migrate Exactly Once. 
    5. Make sure the available JMS managed servers are shown in chosen Constrained Candidate Servers.
    6. Save.
  5. Configure the JMS Server for migration (you can reuse the existing JMS server created by ODEE install, or create a new one)
    1. Update the JMS server to use a JDBC persistent store, targeted to the migratable managed server
    2. Change the JMS server to target the migratable managed server
    3. Change the JMS module to target the JMS cluster -- all members.
    4. Change the JMS submodule to target the JMS server targeted to the migratable managed server.
  6. Update the jms.provider.URL setting in Documaker Administrator (Systems -> Assembly Line n -> Configure -> QUEUES_CFG -> Bus) with a comma-delimited list of hostnames and ports for your servers. For example, in step 1 you created additional machines. You will need to update the jms.provider.URL setting with "t3://server_a:11001,server_b:11001" to match the hostnames and ports for each member. The ports should be the same across all machines.

During a failover scenario, this configuration should act as follows: If the server instance that is hosting the JMS deployment should fail, then the services are automatically migrated to the next member of the cluster. The products and consumers using those JMS resources will then fail to connect to the now-nonexistent service on the now-dead server, and connection will be established to the next server in the list provided by jms.provider.URL setting. Messages remain intact if the persistent store is a database.  

JMS Monitoring

One method of performance tuning an ODEE implementation involves determining how efficient workers are handling the workload. Because every implementation is different (different inputs, documents, and rules), there isn't a one-size-fits-all solution. There are a number of activities that you can undertake to give visibility into your system, and one such activity is to monitor your JMS queues. Each queue can expose information about how many messages it contains, the high water mark of messages (e.g. the maximum number of messages that existed in the queue), the number of active consumers, and more. For our purposes, we are interested in, for each queue, the number of consumers and messages, and the high water mark of messages. If you've spent any time digging around in WebLogic console, you will soon learn that capturing enough of this information to conduct trend analysis is somewhat painful, requiring a lot of configuration and overhead. Luckily, I have put together a handy script that you can run in WLST to capture or display information. You can download the script here.

########## USER SETTINGS ############
# connection to WebLogic Instance
username='<weblogic_user_id>'
password='<weblogic_password>'
wlsUrl='t3://<hostname>:<port>'
# milliseconds to wait between polls to JMS queues
sleepTime=5000;
# comma-delimited list of managed servers hosting JMS services to query.
includeServer = ['jms_server'];
# comma-delimited list of JMS servers (note: not managed servers!) to query.
includeJms = ['AL1Server'];
# comma-delimited list of JMS destinations to query.
includeDestinations = ['IdentifierReq','PresenterReq','AssemblerReq','DistributorReq','ArchiverReq']
#ReceiverReq,ReceiverRes,PubNotifierReq,BatcherReq,SchedulerReq,PublisherReq
# path/file name of logfile to write output
logfilename = 'jmsmon.csv';
# Logging output options:
# 0 - log to screen and file
# 1 - log to file
# 2 - log to screen
logoption = 0
############ END USER SETTINGS ###########
import time
from time import gmtime, strftime
def getTime():
 return strftime("%Y-%m-%d %H:%M:%S", gmtime())
def monitorJms():
 servers = domainRuntimeService.getServerRuntimes();
 if (len(servers) > 0):
     for server in servers:
       serverName = server.getName()
       if serverName in includeServer:
                jmsRuntime = server.getJMSRuntime();
                jmsServers = jmsRuntime.getJMSServers();
                for jmsServer in jmsServers:
                        jmsName = jmsServer.getName();
                        if jmsName in includeJms:
                                destinations = jmsServer.getDestinations();
                                for destination in destinations:
                                        destName = destination.getName();
                                        destName = destName[destName.find('@')+1:];
                                        if destName in includeDestinations:
                                                try:
                                                        if (logoption < 2):
                                                                f.write("%s,%s,%s,%s,%s,%s,%s\n" %(getTime(),serverName,jmsName,destName,destination.getMessagesCurrentCount(),destination.getMessagesHighCount(),destination.getConsumersCurrentCount()));
                                                        if (logoption == 0 or logoption == 2):
                                                                print("%s\t%s\t%s\t%s\t%s,%s\t\t\t%s" %(getTime(),serverName,jmsName,destName,destination.getMessagesCurrentCount(),destination.getMessagesHighCount(),destination.getConsumersCurrentCount()));
                                                except:
                                                        if (logoption < 2):
                                                                f.write('ERROR_DATA\n');
                                                        if (logoption == 0 or logoption == 2):
                                                                print('ERROR_DATA!');
connect(username,password, wlsUrl);
if (logoption < 2):
  f = open(logfilename,'a+');
  f.write('Time,ServerName,JMSServer,Destination,Msgs Cur,Msgs High,ConsumersCur\n');
if (logoption == 0 or logoption == 2):
  print 'Time\t\t\tServerName\tJMSServer\tDestName\tMesg Cur,High\tCons. Cur Count';
try:
    while 1:
          monitorJms();
          if (logoption == 0 | logoption == 2):
                print('--');
          java.lang.Thread.sleep(sleepTime);
except KeyboardInterrupt:
    if (logoption < 2):
        f.close;

This script will output either to a file (as comma-separated values) or the terminal (as formatted output) a listing of each of the desired JMS servers and queues, and the message depths/high water mark and consumer count. To configure for your environment, you can drop the contents of the above into a file called jmsmon.py in your [ODEE_HOME]/documaker/j2ee/weblogic/oracle11g/scripts folder, and then add a shell script file to execute it, which is a simple file with these commands:

#!/bin/sh

. ./set_middleware_env.sh > /dev/null

wlst.sh jmsmon.py

Edit the .py file and adjust the settings as necessary. You'll notice that the user settings are contained at the top of the file. The only settings you must change are the username, password, and WebLogic connection URL for server/port. You can optionally change the settings for includeServer, includeJms, and includeDestinations. Each of these settings is a comma-delimited array of names that you want to be polled and included in the results. If you have multiple JMS managed servers, add them to includeServer. If you have multiple JMS servers, add them to includeJms. You can specify which destinations are included by adding them to includeDestinations - note that this group is used for all managed servers and JMS servers. In this way, if you have a clustered configuration or multiple assembly lines, you can capture the statistics for all of them using this script. Note that the default settings are to log to screen and file, and the screen uses tab-formatted output, while file output is comma-separated values for analysis in a software package like Excel.

The script is meant to be executed during a load test, usually of at least 100 transactions or more to get some useful data for analysis.  Run the script and start your test. While the load test is underway, you will see the current messages and high-water mark on these queues ramp up considerably, because these are the workers that typically take more time to complete a unit of work, so there will be a backlog of work.  In my particular test case, I'm running 1,000 transactions, all of which are routed for manual intervention and so will not proceed beyond the Assembler worker. If I modify the script only to query the Assembler worker, we can see the number of messages waiting. This test tells us that the Assembler is pumping through around 125 transactions every 5 seconds or so, with a single Assembler instance running. I happen to know that these are relatively complex transactions, and this particular system is a virtual machine running on a laptop with the database, application server, and processing services consolidated to a single virtual machine so my performance expectations are low. By examining the consumer count (1) we know that the load balancing algorithm built into ODEE is not kicking in, based on the default configuration. The load balancer configuration allows the Scheduler work to query each worker at regular intervals. If the worker is able to respond within a specified time frame, it is deemed to be idle. If it is unable to respond within the time frame, it is deemed to be busy. After a predefined number of busy responses, the Scheduler will start up another worker instance (or thread pool, depending on the type of worker) as long as the configured maximum has not been reached. In the example above, if I was unhappy with the amount of time taken to run this batch of jobs, I could lower the threshold for load balancing to kick in, or I could preconfigure the number of instances on startup to be higher. In either case, the goal is to prevent worker starvation across the assembly line by having enough workers to satisfy the demand, while balancing this within the confines of the processing cluster. I reviewed the Identifier queue figures in another run and the high water mark for messages in this queue was under 50 and the current message count was very low, meaning the Identifier was keeping with demand.

There is no predetermined performance configuration that will meet all needs, since each implementation is different, but this exercise will give you information to determine how to configure ODEE for your implementation and environment. Good luck!

Sources:

  1. High Availability Guide for FMW 
  2. JMS Configuration for WebLogic

Join the discussion

Comments ( 1 )
  • Andy Little Monday, November 18, 2019
    FYI - I just updated this article with some significant information on how you should configure JMS Servers for high availability and/or failover. Specifically, if you're using a JMS Server cluster, you need to make sure you have message forwarding enabled, and that both servers are using the same persistent storage model.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.