Tuesday Mar 17, 2009

This blog is continued on http://frankkieviet.blogspot.com

A few months ago I decided to look at a different place to host my blog. I compared Google's blogger and wordpress, and finally decided to go with the former.

The full URL is: http://frankkieviet.blogspot.com/ .

View Frank Kieviet's new blog on blogger.com

Wednesday May 21, 2008

JavaOne 2008

It's a week after JavaOne 2008 now. I finally have time to post a blog. I've been extremely busy for and before JavaOne: not only with the presentations that I gave at JavaOne, but also because the Java CAPS 6 code freeze was the week before JavaOne.

At JavaOne I gave three presentations:

For Java University (the day before JavaOne), I presented a part of Joe Boulenouar's class "How Java EE 5 and SOA Help in Architecting and Designing Robust Enterprise Applications". In my part I covered ESBs, JBI and Composite Applications.

A technical session: TS-5301 Sun Java Composite Application Platform Suite: Implementing Selected EAI Patterns. I presented this with Michael Czapski, a colleague in Sun's field organization in Australia. He's also the author of the book Java CAPS Basics: Implementing Common EAI Patterns. In this session we went over a number of EAI patterns from Hohpe and Woolf's book and showed that when you use the right Integration Middleware, you use these patterns almost without realizing it.

A Birds-of-a-feather session: BOF-6211: Transactions and Java Business Integration (JBI): More Than Java Message Service (JMS). I presented this with Murali Pottlapelli, a colleague in Monrovia. Since there was interest in the slides that we presented, and because unlike Sessions, the slides of BOFs are not made available by the JavaOne organization, you can download the slides of Transactions and JBI: More Than JMS from my blog. I also recorded the sound using my MP3 player, but the quality of the recording is pretty bad. Nevertheless, I've also uploaded the mp3 of Transactions and JBI: More Than JMS.

What's next? Now that CAPS 6 is almost out of the door, we're going to focus on the next release. Even more than in the past, we'll be doing this in open source. More to come!

Sunday Oct 07, 2007

Server side Internationalization made easy

Last year I wrote a blog entry on my gripes with Internationalization in Java for server side components. Sometime in January I built a few utilities for JMSJCA that makes internationalization for server side components a lot easier. To make it available to a larger audience, I added the utilities to the tools collection on http://hulp.dev.java.net. What do these utilities do?

Generating resource bundles automatically

The whole point was that when writing Java code, I would like to keep internationalizable texts close to my Java code. Rather than in resource bundles, I prefer to keep texts in my Java code so that:

  1. While coding, you don't need to keep switching between a Java file and a resource bundle.
  2. No more missing messages because of typos in error identifiers; no more obsolete messages in resource bundles
  3. You can easily review code to make sure that error messages make sense in the context in which they appear, and you can easily check that the arguments for the error messages indeed match.

In stead of writing code like this:

        sLog.log(Level.WARNING, "e_no_match_w_pattern", new Object[] { dir, pattern, ex}, ex);

Prefer code like this:

        sLog.log(Level.WARNING, sLoc.t("E131: Could not find files with pattern {1} in directory {0}: {2}"
          ,
dir, pattern, ex), ex);

Here's a complete example:

public class X {
    Logger sLog = Logger.getLogger(X.class.getName());
    Localizer sLoc = Localizer.get();
    public void test() {
        sLog.log(Level.WARNING, sLoc.t("E131: Could not find files with pattern {1} in directory {0}: {2}"
          ,
dir, pattern, ex), ex);
    }
}

Hulp has an Ant task that goes over the generated classes and extracts these phrases and writes them to a resource bundle. E.g. the above code results in this resource bundle:

# DO NOT EDIT
# THIS FILE IS GENERATED AUTOMATICALLY FROM JAVA SOURCES/CLASSES
# net.java.hulp.i18ntask.test.TaskTest.X
TEST-E131 = Could not find files with pattern {1} in directory {0}\\: {2}

To use the Ant task, add something like this to your Ant script, typically between <javac> and <jar>:

<taskdef name="i18n" classname="net.java.hulp.i18n.buildtools.I18NTask" classpath="lib/net.java.hulp.i18ntask.jar"/>
<i18n dir="${build.dir}/classes" file="src/net/java/hulp/i18ntest/msgs.properties" prefix="TEST" />

How does the Ant task know what strings should be copied into the resource bundle? It uses a regular expression for that. By default it looks for strings that start with a single alpha character, followed by three digits followed by a colon, which is this regular expression: [A-Z]\\d\\d\\d: .\*.

Getting messages out of resource bundles

With the full English message in the Java code, how is the proper localized message obtained? In the code above, this is done in this statement:

sLoc.t("E131: Could not find files with pattern {1} in directory {0}: {2}", dir, pattern, ex)

The method t takes the string, extracts the message ID out of it (E131) and uses the message ID plus prefix (TEST) to lookup the message in the right resource bundle, and returns the substituted text. The method t lives in class Localizer. This is a class that needs to be declared in the package where the resource bundles are placed. The class derives from net.java.hulp.i18n.LocalizationSupport. E.g.:

public static class Localizer extends net.java.hulp.i18n.LocalizationSupport {
    public Localizer() {
        super("TEST");
    }
    private static final Localizer s = new Localizer();
    public static Localizer get() {
        return s;
    }
}

The class name should be Localizer so that the Ant task can be extended later to automatically detect which packages use which resource bundles.

Using the compiler to enforce internationalized code

It would be nice if the compiler could force internationalized messages to be used. To do that, Hulp includes a wrapper around java.util.logging.Logger that only takes objects of class LocalizedString instead of just String. The class LocalizedString is a simple wrapper around String. The Localizer class produces these strings. By avoiding using java.util.logging.Logger directly, and instead using net.java.hulp.i18n.Logger the compiler will force you to use internationalized texts. Here's a full example:

public class X {
    net.java.hulp.i18n.Logger sLog = Logger.getLogger(X.class);
    Localizer sLoc = Localizer.get();
    public void test() {
        sLog.warn(sLoc.x("E131: Could not find files with pattern {1} in directory {0}: {2}"
          ,
dir, pattern, ex), ex);
    }
}

Logging is one area that requires internationalization, another is exceptions. Unfortunately there's no general approach to force internationalized messages in exceptions. You can only do that if you define your own exception class that takes the LocalizedString in the constructor, or define a separate exception factory that takes this string class in the factory method.

Download

Go to http://hulp.dev.java.net to download these utilities. The jars (the Ant task and utilities) are also hosted on the Maven repository on java.net.

Sunday Sep 30, 2007

Using Nested Diagnostics Contexts in Glassfish

What is a Nested Diagnostics Context?

Let's say that we're writing a message driven bean (MDB) that we'll deploy on Glassfish. Let's say that the MDB's onMessage() method grabs the payload of the message and calls into a stateless session bean (SLSB) for processing. Let's say that the implementation of the SLSB calls into org.apache.xparser:

my.company.MDB > my.company.SLSB > org.apache.xparser

Let's say that he xparser package may log some warnings if the payload is not properly formatted. No problem so far. Now let's say that the application is put in production together with a dozen other applications and that many of these applications use this library. The administrator once in a while finds these warnings in Glassfish's server.log:

[#|2007-09-17T18:36:03.247-0400|WARN|sun-appserver9.1|org.apache.xparser.Parser
    |_ThreadID=18; ThreadName=ConsumerMessageQueue:(1);
    |Encoding missing, assuming UTF-8|#]

Let's say that the administrator wants to relay this information to the developer responsible for this application. Using the category name org.apache.xparser.Parser, the administrator can find out what code is responsible (a third party component in this case), but how can the administrator find out which application is responsible for this log output?

One approach is to always log the application name before calling into the SLSB, so that the administrator can find the application name using the _ThreadID: he would look at the _ThreadID of the warning, then look for a message earlier in the log that has the same _ThreadID that identifies the application. Not only is this cumbersome, it's also a big problem that the application now fills up the log with the application name just in case the SLSB would log something.

It would be nice if somehow the MDB could associate the thread with the application name, so that if code downstream logs anything, the log message will be adorned with the application name:

[#|2007-09-17T18:36:03.247-0400|WARN|sun-appserver9.1|org.apache.xparser.Parser
    |_ThreadID=18; Context=Payrollsync; ThreadName=ConsumerMessageQueue:(1);
    |Encoding missing, assuming UTF-8|#]

In Log4J, this is quite simple using Log4J's NDC class: before the MDB calls into the SLSB, it would call NDC.push("Payrollsync") to push the context onto the stack, and after the SLSB it would call NDC.pop(). NDC stands for Nested Diagnostic Context. It's called nested because it maintains a stack, so that the SLSB could push another context onto the stack, hiding the context of the MDB, and pop the context off the stack to restore the stack in its original state before returning. Of course each thread has to have its own stack.

The NDC is a nice facility in Log4J. Unfortunately, in java.util.logging there's no such facility. Let's build one!

Building a Nested Diagnostic Context

The Nested Diagnostics Context will have to keep a stack per thread. When the logging mechanism logs something, it needs to peek at the stack and add the top most item on the stack to the log message. The stack needs to be accessible somehow by both the application that sets the context and by the logging mechanism. A complicating factor in this is that it needs to work with both delegating-first and self-first classloaders. The latter is found in some web applications (special setting in sun-web.xml) and in some JBI components. Furthermore, we would like to use this mechanism in Glassfish and avoid having to make changes to the Glassfish codebase. Lastly, we need to avoid making changes to the application that would cause the application to be no longer portable to other application servers.

How do we expose an API to the application so that it can push and pop contexts onto and off the stack? We could define a new Java API, but that would mean that unless an application packages the jar with that new API, it cannot be deployed on an application server instance that doesn't have the NDC code in its classpath. Here's a solution that doesn't require a new API: reuse the existing java.util.logging.Logger API! We'll define two special logger names, one for pushing contexts onto the stack, and one for popping contexts off the stack. Since we're tapping into the log stream anyways, this is not as far a stretch as it may seem. Here's how an application uses this mechanism:
Logger.getLogger("com.sun.EnterContext").fine("Payrollsync");
slsb.process(msg.getText());
Logger.getLogger("com.sun.ExitContext").fine("Payrollsync");

The loggers com.sun.EnterContext and com.sun.ExitContext are special loggers that we'll develop; messages written to these loggers directly interact with the context stack. Through these special loggers, this example will result in adding the context to any log messages that are produced in the slsb.process(msg) call. On other application servers without these special loggers, this will result in logging the context at FINE level before and after the call to the SLSB is made, so that one can associate a log message using the _ThreadID; it will not do anything if FINE logging is turned off.

What if we want to add more than one context parameter to the log message? For instance, what if we want to add the ID of the message that we're processing?

Logger.getLogger("com.sun.EnterContext")
    .log(Level.FINE, {0}={1}, {2}={3}, new Object[] {"Application", "Payrollsync", "Msgid", msg.getMessageID()});
slsb.process(msg.getText());
Logger.getLogger("com.sun.ExitContext") .log(Level.FINE, {0}={1}, {2}={3}, new Object[] {"Application", "Payrollsync", "Msgid", msg.getMessageID()});

The special logger will take the Object[] and push these on the stack. The message string "{0}={1}, {2}={3}" is there merely for portability: if the the code is deployed onto an application server to which we didn't install the NDC facilities, this will simply log the context parameters at FINE level.

Implementation

In a stand alone Java application, you would simply set your own LogManager and implement the NDC functionality there. Glassfish already comes with its own LogManager, and we don't want to override that. Rather, we want to plug in new functionality without any changes to the existing code base. Here's what we need to do:
  1. create the special loggers com.sun.EnterContext and com.sun.ExitContext
  2. hookup these special loggers
  3. hook into the log stream to print out the context

To create the special loggers, we can simply create a new class that derives from java.util.logging.Logger, say EntryLogger. Next, we need to make sure that when someone calls Logger.getLogger("com.sun.EnterContext"), it will be this class that is returned. Without making any changes to the LogManager, the way that that can be accomplished is by instantiating the new EntryLogger and registering it with the LogManager immediately. This has to be done before anybody calls Logger.getLogger("com.sun.EnterContext"). In other words, we should do this before any application starts. In Glassfish there's an extensibility mechanism called LifeCycleListeners. An object that implements this interface can be loaded by Glassfish automatically upon startup.

Lastly, we need to find a way to add the context to the log entries in the log. Glassfish already has a mechanism to add key-value pairs to each log entry: when formatting a LogRecord for printing, Glassfish calls LogRecord.getParameters() and checks each object in the returned Object[] for objects that implement java.util.Map and java.util.Collection. For objects that implement java.util.Map, Glassfish adds the key-value pairs to the log message. For objects that implement java.util.Collection, Glassfish adds each entry as a String to the log message.

If each LogRecord can somehow be intercepted before it reaches Glassfish's Formatter, the context can be added as an extra parameter to the LogRecord's parameter list. This can be done by adding a new java.util.logging.Handler to the root-Logger before Glassfish's own Handler. For each LogRecord that this new Handler receives, it will inspect the Context stack and add a Map with the Context to the LogRecord. Next, the root-Logger will send the LogRecord to Glassfish's own Handler which takes care of printing the message into the log. Once again, the LifeCycleListener is the ideal place to register the new Handler.

Give it a spin!

You can download the jar that has these new classes and/or download the sources. Put the jar in Glassfish's lib directory. Restart the server and install the LifeCycleListener:

LifeCycleListener Configuration

Tuesday May 15, 2007

JavaOne 2007

All of last week I was at JavaOne. It was an exhausting but very interesting week. Like last year, there were many interesting sessions, too many to list them here. Let me just mention the one I enjoyed most was the one by Neal Gafter on Closures for the Java Programming Language (BOF-2358). I can't wait until they're in the Java language!

Not only did I attend sessions and BOFs, I also presented BOFs. Three of them to be precise. I recorded the audio on my MP3 player. Unfortunately the quality of the audio is pretty bad. I'm posting the audio recordings below. I'm also posting the slides. Here they are:

BOF8847: Developing Components for Java Business Integration: Binding Components and Service Engines

Presented by Frank Kieviet, Alex Fung, Sherry Weng, and Srinivasan Chikkala
Attendance: about 100

You cannot cover how to write JBI components in just 45 minutes. We were also not sure about what the audience was interested in. That's why we assumed that the audience would consist mostly of people who have never written a JBI component before, and are relatively new to JBI. That's why we decided to talk mostly about general information on JBI and JBI components, and highlight the power of JBI and discuss how to go about developing one.

As an experiment I wanted to try a new format (at least new for me): rather than slicing up the session into four parts of 10 minutes, we cast the session into a "discussion forum". Of course the questions and answers (and even the jokes) were well rehearsed.

Unfortunately, the audio/visual people that control the meeting rooms, had forgotten to start the session timer. As a result the audio was cut unexpectedly just a minute before we could finish up.

Nevertheless, I think it was an interesting session.

Presentation JavaOne07-BOF8847 (pdf)

Audio JavaOne07-BOF8847 (mp3)


BOF8745: Leveraging Java EE in JBI and vice versa

Presented by Frank Kieviet and Bhavanishankara Sapaliga
Attendance: about 60

This BOF was originally to be presented by Vikas Awasthi and Bhavanishankara Sapaliga, but Vikas couldn't make it, so I replaced him. We focused the session on how JBI and EE can play together, trying to make it interesting for both JBI application developers as well as for EE developers. At the end I ran a demo with NetBeans showing three different scenarios. The demo-gods were with me: the demo went very smoothly. Unfortunately I forgot to demo how to add an EJB to a composite application. Another valuable lesson learned.

Presentation JavaOne07-BOF8745 (pdf)

Audio JavaOne07-BOF8745 (mp3)


BOF9982: The java.lang.OutOfMemoryError: PermGen Space error demystified

Presented by Edward Chou and Frank Kieviet
Attendance: about 116

This session was on Thursday night at 10pm. That night was the JavaOne After dark bash. Free beers, music and snacks for everyone. Therefore we didn't expect much of an attendance: memory leaks are a rather dry subject, and why leave the party early to go to this session? Also, some of our thunder had been stolen by SAP who demo-ed a tool to track memory leaks in a morning-session earlier that week. So we were quite surprised when about 116 people turned up for our session. Most stayed until the very end, and there were also quite a few interesting questions. Apparently a lot of people struggle with memory leaks in permgen space -- in my presentation I mention that I get about a hundred hits on my blog every day from people who search for this memory exception in Google.

Presentation JavaOne07-BOF9982 (pdf)

Audio JavaOne07-BOF9982 (mp3)

Wednesday May 02, 2007

JavaOne / memory leaks revisited...

Memory leaks in print 

A few months ago, Gregg Sporar together with A. Sundararajan started an article on memory leaks in the magazine Software Test & Performance. While writing that, he stumbled upon my blog and decided to cover the "java.lang.OutOfMemoryError: PermGen space" exception too. I offered to collaborate on the article. The article eventually grew so much it was split in two. Part one was published a month ago. Yesterday, part two was published.

Memory leaks at JavaOne

Edward Chou submitted a proposal for a BOF at JavaOne 2007. He and I will be presenting a BOF on the "java.lang.OutOfMemoryError: PermGen space" exception. I'll try to record the session with my MP3 player and post it on my blog.

In preparation for our presentation, we've been looking at some real-life examples of permgen memory leaks. We took a few memory dumps that came from actual customers in actual production environments. We discovered a few more improvements we could make to jhat: it was already fairly simple to track the leaks with jhat; with these changes it becomes really simple. We were actually quite surprised how simple. More on that in a future entry, either on my blog or on Edward's.

More at JavaOne

Speaking about JavaOne... I have my hands full. Next to the memory leaks BOF, I'm also presenting a BOF on JBI ("How to develop JBI components") and I'll be co-presenting another BOF on "EE and JBI."


Sunday Feb 04, 2007

Using Resource Adapters outside of EE containers

Java EE (J2EE) application servers usually are the ideal choice to host your back-end applications, especially if your applications require transactions, security or state management. Running in an application server, your application can easily connect to external systems provided by Resource Adapters. E.g. if your application would have to connect to a CRM (e.g. PeopleSoft), to an ERP (e.g. SAP) or to a system such as JMS, or even to a database (e.g. Oracle), it would use a Resource Adapter. Through this, the application would use advanced features such as connection pooling, connection failure detection and recovery, transactions, security propagation, etc.

As said, EE or J2EE application servers are ideal containers for hosting business side logic. But it's not the best choice in literally every situation. There are cases where you would prefer to write your application as a stand alone Java program. There's nothing strange about that: you should always use the best tools for the job at hand, and no single tool is best in all situations.

Now say that you need to write a stand-alone Java application, and in that application you would need to connect to an external system. Wouldn't it be nice to be able to use an off-the-shelf Resource Adapter for this connectivity, so that you would not have to hand-code features such as connection pooling, connection failure detection and recovery etc? As I will show, this is not that difficult.

Hacking a Resource Adapter

Resource Adapters are distributed in RAR files, i.e. a file with a .rar extension. A RAR is nothing more than a ZIP file. When you open up a RAR, you will see a bunch of jars and a descriptor file with the name ra.xml. In a nutshell, this is what you need to do to use a Resource Adapter:

  1. Add the jars to your application's classpath
  2. Instantiate the Resource Adapter classes (you can find out which ones by examining the ra.xml)
  3. Configure the Resource Adapter by calling a few setter methods (you can find which ones by examining the ra.xml)
  4. Activate the Resource Adapter (you can find out how by reading the Java Connector API specification, or just read on)

These four steps are basically what the application server does when it loads a Resource Adapter. There's a lot of logic involved in this, but in a stand-alone Java application we can take a lot of short-cuts so that the logic that we need to write is not too involved.

There's quite a big difference in how a Resource Adapter is used to provide outbound connectivity versus inbound connectivity. With outbound connectivity, I mean a situation where an application obtains a connection to an external system and reads or writes data to it. With inbound I mean a situation where the Resource Adapter listens for events from the external system and calls into your application when such an event occurs.

As an example we will use the JMSJCA resource adapter. This is an open-source adapter for JMS connectivity to various JMS servers.

Outbound connectivity

In a nutshell what we need to do is instantiate the Resource Adapter class, configure it, instanatiate a Managed Connection Factory, and from that obtain the application-facing connection factory.

When you open up the ra.xml file, you can find out class implements the ResourceAdapter interface:

    <resourceadapter>
        <resourceadapter-class>com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter</resourceadapter-class>

The Resource Adapter class has per the specification a no-args constructor and should implement the javax.resource.spi.ResourceAdapter interface. You could instantiate the ResourceAdapter simply by doing this:

    com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter ra = new com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter();

The drawback of this is that, although the chances of that being small, if the classname changes in future versions of the Resource Adapter, your application would no longer compile. A better approach would be to read the classname dynamically from the ra.xml. I'll leave that as "an excercise for the reader".

The Resource Adapter is a Java Bean with getters and setters. This is how you can configure the Resource Adapter. For example, we could set the connection URL as follows:

    ra.setConnectionURL("stcms://localhost:18007");

Next, we need to instantiate a  ManagedConnectionFactory.  Again from  the ra.xml, you can find the classname:

    <outbound-resourceadapter>
<connection-definition>
<managedconnectionfactory-class>com.stc.jmsjca.core.XMCFUnifiedXA</managedconnectionfactory-class>
 

Again, the ManagedConnectionFactory must have a no-arg constructor and is a Java bean, so you can simply instantiate and configure one as follows:

    com.stc.jmsjca.core.XMCFUnifiedXA mcf = new com.stc.jmsjca.core.XMCFUnifiedXA();
mcf.setUserName("Administrator");
mcf.setPassword("STC");

Next, you may need to associate the newly created ManagedConnectionFactory with the ResourceAdapter. Those that require this association (most likely all of them), implement the javax.resource.spi.ResourceAdapterAssociation interface. This leads to the following code:

    mcf.setResourceAdapter(ra);

Lastly, you create the application-facing connection factory. In case of JMS, this is a javax.jmx.ConnectionFactory. You can find evidence of this in the ra.xml:

   <outbound-resourceadapter>
<connection-definition>
<managedconnectionfactory-class>com.stc.jmsjca.core.XMCFUnifiedXA</managedconnectionfactory-class>
...
<connectionfactory-interface>javax.jms.ConnectionFactory</connectionfactory-interface>
<connectionfactory-impl-class>com.stc.jmsjca.core.JConnectionFactoryXA</connectionfactory-impl-class>
<connection-interface>javax.jms.Connection</connection-interface>
</connection-definition>

This leads to the following code:

javax.jms.ConnectionFactory f = (javax.jms.ConnectionFactory) mcf.createConnectionFactory();

Now, putting it all together, this is what you would need to create a JMS connection factory from the JMSJCA Resource Adapter:

    com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter ra = new com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter();
com.stc.jmsjca.core.XMCFUnifiedXA mcf = new com.stc.jmsjca.core.XMCFUnifiedXA();
ra.setConnectionURL("stcms://localhost:18007");
ra.setUserName("Administrator");
ra.setPassword("STC");
ra.setOptions("JMSJCA.NoXA=true");
mcf.setResourceAdapter(ra);
javax.jms.ConnectionFactory f = (javax.jms.ConnectionFactory) mcf.createConnectionFactory();

And that's all there's to it

As I mentioned, one of the advantages of using a Resource Adapter over using a client runtime directly, is that an Resource Adapter typically provides connection pooling and other nifty features. JMSJCA for instance provides a powerful and configurable connection manager with blocking behavior, time-out behavior, connection failure detection, etc. It even enlists the connection in the transaction if it detects that there is a transaction active when the connection is created.

Not all resource adapters will provide such a comprehensive connection manager: check the documentation of the resource adapter that you're planning to use. If it doesn't provide a connection manager to your liking, you can provide your own connection manager. If you need to write one, take a look at the one in JMSJCA: you may want to use it as a starting point.

Inbound connectivity

Inbound connectivity is where a resource adapter is used to receive messages from an external system; the resource adapter delivers these messages to a Message Driven Bean. In the case of JMS, this is javax.jms.MessageListener, with its void onMessage(javax.jms.Message) method. For other types of resource adapers, you will find other Message Driven Bean interfaces. Look in the ra.xml to find out which one.

Seting up inbound connectivity is a bit more involved than outbound connectivity. That is because with inbound, you need to explain to Resource Adapter how it should obtain a new instance of the Message Driven Bean, and how to obtain a thread that will call the onMessage() method or equivalent method.

We start with instantiating and configuring a ResourceAdapter object; the class can be found in ra.xml as I showed in the Outbound Connectivity section:

    com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter ra = new com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter();
ra.setConnectionURL("stcms://localhost:18007");
ra.setUserName("Administrator");
ra.setPassword("STC")

Next, we need to call start() on the ResourceAdapter object. This method takes a javax.resource.spi.BootstrapContext object. You need to provide an implementation for this class; the most important method that you need to implement is the public javax.resource.spi.work.WorkManager getWorkManager() method. As you can see, this method should return a WorkManager object. This class has a number of methods that provide access to a threadpool. It would help at this point if you have a bit of knowledge of the internals of the Resource Adapter that you're trying to work with: some Adapters don't use a WorkManager, and the ones that do, typically only use one of the methods on the WorkManager object. For instance, here's a WorkManager that could be used with JMSJCA:

public class XWorkManager implements WorkManager {
private java.util.concurrent.Executor mPool;

public XWorkManager(int poolsize) {
mPool = new PooledExecutor(new LinkedQueue(), poolsize);
}

public void scheduleWork(Work work) throws WorkException {
try {
mPool.execute(work);
} catch (InterruptedException e) {
throw new WorkException(e);
}
}

// other methods just throw an exception
}

Fortunately we can make use of the java.util.concurrent tools introduced in JDK 5.0 for a threadpool. Next, we need to provide an implementation for javax.resource.spi.endpoint.MessageEndpointFactory. This is an interface with only two methods. One is there to indicate if the message delivery should be transacted, and the other one to is there to create a MessageEndpoint. This is a class that is a proxy around the Message Driven Bean. This proxy should implement the onMessage() or equivalent method which should simply delegate to the MessageListener or equivalent object in your application. The proxy should also implement three additional methods:

    void afterDelivery()
void beforeDelivery(Method method)
void release()

The beforeDelivery() and afterDelivery() methods are called just before the Resource Adapter calls the onMessage() or equivalent method. You could start and commit a transaction in these methods; if you're not using transactions, you can just leave these methods unimplemented. The release() method is called by the Resource Adapter when it's done using a Message Driven Bean. If you don't implement some sort of pooling mechanism for Message Driven Beans, and you probably won't, you can leave this method empty as well.

Now that you have implementations for the BootstrapContext, WorkManager, MessageEndpointFactory, and MessageEndpoint, you can finally tell the Resource Adapter to start delivering messages to your Message Driven Bean. You do that by calling the void endpointActivation(javax.resource.spi.endpoint.MessageEndpointFactory, javax.resource.spi.ActivationSpec) method on the ResourceAdapter. The ActivationSpec object is implemented by the Resource Adapter; you can find the classname in the ra.xml file. It is a Java bean that is used to configure the message delivery. For instance, in the case of JMS, you specify from which queue or topic to get messages.

As you can see, the inbound connectivity case is not as straight forward as the outbound connectivity case, but still very much doable.

Real examples

For a full sample source listing, take a look at the test suite in JMSJCA You can also look at the JMSBC as part of the Open JBI Components project. This Binding Component uses the JMSJCA Resource Adapter as described in this blog entry. By doing so, a lot of development effort was saved dealing with all the idiosyncracies of various JMS server implementations, connection management, etc.

Sunday Jan 21, 2007

JMSJCA, a feature rich JMS Resource Adapter, is now a java.net project

JMSJCA is now a java.net project. It can be found here: http://jmsjca.dev.java.net.

The project is currently used in Java CAPS, JMS Grid and the JMS BC as part of the open-jbi-components project.

The connector can be used as a J2EE 1.4 Resource Adapter, but its libraries can also be used  as an abstraction layer to JMS servers from  non J2EE-code. As such, the adapter acts like a library that hides the complexities of transactions, concurrency, connection failure detection, JMS server implementation idiosyncracies, etc. That is how it is used in the JMS BC as part of the open-jbi-components project.

Monday Jan 15, 2007

Logless transactions

A few months ago, in my blog entry Transactions, disks, and performance I went into the importance of minimizing the number of writes. Transaction logging is one of those cases where minimizing the number of writes greatly enhances performance. In this entry, I'll describe a way to avoid transaction logging altogether.

What is transaction logging? Transaction logging refers to persisting the state of a two-phase transaction so that in the event of a crash, the transaction can either be committed or rolled back (recovered). I won't go into the details of what XA is; more information about XA transactions can be found elsewhere, e.g. in Mike Spille's XA Exposed.

Let me illustrate what recovery is using a "diagram". Consider an XA two phase transaction with three Resource Managers (RMa, RMb, and RMc). To indicate what happens at what time, I'll put all actions in a table; each row corresponds to a different time.

time
RMa
RMb
RMc
Coordinator
t1
start(xid1a, TMNOFLAGS)



t2

start(xid1b, TMNOFLAGS)

t3


start(xid1c, TMNOFLAGS)
t4
end(xid1a, TMSUCCESS)



t5

end(xid1b, TMSUCCESS)

t6


end(xid1c, TMSUCCESS)
t7
prepare(xid1a)



t8

prepare(xid1b)

t9


prepare(xid1c)
t10



log
t11
commit(xid1a, false)



t12

commit(xid1b, false)

t13


commit(xid1c, false)
t14



delete from log

At t10 the transaction manager records the decision to commit to the log. Let's say that the system crashes after t10, say between t11 and t12. When the system restarts, it will call recover() on all known Resource Managers and it will read the transaction log. In the transaction log it will find that xid1x was marked for commit. Through recover() it will find that xid1b and xid1c are in doubt. It knows that these two need to be committed because of the commit decision in the log.

What happens if the system crashes before the commit decision is written to the log, for example between t8 and t9? Upon recovery, the recover() method of RMa, RMb and RMc return xid1a and xid1b (but not xid1c because prepare was not called on RMc yet). The transaction manager will rollback RMa and RMb because no commit decision was found in the log.

SeeBeyond's Logless XA Transactions

Let's take a look at the recover() method on the XAResource. This method returns an array of Xid objects. Each Xid object holds two byte[] arrays. These two arrays represent the global transaction ID and the branch qualifier. They are typically random numbers picked by the transaction manager. The Resource Managers that receive these Xids should use these objects as identifiers and return them in the recover() method unmodified.

At SeeBeyond, Jerry Waldorf and Venugopalan Venkataraman came up with an idea to use the storage space in the byte[] arrays of the Xid as a way to persist the transaction state. Here's how it works. Let's modify the above example by removing transaction logging:

time
RMa
RMb
RMc
Coordinator
t1
start(xid1a, TMNOFLAGS)



t2

start(xid1b, TMNOFLAGS)

t3


start(xid1c, TMNOFLAGS)
t4
end(xid1a, TMSUCCESS)



t5

end(xid1b, TMSUCCESS)

t6


end(xid1c, TMSUCCESS)
t7


prepare(xid1c)

t8

prepare(xid1b)

t9
prepare(xid1a)


t10


commit(xid1c, false)
t11

commit(xid1b, false)

t12
commit(xid1a, false)



A commit decision is still being made, but this decision is no longer persisted in a separate transaction log. In stead, it is persisted in xid1a. If the system finds xid1a upon recovery, it knows that a commit decision was made. If it doesn't find xid1a, it knows that a commit decision was not made. Note that the order in which both prepare and commit are called on the three Resource Managers is very important.

As in the first example, if the system crashes before a commit decision has been made, it will rollback any resources upon recovery. E.g. if the system crashes between t8 and t9, it will encounter xid1c and xid1b and will call rollback() on these because it cannot find a record of a commit-decision for xid1, i.e. it cannot find xid1a. Hence, xid1b and xid1c need to be rolled back.

If the system crashes after a commit decision has been made, for example between t10 and t11, it will find xid1b and xid1a. Since xid1a signifies a commit decision, both xid1b and xid1a should be committed.

So far so good. But how does the transaction manager know that if it encounters xidb it should look for xida to figure out if a commit decision was made? This is where the transaction manager uses the byte[] of the Xid: it stores this information in one of them.

Complicating factors

A problem in this scheme occurs when the prepare(xid1a) method returns XA_RDONLY. If that happens, commit(xid1a, false) cannot be called, and RMa will not return xid1a upon calling recover(). Recall that xid1a had special significance! Hence it is important to order the Resource Managers such that the first one on which prepare() is called, is both reliable and will not return XA_RDONLY. However, in normal EE applications, the application prescribes in which order resources are enlisted in a transaction. Hence, to use this logless transaction scheme, the application server either needs to be extended with a way to specify resources a priori, or the application server needs to be extended with a learning capability so that it knows which resources are enlisted in a particular operation so that it can pick the right resource manager to write the commit decision to.

The SeeBeyond logless transaction approach is one of the ways that transaction logging can be made less exensive. In a future blog, I'll cover additional ones.

Monday Dec 11, 2006

Short note: Running Java CAPS on Java SE 6

Today Java SE 6 was released. It comes with many new features and cool tools. One of them being jmap as described in a previous log on permgen exceptions. The Integration Server is not officially supported on SE 6 yet. However, if you want to run the Java CAPS integration server on SE 6, this is what you can do:

  1.     Install JDK 6 somewhere, e.g. c:\\java
  2.     Install the IS somewhere, e.g. c:\\logicalhost
  3.     Rename c:\\logicalhost\\jre to c:\\logicalhost\\jre.old
  4.     Copy c:\\java\\jre1.6.0 to c:\\logicalhost\\jre
  5.     Copy c:\\java\\jdk1.6.0\\lib\\tools.jar to c:\\logicalhost\\jre\\lib
  6.     Copy c:\\logicalhost\\jre\\bin\\javaw.exe to c:\\logicalhost\\jre\\bin\\is_domain1.exe
  7.     Copy c:\\logicalhost\\jre\\bin\\javaw.exe to c:\\logicalhost\\jre\\bin\\ isprocmgr_domain1.exe
  8.     Edit c:\\logicalhost\\is\\domains\\domain1\\config\\domain.xml and comment out these lines:
<!--
<jvm-options>-Dcom.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager=com.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager</jvm-options>
<jvm-options>-Dorg.xml.sax.driver=com.sun.org.apache.xerces.internal.parsers.SAXParser</jvm-options>
<jvm-options>-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl</jvm-options>
<jvm-options>-Dcom.sun.org.apache.xerces.internal.xni.parser.XMLParserConfiguration=com.sun.org.apache.xerces.internal.parsers.XIncludeParserConfiguration</jvm-options>
<jvm-options>-Djavax.xml.transform.TransformerFactory=com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl</jvm-options>
<jvm-options>-Djavax.xml.parsers.SAXParserFactory=com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl</jvm-options>
<jvm-options>-Djavax.xml.soap.MessageFactory=com.sun.xml.messaging.saaj.soap.ver1_1.SOAPMessageFactory1_1Impl</jvm-options>
<jvm-options>-Djavax.xml.soap.SOAPFactory=com.sun.xml.messaging.saaj.soap.ver1_1.SOAPFactory1_1Impl</jvm-options>
<jvm-options>-Djavax.xml.soap.SOAPConnectionFactory=com.sun.xml.messaging.saaj.client.p2p.HttpSOAPConnectionFactory</jvm-options>
-->

i.e. add <!-- before and add --> after these lines.

Also comment out this line:

<!--        <jvm-options>-server</jvm-options>-->

After these changes you can run the Integration Server with Jave SE 6. These are not “official” recommendations (as mentioned, there’s no support for SE 6 just yet); also the lines commented out are optimizations, that need to be re-established for SE 6 tet, so don’t do any performance comparisons just yet.


 

Sunday Dec 03, 2006

Moving out of people management

Last week it was four years ago that I started at SeeBeyond. At SeeBeyond, I managed products (the JMS server and J2EE application server in Java CAPS), technology and people. The latter was part of the culture at SeeBeyond: the only way to have influence on any product was to have a team of people reporting to you.

Having responsibility for a team has been interesting. Over the past four years, I've seen people grow. I've seen people "turn around" on whom I was ready to give up. This was very satisfying. With them and through them, I've grown as well. However, in the past year I felt it was time for me to move to the next level. Also, with the increasing number of people in my team (two originally, eight at one point, and six lately), there was less and less time to stay involved with technology at a deep enough level.

When Sun acquired SeeBeyond last year, I was classified as people manager because of the fact that I had people reporting to me. As it turned out, Sun's culture is quite different from SeeBeyond's: there is a dual career ladder with appreciation and growth opportunities for both people managers and individual contributors. But unlike SeeBeyond, people managers primarily manage people and are less involved with technology. 11 people per manager is seen as the norm. It are the technical individual contributors that manage and shape products.

Moving up to the next step on Sun's career ladder, I requested a "diagonal promotion": up one level and from the people management track to the technology track. This week I got my promotion: I'm now a Senior Staff Engineer. Does this mean I'm now a heads-down techie? No, of course not. Sure there will be no more managing reports, but I'll now devote more time influencing people in other teams even outside of the organization. And yes, hopefully there'll be more time to dive a little deeper into a piece of technology.

Thursday Nov 16, 2006

Using java.util.logging in BEA Weblogic

This blog has moved to http://frankkieviet.blogspot.com

Wednesday Nov 15, 2006

More on... How to fix the dreaded "java.lang.OutOfMemoryError: PermGen space" exception (classloader leaks)

I got quite a few comments on my last blog (How to fix the dreaded "java.lang.OutOfMemoryError: PermGen space" exception (classloader leaks)). Apparently more people have been struggling with this problem.


Why bring this up? What's the news? Edward Chou continued to explore options to diagnose classloader leaks. First of all, he explored how to generate a list of orphaned classloaders with jhat. An orphaned classloader is a classloader that is not referenced by any object directly but cannot be garbage collected. The thinking behind this is that programs that create classloaders (e.g. application servers) do maintain references to them. So if there's a classloader that is no longer directly referenced, this classloader is probably a leak. Read about it on his blog (Find Orphaned Classloaders).

Still we were not satisfied: when examining some memory dumps from code that we were not familiar with, we explored yet some other options to diagnose classloader leaks: duplicate classes and duplicate classloaders. Let me explain.

Let's say that your application has a com.xyz.Controller class. If you find many instances of this class object, you likely have a classloader leak. Note the phrase "instances of this class object". What I mean by this: the class com.xyz.Controller is loaded multiple times, i.e. multiple instances of the com.xyz.Controller.class are present.  You can use jhat to run this query: simply list all instances of java.lang.Class.

Edward modified jhat to generate a list of all classloader instances that have an identical set of classes that it loaded. Typically there's no reason why someone would create two classloader instances and load exactly the same set of classes into them. If you find any in your memory dump, you should get suspicious and take a closer look. Monitor Edward's blog for more details on this.

One more thing: Edward found out that the method java.lang.String.intern() allocates memory in PermGen space. So if your application frequently uses this method with different strings, watch out. Fortunately these strings are subject to garbage collection. But if your application holds references to these strings, thereby making garbage collection impossible, your application may cause the dreaded "java.lang.OutOfMemoryError: PermGen space" exception. No classloaders involved this time.

Thursday Oct 19, 2006

How to fix the dreaded "java.lang.OutOfMemoryError: PermGen space" exception (classloader leaks)

This blog has moved to http://frankkieviet.blogspot.com

Monday Oct 16, 2006

Classloader leaks: the dreaded "java.lang.OutOfMemoryError: PermGen space" exception

This blog has moved to http://frankkieviet.blogspot.com

Wednesday Oct 11, 2006

Transactions, disks, and performance

So far I've written only a handful of blogs. My blogs are technical and dry and consequently the hit count of my blog is low. However, it has generated some interest from people in far away places and generated some interesting discussions through email. One of the things that came up is data integrity and disk performance. And that's the topic of this blog...

Maintaining data integrity

Consider a simple scenario: an MDB reads from a JMS queue, and for each message that it receives, it writes a record to a database. Let's say that you want to minimize the chance that you would lose or duplicate data in the case of failure. What kind of failure? Let's say that power failures or system crashes are your biggest worries.

How do we minimize the chance of losing or duplicating data? First of all, you want to make sure that receiving the message from the queue and the writing to the database happens in one single transaction. One of the easiest ways of doing that is to use an application server like Glassfish. How does Glassfish help? It starts a distributed (two phase) transaction. The transactional state is persisted by the JMS server and database, and by Glassfish through its transaction log. In the event of a system crash, the system will come back up and by using the persisted transactional state, Glassfish can make sure that either the transaction is fully committed or fully rolled back. Ergo, no data is lost or duplicated. The details of this are fascinating, but let's look at that some other time.

Thus, reliable persistence that is immune to system crashes (e.g. the Blue Screen of Death (BSOD)) or power failures is important. That means that when a program thinks that the data has been persisted on the disk, it in fact should have been written in the magnetic media of the disk. Why do I stress this? Because it's less trivial than you might think: the operating system and drive may cache the data in the write cache. Therefore you will have to disable the write cache in the OS on the one hand; in your application you will have to make sure that the data is indeed sync-ed to disk on the other hand.

Turning off the write cache, syncing to disk... no big deal, right? Unfortunately there are many systems in which the performance drops like a stone when you do that. Let's look at why that is, and how good systems solve this performance problem.

The expense of writing to disk

As you know, data on a disk is organized in concentrical circles. Each circle (track) is divided up in sectors. When the disk controller writes data to the disk, it has to position the write head to the right track and wait until the disk has rotated such that the beginning of the sector is under the write head. These are mechanical operations; they are not measured in nano seconds or micro seconds, but rather in milliseconds. As it turns out, a drive can do about one hundred of these operations per second. So, as a rule of thumb, a drive can perform one hundred writes per second. And this hasn't changed much in the past few decades, and it won't change in the next decade.

Writes per second is one aspect. The number of bytes that can be written per second is another.  This is measured in megabytes per second. Fortunately over the past decades this rate has steadily increased and it likely will continue to do so. That's why formatting a 300 Gb drive today doesn't take 10000 times longer than it took to format a 30 Mb drive 15 years ago.

Key is that the more you can send to the disk in one write operation, the more data you can write.  To demonstrate this, I've done some measurements on my desktop machine.
writes per second

This chart shows that the number of write operations per second remains constant up to approximately 32 kb. (Note that the x-axis is logarithmic) That means that whether you try to write 32768 bytes or just one single byte per write operation, the maximum number of writes per second remains the same. Of course the amount of data you can write to the disk goes up if you put more data in a single write operation. Put another way, if you need to write 1 kb chunks of data, you can process 32 times faster if you combine your chunks into one 32 kb chunk.
bytes per second

What you can see in this chart is that the amount of data you can write to the disk per second (data rate) increases with the number of bytes per write. Of course that can not increase infinitely: the data rate eventually becomes constant. (Note that the x-axis and y-axis are both logarithmic). The data rate at a write size of 32 kb is approx 3.4 Mb/sec and levels off at a write size of approx 1 Mb to 6 Mb/sec.

These measurements were done on a $300 PC with a cheap hard drive running Windows. I've done the same measurements on more expensive hardware with more expensive drives and found similar results: most drives perform slightly better, but the overall picture is the same.

The lesson: if you have to write small amounts of data, try to combine these in one write operation.

Multi threading

Back to our example of JMS and a database. As I mentioned, for each message picked up from the queue and written to the database, there are several write operations. For now, let's assume that the data written for the processing of single message cannot be combined into one write operation. However, if we process the messages concurrently, we can combine the data being written from multiple threads in one write operation!

Good transaction loggers, good databases that are optimize for transaction processing, and good JMS servers, all try to combine the data from multiple concurrent transactions into as few write operations as possible. As an example, here are some measurements I did against STCMS on the same hardware as above. STCMS is the JMS server that ships with Java CAPS.
messages per second

As you can see, for small messages the throughput of the system increases almost linearly with the number of threads. This happens because one of the design principles of STCMS is to consolidate as much data as possible in a single write operation. Of course when you increase the size of the messages, or if you increase the number of threads indefinitely, you will eventually hit the limit on the data rate. This is why the performance for large messages does not scale like it does for small messages. Of course when increasing the number of threads to large numbers, you will also hit other limits due to the overhead in thread switching.

Note that the measurements in this chart were done on a queue-to-queue and topic-to-topic scenario about three years ago. The STCMS shipping today performs a lot better, e.g. there's no difference in performance anymore between queues and topics, so don't use these numbers as benchmark numbers for STCMS; I'm just using them to prove the point of the power of write-consolidation.

Other optimizations in the transaction manager

Combining data that needs to be written to the disk from many threads into one write operation is definitely a good thing for both resource managers (the JMS server and database server in our example) and the transaction manager.  Are there other things that can be done specifically in the area of the transaction manager? Yes, there are: let's see if we can minimize the number of times data needs to be written for each message.

Last agent commit The first thing that comes to mind is last-agent commit optimization. That means that in our example, one of the resource managers will not do a two phase commit, but instead only do a single phase commit, thereby saving another write. Most transaction managers can do this, e.g. the transaction manager in Glassfish does this.

Piggy backing on another persistence store Instead of writing to its own persistence store, the transaction log could write its data to the persistence store of one of the resource managers participating in the transaction. By doing so, it can push its data into the same write operation that the resource manager needs to do anyway. For instance, the transaction manager can write its transaction state into the database. An extra table in the database is all it takes. Alternatively, the JMS server could provide some extensions so that the transaction log can write its data to the persistence store of the JMS server.

Logless transactions If the resource manager (e.g. JMS) doesn't have an API that allows the transaction manager to use its persistence store, the transaction manager could sneak in some data of its own in the XID. The XID is just a string of bytes that the resource manager treats as an opaque entity. The transaction manager could put some data in the XID that identifies the last Resource Manager; in the case of a system crash, the transaction manager will query all known resource managers and obtain a list of in-doubt XIDs. The presence of the in-doubt XID of the last resource in the transaction signifies that all resources were successfully prepared and that commit should be called; the absence signifies that the prepare phase was incomplete and that rollback should be called. This mechanism was invented by SeeBeyond a few years ago (patent pending). There are some intricacies in this system; perhaps a topic for a future blog? Leave a comment to this blog if you're interested.

Beyond the disk

The scenario we've been looking at is one in which data needs to be persisted so that no data gets lost in case of a system crash or power failure.  What about a disk failure? We could use a RAID system. Are there other ways? Sure! We could involve multiple computers: clustering! If each machine that needs to safeguard transaction data would also write its data to another node in a cluster, that data would survive a crash of the primary node. Assuming that a crash of two nodes at the same time is very unlikely, this forms a reliable solution. On both nodes, data can be written to the disk using a write caching scheme so that the number of writes per second is no longer the limiting factor.

We can go even further for short-lived objects. Consider this scenario:
    insert into A values ('a', 'b')
    update A set value = 'c' where pk = 'a'
    delete A where pk = 'a'
why write the value ('a', 'b') to the disk at all? Hence, a smart write cache can avoid any disk writes for short lived objects. Of course typical data that the transaction manager generates are short lived objects. JMS messages are another example of objects that may have a short life span.

A pluggable transaction manager... and what about Glassfish?

It would be nice if transaction managers had a pluggable architecture so that depending on your specific scenario, you could choose the appropriate transaction manager log persistence strategy.

Isn't clustering something difficult and expensive? Difficult to develop yes, but not difficult to use. Sun Java System Application Server 8.x EE already has strong support for clustering. Glassfish will soon have support for clustering too. And with that clustering will become available to everybody!

Monday Sep 18, 2006

A free course in Java EE (link)

I found this note in my mail:

The 11th session of "Java EE Programming (with Passion!)" free online course will start from October 23rd, 2006. The course contents are revised with new "look and feel" and new hands-on labs. Please see the following course website for more information. Course website: http://www.javapassion.com/j2ee Course FAQ: http://www.javapassion.com/j2ee/coursefaq.html

If you have anybody who wants to learn Java EE programing through hands-on style using NetBeans, please let them about this course.

Thursday Sep 14, 2006

Funny videos from Sun

Who knew... Sun makes pretty funny commercials. See here: Sun videos. While you're there, also take a look at Project LookingGlas... impressive stuff!

Sunday Sep 10, 2006

Testing connection failures in resource adapters

In a previous blog (see J2EE JCA Resource Adapters: Poisonous pools)  I talked about the difficulty to detect faulty connections to the external system (EIS) that the Resource Adapter is interfacing with. A common cause of connection failures is simply connection loss. In my previous blog I detailed the problems of detecting connection loss in the RA and presented a number of ways to address this problem.

What I didn't discuss is how you can build unit tests so that you can easily test connection failures and connection failure recovery. That is the topic of this blog.

What is it that I will be looking at? Say that you have a resource adapter that connects to an external system (EIS) using one or more TCP/IP connections. The failure I want to test is that of a simple connection drop. This can happen if the EIS is rebooted, crashes, or simply because the network temporarily fails (e.g. someone stumbles over a network cable).

An automated way to induce failures

How would you manually test this type of failure? You could simply kill the EIS, or unplug the network connection. Simple and effective, but very manual. What I prefer is an automated way so that I can include test failures in an automated unit test suite.  Here's a way to do that: use a port-forwarder proxy. A port forwarder proxy is a process that listens on a particular port. Any time a connection comes in on that port, it will create a new connection to a specified server and port. Bytes coming in from the client are sent to the server unmodified. Likewise, bytes coming in from the server are sent to the client unmodified. The client is the RA in this case; the server is the EIS. Hence, the port -forwarder proxy sits between the Resource Adapter and the EIS.

A failure can be induced by instructing the port-forwarder proxy to drop all the connections. To simulate a transient failure, the port-forwarder proxy should then again accept connections and forward them to the EIS.

Here is an example of how this port-forwarder proxy can be used in a JUnit test. Let's say that I'm testing a JMS resource adapter. Let's say that I'm testing the RA in an application server: I have an MDB deployed in the application server that reads JMS messages from one queue (e.g. Queue1) and forwards them to a different queue (e.g. Queue2). The test would look something like this:

  1. start the port-forwarder proxy; specify the server name and port number that the EIS is listening on
  2. get the port number that proxy is listening on
  3. generate a connection URL that the RA will use to connect to the EIS; the new connection URL will have the proxy's server name and port number rather than the server name and port number of the EIS
  4. update the EAR file with the new connection URL
  5. deploy the EAR file
  6. send 1000 messages to Queue1
  7. read these 1000 message from Queue2
  8. verify that the port-forwarder has received connections; then tell the port-forwarder to kill all active connections
  9. send another batch of 1000 messages to Queue1
  10. read this batch of 1000 messages from Queue2
  11. undeploy the EAR file

As you can see I'm assuming in this example that you are using an embedded resource adapter, i.e. one that is embedded in the EAR file. Ofcourse you can also make this work using global resource adapters; you just need a way to automatically configure the URL in the global resource adapter.

A port forwarder in Java

My first Java program I ever wrote was a port-forwarder proxy (Interactive Spy). I'm not sure when it was, but I do remember that people just started to use JDK 1.3. In that version of the JDK there was only blocking IO available. Based on that restriction a way to write a port-forwarder proxy is to listen on a port and for each incoming connection create two threads: one thread exclusively reads from the client and sends the bytes to the server; the other thread reads from the server and sends the bytes to the client. This was not a very scalable or elegant solution. However, I did use this solution successfully for a number of tests in the JMS test suite at SeeBeyond for the JMS Intelligent Queue Manager in Java CAPS (better known to engineers as STCMS). However, when I was developing the connection failure tests for the JMSJCA Resource Adapter for JMS, I ran into these scalability issues, and I saw test failures due to problems in the test setup rather than to bugs in the application server, JMSJCA or STCMS.

A better approach is to make use of the non-blocking NIO capabilities in JDK 1.4. For me this was the opportunity to explore the capabilities of NIO. The result is a relatively small class that fully stands on its own; I pasted the source code of the resulting class below. The usual restrictions apply: this code is provided "as-is" without warranty of any kind; use at your own risk, etc. I've omitted the unit test for this class.

This is how you use it: instantiate a TCPProxyNIO passing it the server name and port number of the EIS. The proxy will find a port to listen on in the range of 50000. Use getPort() to find out what the port number is. The proxy now listens on that port and is ready to accept connections. Use killAllConnections() to kill all connections. Make sure to destroy the proxy after use: call close().


/\*
\* The contents of this file are subject to the terms
\* of the Common Development and Distribution License
\* (the "License"). You may not use this file except
\* in compliance with the License.
\*
\* You can obtain a copy of the license at
\* glassfish/bootstrap/legal/CDDLv1.0.txt or
\* https://glassfish.dev.java.net/public/CDDLv1.0.html.
\* See the License for the specific language governing
\* permissions and limitations under the License.
\*
\* When distributing Covered Code, include this CDDL
\* HEADER in each file and include the License file at
\* glassfish/bootstrap/legal/CDDLv1.0.txt. If applicable,
\* add the following below this CDDL HEADER, with the
\* fields enclosed by brackets "[]" replaced with your
\* own identifying information: Portions Copyright [yyyy]
\* [name of copyright owner]
\*/

package com.stc.jmsjca.test.core;

import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.ServerSocket;
import java.nio.ByteBuffer;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.IdentityHashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Random;
import java.util.Set;
import java.util.concurrent.Semaphore;
import java.util.logging.Level;
import java.util.logging.Logger;

/\*\*
\* A proxy server that can be used in JUnit tests to induce connection
\* failures, to assure that connections are made, etc. The proxy server is
\* setup with a target server and port; it will listen on a port that it
\* chooses itself and delegates all data coming in to the server, and vice
\* versa.
\*
\* Implementation: each incoming connection (client connection) maps into
\* a Conduit; this holds both ends of the line, i.e. the client end
\* and the server end.
\*
\* Everything is based on non-blocking IO (NIO). The proxy creates one
\* extra thread to handle the NIO events.
\*
\* @author fkieviet
\*/
public class TcpProxyNIO implements Runnable {
private static Logger sLog = Logger.getLogger(TcpProxyNIO.class.getName());
private String mRelayServer;
private int mRelayPort;
private int mNPassThroughsCreated;
private Receptor mReceptor;
private Map mChannelToPipes = new IdentityHashMap();
private Selector selector;
private int mCmd;
private Semaphore mAck = new Semaphore(0);
private Object mCmdSync = new Object();
private Exception mStartupFailure;
private Exception mUnexpectedThreadFailure;
private boolean mStopped;

private static final int NONE = 0;
private static final int STOP = 1;
private static final int KILLALL = 2;
private static final int KILLLAST = 3;

private static int BUFFER_SIZE = 16384;

/\*\*
\* Constructor
\*
\* @param relayServer
\* @param port
\* @throws Exception
\*/
public TcpProxyNIO(String relayServer, int port) throws Exception {
mRelayServer = relayServer;
mRelayPort = port;

Receptor r = selectPort();
mReceptor = r;

new Thread(this, "TCPProxy on " + mReceptor.port).start();

mAck.acquire();
if (mStartupFailure != null) {
throw mStartupFailure;
}
}

/\*\*
\* Utility class to hold data that describes the proxy server
\* listening socket
\*/
private class Receptor {
public int port;
public ServerSocket serverSocket;
public ServerSocketChannel serverSocketChannel;

public Receptor(int port) {
this.port = port;
}

public void bind() throws IOException {
serverSocketChannel = ServerSocketChannel.open();
serverSocketChannel.configureBlocking(false);
serverSocket = serverSocketChannel.socket();
InetSocketAddress inetSocketAddress = new InetSocketAddress(port);
serverSocket.bind(inetSocketAddress);
}

public void close() {
if (serverSocketChannel != null) {
try {
serverSocketChannel.close();
} catch (Exception ignore) {

}
serverSocket = null;
}
}
}

/\*\*
\* The client or server connection
\*/
private class PipeEnd {
public SocketChannel channel;
public ByteBuffer buf;
public Conduit conduit;
public PipeEnd other;
public SelectionKey key;
public String name;

public PipeEnd(String name) {
buf = ByteBuffer.allocateDirect(BUFFER_SIZE);
buf.clear();
buf.flip();
this.name = "{" + name + "}";
}

public String toString() {
StringBuffer ret = new StringBuffer();
ret.append(name);
if (key != null) {
ret.append("; key: ");
if ((key.interestOps() & SelectionKey.OP_READ) != 0) {
ret.append("-READ-");
}
if ((key.interestOps() & SelectionKey.OP_WRITE) != 0) {
ret.append("-WRITE-");
}
if ((key.interestOps() & SelectionKey.OP_CONNECT) != 0) {
ret.append("-CONNECT-");
}
}
return ret.toString();
}

public void setChannel(SocketChannel channel2) throws IOException {
this.channel = channel2;
mChannelToPipes.put(channel, this);
channel.configureBlocking(false);
}

public void close() throws IOException {
mChannelToPipes.remove(channel);
try {
channel.close();
} catch (IOException e) {
// ignore

}
channel = null;
if (key != null) {
key.cancel();
key = null;
}
}

public void listenForRead(boolean on) {
if (on) {
key.interestOps(key.interestOps() | SelectionKey.OP_READ);
} else {
key.interestOps(key.interestOps() &~ SelectionKey.OP_READ);
}
}

public void listenForWrite(boolean on) {
if (on) {
key.interestOps(key.interestOps() | SelectionKey.OP_WRITE);
} else {
key.interestOps(key.interestOps() &~ SelectionKey.OP_WRITE);
}
}
}

/\*\*
\* Represents one link from the client to the server. It is an association
\* of the two ends of the link.
\*/
private class Conduit {
public PipeEnd client;
public PipeEnd server;
public int id;

public Conduit() {
client = new PipeEnd("CLIENT");
client.conduit = this;

server = new PipeEnd("SERVER");
server.conduit = this;

client.other = server;
server.other = client;

id = mNPassThroughsCreated++;
}
}

/\*\*
\* Finds a port to listen on
\*
\* @return a newly initialized receptor
\* @throws Exception on any failure
\*/
private Receptor selectPort() throws Exception {
Receptor ret;

// Find a port to listen on; try up to 100 port numbers

Random random = new Random();
for (int i = 0; i < 100; i++) {
int port = 50000 + random.nextInt(1000);
try {
ret = new Receptor(port);
ret.bind();
return ret;
} catch (IOException ignore) {
// Ignore

}
}
throw new Exception("Could not bind port");
}

/\*\*
\* The main event loop
\*/
public void run() {
// ===== STARTUP ==========
// The main thread will wait until the server is actually listening and ready
// to process incoming connections. Failures during startup should be
// propagated back to the calling thread.

try {
selector = Selector.open();

// Acceptor
mReceptor.serverSocketChannel.configureBlocking(false);
mReceptor.serverSocketChannel.register(selector, SelectionKey.OP_ACCEPT);
} catch (Exception e) {
synchronized (mCmdSync) {
mStartupFailure = e;
}
}

// ===== STARTUP COMPLETE ==========

// Tha main thread is waiting on the ack lock; notify the main thread.
// Startup errors are communicated through the mStartupFailure variable.
mAck.release();
if (mStartupFailure != null) {
return;
}

// ===== RUN: event loop ==========

// The proxy thread spends its life in this event handling loop in which
// it deals with requests from the main thread and from notifications from
// NIO.
try {
loop: for (;;) {
int nEvents = selector.select();

// ===== COMMANDS ==========

// Process requests from the main thread. The communication mechanism
// is simple: the command is communicated through a variable; the main
// thread waits until the mAck lock is set.
switch (getCmd()) {
case STOP: {
ack();
break loop;
}
case KILLALL: {
PipeEnd[] pipes = toPipeArray();
for (int i = 0; i < pipes.length; i++) {
pipes[i].close();
}
ack();
continue;
}
case KILLLAST: {
PipeEnd[] pipes = toPipeArray();
Conduit last = pipes.length > 0 ? pipes[0].conduit : null;
if (last != null) {
for (int i = 0; i < pipes.length; i++) {
if (pipes[i].conduit.id > last.id) {
last = pipes[i].conduit;
}
}
last.client.close();
last.server.close();
}
ack();
continue;
}
}

//===== NIO Event handling ==========

if (nEvents == 0) {
continue;
}
Set keySet = selector.selectedKeys();
for (Iterator iter = keySet.iterator(); iter.hasNext();) {
SelectionKey key = (SelectionKey) iter.next();
iter.remove();

//===== ACCEPT ==========

// A client connection has come in. Perform an async connect to
// the server. The remainder of the connect is going to be done in
// the CONNECT event handling.
if (key.isValid() && key.isAcceptable()) {
sLog.fine(">Incoming connection");
try {
Conduit pt = new Conduit();
ServerSocketChannel ss = (ServerSocketChannel) key.channel();

// Accept

pt.client.setChannel(ss.accept());

// Do asynchronous connect to relay server
pt.server.setChannel(SocketChannel.open());
pt.server.key = pt.server.channel.register(
selector, SelectionKey.OP_CONNECT);
pt.server.channel.connect(new InetSocketAddress(
mRelayServer, mRelayPort));
} catch (IOException e) {
System.err.println(">Unable to accept channel");
e.printStackTrace();
// selectionKey.cancel();

}
}

//===== CONNECT ==========
// Event that is generated when the connection to the server has
// completed. Here we need to initialize both pipe-ends. Both ends
// need to start reading. If the connection had not succeeded, the
// client needs to be closed immediately.
if (key != null && key.isValid() && key.isConnectable()) {
SocketChannel c = (SocketChannel) key.channel();
PipeEnd p = (PipeEnd) mChannelToPipes.get(c); // SERVER-SIDE

if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">CONNECT event on " + p + " -- other: " + p.other);
}

boolean success;
try {
success = c.finishConnect();
} catch (RuntimeException e) {
success = false;
if (sLog.isLoggable(Level.FINE)) {
sLog.log(Level.FINE, "Connect failed: " + e, e);
}
}
if (!success) {
// Connection failure

p.close();
p.other.close();

// Unregister the channel with this selector
key.cancel();
key = null;
} else {
// Connection was established successfully

// Both need to be in readmode; note that the key for
// "other" has not been created yet
p.other.key = p.other.channel.register(selector, SelectionKey.OP_READ);
p.key.interestOps(SelectionKey.OP_READ);
}

if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">END CONNECT event on " + p + " -- other: " + p.other);
}
}

//===== READ ==========

// Data was received. The data needs to be written to the other
// end. Note that data from client to server is processed one chunk
// at a time, i.e. a chunk of data is read from the client; then
// no new data is read from the client until the complete chunk
// is written to to the server. This is why the interest-fields
// in the key are toggled back and forth. Ofcourse the same holds

// true for data from the server to the client.
if (key != null && key.isValid() && key.isReadable()) {
PipeEnd p = (PipeEnd) mChannelToPipes.get(key.channel());
if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">READ event on " + p + " -- other: " + p.other);
}

// Read data

p.buf.clear();
int n;
try {
n = p.channel.read(p.buf);
} catch (IOException e) {
n = -1;
}

if (n >= 0) {
// Write to other end

p.buf.flip();
int nw = p.other.channel.write(p.buf);

if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">Read " + n + " from " + p.name + "; wrote " + nw);
}

p.other.listenForWrite(true);
p.listenForRead(false);
} else {
// Disconnected

if (sLog.isLoggable(Level.FINE)) {
sLog.fine("Disconnected");
}

p.close();
key = null;
if (sLog.isLoggable(Level.FINE)) {
sLog.fine("Now present: " + mChannelToPipes.size());
}

p.other.close();

// Stop reading from other side

if (p.other.channel != null) {
p.other.listenForRead(false);
p.other.listenForWrite(true);
}
}

if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">END READ event on " + p + " -- other: " + p.other);
}
}

//===== WRITE ==========

// Data was sent. As for READ, data is processed in chunks which
// is why the interest READ and WRITE bits are flipped.
// In the case a connection failure is detected, there still may be
// data in the READ buffer that was not read yet. Example, the
// client sends a LOGOFF message to the server, the server then sends
// back a BYE message back to the client; depending on when the

// write failure event comes in, the BYE message may still be in
// the buffer and must be read and sent to the client before the
// client connection is closed.
if (key != null && key.isValid() && key.isWritable()) {
PipeEnd p = (PipeEnd) mChannelToPipes.get(key.channel());

if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">WRITE event on " + p + " -- other: " + p.other);
}

// More to write?

if (p.other.buf.hasRemaining()) {
int n = p.channel.write(p.other.buf);
if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">Write some more to " + p.name + ": " + n);
}
} else {
if (p.other.channel != null) {
// Read from input again

p.other.buf.clear();
p.other.buf.flip();

p.other.listenForRead(true);
p.listenForWrite(false);

} else {
// Close

p.close();
key = null;
if (sLog.isLoggable(Level.FINE)) {
sLog.fine("Now present: " + mChannelToPipes.size());
}
}
}
if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">END WRITE event on " + p + " -- other: " + p.other);
}
}
}
}
} catch (Exception e) {
sLog.log(Level.SEVERE, "Proxy main loop error: " + e, e);
e.printStackTrace();
synchronized (mCmdSync) {
mUnexpectedThreadFailure = e;
}
}

// ===== CLEANUP =====

// The main event loop has exited; close all connections
try {
selector.close();
PipeEnd[] pipes = toPipeArray();
for (int i = 0; i < pipes.length; i++) {
pipes[i].close();
}
mReceptor.close();
} catch (IOException e) {
sLog.log(Level.SEVERE, "Cleanup error: " + e, e);
e.printStackTrace();
}
}

private PipeEnd[] toPipeArray() {
return (PipeEnd[]) mChannelToPipes.values().toArray(
new PipeEnd[mChannelToPipes.size()]);
}

private int getCmd() {
synchronized (mCmdSync) {
return mCmd;
}
}

private void ack() {
setCmd(NONE);
mAck.release();
}

private void setCmd(int cmd) {
synchronized (mCmdSync) {
mCmd = cmd;
}
}

private void request(int cmd) {
setCmd(cmd);
selector.wakeup();

try {
mAck.acquire();
} catch (InterruptedException e) {
// ignore
}
}

/\*\*
\* Closes the proxy
\*/
public void close() {
if (mStopped) {
return;
}
mStopped = true;

synchronized (mCmdSync) {
if (mUnexpectedThreadFailure != null) {
throw new RuntimeException("Unexpected thread exit: " + mUnexpectedThreadFailure, mUnexpectedThreadFailure);
}
}

request(STOP);
}

/\*\*
\* Restarts after close
\*
\* @throws Exception
\*/
public void restart() throws Exception {
close();

mChannelToPipes = new IdentityHashMap();
mAck = new Semaphore(0);
mStartupFailure = null;
mUnexpectedThreadFailure = null;

mReceptor.bind();
new Thread(this, "TCPProxy on " + mReceptor.port).start();
mStopped = false;

mAck.acquire();
if (mStartupFailure != null) {
throw mStartupFailure;
}
}

/\*\*
\* Returns the port number this proxy listens on
\*
\* @return port number
\*/
public int getPort() {
return mReceptor.port;
}

/\*\*
\* Kills all connections; data may be lost
\*/
public void killAllConnections() {
request(KILLALL);
}

/\*\*

\* Kills the last created connection; data may be lost
\*/
public void killLastConnection() {
request(KILLLAST);
}

/\*\*
\* Closes the proxy
\*
\* @param proxy
\*/
public void safeClose(TcpProxyNIO proxy) {
if (proxy != null) {
proxy.close();
}
}
}


Other uses of the port-forwarding proxy

What else can you do with this port-forwarding proxy? First of all, its use is not limited to outbound connections: ofcourse you can also use it to test connection failures on inbound connections.  Next, you can also use it to make sure that connections are infact being made the way that you expected. For example, the CAPS JMS server can use both SSL and non SSL connections. I added a few tests to our internal test suite to test this capability from JMSJCA. Here, just to make sure that the test itself works as expected, I'm using the proxy to find out if the URL was indeed modified and that the connections are indeed going to the JMS server's SSL port. Similary, you can also use the proxy to count the number connections being made. E.g. if you want to test that connection pooling indeed works, you can use this proxy and assert that the number of connections created does not exceed the number of connections in the pool.

Missing components

In the example I mentioned updating the EAR file and automatically deploying the EAR file to the application server. I've created tools for those too so that you can do those things programmatically as well. Drop me a line if you're interested and I'll blog about those too.

Sunday Aug 27, 2006

SeeBeyond was acquired by Sun a year ago. What changed?

It's been one year now since Sun Microsystems acquired SeeBeyond. What did it mean for SeeBeyond employees like me? What did and will it mean for customers?

What didn't change. First of all, Sun didn't come in and turn the place upside down. Instead it left it pretty much untouched. There were no major reorganizations. Nobody got fired. We didn't change the way we develop software. We didn't change the plans of the products that we were working on.

Culture shock. The old-SeeBeyond was a company of secrecy and need-to-know-only. But the Sun culture is one of openness and transparency. For the first time ever, employees at all levels now had some insight in plans and directions. We could find out what other groups within Sun are doing. We were invited to participate and to share our plans. Even our openness to the outside world changed.  For example, this blog would have been unthinkable a little bit over a year ago.

Integrating products and what that means to customers: there's some overlap between SeeBeyond's product offering and Sun's. We're trying to integrate both offerings as much as possible. That means that in some cases we'll invest less in products that have a better counterpart in Sun. It surely makes the release of a product a lot more complicated: we now need to make sure that all the parts that we depend on and are produced by other groups within Sun are all ready at the same time and work together properly. But for customers it means a better product offering. And it also means a wider product offering because customers now get easier access to products that SeeBeyond didn't offer. Big wins for customers.

Information overload. The interdependencies with other groups within Sun requires us to keep track of many developments. What are the release plans of the Glassfish team? What is the road map of the Message Queue? What is the Tools group up to? What groups are working on NetBeans? And so on. Conference calls several times a week. Wikis and internal sites by the hundreds. At times I get a distinct feeling of information overload and wish I could ignore everything.

Opportunities for SeeBeyond employees.  SeeBeyond has definitely become a more interesting place to work since it became part of Sun. Also a place with more opportunities: smart people, cool products and a good environment means more opportunities. Last week I talked with a long-time Sun employee and he mentioned career paths within Sun. "Career path" is a word I had not heard for many years.

Changes to come and what it means to customers: Sun's new approach to software is that of open source and radically different revenue models. The old SeeBeyond had a revenue model based on license fees, and a sales model in which the first contact with the customer was through an RFP. That will change. Software will be downloadable by anybody and can be used by anybody free of charge. That should draw developers to try out our software. The first contact with customers will be right there. Through more open communication with the end-user, we'll be able to build products that better meet customers' requirements. Since the developers we are targeting have the freedom to choose, we'll also be forced to change and improve our products quite a bit compared with previous versions.

Saturday Aug 26, 2006

"When Connection.close() should not close", or the J2EE JCA ManagedConnection life cycle

One of the things I've struggled with most when I was developing the JMSJCA Resource Adapter at SeeBeyond, was the life cycle of the ManagedConnection. I wasted numerous hours going into fruitless directions simply because I did not fully grasp the intricacies of the ManagedConnection life cycle. If you're involved in developing Resource Adapters, you're likely to stumble onto the same problems, so read on!

What do you mean, close() should not close?

Let's take a look at how you might use a JMS Connection in an EJB:

@Resource(name="jms/cf") ConnectionFactory cf;
@Resource(name = "jmx/q1") Queue q;

@TransactionAttribute(TransactionAttributeType.REQUIRED)
public void myBusinessMethod() throws Exception {
Connection c = cf.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE);
s.createProducer(q).send(s.createTextMessage("Hello world"));
c.close();
}

Should c.close() really close the connection?

Keep in mind that since J2EE 1.4, JMS providers interface with the application server using JCA. JMS connections should be pooled so that repeated execution of the statement fact.createConnection() is cheap. So the answer is no, the connection should not really be closed.

Should c.close() return the connection to the pool then, so that another EJB could use it? Remember that there is a transaction in progress, so the container should hold on to the connection until the transaction is completed. Only then should the container return the connection to the pool.

Are you appreciating yet the complexities that the application server has to deal with? If not, let's add the following to the method above:

@TransactionAttribute(TransactionAttributeType.REQUIRED)
public void myBusinessMethod() throws Exception {
ConnectionFactory fact = (ConnectionFactory) new InitialContext().lookup("jms/cf");
Connection c = fact.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE);
s.createProducer(s.createQueue("q")).send(s.createTextMessage("Hello world"));
c.close();

otherMethod();
}


public void otherMethod() throws Exception {
ConnectionFactory fact = (ConnectionFactory) new InitialContext().lookup("jms/cf");
Connection c = fact.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE);
s.createProducer(s.createQueue("q")).send(s.createTextMessage("Goodbye world"));
c.close();
}

When myBusinessMethod() is called, which in turn calls otherMethod(), how many connections are created? Good application servers like the Java CAPS Integration Server, the Sun Java System Application Server, and Glassfish use only one connection in this example. The connection that is used in otherMethod() is the same connection that was created in myBusinessMethod().

Let's look in more detail at what is happening under the covers, and why that is important.

Alphabet soup

Before we begin, let me reiterate some of the terms and abbreviations used. JCA stands for the Java Connector Architecture. It provides the interfacing between applications running in an application server, and external systems such as CRM packages, but also systems such as JMS.

A Resource Adapter is a set of classes that implement the JCA. The central interface for outbound communications is the ManagedConnection. An application, e.g. an EJB, never gets access to a a ManagedConnection directly. Instead, it gets a connection handle. The handle in the above example is the JMS Connection (actually, it is the JMS Session -- but let's not go into that right now). This Connection object is not the JMS Connection that is implemented by the JMS provider, but is a wrapper around such a connection. The wrapper is implemented by the Resource Adapter.

A ManagedConnection holds the physical connection to the external system, e.g. the JMS Connection and Session. Because the physical connection and hence the ManagedConnection is expensive to create, the application server tries to pool the ManagedConnections.

State transitions, and why they are important

We've already seen that a ManagedConnection can be in an idle state or pooled state and it can be in use. It would be very nice if the application server would tell the ManagedConnection when these state transitions happen. But it doesn't. Why would the ManagedConnection care to know when it is being used or when it is being returned to the pool? Let's look at an example in JMS. Let's change the example a little bit:

@TransactionAttribute(TransactionAttributeType.REQUIRED)
public void usingTemp() throws Exception {
ConnectionFactory fact = (ConnectionFactory) new InitialContext().lookup("jms/cf");
Connection c = fact.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE);
Queue temp = s.createTemporaryQueue();
Message request = s.createTextMessage("541897-9841");
request.setJMSReplyTo(temp);
sendRequest(request);
Message reply = s.createConsumer(temp).receive();
c.close();
}

The sendRequest() method would somehow start a new transaction and send the request message. The question is when the temporary destination should be deleted. According to the JMS spec, the temporary destination should be invalidated as soon as the connection is closed. However, a transaction is still in progress, so the JMS provider will throw an exception if the temporary destination is deleted when c.close() is called. Can't we just ignore the temporary destination? After all, it will be deleted some time, e.g. when the physical JMS connection is closed. However, since the connection is pooled, it may take a long time before the physical JMS connection is closed. During that time temporary destinations just keep piling up. This approach will exhaust the JMS server.

The temporary destination can be deleted safely when the ManagedConnection is returned to the pool. And that's why it's important to know the state transitions.

How to detect state transitions

What hints does the application server give the ManagedConnection about the state transitions? Let's take a look at the ManagedConnection interface:

public interface ManagedConnection {
Object getConnection(Subject subject, ConnectionRequestInfo connectionRequestInfo) throws ResourceException;
void destroy() throws ResourceException;
void cleanup() throws ResourceException;
void associateConnection(Object object) throws ResourceException;
void addConnectionEventListener(ConnectionEventListener connectionEventListener);
void removeConnectionEventListener(ConnectionEventListener connectionEventListener);
XAResource getXAResource() throws ResourceException;
LocalTransaction getLocalTransaction() throws ResourceException;
ManagedConnectionMetaData getMetaData() throws ResourceException;
void setLogWriter(PrintWriter printWriter) throws ResourceException;
PrintWriter getLogWriter() throws ResourceException;
}

The getConnection() method tells the ManagedConnection to create a new connection handle, so after that method the ManagedConnection knows that it is not in the pooled state.

The destroy() method destroys the ManagedConnection so it signals a state transition to "non-existent".

Doesn't the cleanup() method indicate that a connection is no longer used and will be returned to the pool? It depends on the application server when exactly this method is called. Most application servers will call cleanup() immediately when the application calls Connection.close(). At that moment the connection may still be enlisted in a transaction, and may be reused as we've seen in the examples above.

As it turns out, there are two states that the ManagedConnection needs to keep track of so that it can detect a state transition from "in-use" to "pooled".  The two diagrams keep track of whether the application has access to the ManagedConnection through a connection handle. In other words, it keeps track of whether the ManagedConnection has any outstanding connection handles. See the figure below.

The second state is a transactional state: it keeps track of whether the ManagedConnection is enlisted in a transaction. See the figure below:

Let's look at a few examples that illustrate this concept:

@Resource EJBContext ctx;
@Resource(name="jms/cf") ConnectionFactory cf;
@Resource(name = "jmx/q1") Queue q;

public void ex1() {
ctx.getUserTransaction().begin();
Connection c = cf.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE);
s.createProducer(q).send(s.createTextMessage("Hello world"));
c.close();
ctx.getUserTransaction().commit();
}

In this example the getConnection() method is called upon cf.createConnection().createSession(): the ManagedConnection is in use. At the same time the connection is enlisted in the transaction; this is done through the XAResource obtained through ManagedConnection.getXAResource(). When c.close() is called, the ManagedConnection is no longer accessible to the application: all the connection handles are closed. The application server may and probably will call ManagedConnection.cleanup(). However, the connection is still enlisted in the transaction, so the connection is not returned to the pool yet. That happens when getUserTransaction().commit() is called. While the connection is still enlisted, the resource adapter should not try to delete any objects such as temporary destinations in the example mentioned above.

In the following example the enlistment happens a little later. See the inline comments for the state transitions.

public void ex2() {
Connection c = cf.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE); // Accessible, but not enlisted
ctx.getUserTransaction().begin(); // Accessible and enlisted
s.createProducer(q).send(s.createTextMessage("Hello world"));
c.close(); // Inaccessible and enlisted
ctx.getUserTransaction().commit(); // Inaccessible and not enlisted
// Return to pool
}

There is also a possible transition from "Inaccessible and enlisted" back to "Accessible and enlisted":

public void ex3() {
Connection c = cf.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE); // Accessible, but not enlisted
ctx.getUserTransaction().begin(); // Accessible and enlisted
s.createProducer(q).send(s.createTextMessage("Hello world"));
c.close(); // Inaccessible and enlisted

c = cf.createConnection();
s = c.createSession(false, Session.AUTO_ACKNOWLEDGE); // Accessible and enlisted
s.createProducer(q).send(s.createTextMessage("Hello world"));
c.close(); // Inaccessible and enlisted
ctx.getUserTransaction().commit(); // Inaccessible and not enlisted

// Return to pool
}

The following example shows that there can be multiple connection handles:

public void ex4() {
Connection c = cf.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE); // Accessible, but not enlisted
ctx.getUserTransaction().begin(); // Accessible and enlisted
s.createProducer(q).send(s.createTextMessage("Hello world"));

Connection c2 = cf.createConnection();
Session s2 = c.createSession(false, Session.AUTO_ACKNOWLEDGE);// Accessible and enlisted
s2.createProducer(q).send(s2.createTextMessage("Hello world"));
c2.close(); // Accessible and enlisted
ctx.getUserTransaction().commit(); // Accessible and not enlisted
c.close(); // Inaccessible and not enlisted
// Return to pool
}

How to monitor transaction enlistment

To monitor enlistment and the commit() or rollback() method on the XAResource it is possible to write a wrapper around the XAResource of the underlying provider. There are some drawbacks associated with that, and it is better to monitor the transaction through a javax.transaction.Synchronization object. I've described this in more detail in my blog of July 23, 2006: J2EE JCA Resource Adapters: The problem with XAResource wrappers .

Conclusion

By keeping track of both the number of outstanding connection handles given out to the application and the enlistment in a transaction, the ManagedConnection can figure out when it is returned to the pool so that it can undertake necessary actions such as destruction of objects indirectly created by the application.

Sunday Aug 13, 2006

JMS request/reply from an EJB

This blog has moved to http://frankkieviet.blogspot.com

Monday Jul 31, 2006

J2EE JCA Resource Adapters: Poisonous pools

Introduction

As a developer of a  JCA Resource Adapter (RA) you're responsible for all aspects between the EIS and the EJB, including connection failures so that poisoned pools are avoided.

Wait! Too many acronyms in one sentence? Poisoned pools? What am I talking about? Here's a short refresher. JCA is the Java Connector Architecture and defines how Enterprise Java Beans (your application) can communicate with Enterprise Information Systems (EIS). Examples of Enterprise Information Systems are ERP systems, CRM systems and as well as other enterprise systems such as databases and JMS. The "conduit" between the EJB and the EIS is the Resource Adapter (RA). The communication can originate both from the EIS and from the EJB. The former is called inbound, the latter is outbound. In this write-up I'm looking at outbound only.

Creating a connection from your application to an EIS is often expensive. That is why the container (the application server) provides for connection pooling so that connections can be reused rather than getting recreated. This brings with it that there is a risk that faulty connections accumulate in the pool, thus causing a poisoned pool. In such a situation the application can no longer communicate with the EIS.

Seem's simple enough, doesn't it? Let's look a bit closer...

Typical time scales

One of the services that an application server provides to applications is the pooling of resources. As such, when your application uses an outbound connection of a resource adapter, the application server will maintain a pool of connections. When your application needs a connection, the application server tries to satisfy this request first by checking the pool of idle connections; if there are no idle connections, a new connection is created or the application is blocked until a connection is returned to the pool. When the application closes a connection, it is returned to the pool.

Creating a new connection typically involves creating one or more new TCP/IP connections, authentication by the EIS, creating an internal session in the EIS and its associated memory structures and data, etc. This makes creating a new connection expensive, the time scale of connection creation is usually in the order of 50-300 ms. These expensive operations can be avoided when reusing an idle connection: the time scale of reuse is measured in microseconds rather than microseconds. Next to consider is the time scale of connection use by your application, typically often in the range of 1-5 ms.

To show the effects of connection pooling on throughput, let's assume that your application takes 3 ms to process a request, and that the time it takes to create a new connection is 300 ms, while re-using a connection takes 0.03 ms. The processing time is 3.03 ms with pooling, and 303 ms without pooling. A difference of a factor hundred! Sure, I made up the numbers in this example, but they are likely close to what you'll encounter in every day practice.

In addition to the sub-second time scale of connection use and creation, there is another timescale to consider: the typical time scale of the duration of a failure. Communication failures are most often caused by the EIS becoming unavailable temporarily. This can have two causes: a loss of network connectivity, or because of the EIS being restarted. The latter is more probable and will be considered here as the typical failure scenario. Restarting an EIS typically takes from half a minute to several minutes. It is important to keep this timescale in mind when considering error handling strategies.

Mechanics of connection pooling

An outbound connection from the application to the EIS is represented by a ManagedConnection. The ManagedConnection holds the physical connection to the EIS. The lifecycle of a ManagedConnection is under control of the application server: a resource adapter creates a ManagedConnection when the application server tells it to; likewise the resource adapter destroys a connection only when instructed to do so by the application server.

A problem occurs when there is a communication failure with the EIS. For example, if a resource adapter connects to an external EIS, and this external server is restarted, the connections in the pool are all invalid. If the application were to use one of these connections, a failure would certainly occur. The failure would be propagate to your application code through an exception. A likely result would be that the transaction would be rolled back, and that the operation would be attempted again. The application server may not be able to distinguish this communication failure due to a faulty connection from other errors, so it may use the same faulty connection again on the next attempt, thereby ensuring that the same problem will happen for the next transaction. Effectively, the whole application has become inoperable because of the “poisoned pool”. To break out of this cycle, the resource adapter should let the application server know that connections are faulty so that the application server then can make the resource adapter recreate a new connection and avoid putting faulty connections back in the pool. There are several ways to do this.

  • Signal to the application server that a connection is no longer valid
  • Respond negatively when the application server asks whether a connection is valid

Signalling the application server that a ManagedConnection is faulty

After the application server instructs the Resource Adapter to create a new ManagedConnection, it calls the following method on that ManagedConnection:

public interface ManagedConnection
{
   addConnectionEventListener(ConnectionEventListener listener)

    // other methods omitted for clarity
}

The managed connection uses this ConnectionEventListener object to notify the application server of connections being closed. This object can also be used to let the application server know that the connection is faulty using the CONNECTION_ERROR_OCCURRED event. Upon receiving this event, the application server will typically destroy the connection immediately.

This approach can only be used if the resource adapter has some way of finding out when the connection is broken. In practice this turns out to be quite difficult: most resource adapters are not written from from scratch but rather make use of some client jar that takes care of the communication with the EIS. Often, the vendor of the resource adapter is not the vendor of the EIS, and even if this were the case, the vendor of the EIS most likely needs to make a client jar available independent from a resource adapter anyways. It is not uncommon that these client jars don't provide any mechanism to propagate connection failures to the caller. For instance, if the EIS is JMS, the client jar will expose only the JMS API and there is no way in the JMS API to tell the caller of a method that the physical connection is faulty.

Because the application server will destroy the connection immediately upon receiving the CONNECTION_ERROR_OCCURRED event and will roll back the transaction, it is important that the ManagedConnection does not throw this event as a result of an application condition rather than a faulty connection.

Fortunately there are alternatives to the CONNECTION_ERROR_OCCURRED mechanism.

The ValidatingManagedConnectionFactory

Rather than telling the application server that the connection is faulty, the managed connection can also wait for the application server to ask the managed connection whether it is still a valid connection. To make that work, the managed connection factory needs to implement the ValidatingManagedConnectionFactory interface:

public interface
ValidatingManagedConnectionFactory 
{
    Set getInvalidConnections(Set connectionSet) 
}

The application server will check if the managed connection factory implements this interface, and if so, it will call the getInvalidConnections() method periodically.

Again it is often a problem for the managed connection factory to know if a connection is valid or not. For example, if the resource adapter wraps a database connection, there is often no way to find out if a connection is still "live" or not. E.g. on a JDBC connection the isClosed() method does not return any status information on the physical connection but only returns whether the close() method was called.

Passive negative checks
One way for a managed connection to keep track of possible connection problems is to monitor exceptions being thrown from the client runtime to the application. If there is a way for the ManagedConnection to discern application errors (e.g. a  syntax error in a prepared statement) from communication failures, the managed connection can assume that it may be faulty if the exception count is greater than zero. For example, if the resource adapter wraps a JMS provider, it could reasonably assume that exceptions from methods like send(), publish() etc. indicate connectivity problems.

Passive positive checks
If it is not possible to passively monitor connection failures, perhaps it is possible to keep track of when a connection was used without any problem for the last time. If a connection was not used for more than say 30 seconds, you could mark that connection as invalid. Of course there is a risk that the application uses the connection less often than once every 30 seconds; if that is the case, the expense of recreating a connection may not be that bad.

Active validity check
Another way is for the managed connection to actively check the connection validity. If the managed connection uses an Oracle connection underneath, it could do a select on the DUAL table. However, it is important that this check is not very time consuming.

Keep an eye on expenses!
Above it was mentioned  that the applicaton server will call the getInvalidConnections() method periodically. How often does the application server call this method? The application server may have a timer thread that will go over all the idle connections in the connection pool and check to see if they are still valid. There are some serious problems with this approach if it is the only time that the application server calls this method: when the system is processing at or near capacity, the application server will hardly ever find idle connections in the pool.

That's why application servers typically will call the getInvalidConnections() method before it gives out a connection to the application. A simple but expensive approach is for the application server to call this method every time an application is given to the application. A smarter approach is to do this not more often than every so many seconds, a value that is configurable for the server. This value is chosen based on the expected failure duration. As was mentioned earlier, the expected failure duration is likely greater than 30 seconds. Hence, it makes little sense for the application server to call getInvalidConnections() more often than every 30 seconds.

Keep in mind however that there is no standard on what application servers do, so it is important to make sure that the getInvalidConnections() method is fast on average. If calling an expensive method is the only way to find out if a connection is valid, the managed connection factory could keep track of when it was called last, so that it will not call this expensive method more than every so many seconds. A guess can be made what a reasonable time span is by looking at how expensive the check is, and keeping the timescales of connection failures in mind, again 30 seconds being a reasonable ballpark number.

Desparate measures
If there's really no way for the managed connection to find out anything about the validity of the connection, it could resort to a crude but effective workaround: it can set a limit on the lifetime of the connection, e.g. one minute. Again, this time interval is based on the timescales of connectivity failures. This will have a small adverse effect on performance: connections are destroyed and recreated more often than they need to be. This effect will not be very big however: most of the time the application can in fact reuse an existing connection. In the example above with a connection time of 300 ms, the throughput goes down by 0.5% when the maximum lifetime of a connection is 1 minute.

Should a connection failure occur, the faulty connection will be reused for less than one minute, so the problem will eventually correct itself. If the application is used continuously, and if the expected downtime is more than one minute, this will not make any difference to the application because during the one minute in which the connection is faulty: the EIS is unavailable anyways.

Complicating factor: transaction enlistment

Resource adapters declare in the ra.xml what level of transactions they support. There are three levels: XATransaction, LocalTransaction and NoTransaction. If a resource adapter supports XATtransaction, this means that the resource adapter supports XA; the application srver will call getXAResource() on the ManagedConnection to get hold of the XAResource object to control the transaction. Resource adapters that support LocalTransaction return an instance of the LocalTransaction class when the application server calls getLocalTransaction() on the ManagedConnection. This interface has methods begin(), commit() and rollback(). Resource adapters that only support NoTransaction don’t participate at all in transactions.

If a resource adapter supports XATransaction, the managed connection will have to be enlisted each time for every transaction. The transaction manager in the application server will call start() on the XAResource. The start() call is the very first method that the application server calls on the managed connection after getting it out of the pool. The start() method will typically call into theEIS, causing an exception if the exception is faulty. The best way of dealing with this is for the application server to discard the connection, i.e. call destroy() and remove the connection from the pool. Some application servers (e.g. the Integration Server in Java CAPS) do that. So for these application servers it may suffice to do nothing in your resource adapter and still avoid poisoned connection pools. However, there are plenty of other application servers that will propagate the exception to the application and return the connection to the pool. And you do want your resource adapter to work well with any application server, don't you?

For application server that don't destroy connections when the enlistment fails, it is critical that the resource adapter has to provide for a fault detection strategy. Unfortunately, an exception on the start() method is difficult to detect for most resource adapters, because resource adapters often expose the XAResource from the client runtime directly to the application server's transaction manager. There's good reason for this, because there's an inherent problem with XAResource wrappers as I noted in my previous blog. In these situations the passive positive check as I outlined above may be useful.

Conclusion

When developing a resource adapter, it's crucial to provide for connection failure detection. Keep in mind:

  • different application servers behave differently, e.g. different frequency of calling getInvalidConnections(), different behavior when the enlistment of a connection fails
  • transaction enlistment failures may be the only failures that occur; can you detect them?
  • There are different ways of guessing if a connection is valid, even if the monitoring of failures doesn't work:
    • track when a connection was used without failure
    • assign a maximum lifetime
  • Keep an eye on expenses! Make sure that connections are not recreated every time, and make sure that active health checks don't happen too often.
With all this, keep in mind the different time scales:
  • how long it takes to create a new connection
  • how long a typical connection failure lasts
  • how many requests an application is likely to process per second

Sunday Jul 23, 2006

J2EE JCA Resource Adapters: The problem with XAResource wrappers

Let's say that you're writing an outbound JCA Resource Adapter. Let's say that it supports XA. Let's say that you would need to know when the transaction is committed. You would be tempted to provide a wrapper around the "native" XAResource. If you are, read on: there are some problems you need to consider before doing that! Warning: technical warning alert! The remainder of this posting is full of technical terms like XAResource and ManagedConnection.


First let me explain in more detail what I am talking about.

Introduction

A client runtime typically is a library that takes care of the communication between a Java client and a server. For instance, a JMS client runtime is one or more jars that implement the JMS api and takes care of the communication with the JMS server. Likewise, a JDBC client runtime library implements the JDBC api to provide connectivity with a database server.

Many adapters that support XA either wrap around or build on top of an existing client runtime. For example, a JMS resource adapter typically wraps around an existing JMS client runtime. A workflow engine adapter may internally use a JDBC connection for persistence so it can be said that it builds on top of a JDBC client runtime.


How does the native XAResource fit in with the JCA container? When the application server requests the XAResource from the managed connection through the getXAResource() call, the managed connection may return its own implementation of the XAResource object, or it may return the XAResource implemented by the client runtime (the "native" XAResource). The former type is essentially a wrapper around the XAResource implemented by the client runtime.


Why is this important? Often it is necessary for a managed connection to be notified of the progress of a transaction: a managed connection may need to update its state after the transaction has been committed or rolled back. The JCA spec does not provide a standard way of doing this other than through the XAResource. This may invite you (the developer of the adapter) to write a wrapper around the XAResource instead of exposing the XAResource of the underlying client runtime directly.


There are some problems associated with the wrapper-approach, which will next be discussed in detail.


How should isSameRM() be implemented?

The isSameRM() method is called by the transaction manager to find out if two XAResource-s use the same underlying resource manager. If this is the case, instead of creating a new transaction branch, the transaction manager will join the second XAResource into the same transaction branch.

The method isSameRM() can be implemented as follows:

class WrappedXAResource implements XAResource {
  private XAResource delegate;

public boolean isSameRM(XAResource other) {
 if (other instanceof WrappedXAResource) {
  return delegate.isSameRM(other.delegate);
} else {
return delegate.isSameRM(other);
}
}
}

Let's look at a scenario where there are three resources to be enlisted in the same transaction. Two resources belong to the same resource adapter (say W1 and W3), and the other resource belongs to an unknown entity, say R2. Let's assume that the the underlying resource manager is the same. This can happen for example when the resource adapter builds on top of JDBC driver, and the other entity is in fact a database connection to the same database.

Let's say that the application server enlists the resources in this order in the transaction: W1, R2, W3. The transaction manager may call the isSameRM() method as follows:

Case A

  1. enlist W1:
  2. enlist R2:
  3. W1.isSameRM(R2); // returns true; R2 is joined into W1
  4. enlist R3:
  5. W1.isSameRM(W3); // returns true; W3 is joined into W1

In this case, all resources are joined, i.e. one branch with W1 receiving the prepare/commit/rollback calls.

Alternatively, the transaction manager may invoke the isSameRM() call as follows:

Case B

  1. enlist W1:
  2. enlist R2:
  3. R2.isSameRM(W1); // returns false
  4. enlist R3:
  5. W3.isSameRM(W1); // returns true; W3 is joined with W1

In this case there will be two transaction branches with W1 and R2 receiving both the prepare/commit/rollback calls.

Exactly how the transaction manager invokes the isSameRM() method depends on the implementation of the transaction manager and may be differ from one implementation to another.

Now let's look at what happens if the resources happen to be enlisted in this order: R1, W2, W3

Case C

  1. enlist R1
  2. enlist W2
  3. R1.isSameRM(W2); // returns false
  4. enlsit W3
  5. R1.isSameRM(W3); // returns false
  6. W2.isSameRM(W3); // returns true; W3 is joined into W2

In this case there will be two branches with R1 and W2 receiving prepare/commit/rollback calls

Case D

  1. enlist R1
  2. enlist W2
  3. W2.isSameRM(R1); // returns true; W2 is joined into R1
  4. enlist W3
  5. W3.isSameRM(W1); // returns true; W3 is joined into R1

This case results in one transaction branch with R1 receiving the prepare/commit/rollback calls, and W2 or W3 receiving none.

To avoid case D where none of the wrappers receive prepare/commit/rollback calls, the implementation of isSameRM() should only consider other wrappers, and never consider an unwrapped XAResource:

public boolean isSameRM(XAResource other) {
if (other instanceof WrappedXAResource) {
return delegate.isSameRM(other.delegate);
} else {
return false;
}
}

This will also take care of the intransitive behavior of isSameRM() where W1.isSameRM(R2) returns true, while R2.isSameRM(W1) returns false.

Note that if multiple wrappers are joined together, only one wrapper will receive the prepare/commit/rollback calls. It is possible to keep track of all resources that are joined together, but this code becomes rather complicated although feasible. A simpler approach is to always return false in the isSameRM() method:

public boolean isSameRM(XAResource other) {
return false;
}

The obvious drawback is that this will result in more transaction branches and will be more expensive.

There's another complication that may result in the wrapper not getting any commit/rollback calls. This has to do with optimizations in the resource manager.

XAResource.prepare()

If R1 and W2 are really using the same resource manager, but the isSameRM() call returned false, there will be two transaction branches from the perspective of the transaction manager. The underlying resource manager however will see two branches of the same with the same global transaction id. The resource manager may then decide to join these two branches together internally. The result is that when the transaction manager calls XAResource.prepare() on W2, the underlying XAResource may return XA_RDONLY. If the tranaction manager receives this signal, it should not call commit() or rollback() on that resource.

The wrapper can provide more code to deal with this situation: instead of delegating the call to prepare() to the underlying XAResource and returning the return value to the caller (the transaction manager), the wrapper should make sure that it will never return XA_RDONLY. It should store this fact in its internal state, so that when the transaction manager calls commit() or rollback(), the wrapper will check if it had overruled XA_RDONLY and not call commit() or rollback() on the underlying XAResource.

The expense of having multiple branches

The performance difference between a transaction with a single branch and a transaction with two branches is enormous. In the case of a single branch, the transaction manager can skip the call to prepare() and only needs to call commit(onephase=true). The transaction manager does not need to log any state to its transaction log. Any write operation to the disk, both by the underlying resource manager and the transaction manager writing to the transaction log is expensive. This is because to be able to guarantee transactional integrity, the write-operations will have to guarantee that the data is in fact on the disk, and not in some write cache. This is done by “syncing” the data to disk. This is an expensive operation; even a fast hard drive can not sync to the disk faster than say 100 times per second. So, changing a single branch transaction to a transaction with two branches, is in fact very expensive.

An alternative

Instead of using wrappers around the XAResource, it's also possible to register interest in the outcome of the transaction by registering a javax.transaction.Synchronization object with the transaction manager. This interface declares two methods: beforeCompletion() and afterCompletion(). The latter takes an argument to indicate if the transaction was committed or rolled back.

The Synchronization object needs to be registered with the javax.transaction.Transaction object using the registerSynchronization(Synchronization sync) method; this object can be obtained from the javax.transaction.TransactionManager object using the getTransaction() method. The question is how to obtain a handle to the TransactionManager. This is not specified in the J2EE spec, and different application servers make the transaction manager available in different ways. As it turns out, most application servers bind the transaction manager in JNDI and for a few others, some extra code is necessary to invoke some methods on some classes. A notable exception is IBM WebSphere that does not provide access to the javax.transaction interfaces, but provides its own proprietary interfaces. However, with some extra code, the same behavior can be obtained. The bottom line is that it is doable to develop some code that can register a Synchronization object on all current application servers.

The approach using a Synchronization object does not suffer from the performance penalty of causing multiple transaction branches when only one would suffice. Hence, this is a better alternative than using wrappers.


Saturday Jul 22, 2006

Resource adapters at JavaOne

At JavaOne 2006 I gave the presentation "Developing J2EE Connector Architecture Resource Adapters". I did that together with Sivakumar Thyagarajan. He works in Sun's Bangalore office.

That he works there, while I work in Monrovia California... that's one of the interesting things about working at Sun: it's very international. Sun has people in all corners of the world. Last week I was on a phone call with perhaps 20 other people, and 12 of them were from other countries.

I talked to Sivakumart a few times on the phone before JavaOne, but met him only face to face for the first time at JavaOne. That's also one of the nice things about JavaOne: you get to meet people face to face you otherwise only talk with on the phone.

The presentation went pretty well, and the audience agreed: at the end of the presentation all audience members can fill in an evaluation form. Here are the feedback results
 
Overall quality
Speakers
Our presentation
4.32
4.28
Average at JavaOne
3.99
3.90

If you're interested, here's some more information:
JavaOne session information of "Developing J2EE Connector Architecture Resource Adapters"
Slide presentation "Developing J2EE Connector Architecture Resource Adapters"

I've recorded the presentation on my MP3 player:
Audio presentation of "Developing J2EE Connector Architecture Resource Adapters" (.wav)
Audio presentation of "Developing J2EE Connector Architecture Resource Adapters" (.mp3)

Next: JCA Resource adapters

At Sun I'm responsible for the application server and the JMS server that are shipping as part Java CAPS. As such I've been involved quite a bit in resource adapters (Java Connector Architecture, or JCA).

All this serves as an introduction to some more technical blogs on resource adapters.

Sunday Jul 16, 2006

Automatic log translation

Why would I want to translate logs from one locale to another?

Say that I were to build a product, and I do a nice job to make it internationalizable. The product is a success, and it is localized into various languages. Next, a customer in a far-away place sends an email complaining about the server failing to start up. For my convenience, he attached the log file to it. Since I did a good job writing decent error and logging messages, I expect it to be no problem to diagnose what's going wrong. But oops, since I did such a nice job internationalizing the product, the log is in some far-away language, say Japanese! Now what?

Let me give exemplify the example. Let's say that the log contains this entry:
(F1231) Bestand bestelling-554 in lijst bestellingen kon niet geopend worden: (F2110) Een verbinding met 
server 35 kon niet tot stand gebracht worden.
Looks foreign to you? (It doesn't to me, but I would have great problems if this example were in Japanese).

Fortunately, through the error codes, we could make an attempt to figure out what it says by looking up the error codes. However, if there are many log entries, this becomes a laborious affair. Would it be possible to obtain an English log file without tedious lookups and guess work?

I think there are a few different approaches to this problem:
  1. always create two log files: an English one and a localized one.
  2. store the log in a language-neutral form, and use log viewers that render the log in a localized form
  3. try to automatically "translate" the localized log file into English
The trouble with the first two approaches is that in Java localization happens early rather than late. Let me explain what I mean  by that. If you have a compound message as in the example, at each point that information is added to the message, the message is localized. The example above could have occurred through the following code sequence:
try {
...
throw new Exception(msgcat.get("F2110: Could not establish connection with server {0}", serverid));
} catch (Exception e) {
String msg = msgcat.get("F1231: Could not open file {1} in directory {0}: {2}", dir, file, ex);
throw new IOException(msg, ex);
}
The exception message is already localized when it is thrown. E.g. in the catch-block, there is already no language neutral message anymore. It would have been nice if there were a close cousin to the Object.toString() method: one that takes the locale: toString(Locale) and if the Exception class would take an Object instead of limiting itself to a String.

In a previous product where I had more control over the complete codebase, I approaches this problem by introducing a specialized text class that supported the toString(Locale) method, and Exception classes that could return this text class. This solution was also ideal for storing text in locale-neutral form in a database, so that different customers could view the data in different locales.

There is a kludgy work-around: we could change the  msg.get() method so that it returns a String that is locale neutral rather than localized. A separate method would convert the locale neutral String into a localized String, e.g. msg.convert(String, Locale). This method would have to be called any time a String would be prepared for viewing, e.g. in the logging  for a localized log.

In the products that I am currently working on, these approaches to support locale-neutral strings are not feasible because they would require widespread. So let's take a look at option 3.

Given the resource bundle
F1231 = Bestand {1} in lijst {0} kon niet geopend worden: {2}
F2110 = Een verbinding met de server {0} kon niet tot stand gebracht worden.
and
F1231 = Could not open file {1} in directory {0}: {2}
F2110 = Could not establish connection with server {0}
let's see if there is a way to automatically translate the message
(F1231) Bestand bestelling-554 in lijst bestellingen kon niet geopend worden: (F2110) Een verbinding met 
server 35 kon niet tot stand gebracht worden.
into
(F1231) Could not open file bestelling-554 in directory bestellingen: (F2110) Could not establish a
connection with server 35.
I think it is possible to build a tool that can do that. The tool would read in all known resource bundles (possibly by pointing it to the installation image, after which the tool would scan all jars to exttact all resource bundles), and translate them into regular expressions. It would have to be able to recognize error codes (e.g. \\([A..Z]dddd\\) ) and use these to successively expand the error message into its full locale neutral form. In the example, the neutral form is:
[F1231, {0}=bestellingen, {1}=bestelling-554, {2}=[F2110, {0}=35]]
The neutral form then can be easily converted into the localized English form.

Saturday Jul 15, 2006

Internationalization of logging and error messages for the server side (cont'd)

In Thursday's entry I proposed that a tool is needed to make it easier to internationalize sources where the English error messages are kept in the source file, and the foreign language messages are in resource bundles.

Let's talk about this tool. There are a few different approaches to go about this tool. As Tim Foster remarked in his comments, it's possible to parse the source code. This approach is doable, especially when existing tools are used (Tim mentioned http://open-language-tools.dev.java.net.

Another approach is to parse the compiled byte code. Using tools like BCEL, it's fairly simple to read a .class file, and extract all the strings in there. It could easily be run on the finished product: just add some logic to go over the installation image, find all jars, and then iterate over all .class files in the jars.

Fortunately the compiler makes a string that is split up over multiple lines in the source into one:
String msg = msgcat.get("F1231: Could not open file {1}"
+ " in directory {0}: {2}", dir, file, ex);
is found in the .class file as:
F1231: Could not open file {1} in directory {0}: {2}
So it's simple to extract strings from a .class file. But how can we discern strings that represent actual error messages from other strings? Error messages can be discerned from other strings because they start with an error message number. The error message number should follow a particular pattern. In the example
String msg = msgcat.get("F1231: Could not open file {1} in directory {0}: {2}", dir, file, ex);
the pattern (regular expression) is
[A-Z]dddd\\: 
Note that a similar trick would have been used when parsing source code, unless some logic is applied to find only those strings thar are used in particular constructs, like calling methods on loggers, or constructors of exceptions. This can quickly become very complicated because often log wrapper classes are used instead of java.util.Loggers.

This is also the answer to a question that I didn't pose yet: in the following code,
String msg = msgcat.get("F1231: Could not open file {1} in directory {0}: {2}", dir, file, ex);
throw new IOException(msg, ex);
how does the msgcat object localize messages? It does that by extracting the error code from the message (F1231) applying the same regular expression, or by splitting the string on the colon. In either case, it's important to have a convention on how the error message looks like or is embedded into the message.

Next problem: how to re-localize an existing log so that an American support engineer can read a log from his product that was created on a Japanese system?

Thursday Jul 13, 2006

Internationalization of logging and error messages for the server side (cont'd)

I ended my previous blog with "Isn't there a better way"? Well... how would I like to use error messages in Java source code? I would simply want to write something like this:

String msg = msgcat.get("F1231: Could not open file {1} in directory {0}: {2}", dir, file, ex);
throw new IOException(msg, ex);
The advantages are clear:
  1. No switching to a different file to add the error message
  2. The error message is visible right there in the source code -- easy to review!
  3. It's easy to see that the arguments in {0}, {1} are correct   
But how are would we deal with the same error message (same error code) being used in multiple places in the source? Well... there's not a good anwer for that. Fortunately, error messages tend to be unique, i.e. rarely would the same error message be reused.

How is this source file internationalized? We need a tool! The tool will
  1. locate all error messages in the code base
  2. generate properties files for all desired languages if they don't exist     
  3. print out a list of all properties files that need to be localized
Here's what I would like to see as the output of the tool:
msgs_en_US.properties
# AUTOMATICALLY GENERATED# DO NOT EDIT
# com.stc.jms.server.SegmentMgr
F1231 = Could not open file {1} in directory {0}: {2}
and   
msgs_nl_NL.properties
# com.stc.jms.server.SegmentMgr
# F1231 = Could not open file {1} in directory {0}: {2}
# ;;TODO:NEW MESSAGE;;F1231 =

so that a human translator can easily add the translations to the foreign properties files. As you can see, it includes the location (Java class) where the message was encountered.

msgs_nl_NL.properties
# com.stc.jms.server.SegmentMgr
# F1231 = Could not open file {1} in directory {0}: {2}
F1231 = Bestand {1} in lijst {0} kon niet geopend worden: {2}
Ofcourse the tool would have a way of handling if you would change the error message in the source code. The translated properties file would look like this:   
msgs_nl_NL.properties
# com.stc.jms.server.SegmentMgr
# F1231 = File {1} in directory {0} could not be opened: {2}
# F1231 = Could not open file {1} in directory {0}: {2}
# ;;TODO: MESSAGE CHANGED;;
F1231 = Bestand {1} in lijst {0} kon niet geopend worden: {2}
Any ideas on how to build this tool?

Monday Jul 10, 2006

Internationalization of logging and error messages for the server side

This is how one would throw an internationalizable message:
String msg = msgcat.get("EOpenFile", dir, file, ex);
throw new IOException(msg, ex);
The localized message is typically in a .properties file, e.g.
EOpenFile=F1231: Could not open file {1} in directory {0}: {2}
Each language has its own .properties file. The  msgcat class is a utility class that loads the .properties file. Logging messages to a log file typically uses the same constructs.

Looks cool, right? So what's my gripe?
  1. when coding this, you would have to update .properties file in addition to the .java file you're working on
  2. It's easy to make a typo in the message identifier,  EOpenFile in the example above; there is no compile time checking for these "constants".
  3. It's difficult to check that the right parameters are used in the right order ({0}, {1}) etc.
  4. When reviewing the .java file, it's difficult to check that the error message is used in the right context and that the error message is meaningful in the context.
  5. When reviewing the .properties files, it's difficult to determine where these error messages are used (if at all!) -- you can only find out through a full text search
Isn't there a better way?
About

fkieviet

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today