The Changing Face of Open Source

I've been involved in open source for many years now, both as a consumer and a provider. In 2000 I started a project (as yet unfinished and now defunct) for a UML modeling tool called Thorn ( Back then the web was still new, but a lot of infrastructure was already in place for creating robust open source projects. JUnit wasn't exactly new, but it was (and still is) a great tool. Abbot was a tool for testing user interfaces.

Online documentation was maturing from simple text files to complete suites of HTML pages that you could access locally. JavaHelp was available for creating a help system that was available from within your Swing application. Of course, JavaDoc was available for developing the documentation needed by the developers that you might coax into helping with your open source project. Ant was available and was the primary build system of choice.

Fast-forward nine years and I find myself working on another open source project, this one called StarBridge ( not on SourceForge yet). Its a small utility program designed to work with another open source project called Celestia ( Celestia is a "space simulation" program that allows you to view the stars from anywhere in the universe. It works on PC, Mac and Linux platforms and is really a beautiful piece of work. I love the program but I wanted a way to get current satellite information into the Celestia program quickly and easily. Believe it or not, satellite orbits change significantly over time, as does the orbit of the International Space Station. The information on satellite orbits is updated on a daily basis, but the format for that data is not very "people friendly". So I embarked on writing my StarBridge utility to help move information into Celestia.

One of the great things about getting older is that you already know a lot and don't have to spend as much time learning new things. One of the drawbacks of getting older is that many of the things you thought you knew have changed. Open source technologies are no exception. Much has changed since 2000. Lets review some of those changes and how they can bite you if you are living on past assumptions.

Swing is the Thing!

As I started on my little StarBridge utility program, I decided to write it in Swing and to provide a command line interface (as all good utilities should have). The command line was pretty simple, of course. Swing however presented some interesting changes.

The Swing API has not changed much in the last 9 years, which I think is a testament to its flexible/extensible design. I do hear people disparage Swing, especially with regard to its speed. I wonder how many of them have first-hand experience with the speed issue. The work I've done with Swing has always shown me that Swing is pretty darned fast, instantaneous for my applications at least.

One thing that has changed with Swing is the improved guidance for creating threads. Threading has always been a challenging topic for many engineers and threading in Swing is no exception. The Java designers have seen fit to provide a uniform method for starting the GUI aspects of a Swing application in their own thread. The code below shows how this is done:

SwingUtilities.invokeLater(new Runnable() {
   public void run() {

Now many of the older Swing apps DO NOT do this, but that is acceptable. However, Swing developers are now encouraged to take this appraoch in order to ensure that their GUI is running in the event dispatching thread, rather than the main thread of the application.

How Good Tools Get Bad Names
(a.k.a. XML Beans is Slow)
Next, I had to store some configuration information for the StarBridge application. In the past I have used property files, but the information that I needed to store in this utility was more hierarchical in nature, so I opted to go with XML and XML Beans instead of the traditional properties file. I also used XML Beans to store the main star data (over 113,000 stars). Reading in and generating the 113K worth of star data goes fairly quickly (a few seconds). The trouble began when I then iterated through the data, instantiating an Java object based on the XML Bean class for each star. That took 68 minutes!

I am convinced that this is the standard scenario for giving tools and technologies the title of, "slow". It is true that when I instatiated each XML Bean so that I could write that bean out into a different text format (the .DAT file format used by Celestia) that the process took 68 minutes. The problem here is not that XML Beans is slow though, the problem has to do with the fast that instantiating that many objects in a serial manner is slow. XML Beans just happened to be the technology I was using at the time.

The lesson here is that it is very tempting to decry a tool or technology as being slow, especially when we don't understand it. I cant tell you how many times I've talked with people that say XML is slow. For example one company I spoke with was telling me that XML is slow because when they pass a loan document from server to server it takes a, "long time". Ok, I certainly believe them. We all know slow when we see it. So I asked a few questions to try to discover the reason for this slowness:

Q: "How large is the document?"
A: "About 150 megabytes",

Now that seemed a little large to me, even for a complex message.

Q: "What goes into this 150MB message?"
A: "Oh, lots of information, including PDF files and some JPEGs."

Q: "The transmission time alone on 150MB can add up. Do you need to pass around the images and the PDF files inside of the message? Couldn't they be stored on a server farm and referred to in the main message?"
A: "Well, that's the way we designed the system 10 years ago, and all of our existing systems rely on that architecture."

Q: "Ok, fair enough. The transmission times for 150 MB alone can't account for the slowness you've described. Anything else that you think contributes to the slowness?"
A: "Well we have to examine the information within the document and route it appropriately several time. As soon as we added that routing logic, it really slowed down. The XPath and XQuery engines are real dogs!"

I think you can see where I am going with this. The customer wasn't just passing a large document around the network. Every time the customer performed any conditional routing to the message, the entire 150MB of message had to be realized into memory and examined. That does take time, especially when you have a lot of those messages whizzing around your network being realized into memory repeatedly. Putting some SOAP headers into the message that contained the data upon which routing logic depended would have addressed the second problem, and taking the time to store the static elements of the message outside of the message itself would have alleviated the transmission time costs tremendously.

The problem that the customer had was not that XML was slow, but that their architecture was slow. Whether those messages were passed around in XML, or Java RMI or some binary format would have little effect on the performance. The bottleneck was their architecture. But no one wants to blame their architecture.

Its much easier to blame a technology you don't really understand. We are all guilty of this. The first time I used XML Beans for that star information and it took 68 minutes to complete, I thought to myself, "uh oh, did I chose a bad technology for this?"

There is no substitute for understanding and real expertise. It is not a question of intelligence. Being smart does not keep us from making these snap judgments. Hopefully, our experience will help us to resist the temptation to blame the tools, especially new tools that we are still learning to use efficiently. A car with square wheels wont go very fast, no matter what size engine you install.

Oh and that 68 minute problem I had; the solution was pretty simple. I just used an XmlInputStream instead and parsed the input stream instead of instantiating the whole object. That dropped the time back down to several seconds for a 35MB document.

Open Source Publishing?
The other interesting thing I've discovered is that publishing is changing radically. It used to be that you would create a User's Guide for your open source project and include that guide in your ZIP or JAR file. Not any longer. While I have known of and used Wikipedia for years now, I was completely unaware of Wikibooks! People are now writing their books using Wikibooks. Its simple and free. People that like to write are often not developers. As a result, they are often not familiar with the standard tools we developers know and love (like SourceForge and Subversion). Wikibooks allows people interested in writing to contribute to your documentation without having to be developer-savvy. Beisdes, what is the benefit of including documentation in your project if you can just refer to it instead? Just like the story I told earlier about passing around the 150MB document, you are much better off storing the relatively static information elsewhere (wikibooks) and simply referring to it from within your project.

The Bookshelf
And speaking of books, I thought I'd mention a couple of books that I have recently read. I have a ton of books on my bookshelf and I thought it might be a good idea to share some of my favorites with you.

Swing A Beginner's Guide - Herbert Schildt (McGraw Hill)
Even though I have done a fair bit of work with Swing in the past, it had been so long that I felt a refresher course would be a good idea. In this book Herbert Schildt does a good job of walking the reader through the major elements of Swing that every Swing developer ought to know. The book is well written and easy to read. I found a fair bit of value in it just because it covered the basics so well.

Java Concurrency in Practice - Brian Goetz, et. al. (Addison Wesley)
A good guide to threading in Java. Talks about threading in fairly simple terms and is up-to-date for Java 1.6, which was important to me. Definitely a good book to have handy if you are writing (or better yet, about to write) a multi-threaded application. Even a simple tool like StarBridge benefited from understanding the Java threading model better.

Ask the Experts
John Graves provided the following tip for scripting Oracle Service Bus. This script will turn on monitoring for all proxy servies in ann environment. Not only is the script itself handy, its also a great example of using the WebLogic Scripting Tool (WLST) to solve real-world problems.
Thanks John!
from import SessionManagementMBean
from import ALSBConfigurationMBean
from import ProxyServiceConfigurationMBean
from com.bea.wli.config import Ref


# Utility function to load a session MBeans
def getSessionMBean(sessionName):
   SessionMBean = findService(SessionManagementMBean.NAME,SessionManagementMBean.TYPE)
   return SessionMBean


sessionName = String("SessionScript"+Long(System.currentTimeMillis()).toString())
smb = getSessionMBean(sessionName)
alsbSession = findService(String(ALSBConfigurationMBean.NAME + ".").concat(sessionName), ALSBConfigurationMBean.TYPE)
psConfig = findService(String(ProxyServiceConfigurationMBean.NAME + ".").concat(sessionName),ProxyServiceConfigurationMBean.TYPE)

for ref in allRefs:
   typeId = ref.getTypeId()
   if typeId == "ProxyService":
      print "Enabling Monitoring for: " + ref.getFullName()

smb.activateSession(sessionName, "Turn on proxy monitoring")


That's all for now. Keep your code clean and your hands dirty!

- Jeff


Hi Jeff,

Do you know if there's any way to set the proxy service monitoring level? Service vs. Pipeline vs. Action..


Posted by dreeb on July 27, 2012 at 02:12 PM PDT #

Hi Dreeb,
I have some links here that should get you pointed in the right direction. I also cover reporting and monitoring in my book. But assuming you don't have that handy, I think these links will serve you well.

- Jeff

Posted by Jeff on July 27, 2012 at 02:35 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed

A site for thoughts, tutorials and more on the cloud, SOA and Oracle


« July 2016