Friday Jan 08, 2010

Publisher Analytics - the feature that almost made it

The Sun Software Library team always subscribed to Agile development practices.  We've adhered to the "Deliver working software frequently" principle by pushing new software to production every two weeks.  As soon as we finish developing a feature, we make it available to our users, so we can get feedback, and make the feature even better (Tim Bray says it, and quite a bit more, quite eloquently.)

Some stuff happened that forced change.

There is one feature in particular that almost made it.  We called it Publisher Analytics - the ability for each publisher to see how many page views, visitors, and click throughs they got through the Sun Software Library for their particular entries.  The usual approaches (e.g. google analytics) don't work for Rich Internet Applications, so we had to devise our own approach.  To implement this feature, we first built a set of web services for accessing the data (which is available in production), then used EJS Chart's excellent JavaScript based charting package for implementing the front end.

This image shows the feature as developed.  This feature is currently only available to the super user.  In the screen snapshot below, we are actually showing production data.  The actual name of the publisher data rendered is blurred out.  The snapshot was taken in early December 2009, hence the lack of December data.

Sun Software Library Publisher Analytics

Ray Maslinski, Sree Vidya Allada, and Jonathan Leone did a great job on this implementation.

Monday Nov 30, 2009

Continuous Integration with embedded API and browser testing

A "good" software release is one that has more features and fewer issues than the previous one.  Yet, Agile software practices stress the importance of frequent and iterative software releases.  Do these two statements contradict each other?  For example, doesn't it make sense that if you want a quality release, you should spend more time testing software, so you can discover more issues, and hence have more time to fix them?  This seems to imply that waterfall based software development practices make more sense than Agile based ones.  What's going on here?

Turns out, for a variety of reasons, that the above statement is actually not true.  This blog entry dives into one specific aspect of how Agile helps produce better software.  There are others that I'll explore in other blog entries.

Through many personal experiences, as well as papers that I've read, it's actually best from a quality and time to market perspective to know about an issue as soon as possible after its been introduced into the software.  There are many reasons for this, but the most intuitive one is this: the developer still has the context for the buggy code fresh in their minds, so they can find the issue and fix it faster.  If months or years elapse, the developer will need longer time to regain the context (assuming the same developer is still even involved with the software), and have far less confidence that the fix actually improves the software without introducing even more issues.  Comically, the same thing happens with computers: if a memory page is context switched out from main memory, it takes longer to retrieve it.  Perhaps the human mind works the same way, where short term memory is somehow "faster to retrieve"?

Anyway, detecting issues as quickly as possible after they have been introduced into the code base seems like a good, logical idea.  But how do you actually implement this?  Agile gives us several different tools to help us here: this blog entry describes the tools that we, the Sun Software Library engineering team, has embedded into our continuous integration cycles, and some of our thoughts around them.

Since videos are worth a thousand words, we first share our Continuous Integration Process in a screen cast.  For those of you looking for slick screencasts, please look elsewhere - we're engineers, not marketing people.  It's the content that matters to us, not the polish :-)

We use Hudson, for actually managing our Continuous Integration ProcessHudson polls our repository periodically, and kicks off a build if anything new has been checked in.  We also do a build once a day overnight at a fixed time each night.  This ensures that we get at least one build a day.

We have a custom Ant based top level build script, that goes and builds the roughly dozen or so NetBeans projects that together form the Sun Software Library.  The screencast above runs this top level build script manually for demonstration purposes.  We developers also run this script before we commit any code, to make sure that our build works.  The top level build script follows these steps:

  • Build all the projects.  This is a basic check - does the code even compile?
  • Some of the NetBeans projects have been modified to include embedded testing and/or automation tools.
    • Our Web UI project build script has the following tools integrated into NetBeans' build.xml script:
    • Our CQS (CatalogQuerServer) Project, which is the component that implements our web services interface, has the following:
      • A set of jUnit based unit tests that exercise our code directly
      • Modified build instructions, so the code can be instrumented to support Cobertura.  Our code coverage measurement for our backend includes all of our tests, including jUnit, API Functional, and Selenium based Web UI tests.
  • Reloads the database with a known set of reference data (this data is needed and used by the various tests).
  • If deploying to GlassFish, configure the necessary GlassFish resources.
  • Deploys all the projects to an environment dedicated for continuous integration.  This environment has both a Tomcat instance as well as a GlassFish instance in it.  Even though we only use GlassFish for our production environment, we still validate our builds against both Tomcat and GlassFish - this doesn't cost us much, and has helped us resolve some issues.
  • Runs a set of API functional tests, written using the Apache Commons libraries
  • Runs a set of Selenium based tests, these drive the browser to exercise our JavaScript based Rich Internet Application (RIA) UI.
  • Runs schemaspy if the database schema has been changed

Is this all perfect?  Of course not, nothing ever is.  The following issues still exist:

  • The automated testing that is integrated into our continuous integration environment only uses FireFox for testing.  We have deployed several Windows based environments into our lab using virtualization technologies, and are intending to integrate IE based testing in an automated fashion, but we're not there yet.
  • We still do some manual Q/A testing before each release, since there are coverage gaps in our automated tests (for example, our automated tests only cover our primary applications, not our customer help center).
  • We still haven't fully analyzed the cost/benefit aspect of this.  There is definitely a cost of implementation (though it's not that much, especially not since you can replicate what we've implemented).  For us, we intuitively feel that the implementation has saved us more than it has cost us, but a more formal study may be worth while.
  • There may also be alternative tools that are better/easier to use, we haven't really researched this.

PS While writing this blog, I ran across these blog entries, written by John Morrison, one of my colleagues at Sun, which describe testing in an Agile world.  They are interesting reads, and restate much more elegantly and precisely my own personal thoughts on this matter.

Monday Aug 10, 2009

Most viewed date range and support for new browsers - build #79 deployed

Earlier today, we deployed build #79 of the Sun Software Library into production, enhancing the library functionality in the following areas:

  • Most viewed date range: You can now select which time period should be displayed in the "most viewed" area of our home page, so you can see what's popular over different date ranges. 

    library most viewed date selector

    The default setting is "all time", so you'll see what's popular since the beginning of the Sun Software Library, since this is the view that was presented all along.  If you want to see a different date range, simply change the setting.

    There are two aspects of this date range that are worth noting:
    • The data for this is updated once a day.  Our "most viewed" data does not change all that often, and we didn't think it warranted the extra processing required to make this more real time.  If you disagree, tell us.
    • We keep the data for month boundaries only, so when you select "Current Month", you'll be seeing the "most viewed" entries for the current calendar month, not the past 30 days.  For example, if you select "Current Month" on August 10th, you'll see the most popular entries for the month of August (e.g. the last 10 days only).
  • Upgraded our Web UI to ExtJS 3.0: As mentioned elsewhere in this blog, we use the Ext JS Cross-Browser Rich Internet Application Framework widget set for developing our Web UI.  Ext JS recently released version 3.0 of their library.  We upgraded our Web UI to this new version, found some bugs along the way.  The new library enabled us to support more browser types.
  • Support for IE8 & Chrome: We now officially test and support Internet Explorer 8 and Google Chrome browsers for our web site.  IE8 has some significant improvements for JavaScript developers, to quote Joe Hewitt (creator of Firebug): "I couldn't be happier that Microsoft completely copied Firebug for IE8."  We agree, debugging JavaScript on IE8 is finally reasonable.
  • JSLint: We have incorporated JSLint, the JavaScript code quality tool written by Douglas Crockford, into our NetBeans and Hudson based development process, as described in Ari Shamash's blog.  I typically agree w/Douglas Crockford, but JSLint did not hurt our feelingsWe are human after all, we welcome tools that make our lives better and our development more efficient, as I've mentioned several times in this blog....
  • Lots of bug fixes, as always.
There are more improvements in store, keep the feedback coming!

Monday Jun 29, 2009

Build #76 deployed - numerous UI improvements

Earlier today, we deployed build #76 to the Sun Software Library.  For those of you keep up with our numbering scheme, you'll notice a jump in our build numbers.  We were occupied with a side project, one that hopefully we'll be able to show at some point, but since JavaOne, we've been back working on improving the Sun Software Library full time.

Here is what we added:

  • Browser Based WYSIWYG editor for descriptions & reviews: we've finally joined the 21st century in implementing this functionality.  To prevent various cross-site scripting security issues, we only support a subset of HTML for the description field:

    Browser based WYSIWYG Editor

    Formatted reviews can easily be created:

    Browser based WYSIWYG Editor

    For those of you that want to still edit the HTML, you can easily switch back to that mode using the rightmost button in the editor toolbar.  As an aside, we also fixed a bug in the logo preview section of the "Basic Information" tab.  Previously, under certain circumstances, the log was being distorted.

  • Tag Management improvements:  Tagging entries with the right tags in the Sun Software Library will make the entries easier to find and use.  Based on user feedback, we've revisited the "Add Tags" section of our Web UI and improved it as follows:
    • You can now add a "most used' tag directly to your page:

    • The "other tag" field is now easier to navigate, as the search results all show up in one list:

  • Most Recently Updated section of the home page is now more accurate.  Previously, an entry was considered "most recently updated" only if the entry itself was updated (e.g. the description field was updated, or the name was updated, etc.)  Now, an entry is considered "updated" if it or any of its associated data is updated.  For example, if somebody adds a software version to a software record, or if somebody writes a review, that entry will be considered "updated".  Kudos to the "Tech Tracker" team for making this request.

  • Help Center UI improvements: Our help center finally got a much needed face lift, including styling to make it look like our primary site, as well as an embedded screencast orienting new users to our app.

We've also fixed numerous bugs, etc.  There are more improvements in store, keep the feedback coming!

Monday Jun 01, 2009

Get a free T-shirt by writing a review!

Write reviews for your favorite applications in the Sun Software Library and get a free t-shirt\*.

For details on this promotion see:

 \*\*\* Please note that writing a commnet to his blog entry does NOT qualify you for a free t-shirt. Please read the instructions provided in the link above carefully.

Wednesday Apr 01, 2009

Building a JavaScript UI that interacts with both SSL and non-SSL based web services

As you know by reading earlier posts on this blog, we have been working on improving the performance of our Web UI. The authentication user interaction is tightly integrated with rest of the Web UI to provide seamless experience in the rich Web UI. Combining this with the requirement to encrypt user names and passwords, we initially run the entire application under SSL. Ideally, we'd figure out a way to use SSL only for the parts of the communication that require encryption, and regular HTTP traffic for all other aspects of our application (e.g. search results, images, etc.) 

We are experimenting with a new feature that enables the web UI to interact with both SSL based web services and non-SSL based web services within the same browser page. This is one optimization that can help the web site to better scale with increasing traffic.

One technique is the use of iframes with fragment identifiers (text after the hash character in an URL, i.e. This combination enables the web UI to establish a communication channel needed to pass small amount of information needed between iframes to interact with both SSL and non-SSL service points. For further explanation and a working example, please see this blog post for more information.

This use of fragment identifiers as a cross-domain communication channel is in direct overlap with popular practices for history management, which also use fragment identifiers to store page rendering state. Here is a sample page that shows how this feature works.

This leaves us with a potential situation when a page's rendering state can be wiped out whenever iframes use the same fragment identifiers to pass messages among them. In our particular use case, this is not a problem when the message actually leads to a new page state. If this is not the case, then it may lead to an inconsistent state between the representation in the URL and what's displayed on the page. One work around is to reset or synchronize the page state for such scenarios, which is a minor de-optimization that can generate repeat AJAX calls. This will be an interesting challenge to integrate conflicting features. Look for this performance improvement in an upcoming release

Wednesday Feb 11, 2009

API Changes for Authentication and Batch Tagging

Sun Software API changes for authentication and batch tagging.[Read More]

Monday Jan 05, 2009

Automated Information Extraction(IE) Experiments

There is a tremendous amount of data on the web about software. This data is often not structured and or categorized, that is there is no structured interface to it. The library at provides such an interface and relies primarily on manual data entry, which is often the highest quality data when compared to other approaches. We are running some experiments to measure what type of information can be extracted from various other sources using automation. This approach is described in my blog

Saturday Dec 20, 2008

Sun Software Library Quality Process

I would like to share with you a couple of diagrams that describe the activity flow on the development iteration and the development build infrastructure and QA testing model.   As i always say,  a picture worth a thousand words.


Tuesday Dec 09, 2008

Zembly Widgets and Sun Software Library Pages, O My!

As a followup to all the work that Rinaldo Di Giorgio did (see this blog entry), I was able to trivially embed the Zembly SearchSoftware widget into this blog. 

Try typing "sun" into the widget below, that will search for all entries in the Sun Software Library that start with "sun", and present them in a summary list below the widget.  If you select one of the items in the summary list, your browser will open up a new tab with the details for that particular entry.

The implementation for this is still rough around the edges - the search summary list could be made more intuitive - but the idea is very powerful. We can build powerful widgets using Zembly, you can embed them into your web pages (blogs, web sites, etc.)

the HTML code for embedding this widget into this blog entry is trivial.  This HTML snippet can be embedded into any web site, or you can navigate to the URL directly in your web browser:

<iframe src=";iframe"> 
  Text to display if browser does not support iframes

Are there specific widgets you want to see? Tell us.

Monday Dec 08, 2008 & (Intro)

zembly provides an environment for development and deployment of widgets, services and apis in your browser. I created a simple interface to  We will be extending the current interface, we have defined two things for you to get started.

  • A service called SearchSoftware that provides a simple searchall api to start with
  • An example widget built with extjs, you can try the widget out here.

zembly allows you to use this code easily with other applications, in fact you could use this widget and service as a starting point to put library services into other sites supported by zembly like Facebook and Myspace or develop an app for your iPhone.

Some notes on the development. I have been using OO languages for close to 20  years, starting wih ADA, then C++ and then Java. I had been avoiding Javascript's horrible syntax and not well thought out approach to classes and scoping.  I have been using Javascript like C, when it came to classes and extensions. In order to interface to extjs correctly, I have to give Javascript more respect. The correct way to interface to extjs is via a ZemblyProxy class. I have started work on that class and will attempt to provide an implementation in the future, or talk someone else into doing it.

The next example at will provide an example using a linked data view of by Henry Story


Welcome to the Sun Software Library blog, where you will find interesting updates and tidbits about using the Sun Software Library.


« August 2016