Wednesday May 07, 2008

Sun Studio Compilers on LINUX?

With the magic of Mercurial, you can see changesets, like this one: Which Serguei Spitsyn integrated recently.

But wait, what does this changeset actually mean? Sun Studio on Linux? Does that make sense? YES! It does and it's true. Mind you, it's just Hotspot that can be built with the Sun Studio compilers on Linux right now, but it's an important piece, the Hotspot C++ code is not a trivial pile of code to compile and optimize correctly. My hat is off to the Sun Studio team in making this all happen. What should be interesting now is how well the rest of the tools work like dbx and the analyzer/collector.

More can be read about the Sun Studio Linux Compilers at the Sun Studio Site.

Humm, I guess I'm now on the hook to see if the rest of the OpenJDK can be setup to build with Sun Studio on Linux.... Back to work...


P.S. No, we are not abandoning gcc/g++, just providing choices on how the OpenJDK can be built.

InfoQ Article on Git, Mercurial, and Bzr

A while back S├ębastien Auvray asked me some questions about the OpenJDK Mercurial conversion. His article was recently published at

Has some interesting information and stats about Git, Mercurial, and Bzr.


Friday Apr 18, 2008

OpenJDK: Dude, Where's My Changeset?

Developers are asking where the changesets are, reminds me of that movie Dude, Where's My Car?, "Dude, Where's My Changeset?"

So it seemed appropriate to bring up the "OpenJDK Integration Wheel" again and talk about how a changeset moves around:

Let's look at the changesets (fixes) pushed to the Swing Team Area. Let's pretend a changeset was pushed to the Swing area and is now publicly available. Not picking on the Swing team, just didn't want to change the picture :\^) Feel free to change the example from "Swing" to "Build" and the Build Team Area.

And Then? Immediately, anyone in the Swing team can pull these changesets down and have access to them. Other developers from other teams could also pull from this Swing area to get the change, but that's a bit abnormal, certainly possible, just not normal.

And Then? Depending on the Integration Schedule these changesets will likely remain in the Swing Team Area until the official Swing integrator pushes those changesets into the Master area (the Integration Process).

And Then? Once integrated into the master area, the other teams still may not see these changesets until their own integrators choose to "sync up" or send these new changesets in the Master area into their own team areas. This "sync up" usually happens as a team integrator prepares for the integration, so based on the order in the Integration Schedule, it could be a day or many days before a team will see changes that have been pushed into the master area.

And Then? Even after the team area has the changesets, a developer won't see the changesets until he pulls the changesets into his own area.

So you can see how sometimes it's hard to find a changeset. Somewhat independent of the changesets flowing into the various team areas, the Release Engineering Team will use the Master area and attempt to create a promoted build, and if successful will create tags in the Master repositories to record what changesets were included in a promotion.

Some people will find this whole process frustrating, but there are some big advantages. The delay of the changesets getting into the Master area allows for the team most closely associated with the change to test and verify, from that team's perspective. Odds are that bad changesets will get caught by the team, and this protects the other teams. The process the team integrator goes through includes additional testing to verify the changesets being integrated are solid for everyone. It's not unusual for an integrator to run into a problem with his team's changesets, which again protects all the other teams from potential disasters. Granted, regressions will still happen, but really nasty regressions are usually caught early.

Hope this helps explain things.


Thursday Mar 27, 2008

OpenJDK, Mercurial, and The Changeset View

Why do I have to create a "Merge" changeset when there was nothing to merge?

For most of us old TeamWare users, and maybe other SCM users, the need for all the Mercurial "Merge" changesets (or as some people politely refer to as 'merge turds') seems confusing and messy. If the changes don't involve the same file, it can be hard to understand why you need a Merge changeset.

What did TeamWare look like?

In TeamWare a 'resolve' was necessary only when there was a conflict, meaning that two people changed the same file. The tool 'filemerge' provided a way to easily deal with each file conflict, but merging changes is and will always be a risky business. Everyone has had an experience with a 'bad merge', they are nasty problems. No Source Code Management (SCM) tool completely removes the need for merging, and our only hope is for the merging tools to help us out here. It is probably true that a Distributed SCM like TeamWare, Mercurial, or git may create the need for more frequent merging, but the end result is often the same as a non-Distributed SCM, so maybe with a DSCM the merge work is also distributed? Anyway, I digress.

With TeamWare, the 'resolve' action resulted in multiple revisions in each SCCS file that had a conflict. The TeamWare tool 'vertool' vertool provided a way to pick an SCCS file and view it's revision history. Again, this was on a per-file basis, and although that created some benefits for developers, like being able to 'putback' just one file change, it also made it a little difficult to record the true state of the entire workspace. Here is a snapshot of vertool in action for anyone that hasn't seen it:

Notice the SCCS revision graph, when conflicts happened, the graph gets a little more complicated, but unless the changes are abandoned, it always connects back up to the main trunk of the graph. With TeamWare, every file was controlled with SCCS, and every file had a graph. The connections between files was never formally managed by TeamWare, but TeamWare provided some tools like 'freezept' to allow you to try and manage it.

And with Mercurial ...

The changes come in changesets or grouped changes to files, which are treated and tracked as changes to the repository. Yes, the changes are made to specific files, but the revision tracking is done for the entire repository. When a merge situation in Mercurial happens, and they will be frequent, a new changeset has to be created to potentially carry any file merge changes, but most importantly to identify the merged or joined results of two changesets. All changesets have at least one parent changeset, but Merge changesets have two parent changesets. Everytime you do an 'hg pull' that adds new changesets in your repository, and your repository has changesets that have not been pushed yet, you have created what is called a 'multiple head' situation and you will need a Merge changeset. A 'head' is a changeset with no descendants, the tip changeset is a head and must be the only head if you want to push your changesets to the OpenJDK repositories (we do not allow any multiple head pushes with the OpenJDK repositories). This unfortunately means that people that do frequent "syncs" with their parent repository may be creating many Merge changesets, that's just the way it is, like Taxes, we will need to learn to live with it.

The 'hg view' command of Mercurial can provide some insight into this Merge business. To use 'hg view' you need to:

  1. Enable the hgk extension in your ~/.hgrc file.
  2. Make sure that the hgk tool in available from your PATH environment variable setting. You may need to download the Mercurial source bundle that matches the version of Mercurial you are using and get the hgk file from the contrib directory.
  3. Make sure the wish tool is available from your PATH environment variable setting. Note that Solaris Express has a /usr/bin/wish that works, and the MacOS 10.5 has a /usr/bin/wish that works, but you may need to do a little searching to find a wish that is acceptable to hgk. Solaris 10 and older machines may have one at /opt/sfw/bin/wishx or /usr/sfw/bin/wisk8.[34]

For example, to see what the most recent changesets pushed to the OpenJDK jdk7/jdk repository look like:

   hg clone yourjdk
   cd yourjdk
   hg view

You should then see something like this:

Looks a little like a Public Transportation System. Notice the groups of changesets created by developers, usually generated one right after the other. If two developers manage to line up (luck), the sequence is simple, but the second one to do a push had to do a pull and create a merge changeset. Layering on top of that is the integrations of the various teams to the master repository, which should appear as major addition to the graph.

Since a changeset is a repository revision this has tremendous benefits. For example, anyone can re-create the state of a repository (all the files) as of any changeset by simply doing:

   hg clone -r 82c85cfd8402 yourjdk trimmedjdk

Creating a separate repository that represents the state of everything as of that specific changeset id (which happens to be a changeset I created, specifically

I hope this has been helpful to at least a few people. Send me comments if I can clarify this more for people.

For more on the hgk extension go to Thanks to Chris Mason for creating this great extension.


Tuesday Mar 04, 2008

My First OpenJDK7 Mercurial Push

Finally was able to push my first OpenJDK7 Changes into the Mercurial jdk7/build area forest. Specifically to the three repositories: jdk7/build/jdk jdk7/build/jdk, and jdk7/build/jdk.

You can actually browse the changesets by clicking on the changeset links. If you haven't browsed around a Mercurial repository, it's pretty cool.

So What Happens Now?

This is just one of the many team areas for the JDK7 project, as the build team changes accumulate, at some point we will decide (as a team) to do some more detailed build&test runs and when we are satisfied all is well, we will reserve a time to integrate all these build changes to the master area Mercurial forest at (The various teams have to take turns integrating to the master area).

Finally, at some point the Release Engineering team will do their thing and create formal promoted builds from the master repositories, tagging the repositories to indicate when the promotion took place. The Release Engineering team will have to push the tag additions to the master repositories as a final step. At that point, cloning the repositories using that specific tag should get you the exact source used to create the promoted build.

Things are heating up in the OpenJDK world... ;\^)


Sunday Mar 02, 2008

NetBeans, Mercurial, Ant, Mac OS X, and getting the right PATH set

It took be some time, but I figured out how to change the environment variables that the launched Mac applications will get. Why? Because I wanted NetBeans to be running with a PATH environment variable setting that matched what Ant got when I used the build.xml file from the command line. When running the "<exec>" Ant task to use an executable, full paths or modifying the PATH is very platform specific, hard to maintain, and a huge pain. Having the PATH set properly in the environment is the best way.

Now I could launch "netbeans" from a command line and it would work, but here is the answer for setting the environment for applications launched:

<> cat ~/.MacOSX/environment.plist 
PATH = "/Users/ohair/ant/bin:/Users/ohair/findbugs/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/usr/sbin:/sbin";

The directory ~/.MacOSX will need to be created. Apparently this file is read in at login time, so if you change it you will need to logout and log back in again for any change to make a difference. In my case I added the path to my Ant, my Findbugs, and /usr/local/bin which contains the Mercurial (hg) I want to have available in the PATH. I'm not sure NetBeans will actually use the version of Ant in the PATH, but that hasn't been an issue for me.

Why Does hg need to be in the PATH?

It was the need for /usr/local/bin (hg) in the path (/usr/local/bin) that got me started on all this because in my Ant file I wanted the build to automatically pull the version information out of the repository and make it available to the built product as a property setting. Effectively I need to run:

    # Get the last changeset
    hg tip --template '{node|short}\\n'
    # Get the latest tag with a Version string in it
    hg log -l 1 --template '{desc|firstline}\\n' -k "Version:"
    # Get the date of the last changeset
    hg tip --template '{date|shortdate}\\n'

I used Ant rules something like this:

    <target name="hgpropfile" description="Create property file">
        <exec executable="hg" outputproperty="">
            <arg value="tip"/>
            <arg value="--template"/>
            <arg value="{node|short}\\n"/>
        <exec executable="hg" outputproperty="hg.last.tag.summary">
            <arg value="log"/>
            <arg value="-l"/>
            <arg value="1"/>
            <arg value="-k"/>
            <arg value="Version:"/>
            <arg value="--template"/>
            <arg value="{desc|firstline}\\n"/>
        <exec executable="hg" outputproperty="">
            <arg value="tip"/>
            <arg value="--template"/>
            <arg value="{date|shortdate}\\n"/>
<!-- Indentation is critical here -->
<echo file="${dist.dir}/" append="false">
product.version=${} ${hg.last.tag.summary} [${}]

This property and it's value would need to be read in or made available at runtime, the actual Java code would just need to getProperty("product.version") to get the version string.

I used a tag to track the version code name, with the most recent tag containing "Version:" provides the product code name. The date and changeset id come from the tip, or most recent changeset.

Automating the creation of a new version code name tag can be done with a special Ant target used when needed. Effectively it needs to run:

    hg tag -f -m "Version: Name" TAG-YYYY-MM-DD
I just manually created a file with a few hundred code names and pick one based on the day of the year. This "AllVersions" file could look as simple as:
    Humor Risk (1921), previewed once and never released; thought to be lost
    The Cocoanuts (1929), released by Paramount Pictures
    Animal Crackers (1930), released by Paramount
    The House That Shadows Built (1931), released by Paramount (short subject)
    Monkey Business (1931), released by Paramount
    Horse Feathers (1932), released by Paramount
    Duck Soup (1933), released by Paramount
    A Night at the Opera (1935), released by MGM
    A Day at the Races (1937), released by MGM
    Room Service (1938), released by RKO Radio Pictures
    At the Circus (1939), released by MGM
    Go West (1940), released by MGM
    The Big Store (1941), released by MGM
    A Night in Casablanca (1946), released by United Artists
    Love Happy (1949), released by United Artists
    The Story of Mankind (1957), released by Warner Brothers
But ideally you would want enough code names in the list to avoid the name getting re-used too many times. The "hg tip --template '{node|short}\\n'" is your real version, these code names are just a way to help people quickly identify a version.

The Ant target looks something like:

    <target name="new_version" description="Create new version tag">
            <format property="" pattern="D"/>
            <format property="date.ymd" pattern="yyyy-MM-dd"/>
        <property name="versions.file" value="AllVersions"/> 
        <exec executable="wc" outputproperty="version.count.temp" input="${versions.file}">
            <arg value="-l"/>
        <exec executable="sed" outputproperty="version.count" inputstring="${version.count.temp}">
           <arg value="-e"/>
           <arg value="s@\^[\\ ]\*\\([1-9][0-9]\*\\)$@\\1@"/>
        <exec executable="expr" outputproperty="version.selection">
            <arg value="${}"/>
            <arg value="%"/>
            <arg value="${version.count}"/>
        <exec executable="sed" outputproperty="" input="${versions.file}">
            <arg value="-e"/>
            <arg value="${version.selection},${version.selection}p"/>
            <arg value="-n"/>
        <exec executable="hg">
            <arg value="tag"/>
            <arg value="-f"/>
            <arg value="-m"/>
            <arg value="Version: ${}"/>
            <arg value="TAG${date.ymd}"/>

Or in a Makefile you could do a similar thing:

DATE_YEAR_DAY:=$(shell date +%j)
DATE_YMD:=$(shell date +%Y-%m-%d)
VERSION_COUNT:=$(shell cat $(VERSIONS_FILE) | wc -l | sed -e 's@\^[\\ ]\*\\([1-9][0-9]\*\\)$@\\1@')
        hg tag -f -m "Version: $(VERSION_NAME)" TAG$(DATE_YMD)

Now creating tags like this may not be advised or necessary with all repositories, but the basic principle can work in many situations. For example, with the OpenJDK project, the Release Engineering people will create the major milestone tags, and using those you could effectively identify a JDK version with a name (e.g. JDK 7 Build 23), and an exact changeset id. The trick is to get the version information from the repository, into the build tool (make or ant), into the product installation or baked into the product executable, and then available at runtime by the product plus easily seen when looking at an installation of the product. One issue I see is dealing with the situation where you are building a plain source tree without the Mercurial data or the Mercurial tools, somehow when the plain source tree is created the version data would need to be left in the source bundle.

Hope this is of some use to people. I'm sure there might be a better way, so if anyone has any ideas please add your comments. Ultimately I'd like a product to be able to provide enough details to a user so that the original source tree could be made quickly available. Given the changeset id, the exact and complete source could be re-created with hg clone --rev, of course that gets more complicated with a forest, but still pretty simple.



Various blogs on JDK development procedures, including building, build infrastructure, testing, and source maintenance.


« June 2016