Friday Apr 18, 2008

OpenJDK: Dude, Where's My Changeset?

Developers are asking where the changesets are, reminds me of that movie Dude, Where's My Car?, "Dude, Where's My Changeset?"

So it seemed appropriate to bring up the "OpenJDK Integration Wheel" again and talk about how a changeset moves around:

Let's look at the changesets (fixes) pushed to the Swing Team Area. Let's pretend a changeset was pushed to the Swing area and is now publicly available. Not picking on the Swing team, just didn't want to change the picture :\^) Feel free to change the example from "Swing" to "Build" and the Build Team Area.

And Then? Immediately, anyone in the Swing team can pull these changesets down and have access to them. Other developers from other teams could also pull from this Swing area to get the change, but that's a bit abnormal, certainly possible, just not normal.

And Then? Depending on the Integration Schedule these changesets will likely remain in the Swing Team Area until the official Swing integrator pushes those changesets into the Master area (the Integration Process).

And Then? Once integrated into the master area, the other teams still may not see these changesets until their own integrators choose to "sync up" or send these new changesets in the Master area into their own team areas. This "sync up" usually happens as a team integrator prepares for the integration, so based on the order in the Integration Schedule, it could be a day or many days before a team will see changes that have been pushed into the master area.

And Then? Even after the team area has the changesets, a developer won't see the changesets until he pulls the changesets into his own area.

So you can see how sometimes it's hard to find a changeset. Somewhat independent of the changesets flowing into the various team areas, the Release Engineering Team will use the Master area and attempt to create a promoted build, and if successful will create tags in the Master repositories to record what changesets were included in a promotion.

Some people will find this whole process frustrating, but there are some big advantages. The delay of the changesets getting into the Master area allows for the team most closely associated with the change to test and verify, from that team's perspective. Odds are that bad changesets will get caught by the team, and this protects the other teams. The process the team integrator goes through includes additional testing to verify the changesets being integrated are solid for everyone. It's not unusual for an integrator to run into a problem with his team's changesets, which again protects all the other teams from potential disasters. Granted, regressions will still happen, but really nasty regressions are usually caught early.

Hope this helps explain things.


Thursday Mar 27, 2008

OpenJDK, Mercurial, and The Changeset View

Why do I have to create a "Merge" changeset when there was nothing to merge?

For most of us old TeamWare users, and maybe other SCM users, the need for all the Mercurial "Merge" changesets (or as some people politely refer to as 'merge turds') seems confusing and messy. If the changes don't involve the same file, it can be hard to understand why you need a Merge changeset.

What did TeamWare look like?

In TeamWare a 'resolve' was necessary only when there was a conflict, meaning that two people changed the same file. The tool 'filemerge' provided a way to easily deal with each file conflict, but merging changes is and will always be a risky business. Everyone has had an experience with a 'bad merge', they are nasty problems. No Source Code Management (SCM) tool completely removes the need for merging, and our only hope is for the merging tools to help us out here. It is probably true that a Distributed SCM like TeamWare, Mercurial, or git may create the need for more frequent merging, but the end result is often the same as a non-Distributed SCM, so maybe with a DSCM the merge work is also distributed? Anyway, I digress.

With TeamWare, the 'resolve' action resulted in multiple revisions in each SCCS file that had a conflict. The TeamWare tool 'vertool' vertool provided a way to pick an SCCS file and view it's revision history. Again, this was on a per-file basis, and although that created some benefits for developers, like being able to 'putback' just one file change, it also made it a little difficult to record the true state of the entire workspace. Here is a snapshot of vertool in action for anyone that hasn't seen it:

Notice the SCCS revision graph, when conflicts happened, the graph gets a little more complicated, but unless the changes are abandoned, it always connects back up to the main trunk of the graph. With TeamWare, every file was controlled with SCCS, and every file had a graph. The connections between files was never formally managed by TeamWare, but TeamWare provided some tools like 'freezept' to allow you to try and manage it.

And with Mercurial ...

The changes come in changesets or grouped changes to files, which are treated and tracked as changes to the repository. Yes, the changes are made to specific files, but the revision tracking is done for the entire repository. When a merge situation in Mercurial happens, and they will be frequent, a new changeset has to be created to potentially carry any file merge changes, but most importantly to identify the merged or joined results of two changesets. All changesets have at least one parent changeset, but Merge changesets have two parent changesets. Everytime you do an 'hg pull' that adds new changesets in your repository, and your repository has changesets that have not been pushed yet, you have created what is called a 'multiple head' situation and you will need a Merge changeset. A 'head' is a changeset with no descendants, the tip changeset is a head and must be the only head if you want to push your changesets to the OpenJDK repositories (we do not allow any multiple head pushes with the OpenJDK repositories). This unfortunately means that people that do frequent "syncs" with their parent repository may be creating many Merge changesets, that's just the way it is, like Taxes, we will need to learn to live with it.

The 'hg view' command of Mercurial can provide some insight into this Merge business. To use 'hg view' you need to:

  1. Enable the hgk extension in your ~/.hgrc file.
  2. Make sure that the hgk tool in available from your PATH environment variable setting. You may need to download the Mercurial source bundle that matches the version of Mercurial you are using and get the hgk file from the contrib directory.
  3. Make sure the wish tool is available from your PATH environment variable setting. Note that Solaris Express has a /usr/bin/wish that works, and the MacOS 10.5 has a /usr/bin/wish that works, but you may need to do a little searching to find a wish that is acceptable to hgk. Solaris 10 and older machines may have one at /opt/sfw/bin/wishx or /usr/sfw/bin/wisk8.[34]

For example, to see what the most recent changesets pushed to the OpenJDK jdk7/jdk repository look like:

   hg clone yourjdk
   cd yourjdk
   hg view

You should then see something like this:

Looks a little like a Public Transportation System. Notice the groups of changesets created by developers, usually generated one right after the other. If two developers manage to line up (luck), the sequence is simple, but the second one to do a push had to do a pull and create a merge changeset. Layering on top of that is the integrations of the various teams to the master repository, which should appear as major addition to the graph.

Since a changeset is a repository revision this has tremendous benefits. For example, anyone can re-create the state of a repository (all the files) as of any changeset by simply doing:

   hg clone -r 82c85cfd8402 yourjdk trimmedjdk

Creating a separate repository that represents the state of everything as of that specific changeset id (which happens to be a changeset I created, specifically

I hope this has been helpful to at least a few people. Send me comments if I can clarify this more for people.

For more on the hgk extension go to Thanks to Chris Mason for creating this great extension.



Various blogs on JDK development procedures, including building, build infrastructure, testing, and source maintenance.


« February 2017