OpenJDK 6: Logistics of Partial Merge with 6u10
By darcy on Oct 02, 2008
A large fraction of my work for OpenJDK 6 build 12 was porting all of the cumulative fixes in selected areas of the 6u10 code base into OpenJDK 6. Internally, like the forest of Mercurial repositories of JDK 7, the code base of OpenJDK 6 is composed of a set of teamware workspaces for different areas: cobra, hotspot, jaxp, jaxws, jdk, and langtools. Previously, the non-HotSpot code lived in a single "j2se" workspace which was split as part of the JDK 7 transition to Mercurial. I worked on merging in the fixes from the corba, jaxp, jaxws, and langtools areas. Jon helped with langtools too.
A variety of techniques were used to first find the fixes in the 6 update train that were absent in the OpenJDK 6 code base and then apply them as appropriate. The three basic strategies were:
Examining the bug database to find bugs fixed in some 6 update release but not in OpenJDK 6. Patches implementing those fixes could then be applied.
Teamware bringover and merge. The teamware bringover command pulls down changes from the parent workspace; per-file SCCS histories track changes and are used to identify any merge conflicts that need to be resolved.
Raw diffs of source files. Crude, but sometimes effective with the right preprocessing.
There were various complications using these techniques. Not all areas follow the same bug database practices so using the bug database alone was not sufficient. Standard practice for maintaining JDK code is to have each SCCS delta tagged with a set of bug numbers (or tagged as a merge or a copyright update) and to not add empty deltas when the code has not changed. However, some areas of the JDK, jaxws in particular, are effectively maintained externally and only occasionally synced. In those cases, this SCCS discipline is not necessary followed, rendering a teamware merge operation ineffective since many files will have spurious conflicts. However, even in areas where standard practices are followed, a simple teamware bringover and merge may not suffice. For example, a file may have been removed from OpenJDK 6 but not removed from the 6 update train; a simple bringover would recreate the file. Raw diffs of source files can also be informative, but some preprocessing is needed to bring both code bases to a common format.
Compared to the 6 update train, as part of open sourcing and in preparation for transition to Mercurial, the OpenJDK 6 code base has undergone a number of pervasive transformations. On a file level:
(generally) a GPL license is used instead of the Sun TLDA
SCCS keywords have been purged
whitespace has been normalized, tabs replaced by spaces and source files end with a newline
Therefore, to compute a meaningful raw diff, each of these transformations needs to be applied to the 6 update source or undone from the OpenJDK 6 source; although diff -w can bridge inconsistent whitespace conventions of course. However, even after being brought to a common format, since OpenJDK 6 is a backward branch from JDK 7, certain refactorings are present in the OpenJDK 6 train but absent from the 6 update train, obscuring both teamware-based and diff-based file comparisons. On a workspace level, the 6 update releases still use a monolithic j2se workspace in contrast to the split component workspaces in OpenJDK 6. Consequently, the contents of some subdirectories are now spread to multiple areas, further complicating bringover logistics.
The initial approach to evaluate merging in changes to each OpenJDK 6 component workspace was to:
Generate a list of source and test directories of the component workspace. (The make directories were excluded from consideration in the merge attempt since the workspace split fundamentally changed the makefile structure.)
Do a trial bringover -n of those directories from a 6u10 workspace. A workspace has a "biological parent" which gives birth to it; however, a bringover can use a different parent for comparison purposes instead. In this case, for the merge operation I used the 6u10 j2se workspace to be a temporary adoptive parent of my OpenJDK 6 component workspace. However, the eventual results of the merge would be committed to the biological parent OpenJDK 6 workspace.
Refine the directory list to avoid including spurious files.
Evaluate the utility of the bringover results in terms of merging files and generating conflicts.
While this procedure provided useful information, a simple bringover from 6u10 into OpenJDK 6 was never sufficient to properly capture the set of fixes without introducing other sorts of regressions, such as open source related file and license cleanup. On a per component workspace basis, those complications were:
corba: Generally the bringover from corba went very smoothly; there were only 12 conflicting files and 8 new files that would be created. However, none of the 8 new files ended up being added to the OpenJDK 6 corba workspace; four the new files now live in the jdk workspace after the workspace split and the other four were unneeded binary files previously removed from the workspace in preparation for open sourcing. Resolving the actual file conflicts was straightforward and the fixes for eight bugs were brought over. After the merge, the OpenJDK 6 corba workspace had only a slightly different structure than in 6u10 since the OpenJDK 6 corba was changed to no longer require a scheme interpreter during the build!
jaxp: In several dozen directories, there were about 30 conflicting files and two new files that would be created. Many of the differences were small, such as purging of SCCS keywords in OpenJDK 6. Where an SCCS keyword was purged, the purge was kept in the merged result; likewise licensing refinements and cleanups in OpenJDK 6 were also preserved in the merge. However, all changes that affected the semantics of the running code were brought in from 6u10. In summary, for the 30 or so conflicting files, generally the code matches the code in 6u10 and the license matches the license in OpenJDK 6. In previous syncs with the JDK, jaxp didn't always follow bug database discipline so a marker merge bug was created to supplement the known fix that was brought over. One of the files that would have been created was previously removed in JDK 7 so it was not resurrected and the other potential new file was not necessary and thus not created either.
jaxws: The jaxws code is externally maintained and occasionally synced; however, standard teamware delta practices are not followed, resulting in thousands of spurious merge conflicts being reported on a bringover. Therefore, by using a script that stripped off the leading comment in a file (removing license differences) and normalized whitespace, transformed versions of the OpenJDK 6 and 6u10 jaxws sources in a common format could be compared. The only nontrivial differences were for the upgrade to JAF 1.1.1. Other than difficulties getting the tests for these changes to work in jprt, incorporating the fixes was straightforward. In addition, the jaxws team verified no other fixes went into 6u10 that were not already present in OpenJDK 6 (the same security fixes had been applied independently to both releases).
langtools: The javac in OpenJDK 6 inherited a number of restructuring changes from JDK 7 that permeated the code, limiting the effectiveness of both teamware-based and source-file based comparisons to find true differences. Instead, queries on the bug database were used to identify bugs fixed in some 6 update release, but not in OpenJDK 6. Finding the effective set of bugs fixed in OpenJDK 6 compared to the originally shipped JDK 6 was computed as:
The bugs fixed in JDK 7 before the backward branch to create OpenJDK 6.
PLUS bugs directly fixed in OpenJDK 6 since the inception of its workspaces.
MINUS any "antibugs" that were annihilated in OpenJDK 6. An antibug is a change from JDK 7 that is inappropriate for a Java SE 6 implementation, such as OpenJDK 6, that is fixed by undoing the change. For example, a change in JDK 7 that added a class or method in the java.\* or javax.\* namespaces is inappropriate for a Java SE 6 implementation and must be removed for Java SE 6 conformance.
The set of bugs cumulatively fixed in 6u10 is just a union of the bugs fixed in each update release up to and including 6u10: 6u1, 6u2, 6u3, 6u4, 6u5, 6u6, 6u7, and 6u10. In the end, the patches for about five groups of langtools bugs needed to the applied.
When reviewing the changes to corba, jaxp, and jaxws, two sets of webrevs were generated, one comparing the merge result against 6u10 and another comparing the merge result against OpenJDK 6. Each was important to fully verify the correctness of the change both in terms of licensing and semantics.