Building javac for JDK 7
By jjg on Nov 20, 2009
Back in August, Kelly posted a blog entry about the Anatomy of the JDK build. However, upcoming new features for javac mean that building the JDK is about to get more interesting. More specifically, building the langtools component of the JDK is about to get a whole lot more challenging.
Currently, it is a requirement that we can build a new version of JDK using the previous version. In other words, we can use JDK 6 to build images for JDK 7. However, we also want to be able to use new JDK features, including new language features, throughout most of the JDK source code. This means we need to be able to use the new version of javac to compile the Java code in the new version of JDK, which in turn imposes a restriction that we must at least be able to compile the new version of javac with the previous version of javac.
In practice, this means langtools is built using the boot JDK, which it uses to build bootstrap versions of javac, javah and javadoc, which understand the latest version of Java source code and class files, but which can be run by the boot JDK. These bootstrap tools will be used through the rest of the JDK build. In addition, the langtools build uses the new bootstrap javac to compile all of the langtools code for eventual inclusion in the new version of JDK. This is shown here, in Figure 1. This directly corresponds to step 1 in Anatomy of the JDK build.
Figure 1: Building langtools today
The main body represents the langtools build; inputs are on the left, and deliverables (for the downstream parts of the build) are shown on the right.
In a recent blog entry, JSR 199 meets JSR 203, I described a new file manager for use with javac that can make of of the new NIO APIs now available in JDK7. Separately, Project Jigsaw is working to provide a Java module system that is available at both compile time and runtime, with consequential changes to javac. These two projects together have one thing in common: they both require the new version of javac should be able to access and use new API that is only available in JDK 7, which is at odds with the restriction that we should be able to compile the new javac with the previous version of javac.
The problem, therefore, is, How do we build javac for JDK 7?
Using the source path
One might think we could simply put the JDK 7 API source files on the source path used to compile javac. If only it were that simple! Various problems get in the way: some of the new API already uses new language features which will not be recognized by earlier versions of javac -- for example, some of the Jigsaw code already uses the new diamond operator. Also, javac ends up trying to read the transitive closure of all classes it reads from the source path, and when you put all of JDK on the source path, you end up reading a whole lot of JDK classes! Even though the new javac may just directly reference the NIO classes, to compile those classes, the transitive closure eventually leads you to AWT (really!) and to a couple of show stoppers: some of the classes are platform specific (i.e. in src/platform/classes instead of src/share/classes) and worse, some of the source files do not even exist at the time javac is being compiled -- they are generated while building the jdk repository, which happens much later in the JDK build process. (Step 6 in Anatomy of the JDK build.) So, simply putting the JDK 7 API source files on the source path is not a viable solution -- and reorganizing the build to generate the automatically generated source code earlier would be a very big deal indeed.
Using an import JDK
So, clearly, you can no longer build all of a new javac using the previous version of javac. But, we could leave out the parts of the new javac that depend on the new API, provided that we can build a bootstrap javac that functions "well enough" to be able to build the rest of javac and the JDK. However, we would still need to be able to build the new version of javac to be included in the final JDK image.
If you temporarily ignore chickens and eggs and their temporal relationships, the problems would all go away if you could put the classes for JDK 7 on the (boot) class path used to compile javac. This is very similar to the use of an "import JDK" used elsewhere by the JDK build system when performing partial JDK builds: an import JDK is used to provide access to previously built components when they are not otherwise part of the current build environment, which is somewhat the case here.
This is shown here, in Figure 2, and is not so different from what we are currently doing.
Figure 2: Building langtools with an import JDK
In a full JDK build, we cannot compile against the JDK source code on the source path, and we cannot assume the availability of an import JDK to use on the (boot) class path. The solution is to provide stub files for the necessary JDK 7 API, which are sufficient for the purpose of compiling javac. Stub files have the same public signature as the files they represent, but none of the implementation detail, so they do not suffer from the same extensive transitive closure problem as occurred when trying to compile against the real JDK 7 API source code. And, we only need stub files for those classes required by javac that are either new or changed from their JDK 6 counterparts. This also simplifies the problem substantially.
The number of files involved, and the rate at which some of the files are changing, makes it impractical to create and maintain such stub files manually. The solution is to generate the stub files automatically from the latest JDK 7 API that would otherwise be used instead. The stub generator is built from parts of javac -- it reads in the JDK 7 source files to create javac ASTs, it rewrites the ASTs by removing as many implementation details as possible, then writes out the modified AST in Java source form to be used in place of the original. And, as a minor added complication, although the output stub files must be readable by a JDK 6 compiler, the input source files may contain JDK 7 artifacts (remember I said that the Jigsaw code already uses the diamond operator), so the stub generator must be built on top of the new javac -- or at least, those parts of the new javac that can be compiled by the old javac.
The final result is shown here, in Figure 3.
Figure 3: Building langtools using generated stubs
The langtools build.xml file uses three new properties. Two are statically defined in build.properties, and specify the langtools source files that depend on new JDK 7 API, and the API that is depended upon; the third is provided by the user and can specify the location of either an import JDK or a jdk repository.
When building a full JDK, the langtools build.xml must be given the location of the jdk/ repository. The langtools build will create and compile against stubs files generated from the necessary JDK source code. [Figure 3, above.] In a full JDK control build, the location of the jdk/ repository is passed in automatically by the Makefile from the JDK_TOPDIR make variable, which exists for this purpose.
When building langtools by itself, a developer may choose to pass in the location of an import JDK. In this case, the langtools build will compile against rt.jar in the import JDK, thus precluding the need to generate and use stub files. [Figure 2, above.]
If no value is passed in for the jdk/ repository or import JDK, the langtools build will not build those classes that require the use of JDK 7 API. This allows a developer to create a compiler that is just "a better JDK 6" compiler. [Figure 1, above.]
It is also worth noting the compiler options are quite tricky for these different cases, and specifically, for the boxes in the diagrams labelled "compile product classes".
javac itself is run using the bootstrap javac classes on the JVM boot class path (
If being used, the stub files go on the compiler's source path (
-sourcepath), together with
-Xprefer:source. Together, these mean that the stub files are used in preference to any files from the boot JDK, and that class files are not generated for the stub files. Other JDK API comes from the normal boot class path. Note that unlike other situations when overriding the standard JDK API, the stub files cannot go on the boot class path because source files are not read from that path.
If an import JDK is being used, it is used together with the javac output directory for the compiler's boot class path (
-Xbootclasspath). This completely replaces the normal boot class path used by the compiler, so all JDK classes are read from the import JDK.
Unless an import JDK is being used, the javac output directory is prefixed to the normal boot class path (
-Xbootclasspath/p:). This means that langtools classes are used in preference to classes on the normal boot class path, while not hiding any classes not defined by langtools.
SummaryWith these build changes, it is possible to allow limited references from javac into new JDK 7 API, which are forward references in terms of the normal build process. Furthermore, this can be done without changing the overall structure of the JDK build pipeline.
AcknowledgementsThanks for Kelly and Maurizio for reviewing the work here.