JDK Build Musings
By kto on Apr 17, 2012
Here are some somewhat random musings on building the jdk and various build system observations. It might be observed that some of these may sound like whining, I can assure you that whining is not allowed in my blog, only constructive criticism, it's everyone else that is whining. :^) Apologies for the length.
Build and test of the JDK has multiple dimensions, and I cannot say that I have them all covered here, it is after all, just a blog, and I am not endowed with any supernatural guidance here.
Continuous Build & Smoke Test
Every component, every integration area, every junction point or merge point should be constantly built and smoke tested. I use the term smoke test because completely testing a product can take many weeks, and many tests do not lead themselves to being fully and reliably automated. The job of complete testing belongs to the testing organization. Smoke tests should provide a reasonable assurance that a product is not brain dead and had no major flaws to prevent further testing later on down the road. These smoke tests should be solid reliable tests, and any failure of these tests should signify a flaw in the changes to the product and raise major red flags for the individuals or teams that integrated any recent changes. Over the last year or more we have been trying to identify these tests for the jdk and it's not an easy task.
Everyone cuts corners for the sake of productivity, it's just important to cut those corners with eyes wide open. The ideal would be that every changeset was known to have passed the build and smoke test, the reality is far less than that, but we know where we want to be.
Build and Test Machines
The hardware/machine resources for a build and test system is cheap, and a bargain if it keeps all developers shielded from bad changes, finding issues as early in the development process as possible. But it is also true that hardware/machine resources do not manage themselves, so there is also an expense to managing the systems, some of it can be automated but not everything. Virtual machines can provide benefits here, but they also introduce complications.
Depending on who you talk to, this can mean a variety of things. If it includes building and smoke testing before the changes are integrated, this is a fantastic thing. If people consider this to mean that developers should continuously integrate changes without any verifications whatsoever that the changes work and don't cause regressions, that could be a disaster. Some so called 'Wild West' projects purposely want frequent integrations with little or no build and test verifications. Granted, for a small tight team, going 'Wild West' can work very well, but not when the lives of innocent civilians are at risk. Wild West projects must be contained, all members must agile, wear armor, and be willing to accept the consequences of arbitrary changes sending a projectile through your foot.
The JDK is not a pure Java project and builds must be done on a set of different platforms. When I first started working on the JDK, it became obvious to me that this creates a major cost to the project, or any project. Expecting all developers to have access to all types of machines, be experienced with all platforms, and to take the time to build and test manually on all of them is silly, you need some kind of build and test system to help them with this. Building on multiple platforms (OS releases or architectures) is hard to setup, regardless of the CI system used, this is a significant issue that is often underestimated. Typically the CI system wants to try and treat all systems the same and the fact of the matter is, they are not, and somewhere these differences have to be handled very carefully. Pure Linux projects, or pure Windows projects will quickly become tethered to that OS and the various tools on them. Sometimes that tethering is good, sometimes not.
Again, the JDK is not a pure Java project, many build tools try and focus on one language or set of languages. Building a product that requires multiple languages, where the components are tightly integrated, is difficult. Pure Java projects and pure C/C++ projects have a long list of tools and build assists in creating the resulting binaries. Less so for things like the JDK, where not only do we have C/C++ code in the JVM, but C/C++ code in the various JNI libraries, and C/C++ code in JVM agents (very customized). The GNU make tool is great for native code, the Ant tool is great for small Java projects, but there aren't many that work well in all cases.
Picking the right tools for the JDK build is not a simple selection.
Using different C/C++ compilers requires a developer to be well aware of the limitations of all the compilers, and to some degree if the Java code is also being compiled by different Java compilers, the same awareness is needed. This is one of the reasons that builds and tests on all platforms is so important and also why changing compilers, even just new versions of the same compiler can make people paranoid.
With the JDK we have a history of doing what we call partial builds. The hotspot team rarely builds the entire jdk, but instead just builds hotspot (because that is the only thing they changed) and then places their hotspot in a vetted jdk image that was built by the Release Engineering team at the last build promotion. Dito for the jdk teams that don't work on hotspot, they rarely build hotspot. This was and still is considered a developer optimization, but is really only possible because of the way the JVM interfaces to the rest of the jdk, it rarely changes. To some degree, successful partial builds can indicate that the changes have not created an interface issue and can be considered somewhat 'compatible'.
These partial builds create issues when there are changes in both hotspot and the rest of the jdk, where both changes need to be integrated at the same time, or more likely, in a particular order, e.g. hotspot integrates a new extern interface, later the jdk team integrates a change that uses or requires that interface, ideally after the hotspot changes have been integrated into a promoted build so everyone's partial builds have a chance of working.
The partial builds came about mostly because of build time, but also because of the time and space needed to hold all the sources of parts of the product you never really needed. I also think there is a comfort effect by a developer not having to even see the sources to everything he or she doesn't care about. I'm not convinced that the space and time of getting the sources is that significant anymore, although I'm sure I would get arguments on that. The build speed could also become less of an issue as the new build infrastructure speeds up building and makes incremental builds work properly. But stay tuned on this subject, partial builds are not going away, but it's clear that life would be less complicated without them.
Similar to many native code projects we can build a product, or a debug, or a fastdebug version. My term for these is build flavors. My goal in the past is to make sure that the build process stays the same, and it's just the flavor that changes. Just like ice cream. ;^) (fastdebug == -O -g + asserts).
Plug and Play
Relates to build flavors, it has been my feeling that regardless of the build flavor, the API's should not change. This allows for someone to take an existing product build, replace a few libraries with their debug versions, and run tests that will run with the best performance possible, and be able to debug in the area of interest. This cannot happen if the debug or fastdebug versions have different APIs, like MSVCRTD.DLL and MSVCRT.DLL.
Probably applies to Git or any distributed Source Code Management system too.
Face it, DSCM's are different. They provide some extremely powerful abilities over single repository model SCM's, but they also create unique issues. The CI systems typically want to treat these SCM systems just like SVN or CVS, in my opinion that is a mistake. I don't have any golden answers here, but anyone that has or does work with a distributed SCM, will struggle with CI systems that treat Mercurial like subversion.
The CI systems are not the only ones. Many tools seem to have this concept of a single repository that holds all the changes, when in reality with a DSCM, the changes can be anywhere, and may or may not become part of any master repository.
Not many projects have cut up the sources like the OpenJDK. There were multiple reasons for it, but it often creates issues for tools that either don't understand the concept of nested repositories, or just cannot handle them. It is not clear at this time how this will change in the future, but I doubt they will go away. But it has been observed by many that the lack of bookkeeping with regards to the state of all the repositories can be an issue. The build promotion tags may not be enough to track how all the repository changeset states line up with built together.
Managing Build and Test Dependencies
Some build and test dependencies are just packages or products installed on a system, I've often called those "system dependencies". But many are just tarballs or zip bundles that needs to be placed somewhere and referred to. In my opinion, this is a mess, we need better organization here. Yeah yeah, I know someone will suggest Maven or Ivy, but it may not be that easy.
We will be trying to address this better in the future, no detailed plans yet, but we must fix this and fix it soon.
Resolved Bugs and Changesets
Having a quick connection between a resolved bug and the actual changes that fixed it is so extremely helpful that you cannot be without this. The connection needs to be both ways too. It may be possible to do this completely in the DSCM (Mercurial hooks), but in any case it is really critical to have that easy path between changes and bug reports. And if the build and test system has any kind of archival capability, also to that job data.
Some tests cannot be automated, some tests should not be automated, some automated tests should never be run as smoke tests, some smoke tests should never have been used as smoke tests, some tests can seriously mess up automation and even the system being used, ... No matter what, automating testing is not easy. You cannot treat testing like building, it has unique differences that cannot be ignored. If you want the test runs to be of the most benefit to a developer, you cannot stop on the first failure, you need to find all the failures. That failure list may be the evidence that links the failures to the change causing the failures, e.g. only tests using -server fail, or only tests on X64 systems fail, etc. At the same time, it is critical to drive home the fact that the smoke tests "should never fail", it is a slippery slope to start allowing smoke tests to fail. Sometimes, you need hard and fast rules on test failures, the smoke tests are those. If accepting failing smoke tests is a policy, that same policy needs to exclude the failing smoke test for everyone else so that life can go on for everyone else.
In an automated build and test system, you have to protect yourself from the tests polluting the environment or the system and impacting the testing that follows it. Redirection of the user.home and java.io.tmpdir properties can help, or at least making sure these areas are consistent freom test run to test run. Creating a separate and unique DISPLAY for X11 systems can also protect your test system from being impacted by automated GUI tests that can change settings of the DISPLAY.
Unless you can guarantee that all systems used are producing the exact same binary bits, in my opinion, distributed builds are unpredictable and therefore unreliable. A developer might be willing to accept this potential risk, but a build and test system cannot, unless it has extremely tight control over the systems in use. It has been my experience that parallel compilations (GNU make -j N) on systems with many CPUs is a much preferred and more reliable way to speed up builds.
However, if there are logically separate builds or sub-builds that can be distributed to different systems, that makes a great deal of sense. Having the debug, fastdebug, and product builds done on separate machines is a big win. Cutting up the product build can create difficult logistics in terms of pulling it all together into the final build image.
Having one large testbase, and one large testlist, and requiring one very long testrun that can take multiple hours is not ideal. Generally, you want to make the testbase easy to install anywhere, and create batches of tests, so that using multiple machines of the same OS/arch can allow for the tests to be run in a distributed way. Getting the same testing done in a fraction of the time of running one large batch. If the batches are too small, you spend more time on test setup than running the test. The goal should be to run the smoke tests as fast as possible in the most efficient way possible, and more systems to test with should translate into the tests getting done sooner. This also allows for new smoke test additions as the tests run faster and faster. Unlike distributed building, the testing is not creating the product bits, and even if the various machines used are slightly different, that comes closer to matching the real world anyway. Ideally you would want to test all possible real world configurations, but we all know how impractical that is.
Killing Builds and Tests
At some point, you need to be able to kill off a build or test, probably many builds and many tests on many different systems. This can be easy on some systems, and hard with others. Using Virtual Machines or ghosting of disk images provides a chance of just system shutdowns and restarts with a pristine state, but that's not simple logic to get right for all systems.
Automated System Updates
Having systems do automatic updates while builds and tests are running is insane. The key to a good build and test system is reliability, you cannot have that if the system you are using is in a constant state of flux. System updates must be contained and done on a schedule that prevents any disturbance to the build and tests going on in the systems. It is completely unacceptable to change the system during a build or test.
AV software can be extremely disturbing to the performance of a build and test system. It is important, but must be done in a way that preserves the stability and reliability of the build and test system. The dynamic AV scanning is a great invention, but has the potential to disturb build and test processes in very negative ways.
Hopefully this blabbering provided some insights into the world of JDK build and test.