Wednesday May 25, 2011

Build Infrastructure Project

So the Build Infrastructure Project for OpenJDK has finally gotten started. Please go to the project page for details, this is just some ramblings and babbling that people may find interesting. Hopefully not containing too many outright lies and mistakes, but if there are any, they belong to me and me alone. Most importantly:

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

So what is this Build Infrastructure project all about? Does "makefiles".equals("BuildInfrastructure")? No, but close, maybe "BuildInfrastructure".startsWith("makefiles")&&"BuildInfrastructure".endsWith("makefiles") :^)

For a long time now, most changes to the JDK makefiles and build process has been evolving slowly, some may say glacially, and it has certainly been fragmented. :^( I've been involved for many years now in trying to do some simplifications, and changes to the original Sun JDK Makefiles via the initial Peabody (JRL) project, and then OpenJDK.

I can't speak to the very original JDK Makefiles (JDK 1.1), but from where I entered the picture (and this is just my opinion) the makefiles for the most part, served the Release Engineering (RE) team for building the product, and developers just had to navigate around and find useful patterns that allowed them to get their job done, often times only building small parts of the JDK any way they could (and avoiding complete JDK builds at all costs). The important builds came from RE, and as long as their builds were successful, always from scratch, all was well with the world. But the developers suffered from:

  • No good or reliable incremental builds
  • Slow builds of Java source
  • Incorrect assumptions about *.java timestamps and what needs to be compiled
  • Implicit compilations by javac confusing matters
  • The "one pile" of "classes/" build scheme (related to the implicit issue)
  • Poor automation around source list selections and when and how javah is run
  • Like plastic bags, it was important to avoid sticking your head into the makefiles too completely:

Multiple events happened over time that triggered evolutionary changes to the JDK build processes. I don't have actual dates for these, but you get the general idea:

  • Hotspot, fastdebug builds, plug&play shared libraries (removal of java_g)

    These may seem unrelated, but when the Hotspot team started building "fastdebug" VM libraries (using -g -O and including assertions), that could just be plugged into any JDK image, that was a game changer. It became possible to plug&play native components when building this way, instead of the java_g builds where all the components had to be built the same way, an all or nothing run environment that was horribly slow and limiting. So we tried to create a single build flow, with just variations on the builds (I sometimes called them "build flavors" of product, debug, and fastdebug). Of course, at the same time, the machines got faster, and perhaps now using a complete debug build of a jdk make more sense? In any case, that fastdebug event influenced matters. We do want different build flavors, but we don't want separate make logic for them.

  • Ant scripts

    Ant scripts seem to create intense discussions. I originally found Ant tolerable, and actually pretty efficient for small vanilla Java projects. The fact that IDEs (like NetBeans) worked so well with them also made them interesting, I jumped on board and let the ants crawl all over me, or rather my manager had me work on the JavaFX Ant scripts and they crawled all over me. :^( (Trust me, large disjoint sets of Ant scripts are very hard to manage.) Over time, I discovered that the Ant "<parallel>" capabilities are not useful, the XML scripting is verbose and difficult to use when you go beyond that simple Java project model, and I probably tracked down more issues involving Ant and Ant scripts since they were introduced into the JDK build than any other change to the build process. Ant can be very tricky. It is a Java app, which is kind of cool, but when you are actually building a JDK, it becomes more of an annoyance. I have always resisted any situation where Make runs Ant which runs Make, or where Ant runs Make which runs Ant, I figured my sanity was more important. So the bottom line is, I think we need a better answer, and it probably looks more like Make than Ant, and will likely not even include Ant. Sorry if that offends the Ant lovers. :^(

  • Peabody (JRL) builds and the need to build on a wider variety of platforms, and by a larger population of developers.

    Before Peabody and the JRL sources were exposed, the number of developers was limited and the fact that it was so difficult to build wasn't as big a factor. A developer would spend a day or so getting his system ready to do a build, and then would never have to look at it again, until he had a new system. It was a one time build setup, per developer. But as the number of people wanting to build the JDK exploded (note I said "build" not develop) it was obvious that the complex build setups were more of a problem. In addition, the antiquated Linux/Solaris/Windows systems used previously did not match what this new crop of JDK builders had access to. The various makefile sanity checks that worked so well for RE now needed to become warnings instead of fatal build errors.

  • OpenJDK, and a further explosion of developers and builders of the source.

    These same build issues continued with OpenJDK, we tried to make life easier, and provided the README-builds.html document to try and help guide people. But we knew, and we were being told day after day, it was not as good as it could be.

  • Separate repos for hotspot, langtools, jaxp, jaxws, corba, ...

    The fact that we used nested repositories has been an issue from day one. The isolation it provides for various teams and the extra dimension to distributed development it creates does have it's benefits, but it also comes with some issues. Many tools that say they "support Mercurial" don't actually support nested repositories, and without using something like the Mercurial "subrepos" functionality, coordinating changesets in between repositories is still an issue. The Forest Extension upon which we relied has not become any kind of official extension and continues to be problematic. But for now we are sticking with nested repositories but expect some changes in the future here.

  • The jaxp and jaxws source drop bundles

    These source drops served their purpose, but we need a better answer here. It does create a complexity to building that needs to be fixed.

  • The complications of building a product from open and closed sources

    This is unique to our JDK builds that use all or most of the OpenJDK sources but then augment the build with additional sources. We have tried to have this not impact the actual overall build process, this is our pain, but certainly impacts what we do in the OpenJDK from time to time.

  • And in comes GNU make's "$(eval)" function.

    At the time I thought this feature was significant, I did some experiments and discovered that it only worked well with GNU make 3.81, and we seemed to be stuck on GNU make 3.78.1 or 3.79, and in some cases people were using GNU make 3.80. This GNU Make eval function allows for the value to be evaluated and parsed again as make syntax. This might not seem to be a sane or important feature, but it actually is a very special and powerful feature. So an effort was made to get us up-to-date and working well on GNU make 3.81. In any case, this turned out to be a critical feature for what people will see moving forward in the Build Infrastructure Project.

So this is what we need to keep in mind:

  • Different build flavors, same build flow
  • Ability to use 'make -j N' on large multi-CPU machines is critical, as is being able to quickly and reliably get incremental builds done, this means:
    • target dependencies must be complete and accurate
    • nested makes should be avoided
    • ant scripts should be avoided for multiple reasons (it is a form of nested make), but we need to allow for IDE builds at the same time
    • rules that generate targets will need to avoid timestamp changes when the result has not changed
    • Java package compilations need to be made parallel and we also need to consider some kind of javac server setup (something that had been talked about a long time ago)
  • Continued use of different compilers: gcc/g++ (various versions), Sun Studio (various versions), and Windows Visual Studio (various versions)
  • Allow for clean cross compilation, this means making sure we just build it and not run it as part of the build
  • Nested repositories need to work well, so we need a way to share common make logic between repositories
  • The build dependencies should be managed as part of the makefiles

So as the Build Infrastructure Project progresses, expect some revolutionary changes.

-kto

Thursday Sep 23, 2010

JDK7: Java vendor property changes

Very soon now, the OpenJDK7 sources (including the source bundles) and the JDK7 binary bundles will include changes to the values of the various "vendor" java properties and also the Windows COMPANY file properties.

If you have code that depends on any of these settings starting with "Sun", now might be the time to include "Oracle" as part of that comparison so that your use of OpenJDK7 or JDK7 builds will operate properly.

Specifically, these Java system properties currently contain:

java.vendor = Sun Microsystems Inc.
java.vendor.url = http://java.sun.com/
java.vm.vendor = Sun Microsystems Inc.
java.specification.vendor = Sun Microsystems Inc.
java.vm.specification.vendor = Sun Microsystems Inc.
And will soon contain:
java.vendor = Oracle Corporation
java.vendor.url = http://java.oracle.com/
java.vm.vendor = Oracle Corporation
java.specification.vendor = Oracle Corporation
java.vm.specification.vendor = Oracle Corporation

In addition, on Windows, the DLL and EXE files will have their COMPANY file property changed to match the java.vendor property value.

There are other vendor properties throughout the JDK7 implementation that will also be changing, but it may take some time to complete all the other changes.

NOTE: this only applies to JDK7 and newer releases.

-kto

Thursday Dec 10, 2009

Mercurial Forest: Pet Shell Trick of the Day

Now be careful with this, but here is a simple bash shell script that will do Mercurial hg commands on a forest pretty quickly. It assumes that the forests are no deeper than 3, e.g. \*/\*/\*/.hg. Every hg command is run in parallel shell processes, so very large forests might be dangerous, use at your own risk:

#!/bin/sh

# Shell script for a fast parallel forest command

tmp=/tmp/forest.$$
rm -f -r ${tmp}
mkdir -p ${tmp}

# Remove tmp area on A. B. Normal termination
trap 'rm -f -r ${tmp}' EXIT

# Only look in specific locations for possible forests (avoids long searches)
hgdirs=`ls -d ./.hg ./\*/.hg ./\*/\*/.hg ./\*/\*/\*/.hg 2>/dev/null`

# Derive repository names from the .hg directory locations
repos=""
for i in ${hgdirs} ; do
  repos="${repos} `echo ${i} | sed -e 's@/.hg$@@'`"
done

# Echo out what repositories we will process
echo "# Repos: ${repos}"

# Run the supplied command on all repos in parallel, save output until end
n=0
for i in ${repos} ; do
  n=`expr ${n} '+' 1`
  (
    cline="hg --repository ${i} $\*"
    echo "# ${cline}"
    eval "${cline}"
    echo "# exit code $?"
  ) > ${tmp}/repo.${n} ; cat ${tmp}/repo.${n} &
done

# Wait for all hg commands to complete
wait

# Cleanup
rm -f -r ${tmp}

# Terminate with exit 0 all the time (hard to know when to say "failed")
exit 0

Run it like: hgforest status, or hgforest pull -u.

As always should any member of your IMF forest be caught or killed, the secretary will disavow all knowledge of your actions. This tape will self-destruct in five seconds. Good luck Jim.

-kto

Wednesday Nov 25, 2009

Faster OpenJDK Build Tricks

Here are a few tips and tricks to get faster OpenJDK builds.

  • RAM

    RAM is cheap, if you don't have at least 2Gb RAM, go buy yourself some RAM for Xmas. ;\^)

  • LOCAL DISK

    Use local disk if at all possible, the difference in build time is significant. This mostly applies to the repository or forest you are building (and where the build output is also landing). Also, to a lesser degree, frequently accessed items like the boot jdk (ALT_BOOTDIR). Local disk is your best choice, and if possible /tmp on some systems is even better.

  • PARALLEL_COMPILE_JOBS=N

    This make variable (or environment variable) is used by the jdk repository and should be set to the number of native compilations that will be run in parallel when building native libraries. It does not apply to Windows. This is a very limited use of the GNU make -j N option in the jdk repository, only addressing the native library building. A recommended setting would be the number of cpus you have or a little higher, but no more than 2 times the number of cpus. Setting this too high can tend to swamp a machine. If all the machines you use have at least 2 cpus, using a standard value of 4 is a reasonable setting. The default is 1.

  • HOTSPOT_BUILD_JOBS=N

    Similar to PARALLEL_COMPILE_JOBS, this one applies to the hotspot repository, however, hotspot uses the GNU make -j N option at a higher level in the Makefile structure. Since more makefile rules get impacted by this setting, there may be a higher chance of a build failure using HOTSPOT_BUILD_JOBS, although reports of problems have not been seen for a while. Also does not apply to Windows. A recommended setting would be the same as PARALLEL_COMPILE_JOBS. The default is 1.

  • NO_DOCS=true

    Skip the javadoc runs unless you really need them.

Hope someone finds this helpful.

-kto

Friday Aug 14, 2009

Warning Hunt: Hudson and Charts on FindBugs, PMD, Warnings, and Test Results

"Drink, ye harpooners! drink and swear, ye men that man the deathful whaleboat's bow -- Death to Moby Dick!"

Once again, my sincere apologies to the Whales and Whale lovers out there, just remember Moby Dick is just a fictional story and I am in no way condoning the slaughter of those innocent and fantastic animals.

Hudson and Trends

Using Hudson to watch or monitor trends in your projects is a fantastic tool, although my graphs in this example are not that interesting, you can see how this could help in keeping track of recent test, warnings, findbugs, and pmd trends:

This blog should demonstrate how you can reproduce what you see above.

To get Hudson and start it up, do this:

  1. Download Hudson (hudson.war)
  2. In a separate shell window, run:
    java -jar hudson.war

Congratulations, you now have Hudson running. Now go to http://localhost:8080/ to view your Hudson system. Initially you need to add some Hudson plugins, so you need to:

  1. Select Manage Hudson
  2. Select Manage Plugins
  3. Select Available
  4. Select the check boxes for the following plugins: Findbugs Plugin, Mercurial Plugin, and PMD Plugin
  5. Select Install at the bottom of the page.
  6. Select Restart Now

We need a small Java project to demonstrate the charting, and one that has been setup to create the necessary charting data. You will need Mercurial installed, so if hg version doesn't work, get one of the Mercurial binary packages installed first. I will use this small jmake project from kenai.com. So get yourself a mercurial clone repository with:

hg clone https://kenai.com/hg/jmake~mercurial ${HOME}/jmake
and verify you can build it from the command line first:
cd ${HOME}/jmake
ant import
ant
The ant import will download junit, findbugs, and pmd. Access to junit, findbugs, and pmd could be done in a variety of ways, this import mechanism is just happens to be the way this ant script is setup.

This jmake build.xml script is already setup to create the necessary xml files that Hudson will chart for junit, findbugs, and pmd. You may want to look at the build.xml file, specifically the lines at the end of the file for how this is done. Your build.xml needs to create the raw data (the xml files) for the charting to work. This build.xml file happens to be a NetBeans configured ant script so it can be used in NetBeans too.

Now let's create a Hudson job to monitor your jmake repository. Go to http://localhost:8080/ and:

  1. Select New Job
  2. Name the job, I'll use "jmake".
  3. Select Build a free-style software project.
  4. Select OK.

Next we need to configure the job. You should be sitting in the job configuration page now, this is what you need to do:

  1. Under Source Code Management, select Mercurial and enter the path ${HOME}/jmake.
  2. Under Build Triggers select Poll SCM and enter \*/5 \* \* \* \* so that the repository is checked every 5 minutes.
  3. Under Build select Add Build Step and Invoke Ant, add import as the ant target.
  4. Select Add Build Step again and Invoke Ant, leave the ant target blank.
  5. Select Publish JUnit test report result and provide the path build/test/results/TESTS-\*.xml
  6. Select Scan for compiler warnings.
  7. Select Publish PMD analysis report and provide the path build/pmd.xml.
  8. Select Publish Findbugs analysis report and provide the path build/findbugs.xml
  9. Finally hit the Save at the bottom of the job configuration page.

Lastly, you need to use the Build Now a few times to build up some build history before the charts show up.

This Hudson job will automatically run when changes are added to the repository.

Hope this helps. Using Hudson for this is a definite plus, everyone should invest some time in setting these kind of systems up.

-kto

About

Various blogs on JDK development procedures, including building, build infrastructure, testing, and source maintenance.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today