Wednesday Nov 25, 2009

Faster OpenJDK Build Tricks

Here are a few tips and tricks to get faster OpenJDK builds.

  • RAM

    RAM is cheap, if you don't have at least 2Gb RAM, go buy yourself some RAM for Xmas. ;\^)

  • LOCAL DISK

    Use local disk if at all possible, the difference in build time is significant. This mostly applies to the repository or forest you are building (and where the build output is also landing). Also, to a lesser degree, frequently accessed items like the boot jdk (ALT_BOOTDIR). Local disk is your best choice, and if possible /tmp on some systems is even better.

  • PARALLEL_COMPILE_JOBS=N

    This make variable (or environment variable) is used by the jdk repository and should be set to the number of native compilations that will be run in parallel when building native libraries. It does not apply to Windows. This is a very limited use of the GNU make -j N option in the jdk repository, only addressing the native library building. A recommended setting would be the number of cpus you have or a little higher, but no more than 2 times the number of cpus. Setting this too high can tend to swamp a machine. If all the machines you use have at least 2 cpus, using a standard value of 4 is a reasonable setting. The default is 1.

  • HOTSPOT_BUILD_JOBS=N

    Similar to PARALLEL_COMPILE_JOBS, this one applies to the hotspot repository, however, hotspot uses the GNU make -j N option at a higher level in the Makefile structure. Since more makefile rules get impacted by this setting, there may be a higher chance of a build failure using HOTSPOT_BUILD_JOBS, although reports of problems have not been seen for a while. Also does not apply to Windows. A recommended setting would be the same as PARALLEL_COMPILE_JOBS. The default is 1.

  • NO_DOCS=true

    Skip the javadoc runs unless you really need them.

Hope someone finds this helpful.

-kto

Thursday Oct 08, 2009

Teamware/SCCS history conversion to Mercurial

Teamware/SCCS history conversion to Mercurial

Originally posted back in December 2007, I've added some new references and some possible strategies, at the end.

Silver Falls, Oregon. No, it doesn't use Mercurial, yet.

Just a few notes on converting source file change history from Teamware/SCCS to Mercurial. These are just notes because in the JDK area and in any Teamware JDK workspaces being converted, we don't plan on converting the old source change history into Mercurial. The major reason why we aren't is a legal issue, and you can imagine what the legal issues are with regards to non-open sources that become open. I won't get into that. But there are some technical issues too, which I will try and cover in case someone decides to attempt such a conversion.

Why convert the revision history?

The complete source history is an extremely valuable asset, being able to know when and who made a change years ago is often essential to understanding a problem in a product. Initially we wanted to preserve this source change history and assumed it wasn't a difficult job. Most engineers have been upset that our current plans don't include this history conversion, but read on if you are curious as to the problems encountered.

The Basic Idea

The basic idea in doing an 'ideal' source history conversion would be to create a Mercurial changeset for each Teamware putback. That means you need to identify the putback event, the specific SCCS revisions of the files, and any file renames or deletes. And each changeset is built upon the previous changeset, so the ordering of the changesets is critical here.

Sounds simple right? Well, read on, it's not so simple.

The Problems

History Files: You need to understand how the Teamware history file works. The Codemgr_wsdata/history file in a workspace does not propagate, so the specifics on a putback won't percolate around your tree of workspaces. This means that each workspace has a history of the Teamware events that happened to it, but not the details of anything that happened to the other workspaces. So to get accurate Teamware history you need the entire tree of integration workspaces (any workspaces that might be the target of a putback) and all that ever existed, then you'd need to fold all the events in these history files together in the proper time order. So the more complicated the Teamware workspace integration tree, the more difficult this task becomes. The JDK workspaces (there are many different workspaces) each have 6-20 different integration workspace instances, and some of these workspaces go back quite a few years, so we are talking some major source change history here.

SCCS file revision numbers: The details in the Teamware history will just list the files involved in a putback or bringover, not the specific SCCS Rev numbers for the files. So matching up the specific SCCS Revs on files to the specific putback event that putback these SCCS revisions is not trivial. (I think there may have been an option in Teamware to record the SCCS revision numbers in the workspace history file, but it is off by default, which is shame). So to create a nice neat Mercurial changeset means you need to somehow match up the filelists and timestamps of the putbacks with the individual SCCS revision numbers of source files. Unfortunately, the SCCS files record a time but no timezone, so if anyone decides to do this kind of history conversion will need to have lots of fuzzy timestamp logic to match up the right SCCS revisions with the putbacks. The username is included in the Teamware history file and the SCCS revisions, so that may also help, except that often an integrator of changes isn't the same person that did the SCCS revision.

SCCS Revision Tree: The SCCS revision tree for each file can be fairly complex graph, depending on how many file merges happened to the file. You might be able to just use the top level SCCS revision number, but information in the SCCS comments of the other revisions will contain important information to preserve.

Deleted files: Teamware deletes files by moving them from where they are to a top level deleted_files directory. So they don't really get deleted, just renamed. However, a common practice with many teams is to purge the deleted_files directory once a product reaches a major milestone. So some of the files may actually be permanently gone, and this needs to be taken into consideration. At some point, you can't recreate the complete source tree if this has happened.

Empty comments: Empty SCCS revision comments, and empty putback comments would also create problems if you planned on using these comments or cookies of information in these comments to connect up the files to the putback events (e.g. bug id numbers or change request numbers). So more specific SCCS revision comments and more specific putback comments might make this job easier.

Approaches

We considered multiple approaches to doing a source revision history conversion. You could come at it from the putback events, using the history files to identify 'real' changesets, and hope any deleted files are still around. What you'd use as Revision 1 of the files might be a little tricky. Or you could try and just look at the SCCS revisions, and figure out via timestamps, usernames, and perhaps SCCS comments, which files were changed as a group. Or a combination of both. Or you could try to come at if from a time perspective, e.g. all the changes to get you from April 1, 2004 to May 1, 2004.

The simple approach of one changeset per SCCS revision isn't really that simple because Mercurial changesets have an order to them. To do it right you'd need to view the Teamware workspace as a large graph of file nodes, with small sub-graphs of SCCS revisions. Then pick a time T to start Revision 1 of the Mercurial sources, find all the file instances at time T, add these files as a changeset to Mercurial, then repeat that for T+1. Or perhaps T+N where N is selected based on sampling timestamps after T for a quiet time (to avoid picking a time that might split up file changes that happened in a group). Just some wild ideas.

But it just feels wrong, no putback data, the files won't be bunched right, and the resulting repository would contain inaccurate source state in any of these converted changesets.

We never fully explored all the approaches because once the legal constraint came in, there seemed no need to pursue it. It's an interesting and complicated problem, but ultimately one we decided we didn't have to solve.

Conclusion

So the bottom line is that whatever can be created would likely have questionable data if someone asked to have the sources per a particular date or if they wanted to know the state of the entire source tree when a given change was made... Hard to ever be perfect here, and not being perfect could send a few engineers down some deep rabbit holes. :\^(

The old history isn't being destroyed, it's just being left in the old Teamware workspaces. So we will still have access to it, just not via Mercurial repositories. As time passes, we'll build up new and better history in our Mercurial repositories, and maybe by the time I retire, it won't matter much. ;\^)

Update: Some Ideas

Jesse's conversion script turns out to be a possibility. He documents the problems with it, but it's certainly a step toward something.

With the OpenJDK6 repositories which were originally in TeamWare, we had two ways to gain some history. With each build promotion while in TeamWare, we saved a source bundle, so we had a raw snapshot of the source for each build. By using these as potential working set files, this allowed us to start rev0 with Build 0 source bundles, then for each build promoted after that, repeat the steps:

  1. Delete the working set files
  2. Copy in new working set files from the source bundle for Build promotion N.
  3. Run: hg addremove ; hg commit --message "Build Promotion N" ; hg tag BuildN
This provides a large grain history, not great, but could be very valuable to narrow down when a change came in. Adding in more specific history required patch files that you would apply in between, but you needed the patches, you needed to know what Build a change went in, and most critically, you needed to know the order of the patches or changes in case two fixes modified the same files. Ultimately, it worked for OpenJDK6, to a degree. The Build Promotion revisions were accurate, but sometimes getting the others accurate was hard to do. And unfortunately, all the changesets in OpenJDK6 look like they were created by me, which is right in a way, but I really wasn't responsible for many of the patches. So the authors, dates, and SCCS comments were not included, but the bugids were.

Anyway, just thought I would update this rather old posting.

-kto

Friday Oct 02, 2009

JavaFX Survey

Are you a JavaFX user? If so, please take a short JavaFX survey and tell us what you think.

-kto

Wednesday Sep 30, 2009

Parfait: Finding more bugs

The Parfait tool checks C/C++ source code for common systems and security bugs. The OpenSolaris project is using it successfully, and we are currently investigating it's use with the OpenJDK project. Dr. Cristina Cifuentes has more information on "Building a Better Bug-Checker". More publications are available at the Sun Labs site.

If we could only get rid of all compiler warning errors, findbugs errors, and Parfait errors, the world would be squeaky clean. ;\^)

-kto

Friday Aug 14, 2009

Warning Hunt: Hudson and Charts on FindBugs, PMD, Warnings, and Test Results

"Drink, ye harpooners! drink and swear, ye men that man the deathful whaleboat's bow -- Death to Moby Dick!"

Once again, my sincere apologies to the Whales and Whale lovers out there, just remember Moby Dick is just a fictional story and I am in no way condoning the slaughter of those innocent and fantastic animals.

Hudson and Trends

Using Hudson to watch or monitor trends in your projects is a fantastic tool, although my graphs in this example are not that interesting, you can see how this could help in keeping track of recent test, warnings, findbugs, and pmd trends:

This blog should demonstrate how you can reproduce what you see above.

To get Hudson and start it up, do this:

  1. Download Hudson (hudson.war)
  2. In a separate shell window, run:
    java -jar hudson.war

Congratulations, you now have Hudson running. Now go to http://localhost:8080/ to view your Hudson system. Initially you need to add some Hudson plugins, so you need to:

  1. Select Manage Hudson
  2. Select Manage Plugins
  3. Select Available
  4. Select the check boxes for the following plugins: Findbugs Plugin, Mercurial Plugin, and PMD Plugin
  5. Select Install at the bottom of the page.
  6. Select Restart Now

We need a small Java project to demonstrate the charting, and one that has been setup to create the necessary charting data. You will need Mercurial installed, so if hg version doesn't work, get one of the Mercurial binary packages installed first. I will use this small jmake project from kenai.com. So get yourself a mercurial clone repository with:

hg clone https://kenai.com/hg/jmake~mercurial ${HOME}/jmake
and verify you can build it from the command line first:
cd ${HOME}/jmake
ant import
ant
The ant import will download junit, findbugs, and pmd. Access to junit, findbugs, and pmd could be done in a variety of ways, this import mechanism is just happens to be the way this ant script is setup.

This jmake build.xml script is already setup to create the necessary xml files that Hudson will chart for junit, findbugs, and pmd. You may want to look at the build.xml file, specifically the lines at the end of the file for how this is done. Your build.xml needs to create the raw data (the xml files) for the charting to work. This build.xml file happens to be a NetBeans configured ant script so it can be used in NetBeans too.

Now let's create a Hudson job to monitor your jmake repository. Go to http://localhost:8080/ and:

  1. Select New Job
  2. Name the job, I'll use "jmake".
  3. Select Build a free-style software project.
  4. Select OK.

Next we need to configure the job. You should be sitting in the job configuration page now, this is what you need to do:

  1. Under Source Code Management, select Mercurial and enter the path ${HOME}/jmake.
  2. Under Build Triggers select Poll SCM and enter \*/5 \* \* \* \* so that the repository is checked every 5 minutes.
  3. Under Build select Add Build Step and Invoke Ant, add import as the ant target.
  4. Select Add Build Step again and Invoke Ant, leave the ant target blank.
  5. Select Publish JUnit test report result and provide the path build/test/results/TESTS-\*.xml
  6. Select Scan for compiler warnings.
  7. Select Publish PMD analysis report and provide the path build/pmd.xml.
  8. Select Publish Findbugs analysis report and provide the path build/findbugs.xml
  9. Finally hit the Save at the bottom of the job configuration page.

Lastly, you need to use the Build Now a few times to build up some build history before the charts show up.

This Hudson job will automatically run when changes are added to the repository.

Hope this helps. Using Hudson for this is a definite plus, everyone should invest some time in setting these kind of systems up.

-kto

About

Various blogs on JDK development procedures, including building, build infrastructure, testing, and source maintenance.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today