Testing javac: more, faster

Every now and then1,2, it is fun to look back at how far we've come to improve javac testing, at least within the main langtools unit and regression test suite.

Here are some numbers I collected recently for the "state of the art" testing as it was for JDK 5, 6, 7 and now for JDK 8. The numbers were gathered by running the javac tests for each release on the corresponding JDK build. For each run, the following information was recorded:

  • The number of javac tests
  • The elapsed time of the test run
  • The number of invocations of javac, as measured by the number of JavaCompiler objects that were created
  • An indication of the number of unit tests, as measured by the number of Context objects that were created

The runs were all done on the same hardware (a medium big server, running Ubuntu 12.04) which had no other jobs running on it at the time. The same version of jtreg was used for all the test runs. The tests in JDK 5 were not set up to work with jtreg's "samevm" and so had to be run in the slower "othervm" mode, in which each compilation and test execution create a new JVM. All the tests were run serially, i.e. without using jtreg features for executing tests concurrently. Also, I make no claims that these experiments were done in any particularly carefully controlled scientific fashion.

Description Notes Release/build 
5u51 6u51 7u25 8-b92
exec mode   othervm samevm samevm samevm
elapsed time (mm:ss.00) same hardware, same jtreg 08:12.64 03:02.82 05:29.83 12:19.22
%CPU   145% 171% 217% 492%
#javac tests in test/tools/javac/ 845 1140 1705 2436
#compilations #JavaCompiler.<init> 848 1594 41922 381301
#contexts (indicative of unit tests) #Context.<init> 850 1790 271642 591950
average time per test   00:00.583 00:00.160 00:00.193 00:00.303
average time per compilation   00:00.581 00:00.115 00:00.008 00:00.002
average compilations per test   1.00 1.40 24.59 156.53

What do the numbers show?

The simplest comparisons are between the first (5u51) and last (8-b92) columns. On the same hardware, the tests take 50% longer to run than the earlier set of tests. That's a somewhat unfair comparison, since we didn't have the same hardware back in 2005. My personal recollection is that running the tests used to take nearer to half an hour. But even if you allow that the tests may take 50% longer to run, in that time we run ...
  • Nearly 3 times as many tests (2436 vs. 845)
  • 450 times as many compilations (381301 vs.848)
  • Over 210,000 unit tests (591950-381301), compared to none before

The change in elapsed time from 5u51 (08:12.64) to 6u51 (03:02.82) is interesting and shows the benefit of changing from othervm mode to samevm mode.

The %CPU numbers are as reported by the Linux time command, and reflect the availability of multiple processors. Even in JDK 5, HotSpot was able to take advantage of multiple processors. The higher numbers in more recent releases reflect that the tests too will take advantage of more processors, especially the tests in JDK 8.

What did we do?

The improvements are the results of a number of factors, which together can be summarized as "end to end TLC, to do more, in less time".

The following list is an overview of at least some of the steps we've taken along the way:

  • Fix jtreg samevm mode.
  • Fix javac to work well with the test harness, including a special testing mode for diagnostics.
  • Reduce use of shell tests3.
  • Improve tests by running many, sometimes thousands, of test cases within each test.
  • Optimize tests to share javac infrastructure as much as possible between test cases.
  • Add jtreg agentvm mode, which enabled concurrent test execution.
  • Add hooks into javac to facilitate better unit testing.
  • Leverage modern hardware by running test cases concurrently within a test.

What's next?

Earlier, I said the test runs were done on a server, using othervm or samevm mode. I did one additional run, to take more advantage of the server's capabilities: this run allowed 16 tests to run concurrently. Here are the relevant rows from the previous table, extended with the data for that additional test run.

Description Release/build
5u51 6u51 7u25 8-b92
exec mode samevm samevm samevm samevm agentvm
concurrency 1 1 1 1 16
elapsed time 08:12.64 03:02.82 05:29.83 12:19.22 2:06.91
%CPU 145% 171% 217% 492% 2198%

One might expect a bigger speedup running 16 tests in parallel, but there are a few slow running tests that extended the total elapsed time by 46 seconds. If you set those few tests aside, the rest of the tests completed in 1:23. Whichever way you look at the numbers, Woohoo!


The results described here are the result of the combined contributions of the many people who have worked on javac, and on testing javac, over the past many years, including Maurizio, Brian, Joe, Kumar, Vicente, Alex, Steve, Sue, Sonali, Matherey, and the JCK-compiler team.

1. Raising the (langtools) quality bar , Dec 2008
2. Speeding up tests again: another milestone, May 2011
3. Shelling tests, May 2013

Post a Comment:
Comments are closed for this entry.

Discussions about technical life in my part of the OpenJDK world.


« July 2016