Musings on JDK development

Test where the failures are likely to be

There is a old joke about walking along one night and coming across someone looking down underneath a streetlight for lost keys. Stopping to help look, after a minute or two of searching you remark, "Your keys don't seem to be here. Where did you drop them?" "Well, I dropped them over in that ally, but it's way too dark to look there!"

While at Berkeley, one of the lessons I learned from Professor Kahan was "test where the failures are likely to be, ", which he stated much more mathematically as "seeking the singularly points of analytic functions."
Especially for numerical applications, the tricky inputs to the code can differ markedly from algorithm to algorithm. For example, this was the underlying reason the Pentium fdiv bug was not caught sooner. A new SRT divider algorithm was being used and while billions and billions of existing tests were run and looking fine, new tests targeting the new algorithm were apparently not written.
After learning of the general problem, Professor Kahan was able to write a short test program that probed at likely failure points, boundaries in a lookup table, and found incorrect quotients after executing for under a minute.

I keep Professor Kahan's advice in mind went writing regression tests for my JDK work, especially on numerics. At least on occasion, this methodology has flagged a bug unrelated to the code at hand. Tests I wrote for an initially internal "getExponent" method on floating-point numbers included checking adjacent floating-point values around each transition to the next exponent; the lucky by-catch of this was a HotSpot bug which was then corrected.
From a code coverage perspective testing at every exponent value is not needed since the code executed is the same, but such thoroughness helps provide robustness against other kinds of failures and didn't take much more time or code in this case.

While the mathematics behind certain math library tests can be quite sophisticated, in some ways the structure of their input is relatively simple compared to, say, the set of legal strings to a Java compiler. In the worst case, for a single-argument floating-point method an exhaustive test "just" has to make sure each of the 232 or 264 possible inputs has the proper value. The set of possible Java programs is much, much larger and categorizing the set of notable transition points can be challenging, but looking for likely failures is still applicable and worthwhile as one aspect of testing.

Join the discussion

Comments ( 2 )
  • Damon Hart-Davis Sunday, April 27, 2008

    I try to code test cases (and encourage others to do so too) to cover of major distinct area of operation of a module entry point (eg with each significant boolean flipped), edge cases, and \*cases that have failed before\*.

    The last of these because we want a bug to stay fixed, clearly, but also because it's likely to continue to indicate a weak point in the system design or implementation and thus things like it may well break under stress in future. In that case I also try to 'fuzz' the inputs a little by setting randomly any apparently 'don't care' values, thus testing a larger state space over time...



  • Joe Darcy Monday, April 28, 2008


    Yes, I agree keeping around old failure cases is prudent. When fixing a bug in addition to testing the particular failing input in question, I also like test "nearby" inputs to make sure the problem is fully gone. Nearby can mean fuzzing some parts of the input or inputs with a similar structure (e.g. powers of 2).

Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.