Everything Older is Newer Once Again

Catching up on writing about more numerical work from years past, the second article in a two-part series finished last year discusses some low-level floating-point manipulations methods I added to the platform over the course of JDKs 5 and 6. Previously, I published a blog entry reacting to the first part of the series.

JDK 6 enjoyed several numerics-related library changes. Constants for MIN_NORMAL, MIN_EXPONENT, and MAX_EXPONENT were added to the Float and Double classes. I also added to the Math and StrictMath classes the following methods for low-level manipulation of floating-point values:

There are also overloaded methods for float arguments. In terms of the IEEE 754 standard from 1985, the methods above provide the core functionality of the recommended functions. In terms of the 2008 revision to IEEE 754, analogous functions are integrated throughout different sections of the document.

While a student at Berkeley, I wrote a tech report on algorithms I developed for an earlier implementation of these methods, an implementation written many years ago when I was a summer intern at Sun. The implementation of the recommended functions in the JDK is a refinement of the earlier work, a refinement that simplified code, added extensive and effective unit tests, and sported better performance in some cases. In part the simplifications came from not attempting to accommodate IEEE 754 features not natively supported in the Java platform, in particular rounding modes and sticky flags.

The primary purpose of these methods is to assist in in the development of math libraries in Java, such as the recent pure Java implementation of floor and ceil (6908131). This expected use-case drove certain API differences with the functions sketched by IEEE 754. For example, the getExponent method simply returns the unbiased value stored in the exponent field of a floating-point value rather than doing additional processing, such as computing the exponent needed to normalized a subnormal number, additional processing called for in some flavors of the 754 logb operation. Such additional functionality can actually slow down math libraries since libraries may not benefit from the additional filtering and may actually have to undo it.

The Math and StrictMath specifications of copySign have a small difference: the StrictMath version always treats NaNs as having a positive sign (a sign bit of zero) while the Math version does not impose this requirement. The IEEE standard does not ascribe a meaning to the sign bit of a NaN and difference processors have different conventions NaN representations and how they propagate. However, if the source argument is not a NaN, the two copySign methods will produce equivalent results. Therefore, even if being used in a library where the results need to be completely predictable, the faster Math version of copySign can be used as long as the source argument is known to be numerical.

The recommended functions can also be used to solve a little floating-point puzzle: generating the interesting limit values of a floating-point format just starting with constants for 0.0 and 1.0 in that format:

  • NaN is 0.0/0.0.

  • POSITIVE_INFINITY is 1.0/0.0.

  • MAX_VALUE is nextAfter(POSITIVE_INFINITY, 0.0).

  • MIN_VALUE is nextUp(0.0).

  • MIN_NORMAL is MIN_VALUE/(nextUp(1.0)-1.0).

Comments:

Post a Comment:
Comments are closed for this entry.
About

darcy

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
News

No bookmarks in folder

Blogroll