I was heartened to recently come across the article
Java's new math, Part 1: Real numbers
which detailed some of the additions I made to Java's math libraries over the years in JDK 5 and 6, including
title="Add hyperbolic transcendental functions (sinh, cosh, tanh) to Java math library">hyperbolic trigonometric functions (sinh, cosh, tanh),
title="Want Math.cbrt() function for cube root">cube root,
and
title="Math package: implement log10 (base 10 logarithm)">base-10 log.
A few comments on the article itself, I would describe java.lang.StrictMath as java.lang.Math's fussy twin rather than evil twin. The availability of the StrictMath class allows developers who need cross-platform reproducible results from the math library to get them. Just because floating-point arithmetic is an approximation to real arithmetic doesn't mean it shouldn't be predictable! There are non-contrived circumstances where numerical programs are helped by having such strong reproducibility available. For example, to avoid unwanted communication overhead, certain parallel decomposition algorithms rely on different nodes being able to independently compute consistent numerical answers.
While the java.lang.Math class is not constrained to use the particular FDLIBM algorithms required by StrictMath, any valid Math class implementation still must meet that stated quality of implementation criteria for the methods. The criteria usually include a low worst-case relative error, as measures in
ulps
(units in the last place), and semi-monotonicity, whenever the mathematical function is non-decreasing, so is the floating-point approximation, likewise, whenever the mathematical function is non-increasing, so is the floating-point approximation
Simply adding more FDLIBM methods to the platform was quite easy to do; much of the effort for the math library additions went toward developing new tests, both to verify that the general quality of implementation criteria were being met as well as that verifying the particular algorithms were being used to implement the StrictMath methods. I'll discuss the techniques I used to develop those tests in a future blog entry.
Joe--Interesting topic. I'd be interested in hearing the how you (or the team) decided which methods should be implemented in Java, and which should be in C, and why.
Thanks!
Patrick
Patrick,
Deciding which to implement in Java and which to implement in C was pretty easy. Basically if FDLIBM had an "interesting" function already implemented in C, I choose to use that. For a long time I've wanted to port all the portions of FDLIBM we use to Java, but that hasn't been a high priority project. The smaller methods like copySign, getExponent, and nextAfter I just implemented in Java; they are relative easy (and fun) to write and straightforward to test.
A more interesting question was which subset of FDLIBM to add to Java. From the beginning, commonly used methods like sin, cos, and tan were included in Math. I tried to include the set of next most commonly used methods, such as the hyperbolic transcendental functions and log10. More esoteric functions, like the family of gamma functions, didn't make the cut.
Hi Joe
Thanks for the reply. I asked about implementations in C because I was surprised at some point a few years ago when found out that some methods I expected to be in Java were actually neither in Java nor accessed via JNI but were compiled into the VM as "intrinsics". I believe (looking at a headers file from HotSpot, vmSymbols.hpp) this includes a handful of methods from the Math class. So I was interested in how that discussion went--what goes into Java (possibly, as you imply, requiring resources to write it), what is accessed from an external lib via JNI, and what is an intrinsic inside the VM. Intrinsics just aren't a topic I've seen covered much in my experience with Java. It would seem to me to be a case where you can get top performance from the method in question, but you have to be pretty careful about what's in there, for reasons stability since it's part of the VM itself.
Thanks again for the reply.
Patrick
Hello.
Ah yes, intrinsics.
First, some background. Compared to calling a Java method, calling a C/C++ function via JNI is a relatively heavyweight process since the VM has to be in an appropriate state to transition to native code. Also in general various protections need to be in place in case the native function performs an operation that affects certain aspects of the overall program state, such as allocating memory. The individual FDLIBM math functions exposed via java.lang.Math run relatively quickly and don't allocate memory or perform other operations that require such safeguards from the JVM. So in general the FDLIBM functions can be called using lower-overhead trusted calling sequences.
In addition, some operations on some platforms have distinct "intrinsic" implementations in the JVM rather than the libraries. Those operations include the sin, cos, tan and pow methods from the Math class on x86 platforms. By using intrinsics, we can take advantage of x87 hardware instructions to speed up those methods while still obeying the specified semantics. Achieving speed with sufficient semantic control in these cases is not really possible unless you have full control over the instruction sequences, as one has in a vm intrinsic.
Hi Joe (if you're reading this)
I take it hypot() is one of your new functions. I excitedly came across this recently, having been using my own version (basically sqrt(x\*x+y\*y)), and fully expected hypot() to be faster as well as more accurate.
_But_ - I've timed it on my Dell laptop (a 2GHz Pentium, 32-bit, Vista) and whereas my version takes about 40 ns, Math.hypot() takes 1100 ns !! This is disastrous.
I get this result with the client versions of both JRockit 6.0 (?) and JRE 1.6.
I couldn't quite believe this result, so I've tried writing it various ways, but it makes no difference. Am I doing something wrong, or does the extra accuracy of hypot() come at a huge cost in speed?
Hi Jonathan.
A few comments on your findings.
First, you don't say how your performance results were gathered so to help rule out a methodological artifact I'll reference the usual Java microbenchmarking cautions: verify what you're measuring and have your program run for at least 10 seconds to avoid any VM startup jitter. I recommend searching the web for "cliff click microbenchmark", to get both of Cliff's informative talks on how do to microbenchmarking well. For Sun's JDK, "java -server" should also be tried if you want maximum long-term performance (startup may not be as good as with -client).
The straightforward code you are using, sqrt(x\*x+y\*y), is all implemented in hardware so it would be hard to be faster! The speed of sqrt would be the limiting factor; the add and multiplies are comparatively low-latency and can often be pipelined.
Looking at the C code currently used to implement hypot, src/share/native/java/lang/fdlibm/src/e_hypot.c in the jdk Mercurial repository, there is nothing that looks hideously expansive, but certain ranges of arguments have faster paths than others.
I suspect your program may be getting hit by the cost of Java -> native -> Java transitions to call out to the C code and come back. For a long time, I've wanted to port FDLIBM in the JDK to Java to avoid these, but I haven't gotten around to it yet.