Cosine redux

It's been quite a while since my last blog entry: I had a bit of a technology meltdown that I'm not quite done with yet :-(

I was doing some recreational hacking over the holidays that involved evaluating cosines. I ended up doing (once again!) an implementation of cosine (don't ask why). Given the confused flamage that my previous posts on cosine generated, I figure that showing some code would be useful. I would never have thought that cosine would generate so much email traffic. Yes, I know about Taylor series. No I wasn't trying to insult centuries of standard mathematical practice. Performance is often the art of cheating carefully.

So, here's my implementation of cosine, with its argument in turns (1 turn == 360 degrees):

public static float cost(float f) {
    int bits = Float.floatToRawIntBits(f);
    int mantissa = (bits&0x7FFFFF)|(1<<23);
    int shift = (bits<<1>>>24)-127+9; // exponent, unbiased, with shift
    if(shift>=32 || shift<=-32) return 1;
    int fractionBits = shift>=0 ? mantissa<<shift : mantissa>>-shift;
    int tableIndex = (fractionBits>>>(30-resultPrecision))&tableSizeMask;
    switch(fractionBits>>>30) { // Quadrant is top two bits
        case 0: return cosTable[tableIndex];
        case 1: return -cosTable[tableSizeMask-tableIndex];
        case 2: return -cosTable[tableIndex];
        default/\*case 3\*/: return cosTable[tableSizeMask-tableIndex];
Lets go through this slowly:
  1. int bits = Float.floatToRawIntBits(f);
    Get the IEEE 754 bits
  2. int mantissa = (bits&0x7FFFFF)|(1<<23);
    The mantissa is the bottom 23 bits - to which the hidden bit must be prepended.
  3. int shift = (bits<<1>>>24)-127+9;
    Extract the exponent, correct for the exponent bias, then add a bias to move the binary point to the top of the word.
  4. if(shift>=32 || shift<=-32) return 1;
    If the shift is too large, the fraction bits would be zero, therefore the result is 1.
  5. int fractionBits = shift>=0 ? mantissa<<shirt : mantissa>>-shift;
    Shift the mantissa so that it's a fixed point number with the binary point at the top of the 32 bit int. The magic is in what's not here: because the argument is in turns, I get to ignore all of the integer bits (range reduction made trivial); and because it's the cosine function, which is symmetric about the origin, I get to ignore the sign.
  6. int tableIndex = (fractionBits>>>(30-resultPrecision))&tableSizeMask;
    The top two bits are the quadrant... extract the bits below that to derive a table index.
  7. switch(fractionBits>>>30) {
    One case for each quadrant. This switch could be eliminated by making the table 4 times larger.
  8. case 0: return cosTable[tableIndex];
    Yes! It's just a table lookup! Truly trivial.
  9. case 1: return -cosTable[tableSizeMask-tableIndex];
Since this is just a table lookup, the resulting approximation can be pretty jagged if the table is small. But it's easy to tune the table size depending on accuracy needs. The smaller the table is, the higher the cache hit rate will be, and the more likely it is that the whole table will fit in cache.

Table lookups are a very common way to implement mathematical functions, particularly the periodic ones like cosine. There are all kinds of elaborations. One of the most common for improving accuracy is to do some sort of interpolation between table elements (linear or cubic, usually).


Jeff, I think the problem that your solutions are not helping to solve here is the performance, i.e. the one that Java is having in calling trig functions. Specifically I cannot see how CORDIC will eliminate the runtime overhead that argument reduction brings about in Java when calling x86 hardware trig instructions.

Posted by Alex Lam on January 21, 2006 at 08:08 AM PST #

It's a tradeoff. The better you understand the precision requirements of your application, the better you can tune it. There is a spectacular amount of literature on different tricks on evaluating functions like cosine. The less precision you need, the faster you can make it go. The problem with a general-purpose library like Java's is that it has no information about the precision requirements of the application invoking it. So the safe thing to do is to produce the most accurate answer possible. The difficulty is that in this case, the x86 hardware doesn't do a great job, so we correct for it. The grotesque solution would be to define functions that had extra parameters for (at least) the number of result bits that were needed to be accurate, and the range over which the argument might vary: if these were constants, then a good optimizer could substitute appropriatly tricked-up code fragments.

Posted by James Gosling on January 21, 2006 at 02:02 PM PST #

It is inconvenient that radians depend on PI and PI is an irrational number. At the moment I am writing an arbitrary-precision arithmetics package and is working on implementing the non-linear (trig, exp, log, pow etc.) functions. Apart from revising my maths knowledge I have also looked up (seemingly) similar projects on the Internet (JUMP, JScience, JCM etc.) for possible inspirations. Now I think I have a better sense of direction on how to work things out; I sincerely hope that after all the pain-staking processes I'd come up with an almighty useful Java library ;)

Posted by Alex Lam on January 22, 2006 at 01:04 AM PST #

Hi Alex

I'm searching for a good arbitrary-precision arithmetics package for numerical computations with support for trigonometric functions.
Can you suggest one?

Posted by Axel Kramer on January 22, 2006 at 07:59 PM PST #

怎么啦? 在文章中间出现很多 这样的成人电影广告! 成人电影 成人小电影 免费成人电影 免费观看成人电影 日本成人电影 黄色电影 黄色小电影 免费黄色电影 免费观看黄色电影 日本黄色电影 激情电影 激情小电影 免费激情电影 免费观看激情电影 日本激情电影 色情电影 色情小电影 免费色情电影 免费观看色情电影 日本色情电影 性爱电影 性爱小电影 免费性爱电影 免费观看性爱电影 日本性爱电影 what goes wrong??

Posted by nile black on January 22, 2006 at 10:32 PM PST #

What ever happened to the Java Grande project? Among many other interesting things they were working on - I thought one of the proposals was for a math library that could be processor dependent. In cases in which it was more important to have performance than to have identical answers on all platforms there could be specific optimizations. These would be contained in libraries where it would be clear that this was the trade off you were making.

Posted by Daniel Steinberg on January 22, 2006 at 11:38 PM PST #

"The grotesque solution would be to define functions that had extra parameters for (at least) the number of result bits that were needed to be accurate, and the range over which the argument might vary: if these were constants, then a good optimizer could substitute appropriatly tricked-up code fragments."

What's so grotesque about that? If every function requires information about the error bounds, then enforcing the desired error limits becomes quite complicated. This is doubly true, while there is no standard way to track and describe error bounds. Perhaps JSR 275 will address that.

I would think that "The Jackpot Way" to solve this problem would be to provide an editor for complex functions and equations. Then a function evaluation engine can evaluate the entire function, tracking the error along the way. Radically different evaluation methods could be chosen, simply by changing the acceptable error.

Total error is a pain to track, but it is almost always the only important thing. The error involved in any single stage of a calculation is only important to the extent that it impacts total error.

If you think this approach requires too much intelligence from the function evaluation engine to ever compete in terms of performance, would you say the same thing about Java itself?

Posted by Curt Cox on January 23, 2006 at 12:04 AM PST #

you sir, are a hacker. I challenge you to pistols at dawn!

Posted by wes on January 23, 2006 at 07:06 PM PST #

Hi Axel, AFAIK, there isn't one (except that there would be one once I've done with it :-D) It's not like extension of non-linear functions from decimal to arbitrary-precision is that straight-forward, I suppose~

Posted by Alex Lam on January 23, 2006 at 09:20 PM PST #

Post a Comment:
Comments are closed for this entry.



« April 2014