More bit-twiddling intrinsics

Today I pushed the changes for 6823354 which adds intrinsics for {Integer,Long}.{numberOfLeadingZeros,numberOfTrailingZeros}() methods.  The speedups are quite good:


Integer Long
numberOfLeadingZeros numberOfTrailingZeros numberOfLeadingZeros numberOfTrailingZeros
Intel Nehalem 32-bit 3.18 3.96 1.36 1.90
64-bit 3.83 3.74 2.02 2.17
AMD Shanghai 32-bit 1.94 3.55 0.98 2.44
32-bit w/ lzcnt 4.90 - 1.46 -
64-bit 2.52 3.09 1.86 3.26
64-bit w/ lzcnt 6.77 - 3.71 -
UltraSparc T2 32/64-bit 2.01 2.22 1.55 1.91

"w/ lzcnt" in the table means the numbers are using AMD's LZCNT (count leading zeros) instruction which is part of SSE4a.

The SPARC intrinsics need a hardware implementation of the POPC instruction.

Yet I haven't found a real-world application that uses these methods extensively (including bitCount), but if anyone knows one, please let me know.

Comments:

java.util.BitSet.nextSetBit / nextClearBit (typically used when sieving primes, using the Erathostenes sieve) can benefit from such optimizations. It's a pity that I have no access to such processors to publish a benchmark.

Posted by Edson Watanabe on May 08, 2009 at 12:20 PM CEST #

A quick search didn't reveal an application that uses these methods extensively. But if you provide me a benchmark, I will publish some numbers.

Posted by Christian Thalinger on May 08, 2009 at 01:01 PM CEST #

Today's strongest chess-playing problems use so-called "bitboard" representations and could benefit from those operations. See, for instance, "How DarkThought Plays Chess":

http://people.csail.mit.edu/heinz/dt/node7.html

Posted by Mauro Persano on May 08, 2009 at 02:58 PM CEST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Random posts from a HotSpot engineer.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today