Apache Harmony: Thanks for the TreeMap Work!

I'd like to thank Apache Harmony for their JDK library performance efforts. We were given a tip that the Harmony folks were doing some interesting work with the TreeMap collection class, and low and behold they were. The work is surrounding fat TreeMap nodes where each node contains several TreeMap entries. This greatly improves in order traversal of the TreeMap entries, and since SPECjbb2005 traverses over a TreeMap I can see why they did this. Controlled measurements showed the "fat" TreeMap significantly faster with in order traversals, and improved our SPECjbb2005 score by a solid 3-5% depending on the platform.

Considering the potential performance gain we started the effort of porting the Apache Harmony TreeMap code to JDK 6. We ran into a few performance snags in our performance regression testing, but with a bit more work we were able to develop a solution that was able to realize the performance gains without any negative impact on other workloads. We are now making the necessary steps to give back our code changes to Apache Harmony.

The new TreeMap is included in JDK 6 Update 6 Performance Release which is available for download at http://java.sun.com/performance.

Thanks again Harmony! Keep the JDK performance optimizations coming.







Comments:

Hi,

I got through the entire survey thing to again discover that this is a JRE whereas I need a JDK (for javac compilation of JSPs). Would be helpful to flag that up IN BIG BOLD LETTERS for stupid people like me!

Rgds

Damon

Posted by Damon Hart-Davis on July 07, 2008 at 08:39 PM EDT #

BTW, could you make available the new TreeMap implementation and/or diffs for the terminally obsessed (eg me) to look at?

Rgds

Damon

Posted by Damon Hart-Davis on July 07, 2008 at 08:42 PM EDT #

Hey David, good to hear that you can make use of Harmony code in your performance release.

Please do send along information on the performance snags you saw. You can dump the info into:
http://issues.apache.org/jira/browse/HARMONY

Thanks.

Posted by Tim Ellison on July 07, 2008 at 09:00 PM EDT #

Hi, Dave!
We prefer to call it "folded TreeMap" rather than "fat TreeMap". Glad to hear the code is used not only in Harmony!

Thanks.

Posted by Aleksey Shipilev on July 08, 2008 at 05:30 PM EDT #

@Aleksey & Harmony folks: We had been considering a similar idea for quite some time. Seeing that Harmony had an implementation of a "fat" or "folded" TreeMap allowed us to get there a little earlier. I happen to be the guy who did the work to migrate the (as you call it "follded") Harmony TreeMap and made it Java 6 compliant & integrated it into Sun HotSpot class libraries. As you know there's a lot of new functionality added to Java 6 TreeMap, (NavigableMap support). You'll see all this when we give you the updated TreeMap back. There are some general issues we are continuing to work on surrounding a "folded" TreeMap, especially in the area of footprint for situations such as; worst case scenario is you could end up with 1 element per 64 element node, and for applications (which we have found many) which only populate a TreeMap with 10 or so elements, there's quite a bit of waste in a 64 element TreeMap node. At the moment we are further analyzing the distribution of TreeMap use cases and evaluating their performance. We will likely make some additional changes on top of the Java 6 fat / folded TreeMap we introduced in JRE 6u6p to better handle a wider distribution of TreeMap use cases. Thanks again to Harmony for their initial fat / folded TreeMap.

Posted by charlie hunt on July 09, 2008 at 01:47 AM EDT #

So. Unbelievably. Awesome.

This is the nirvana vision of multiple open sourced JVMs. Healthy competition, multiple eyes, multiple perspectives, new implementations tested and inspected and shared and <faints into the thick curls of Richard Stallman's beard>

Well.. Okay, it's just one class and one case, but great work guys. Both teams should be very proud.

Viva la Java!

Posted by Adam Malter on July 09, 2008 at 04:28 AM EDT #

Ok, in other words, p6 is all about a optimized TreeMap implementation?

Posted by mbien on July 09, 2008 at 09:53 AM EDT #

I am very glad that idea of "folded" TreeMap is used more widely.

[to charlie hunt ] When I designed Harmony's "folded" TreeMap we evaluated the problem with memory footprint and "partially filled bucket". Here we have classic problem of choosing right balance.
Just for examle (The following numbers are right for Harmony objects layout and can be differ on other VMs):
"folded" TreeMap up to 64 elements constantly need 588 bytes of memory.
Classic RB-TreeMap:
10 elements need 320 bytes vs 588
20 elements need 640 bytes vs 588
64 elements need 2048 bytes vs 588

We loose in case of 10 elements, but we significantly win in case of 64 elements.
That is why we decided to do such choice.
The following possibilities also were considered.
- Limit folded bucket by 32 elements. At this case "folded" needs less memory starting from 11 elemnts in tree.
- Implements dynamic adjustment of bucket size.

Also, I hope there are a lot of other possibilities and you will choose a right solution.

Best Regards, Sergey Kuksenko

Posted by Sergey Kuksenko on July 09, 2008 at 10:04 PM EDT #

This is fantastic - yet another benefit of a free and open Java ecosystem. Being able to share code between the various implementations was one of our hopes and goals when we started Harmony. It's great to see this!

Now, if Sun would only let the Apache Software Foundation have the TCK to test compatibility with the spec.... ;)

geir

Posted by Geir Magnusson Jr on July 09, 2008 at 10:55 PM EDT #

@mbien: 6u6p only a TreeMap optimization? We wouldn't do a release for a single optimization. It's just too costly to do a release for one optimization. But, from a # of lines of source code, this was probably the biggest change in 6u6p. The initial folded / fat TreeMap which we started from, the Harmony version, is based on support of Java 5 interfaces. So, we had to implement support for Java 6 interfaces. That almost doubled the number of lines of source code in TreeMap, (most of the new code is to support the NavigableMap interface). As for other optimizations in 6u6p, there's quite a few JIT compiler optimizations in addition to a couple other class library optimizations.

Posted by charlie hunt on July 10, 2008 at 12:34 AM EDT #

@Sergey Kuksenko: Indeed, often times it's the case of choosing the right balance. And, that's always an interesting challenge of finding a good representative distribution of uses and use cases, (a task I kind of enjoy doing in addition to performance optimization work). Then, flushing out ways to auto-configure or auto-tune based on information available at runtime and choosing a good balance from there. I've been exploring quite a few alternatives, but it's too premature to say anything too conclusive other than the 64 node size works well for large TreeMaps where its use is dominated by TreeMap iteration. An interesting twist on a red black tree I recently looked at is a left leaning red black tree. Here's a link to good presentation about it: http://www.cs.princeton.edu/~rs/talks/LLRB/RedBlack.pdf I think you'll enjoy it.

Posted by charlie hunt on July 10, 2008 at 12:58 AM EDT #

@charlie hunt
thank you for your answer, thats why I asked (again) because there is nothing mentioned about any other optimization (the tree map opt is actually the only optimization I found through googling...).

You fill in a form and get a different RE without knowing anything about the differences...
A short overview over the optimizations would be great (or links to the RFEs).

Posted by mbien on July 10, 2008 at 05:02 AM EDT #

@mbien: I'll pass along your feedback to release engineering. You are right, we should do a better job at communicating what optimizations have been added to a release. And, IMO, I think we should also say what you need to do to take advantage of them. For example, most of the time the optimizations are introduced under the -XX:+AggressiveOpts command line flag. There are exceptions to this general rule such as 6u6p's TreeMap.

Posted by charlie hunt on July 11, 2008 at 03:05 AM EDT #

Post a Comment:
Comments are closed for this entry.
About

dagastine

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today