Superduper slow jar?

It's well known that jar is a "little" slow:-) How slow? On my "aged" SunBlad1000, it takes about 1 minute and 40 seconds to jar the whole rt.jar in "cf0M" mode (no compress, no manifest), costs you a little more if in compress mode.

While we did feel it is a little bit "too slow", but then figured we are talking about jarring 10 thousands of classes with a total size of 50+M, given the number of files and the total size, it might just need that much of time. So it has been this slow for years and we never took sometime to figure out if it really needs that much of the time, until someone "accidentally" noticed that "the CPU went to 100% busy for quite some time, perhaps a minute or more on my laptop, before starting to hit the disk to create the jar archive".

That sounds strange, AFAIK the major job jar is supposed to do is to copy and compress the files (into the jar), it should hit the disk from the very beginning to the end. So I took a first peek into the jar source code after so many years, it turned out we had a very "embarrassing" bug in the jar code, we were doing a O(n) look-up on a Hashtable (via the contains() method) for each and every file we were jarring, in which it really should be a O(1) look-up operation with a HashSet. Given the number of files we are working on, this "simple" mistake costs us to spend majority of the time (that 1 min 40+ sec) in "collecting" the list of files that need to jar, instead of the real "jarring" work it is supposed to do, sigh:-(

With that fixed (in JDK7 build44 and later) the jar is much faster now. Below are the quick "time measuring" numbers of 10 runs of jarring/zipping the rt.jar/zip, in "store only" mode and "zip ompression" mode.

b43: the JDK7/build43, which does not have the fix.
b47: the JDK7/build47, which does have the fix.
zip: the default zip installed on my Solaris, which is zip2.3/1999

(1)jar cf0M / zip -r0q (simply "store" no zip compression)
---b43-------b47------zip--
  1:43.7   20.6   10.2
  1:40.3   20.2    9.2
  1:40.1   21.0    9.0
  1:40.5   19.6   10.4
  1:40.9   19.6    8.7
  1:40.2   19.6    9.1
  1:40.0   18.6   10.0
  1:39.1   20.0    8.6
  1:41.3   18.5    9.0
  1:42.1   19.6    9.6

(2)jar cfM/zip -rq (with zip compression)

---b43-------b47------zip---
  1:47.0   25.3   15.7
  1:45.9   23.4   14.2
  1:44.7   23.3   14.9
  1:45.4   23.7   14.3
  1:45.6   23.3   14.3
  1:44.9   23.6   14.0
  1:45.9   23.2   14.6
  1:44.0   23.0   14.2
  1:44.9   23.3   14.8
  1:45.8   23.5   14.2

Jar is making big progress and doing much much better, though is still slower compared to the "zip". So we will continue our "catch-up" going forward (I have to say it:-) actually I do have some code that make jar much closer to zip, but it will take a while to make it into the product)

  \*1 The fix now is only in JDK7.
  \*2 If you are interested in the detail of the fix, you can have a peek at here

Comments:

I was under the impression that the rt.jar is expanded and cached in a specialized format such that is can be loaded really fast into memory even without bytecode validation etc.

Posted by Casper on March 12, 2009 at 03:55 AM PDT #

Casper, rt.jar is used here simply as an example to demonstrate the issue, mostly because I jar/unjar it dozen of times every day:-) Yes, some special "indexing" is being done on rt.jar to improve the start-up, but this is not what I'm talking about here...and rt.jar is still a jar file on your disk.

Posted by Xueming Shen on March 14, 2009 at 01:51 PM PDT #

Interesting; I just tried to speed up our build process and wondered about the 30s (jar) vs 3s (zip) difference I encountered.

Will this fix get into older JDK versions as well?

Posted by lars on March 28, 2009 at 01:31 AM PDT #

I will consider to backport the change into earlier releases after being baked in the 7 long enough.

Posted by Xueming Shen on March 31, 2009 at 05:21 AM PDT #

Please please please backport this to Java 6. I work on a product that embarrassingly has 35K classes in a single jar sized just over 100M. Building the jar can take 6+ minutes using 1.6.0_12

Posted by Tom Flynn on April 14, 2009 at 04:55 PM PDT #

Please please please backport this. The jarring is by far the longest bit of our build process.

Posted by cagcowboy on June 24, 2009 at 02:42 AM PDT #

The good news is that the backport has been in latest 6u18 ws, the not that too-good is it probably will take another couple months for this release to go public:-)

Posted by xueming on August 13, 2009 at 12:56 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

xuemingshen

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today