[This is a repost of the 10/24/05 Sleepycat blog entry.]
Recently, I've been working on a JE user's performance test. They have
a need for high insert throughput while simultaneously reading records
from the same database. While tuning the insertion test I learned some
interesting things about the Java Garbage Collector and how it
interacts with the JE Cache.
The benchmark is simple: it writes
one million records with keys that are sequential "longs". The total
database size is about 380MB. In order to keep the database in-memory,
I set the cache size is 512MB and the log buffer size is 64MB. With
those parameters as a baseline, I began my tuning.
One of the first things I tried was to adjust the JVM heap size (-Xmx).
I saw some odd results. For instance, if I ran the benchmark with no
-Xmx parameter on the command line, it performed better than if I
specified -Xmx512M. I was running on a "server class" machine (a Sun
V20Z, 2GB, 2 x Opteron 244 CPU, Solaris 10, which, by the way, really
screams) so I knew I was getting the server JVM by default. When I
printed out Runtime.maxMemory() the JVM returned the same value (512MB) whether I invoked it with or without the -Xmx512M, yet with that command line option (or lack thereof) being the only difference, the results were clearly slower with the -Xmx parameter than without. What I didn't know was that if didn't specify -Xmx, then the JVM chose values based on Ergonomics in the 5.0 Java[tm] Virtual Machine.
Specifically, on a server class machine, the initial heap size is 1/64
of the physical memory (1/64th of 2GB is 32MB) and the max heap size is
1/4 of the physical memory (512MB in my case). But if you do specify
-Xmx, then the initial heap size is not chosen based on the ergonomics
parameters. So, for example, a simple test program that prints
Runtime.maxMemory() and Runtime.totalMemory() shows the following:
> java Test # Java Ergonomics chooses values
maxMemory: 517013504 # 512MB
totalMemory: 33554432 # 32MB
freeMemory: 33269360
> java -Xmx512M Test # Java Ergonomics does not choose values
maxMemory: 517013504 # 512MB as specified by -Xmx512M
totalMemory: 5111808 # something smaller than 32MB
freeMemory: 4826736
> java -Xmx512M -Xms32M Test # User supplies both
maxMemory: 517013504 # 512MB as specified by -Xmx512M
totalMemory: 33554432 # 32MB as specified by -Xms32M
freeMemory: 33269360
So that explained why I was seeing different results when I specified -Xmx (but not -Xms).
The next (and more) interesting observation came when I adjusted the JE cache size (je.maxMemory).
The benchmark's baseline setting of 512MB was clearly enough to hold
all of the entries in the database. But just to confirm that, I
decreased the cache size. Surprisingly, the benchmark performance
improved! Additional decreases of the cache size all the way down to
8MB continued produce improved performance. The reason behind this
counter-intuitive phenomenon lay with the GC. Tuning Garbage Collection with the 5.0 Java[tm] Virtual Machine
provided assistance in narrowing this down. By using the various GC
debugging command line options available in the J2SE Java 5 JVM ("-verbose:gc", "-XX+PrintGCDetails", "-XX:+PrintGCTimeStamps") it was pretty easy to figure out what was going on with the GC.
GC's
of free objects in "Young Space" are cheap, but full GC's that migrate
objects are expensive. By increasing the cache size, JE was leaving
more objects in the cache. Even though they were never accessed again
(since this was just an insertion test), with a large cache those
objects had to be migrated to "Tenured Space". But with a smaller
cache, the objects could be evicted (i.e. "freed") thereby allowing the
GC to free them from the "Young Space" (cheap). When the heap was sized
large enough that the "Young Space" was also large enough so that no
Full GC's were necessary, the highest performance was obtained.