« JE and the Sun T1000/T2000 ("Niagara") | Main | Golden Penguin Bowl (repost from Sleepycat blog on 4/06/06) »

Adler32 vs the GC






Recently a user opened a support request for what appeared to be a JE bug: their application was getting OutOfMemoryErrors
and all it was doing was updating and querying records in the database.
We were fortunate enough to have a test case that reproduced the
problem fairly quickly. The case started 16 threads. 15 of these
threads picked a key at random between 1 and 3000 and then either
created a record for that key or updated an existing record at that
key. The other thread would do a scan of the entire set of 3000 keys.
The inserters slept between 1 ms and 1 sec between inserts and the
full-scan thread woke up every 1 minute. Oh, did I say that the data
for each of the records were 512KB? The user reported that the program
ran just fine with JE 2.0.90, but would consistently OOME with 2.1.30.

So
I did a lot of the usual things that one does when trying to debug
OOMEs in Java: (1) run it in your favorite memory memory profiling
tool, (2) run it in the J2SE 1.6 JVM to try to get a better stack trace
from the OOME, (3) run with -verbose:gc and the handful of other -XX:
GC flags, etc.

When I tried (1), everything looked just fine.
The major consumer of memory (byte[]'s) held pretty firm and well
within the JE cache limit. When I tried (2), the program worked. Hmmm,
go figure. When I ran with (3), everything looked pretty normal -- the
GC was collecting memory quite nicely and then all of a sudden, bam! an
OOME appeared.

I tried increasing the -Xmx JVM parameter while
pinning down the JE cache size. While it ran longer, it still
eventually OOME'd.

I tried diddling the GC parameters, but nothing.

In
desperation, I started checking out snapshots of the source tree
between 2.0.90 and 2.1.30, doing binary searches to see which versions
worked and which didn't. The eventual culprit? The
java.util.zip.Adler32 class -- sort of.

Looking at the source code for the java.util.zip.Adler32 class, we see this:


JNIEXPORT jint JNICALL
Java_java_util_zip_Adler32_updateBytes(JNIEnv *env, jclass cls, jint adler,
jarray b, jint off, jint len)
{
Bytef *buf = (*env)->GetPrimitiveArrayCritical(env, b, 0);
if (buf) {
adler = adler32(adler, buf + off, len);
(*env)->ReleasePrimitiveArrayCritical(env, b, buf, 0);
}
return adler;
}


Checking at here we see:


These restrictions make it more likely that the native code will obtain
an uncopied version of the array, even if the VM does not support
pinning. For example, a VM may temporarily disable garbage collection
when the native code is holding a pointer to an array obtained via
GetPrimitiveArrayCritical.


JE passes the entire array
(all 512KB of it in this case) to Adler32 and so it seems that this is
blocking the GC. In a multi-threaded environment, this can be a big
problem.

The workaround is pretty easy. We have a flag, -Dje.disable.java.adler32=true
that can be used to force JE to use our own Adler32. It's slower than
the java.util.zip.Adler32, but it doesn't suffer from GC-blockage. Why
do we have this flag? A while back there was a bug in the J2SE 1.4 JVM
where Adler32 would cause JVM crashes, so we wrote our own to work
around this. We noticed that the 1.5 JVM fixes the bug so we put in
conditional code to check whether we were running on a 1.5 JVM or not,
and if so, use the java.util.zip.Adler32. We added a flag to override
the conditional use of the class.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About This Entry

This page contains a single entry from the blog posted on September 22, 2006 1:27 PM.

The previous post in this blog was JE and the Sun T1000/T2000 ("Niagara").

The next post in this blog is Golden Penguin Bowl (repost from Sleepycat blog on 4/06/06).

Many more can be found on the main index page or by looking through the archives.

Top Tags

Powered by
Movable Type and Oracle