Repost: Tips and tricks for dealing with a fragmented Java heap

Originally posted in December 2007, reposted on request. The problem described here is generic and applies to all JVMs using mark-n-sweep or similar GC algorithms.

Heap fragmentation occurs when a Java application allocates a mix of small and large objects that have different life times. The main negative effect of fragmentation is that long GC pause times caused by the JVM being forced to compact the Java heap. These long pause times are typically triggered when your Java program attempts to allocate a large object, such as an array.

As described in my previous blog entry on this topic, you can use JRockit Mission Control to find out how fragmented the heap is. But a fragmented heap is only a problem if it leads to long pause times (or an OutOfMemoryError). To find out the impact on pause times, you can run with the -Xverbose:gcpause command line flag, which will give you something like:

[INFO ][gcpause] old collection phase 1-0 pause time: 73.214054 ms, (start time: 15.807 s)
[INFO ][gcpause] (pause includes compaction: 1.029 ms (external), update ref: 1.532 ms)
[INFO ][gcpause] Threads waited for memory 66.612 ms starting at 17.568 s
[INFO ][gcpause] old collection phase 1-0 pause time: 66.449507 ms, (start time: 17.569 s)
[INFO ][gcpause] (pause includes compaction: 1.236 ms (internal), update ref: 1.488 ms)

The pauses in this example are clearly not a problem, but it can sometimes be much longer than this.

If you don't want to restart your JVM, you can enable this during runtime by running jrmcd <pid> verbosity set=gcpause=info. After you have the data you need, disable informational logs with jrcmd <pid> verbosity set=gcpause=error.

Or you can do a JRA recording (see the previous blog entry) and look in the GC details tab, where the time spent in compaction is clearly visible:


Before we look into the possible strategies for dealing with fragmentation, it is crucial to understand what causes it. The first key observation is that fragmentation is caused by GC. When the JVM performs GC it will clear out dead objects. It's the act of removing these dead objects that creates the holes in the heap. Memory allocation only has an indirect impact, in that it can create a pattern in the heap that later leads to the GC fragmenting the heap.

A second key observation is that fragmentation is only a problem if you can't use the holes in the heap. As long as you only allocate small objects, it doesn't matter how fragmented the heap is.

With these two observations in mind, here are some tips:

1. Increase the heap size

Increasing the heap size will decrease the frequency of GCs. One benefit of this is that objects are more likely to be dead than if GCs are very frequent, and if more objects are dead then there will be fewer live objects around that can contribute to create holes. In other words, the heap holes will on average be larger, which implies less fragmentation. Also, if GCs are less frequent, you can possible afford longer pause times since the impact of GC on throughput will be lower. Be aware that increasing the heap size can cause a slight increase in pause times.

A special case is to run with an infinitely large heap and never GC, which will of course avoid fragmentation completely.

2. Use a generational GC

Running JRockit with -Xgc:gencon or -Xgc:genpar will enable the use of a nursery or young space. The nursery will store recently allocated objects and when it is full a nursery GC will be performed, in which objects that are still alive will be moved to the old space. Since all objects that survive are (eventually) moved to the old space, the nursery will never be fragmented. And fragmentation of the old space will happen much more slowly since objects moved there will on average survive for a long time. Also, since the old space will fill up less rapidly, fragmentation-causing old space GCs will be less frequent.

A common strategy for avoiding the cost of old space GCs (often called "full GCs") is to configure your nursery size so that almost all objects die before they reach the old space. If you do this carefully, you can postpone old space GCs for a very long time. I've seen some installations where the app has been configured to avoid old space GCs for a full day, after which it is restarted to force creation of a "clean" heap. This may be more frequent among non-JRockit users, since our compaction is fairly efficent and the pause times tend to be acceptable. One word of warning here: This strategy is not guaranteed to avoid full GCs, since that depends on the load on your application, exact heap layout etc, so don't rely on it too much and configure it with a large safety margin.

3. Tune compaction parameters

The default behavior of JRockit is to analyze the fragmentation of the heap and do a little bit of compaction every old space GC cycle. The proportion of the heap that it decides to defragment is called the compaction ratio which is typically stated as a percentage of the heap size, where a common figure would be perhaps 5%. If your application causes a lot of fragmentation, you can configure this ratio manually which gives you the ability to create a balance of power between the GC and your memory-hungry Java program. You can try with -XXcompactratio=10 or so to start with. A high number will lead to longer pause times, but also means that the JVM will be able to cope with higher fragmentation.

If you want to do advanced tuning, look in the JRockit reference guide for parameters that impact compaction. Two examples are -XXinternalcompactratio and -XXexternalcompaction.

4. Don't allocate memory

Ok, so it's time to look at what you can do in your Java code. The most obvious tip is avoid memory allocation. This will have direct impact on both the frequency of GCs and indirectly on the GC pause time, since less objects will be alive at the time of a GC. You can use the Memory Leak Detector to analyze your application's allocation pattern and trace down excessive allocation to where in your source code it occurs. See Marcus Hirt's blog entry on this subject for tips.

5. Avoid allocating large objects

Arrays and other large objects are always the biggest culprit when it comes to fragmentation. They cause the heap to fill up quickly, leading to frequent GCs, they create irregular patterns in the heap and they request big contiguous chunks of memory on the heap at allocation time, which can be impossible for the JVM to fulfill without first doing compaction. To avoid excessive allocation of large objects, think twice before copying arrays etc. All code involving string processing (char arrays), XML, I/O (byte arrays) etc is a target for optimization in this area. Again, the Memory Leak Detector is a very powerful tool for analyzing this.

6. Allocate objects with similar lifespan in chunks

This is my last tip, and it is a bit esoteric so please bear with me... The idea is that any larger operation can decrease its impact on fragmentation by allocating the memory it needs in a chunk and then keep it alive until the operation is complete. Consider a J2EE transaction such as a servlet request. When you start this request, you can have one metaobject created that allcates most or all of the objects that you need to process that particular request. Since these allocations will all be performed by a single thread and very closely spaced in time, they will typically end up stored as a contiguous block in the Java heap. If you keep all these objects alive until the transaction are done, they will all become subject for GC at the same time, and cleared as dead objects during the same GC cycle. Ergo, less fragmentation. So nulling out objects prematurely to decrease live data may not be the best in the long run.

Final words

That's it for this time... I hope you found this useful! Don't hestitate to ask if you found something unclear. Keep up the good coding!



thanks for tips.

A litle precision from oracle documentation
the default of last Jrockit releases as R28 use by default a generational heap with dynamics strategy : througput (with -Xgc:gencon and a dynamique nursery size)

Posted by foobar on November 14, 2011 at 08:47 AM PST #

Post a Comment:
  • HTML Syntax: NOT allowed



« July 2016