Did I mention that the low pause collector maintains free lists
for the space available in the tenured generation and
that fragmentation can become a problem? If you're using the low pause collector and things are
going just peachy for days and days and then there is a huge (relatively speaking) pause,
the cause may be fragmentation in the tenured generation.
In 1.4.2 and older releases in order to do a young generation collection
there was a requirement that there be a contiguous chunk of
free space in the tenured generation that was big enough to hold
all the the young generation. In the GC tuning documents at
this is referred to as the young generation guarantee. Basically
during a young generation collection, any data that survives may have to be
promoted into the tenured generation and we just don't know how much is going to
survive. Being our usual conservative selves we assumed all of it would survive and
so there needed to be room in the tenured generation for all of it. How does this
cause a big pause? If the young generation is full and needs to be collected but
there is not enough room in the tenured generation, then a full collection of
both the young generation and the tenured generations are done. And this collection
is a stop-the-world collection not a concurrent collection so you generally see a
pause much longer than you want to. By the way this full
collection is also a compacting collection so there is no fragmentation at the
end of the full collection.
In 5.0 we added the ability in the low pause collector to start a young
generation collection and then to back out of it if there was not enough
space in the tenured generation. Being able to backout of a young generation
collection allowed us to make a couple of changes.
We now keep an average of the amount of space
that is used for promotions and use that (with some appropriate
padding to be on the safe side) as the requirement for the space
needed in the tenured generation. Additionally we no longer need
a single contiguous chunk of space for the promotions so we look at the total
amount of free space in the tenured generation in deciding if we can
do a young generation collection. Not having to have a single contiguous chunk of space to support
promotions is where fragmentation comes in (or rather where it doesn't come in as often).
Yes, sometimes using the averages for the
amount promoted and the total amount of free in the tenured generation tells us to
go ahead and do a young generation collection and we get surprised (there really isn't enough
space in tenured generation). In that situation we have to back out of the
young generation collection. It's expensive to back out of a collection, but it's doable.
That's a very long way of saying that fragmentation is less of
a problem in 5.0. It still occurs, but we have better ways of dealling with it.
What should you do if you run into a fragmentation problem?
Or you could try a larger total heap and/or smaller young generation.
If your application is on the edge, it might give you just enough
extra space to fit all your live data. But often it just delays the problem.
Or you can try to make you application do a full, compacting collection
at a time which will not disturb your users.
If your application can go for a day without hitting a
fragmentation problem, try a System.gc() in the middle of the
night. That will compact the heap and you can hopefully go
another day without hitting the fragmentation problem. Clearly no help for an
application that does not have a logical "middle of the night".
Or if by chance most
of the data in the tenured generation is read in when your application
first starts up and you can do a System.gc() after you complete
initialization, that might help by compacting all data into
a single chunk leaving the rest of the tenured generation
available for promotions. Depending on the allocation pattern
of the application, that might be adequate.
Or you might want to start the By the way, I've increased the comment period for my blogs. I hadn't realized it was so short.
concurrent collections earlier. The low pause collector tries to start a concurrent
collection just in time (with some safety factor) to collect the
tenured generation before it is full. If you are doing concurrent
collections and freeing enough space, you can try starting a concurrent collection sooner so that
it finishes before the fragmentation becomes a problem.
The concurrent collections don't do a compaction, but they do
coalese adjacent free blocks so larger chunks of free space
can result from a concurrent collection.
One of the triggers for starting a concurrent collection is the amount
of free space in the tenured generation.
You can cause a concurrent collection to
occur early by setting the
percentage of the tenured generation that is in use above which a
concurrent collection is started. This will increase the overall time you spend
doing GC but may avoid the fragmentation problem. And this will be more
effective with 5.0 because a single contiguous chunk of space is not required
By the way, I've increased the comment period for my blogs. I hadn't realized it was so short.