X
  • Java |
    Tuesday, November 28, 2006

Presenting the Permanent Generation

Have you ever wondered how the permanent generation fits into our
generational system? Ever been curious about what's in the permanent
generation. Are objects ever promoted into it? Ever promoted out?
We'll you're not alone. Here are some of the answers.

Java objects are instantiations of Java classes. Our JVM has an internal
representation of those Java objects and those internal representations
are stored in the heap (in the young generation or the tenured generation).
Our JVM also has an internal representation of the Java classes and those
are stored in the permanent generation. That relationship is shown in
the figure below.

The internal representation of a
Java object and an internal representation of a Java class are very
similar. From this point on let me just call them Java objects and Java
classes and you'll understand that I'm referring to their internal
representation. The Java objects and Java classes
are similar to the extent that during a garbage collection both
are viewed just as objects and are collected in exactly the same way. So
why store the Java objects in a separate permanent generation? Why not just
store the Java classes in the heap along with the Java objects?

Well, there is a philosophical reason and a technical reason. The
philosophical reason is that the classes are part of our JVM implementation
and we should not fill up the Java heap with our data structures. The
application writer has a hard enough time understanding the amount
of live data the application needs and we shouldn't confuse the issue
with the JVM's needs.

The technical reason comes in parts.
Firstly the origins of the permanent generation predate my joining the team
so I had to do some code archaeology to get the story straight (thanks
Steffen for the history lesson).

Originally there was no permanent generation. Objects and classes
were just stored together.

Back in those days classes were mostly static. Custom class loaders were
not widely used and so it was observed that
not much class unloading occurred. As a performance optimization
the permanent generation was created and classes were put into it.
The performance improvement was significant back then. With the amount
of class unloading that occur with some applications, it's not clear that
it's always a win today.

It might be a nice simplification to not have a permanent generation, but
the recent implementation of the parallel collector for the tenured
generation (aka parallel old collector)
has made a separate permanent generation again desirable. The issue
with the parallel old collector has to do with the order in which
objects and classes are moved. If you're interested, I describe this
at the end.

So the Java classes are stored in the permanent generation. What all
does that entail? Besides the basic fields of a Java class there are


  • Methods of a class (including the bytecodes)

  • Names of the classes (in the form of an object that points to a string also in the permanent generation)

  • Constant pool information (data read from the class file, see chapter 4 of the JVM
    specification for all the details).

  • Object arrays and type arrays associated with a class (e.g., an object array
    containing references to methods).

  • Internal objects created by the JVM (java/lang/Object or java/lang/exception
    for instance)

  • Information used for optimization by the compilers (JITs)

That's it for the most part. There are a few other bits of information that
end up in the permanent generation but nothing of consequence in terms of size. All these are allocated in the permanent generation and stay
in the permanent generation. So now you know.

This last part is really, really extra credit.
During a collection the garbage collector needs
to have a description of a Java object (i.e., how big is it and what
does it contain). Say I have an object X and X has a class K.
I get to X in the collection and I need K to tell me what X
looks like. Where's K? Has it been moved already?
With a permanent generation during a collection we move the
permanent generation first so we know that all the K's are in
their new location by the time we're looking at any X's.

How do the classes in the permanent generation get collected while the
classes are moving? Classes also have classes that describe their content.
To distinguish these classes from those classes we spell the former klasses.
The classes of klasses we spell klassKlasses. Yes, conversations around the
office can be confusing. Klasses are instantiation of klassKlasses so
the klassKlass KZ of klass Z has already been allocated before Z can be
allocated.
Garbage collections in the permanent generation
visit objects in allocation order and that allocation order is
always maintained during the collection. That is, if A is allocated
before B then A always
comes before B in the generation. Therefore if a Z is
being moved it's always the case that KZ has already been moved.

And why not use the same knowledge about allocation order to
eliminate the permanent generations even in the parallel old
collector case?
The parallel old collector does maintain allocation order of
objects, but objects are moved in parallel. When the collection
gets to X, we no longer know if K has been moved. It might be
in its new location (which is known) or it might be in its
old location (which is also known) or part of it might have
been moved (but not all of it). It is possible to keep track
of where K is exactly, but it would complicate the collector
and the extra work of keeping track of K might make it a performance
loser. So we take advantage of the fact that classes are kept in the permanent
generation by collecting the permanent generation before collecting
the tenured generation. And the permanent generation is currently collected serially.

Join the discussion

Comments ( 11 )
  • Bharath R Tuesday, November 28, 2006
    A couple of clarifications:

    1)It is right to state that when (and only when) the perm generation needs to be GC'ed (which is presumably less often than young and tenured generations), it precedes collections in the young and tenured generations? At all other times, major/minor collections go on independent of it?

    2)From what I gather, is Klass an internal representation of a class object, and an instantiation of klassKlass?

    Thanks

  • Patrick Wright Wednesday, November 29, 2006
    Very interesting, thanks. Some follow-up questions I have for a future blog:
    1) What, exactly, is the effect of calling System.gc()?
    2) Why, exactly, is hotswapping of classes (changing class definitions on-the-fly) difficult to implement on the JVM? I've never seen a thorough answer, only suggestions that it has to do either with security or with optimization.
    3) There are some known issues with some J2EE containers, where one can run out of perm gen space on redeploying multiple times--is this an issue with the custom classloader implementation, or is it just a limitation of the design of the perm generation? The solution I read and used was to increase the max perm generation size.
    Thanks for posting this, it's great information.
    Regards
    Patrick
  • Damon Hart-Davis Wednesday, November 29, 2006
    Thanks!

    I assume that native machine code (the output of C1/C2) is stored in the PermGen hanging off the class, like the bytecodes. Does this mean that the native code has to be PIC (Position Independent Code) to allow it to be shifted around during GC and is the penalty for this significant (I've seen a few percent between static and shared versions of the same lib for tight code)? If so, would it be worth having a REALLY PermPermGen for code that is REALLY heavily used so that it can be pinned down and use non-PIC instructions?

    Rgds

    Damon

  • Jon Umasasmit Wednesday, November 29, 2006
    Bharath,

    The permanent generation is collected every time the tenured generation is collected. It's not really a separate generation the way the young generation is a separate generation and there is not the mechanism to collect it separately. The tenured generation and the permanent generation are collected when either become full.

    Yes, Klass is an internal representation of class. It is like an instantiation of KlassKlass in that KlassKlass describes a Klass the way Klass describes a Java object. Please ask again if that is not clear.

  • Jon Umassamit Wednesday, November 29, 2006
    Patrick,

    I'll do the easy one's here.

    1) In our JVM a System.gc() does a full collection
    (young generation, tenured generations, and permanent generation).
    The specification for System.gc() says that a call to it is
    advisory so other actions (including no action) can occur on
    other JVM's.

    2) Regarding hotswapping I'll see if I can get someone to
    explain better than I, but my understanding is that the difficulty
    has to do with the treatment of objects that have already been
    instantiated from a class that is being hotswapped. If the hotswapping
    adds fields or methods to the class or modifies exiting methods, what should
    the JVM do about the original instantiations and, more interesting, what
    should it do about methods of the original instantiations that are
    executing.

    3) With regard to redeploying and filling up the permanent generation,
    the permanent generation is pretty much agnostic about what it
    contains. All the permanent generation does is allocate space
    and it doesn't know what its used for. However,
    a class and it's classloader have to both be unreachable in order for them
    to be unloaded. A class X with classloader A and the same class X with
    classloader B will result in two distinct objects (klassses) in the
    permanent generation. Lots of that and lots of redeployments
    would cause pressure on the permanent generation. But I'm really
    guessing here. Try running "jmap -permstat" in JDK5 and later
    (on solaris and linux)
    and that will give you information about the classes in the permanent
    generation.

  • Jon Umassamit Wednesday, November 29, 2006
    Damon,

    Actually native code generated by C1 and C2 is not stored
    in the permanent generation. Perhaps that to avoid the
    performance problem you mention. The compilers allocate directly
    out of the C heap. When a class (and it's methods) are unloaded,
    the code for any compiled methods is put on a list for
    deallocation.

  • Jon Umsasamit Thursday, November 30, 2006
    Patrick,

    Here's a paper that talks about implementing hotswapping. Part of the paper is a discussion on why and how the implementation would be staged. Section 3 lists the proposed stages and will give you an idea of some of the complexity involved. Sections 4 and 5 go into more detail.

    http://www.experimentalstuff.com/Technologies/HotSwapTool/runtime-evol.pdf

  • Elliott Hughes Thursday, November 30, 2006
    I've really been enjoying this series of articles. One minor typo I noticed in this one: "constant poll" instead of "constant pool".
  • Jon Usmasamit Tuesday, December 5, 2006
    Elliott,

    Thanks for pointing out the typo. By the way, I fixed it by
    just editing my blog. I don't know if there is any
    external indication that it's been updated. So
    the typo really was there, and now it's not.

  • Sachin Sunday, December 17, 2006
    Just one question. This may not be related to this post. Is a "static final" directly allocated into old gen. I guess in most of the cases it will most likely land into old gen over the time I suppose?
    Thanks for the nice posts.
    Sachin
  • Jon Usmasamti Friday, December 22, 2006
    In general everything is allocated out of the young generation. The exceptions are objects that are too large to fit in the young generation and some corner cases when free space in the young generation is low. In the latter case
    there are different policies for different collectors which may allow an allocation directly out of the old generation in the belief that direct allocation would be more efficient for some reason. For
    example, there might not be enough room in the young generation for an exceptionally large allocation but still
    might be plenty of free space in the young generation. In such a situation it might be better to do that one large allocation out of the old generation
    rather than collect the young generation.

    Being "clever" like this has sometimes gotten us into trouble. In the corner cases of these corner cases we've
    found ourselves continually allocating out the of old generation. The really bad thing about that is that it's hard for the user (and us) to recognize what is happening. Everything looks like it's
    working. The application just slows down.

    Sorry I didn't get back sooner. I took a few days off to visit family and see the bright (Christmas) lights in New York City. New York was great. The flights were not. Although compared to Denver, our misadventures are not
    worth mentioning.

Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha