Presenting the Permanent Generation

Have you ever wondered how the permanent generation fits into our generational system? Ever been curious about what's in the permanent generation. Are objects ever promoted into it? Ever promoted out? We'll you're not alone. Here are some of the answers.

Java objects are instantiations of Java classes. Our JVM has an internal representation of those Java objects and those internal representations are stored in the heap (in the young generation or the tenured generation). Our JVM also has an internal representation of the Java classes and those are stored in the permanent generation. That relationship is shown in the figure below.

The internal representation of a Java object and an internal representation of a Java class are very similar. From this point on let me just call them Java objects and Java classes and you'll understand that I'm referring to their internal representation. The Java objects and Java classes are similar to the extent that during a garbage collection both are viewed just as objects and are collected in exactly the same way. So why store the Java objects in a separate permanent generation? Why not just store the Java classes in the heap along with the Java objects?

Well, there is a philosophical reason and a technical reason. The philosophical reason is that the classes are part of our JVM implementation and we should not fill up the Java heap with our data structures. The application writer has a hard enough time understanding the amount of live data the application needs and we shouldn't confuse the issue with the JVM's needs.

The technical reason comes in parts. Firstly the origins of the permanent generation predate my joining the team so I had to do some code archaeology to get the story straight (thanks Steffen for the history lesson).

Originally there was no permanent generation. Objects and classes were just stored together.

Back in those days classes were mostly static. Custom class loaders were not widely used and so it was observed that not much class unloading occurred. As a performance optimization the permanent generation was created and classes were put into it. The performance improvement was significant back then. With the amount of class unloading that occur with some applications, it's not clear that it's always a win today.

It might be a nice simplification to not have a permanent generation, but the recent implementation of the parallel collector for the tenured generation (aka parallel old collector) has made a separate permanent generation again desirable. The issue with the parallel old collector has to do with the order in which objects and classes are moved. If you're interested, I describe this at the end.

So the Java classes are stored in the permanent generation. What all does that entail? Besides the basic fields of a Java class there are

  • Methods of a class (including the bytecodes)
  • Names of the classes (in the form of an object that points to a string also in the permanent generation)
  • Constant pool information (data read from the class file, see chapter 4 of the JVM specification for all the details).
  • Object arrays and type arrays associated with a class (e.g., an object array containing references to methods).
  • Internal objects created by the JVM (java/lang/Object or java/lang/exception for instance)
  • Information used for optimization by the compilers (JITs)

That's it for the most part. There are a few other bits of information that end up in the permanent generation but nothing of consequence in terms of size. All these are allocated in the permanent generation and stay in the permanent generation. So now you know.

This last part is really, really extra credit. During a collection the garbage collector needs to have a description of a Java object (i.e., how big is it and what does it contain). Say I have an object X and X has a class K. I get to X in the collection and I need K to tell me what X looks like. Where's K? Has it been moved already? With a permanent generation during a collection we move the permanent generation first so we know that all the K's are in their new location by the time we're looking at any X's.

How do the classes in the permanent generation get collected while the classes are moving? Classes also have classes that describe their content. To distinguish these classes from those classes we spell the former klasses. The classes of klasses we spell klassKlasses. Yes, conversations around the office can be confusing. Klasses are instantiation of klassKlasses so the klassKlass KZ of klass Z has already been allocated before Z can be allocated. Garbage collections in the permanent generation visit objects in allocation order and that allocation order is always maintained during the collection. That is, if A is allocated before B then A always comes before B in the generation. Therefore if a Z is being moved it's always the case that KZ has already been moved.

And why not use the same knowledge about allocation order to eliminate the permanent generations even in the parallel old collector case? The parallel old collector does maintain allocation order of objects, but objects are moved in parallel. When the collection gets to X, we no longer know if K has been moved. It might be in its new location (which is known) or it might be in its old location (which is also known) or part of it might have been moved (but not all of it). It is possible to keep track of where K is exactly, but it would complicate the collector and the extra work of keeping track of K might make it a performance loser. So we take advantage of the fact that classes are kept in the permanent generation by collecting the permanent generation before collecting the tenured generation. And the permanent generation is currently collected serially.

Comments:

A couple of clarifications:

1)It is right to state that when (and only when) the perm generation needs to be GC'ed (which is presumably less often than young and tenured generations), it precedes collections in the young and tenured generations? At all other times, major/minor collections go on independent of it?

2)From what I gather, is Klass an internal representation of a class object, and an instantiation of klassKlass?

Thanks

Posted by Bharath R on November 28, 2006 at 03:18 PM PST #

Very interesting, thanks. Some follow-up questions I have for a future blog: 1) What, exactly, is the effect of calling System.gc()? 2) Why, exactly, is hotswapping of classes (changing class definitions on-the-fly) difficult to implement on the JVM? I've never seen a thorough answer, only suggestions that it has to do either with security or with optimization. 3) There are some known issues with some J2EE containers, where one can run out of perm gen space on redeploying multiple times--is this an issue with the custom classloader implementation, or is it just a limitation of the design of the perm generation? The solution I read and used was to increase the max perm generation size. Thanks for posting this, it's great information. Regards Patrick

Posted by Patrick Wright on November 28, 2006 at 05:04 PM PST #

Thanks!

I assume that native machine code (the output of C1/C2) is stored in the PermGen hanging off the class, like the bytecodes. Does this mean that the native code has to be PIC (Position Independent Code) to allow it to be shifted around during GC and is the penalty for this significant (I've seen a few percent between static and shared versions of the same lib for tight code)? If so, would it be worth having a REALLY PermPermGen for code that is REALLY heavily used so that it can be pinned down and use non-PIC instructions?

Rgds

Damon

Posted by Damon Hart-Davis on November 28, 2006 at 11:56 PM PST #

Bharath,

The permanent generation is collected every time the tenured generation is collected. It's not really a separate generation the way the young generation is a separate generation and there is not the mechanism to collect it separately. The tenured generation and the permanent generation are collected when either become full.

Yes, Klass is an internal representation of class. It is like an instantiation of KlassKlass in that KlassKlass describes a Klass the way Klass describes a Java object. Please ask again if that is not clear.

Posted by Jon Umasasmit on November 29, 2006 at 05:27 AM PST #

Patrick,

I'll do the easy one's here.

1) In our JVM a System.gc() does a full collection (young generation, tenured generations, and permanent generation). The specification for System.gc() says that a call to it is advisory so other actions (including no action) can occur on other JVM's.

2) Regarding hotswapping I'll see if I can get someone to explain better than I, but my understanding is that the difficulty has to do with the treatment of objects that have already been instantiated from a class that is being hotswapped. If the hotswapping adds fields or methods to the class or modifies exiting methods, what should the JVM do about the original instantiations and, more interesting, what should it do about methods of the original instantiations that are executing.

3) With regard to redeploying and filling up the permanent generation, the permanent generation is pretty much agnostic about what it contains. All the permanent generation does is allocate space and it doesn't know what its used for. However, a class and it's classloader have to both be unreachable in order for them to be unloaded. A class X with classloader A and the same class X with classloader B will result in two distinct objects (klassses) in the permanent generation. Lots of that and lots of redeployments would cause pressure on the permanent generation. But I'm really guessing here. Try running "jmap -permstat" in JDK5 and later (on solaris and linux) and that will give you information about the classes in the permanent generation.

Posted by Jon Umassamit on November 29, 2006 at 06:40 AM PST #

Damon,

Actually native code generated by C1 and C2 is not stored in the permanent generation. Perhaps that to avoid the performance problem you mention. The compilers allocate directly out of the C heap. When a class (and it's methods) are unloaded, the code for any compiled methods is put on a list for deallocation.

Posted by Jon Umassamit on November 29, 2006 at 06:56 AM PST #

Patrick,

Here's a paper that talks about implementing hotswapping. Part of the paper is a discussion on why and how the implementation would be staged. Section 3 lists the proposed stages and will give you an idea of some of the complexity involved. Sections 4 and 5 go into more detail.

http://www.experimentalstuff.com/Technologies/HotSwapTool/runtime-evol.pdf

Posted by Jon Umsasamit on November 29, 2006 at 11:47 PM PST #

I've really been enjoying this series of articles. One minor typo I noticed in this one: "constant poll" instead of "constant pool".

Posted by Elliott Hughes on November 30, 2006 at 12:47 AM PST #

Elliott,

Thanks for pointing out the typo. By the way, I fixed it by just editing my blog. I don't know if there is any external indication that it's been updated. So the typo really was there, and now it's not.

Posted by Jon Usmasamit on December 05, 2006 at 02:23 AM PST #

Just one question. This may not be related to this post. Is a "static final" directly allocated into old gen. I guess in most of the cases it will most likely land into old gen over the time I suppose? Thanks for the nice posts. Sachin

Posted by Sachin on December 17, 2006 at 12:21 PM PST #

In general everything is allocated out of the young generation. The exceptions are objects that are too large to fit in the young generation and some corner cases when free space in the young generation is low. In the latter case there are different policies for different collectors which may allow an allocation directly out of the old generation in the belief that direct allocation would be more efficient for some reason. For example, there might not be enough room in the young generation for an exceptionally large allocation but still might be plenty of free space in the young generation. In such a situation it might be better to do that one large allocation out of the old generation rather than collect the young generation.

Being "clever" like this has sometimes gotten us into trouble. In the corner cases of these corner cases we've found ourselves continually allocating out the of old generation. The really bad thing about that is that it's hard for the user (and us) to recognize what is happening. Everything looks like it's working. The application just slows down.

Sorry I didn't get back sooner. I took a few days off to visit family and see the bright (Christmas) lights in New York City. New York was great. The flights were not. Although compared to Denver, our misadventures are not worth mentioning.

Posted by Jon Usmasamti on December 22, 2006 at 03:43 AM PST #

Post a Comment:
Comments are closed for this entry.
About

jonthecollector

Search

Categories
Archives
« July 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today