Understanding Java class loading - part 2

Understanding Java class loading - part 2

This is continuation of my earlier post.

Recap

  • JVM calls java.lang.ClassLoader.loadClass(String) to load a class.
  • Loader on which JVM calls loadClass is known as initiating loader
  • Initiating loader can delegate loading to another loader which itself can delegate and so on.... Eventually some loader calls java.lang.ClassLoader.defineClass method. That loader is called defining loader
  • To resolve class references from a class com.acme.Foo, JVM uses the defining loader L1 of com.acme.Foo as the initiating loader.

Notation

At runtime, a Java class is uniquely identified by the pair: fully qualifed name and defining loader instance. We denote a class by


    <qualified-class-name, defining-loader>    

Examples:

    <java.lang.Object, null>
    <com.acme.MyAppClass, sun.misc.Launcher$AppClassLoader@768812>

When JVM uses a loader Li as initiating loader to load class C, we denote it by

    CLi

With both initiating loader Li and defining loader Ld, we denote

    <C, Ld>Li

Examples:

    <java.awt.Frame, null>sun.misc.Launcher$AppClassLoader@768812&
    <com.acme.AppUtil, sun.misc.Launcher$AppClassLoader@768812>sun.misc.Launcher$AppClassLoader@768812&

First one denotes the class with name "java.awt.Frame" defined by null (bootstrap loader). JVM initiated loading using the application class loader. Second one denotes the class with the name "com.acme.AppUtil" defined by application class loader and JVM initiated loading using the application loader itself.

Loader constraints

There are two ways in which a Java class "A" refers to another Java class "B".
  1. class A reads/writes a field of class B
  2. class A calls a method of class B
Consider that A refers to B's field "f". And assume field "f" is of reference type "T".

class A {
    void func(B b) {
        T t =  b.f;
        // operate on "t"
    }
}

class B {
    public T f;
}


During compilation, javac verifies that B has a field "f" of same type "T" (or subtype of "T"). Here, "same type" during compilation just means the same fully qualified class name. At runtime, if both A and B are loaded (i.e, defined) by the same class loader, say L, then we have no problem. JVM would use the same class loader "L" as initiating loader to load "T" - the field type. In other words, both class <A, L> and and class <B, L> would have same notion of type "T".

But, what if A and B are loaded (defined) by different loaders? Let us assume <A, L1> and <B, L2> and L1 != L2. In this case, to resolve type "T" from <A, L1>, JVM will use L1 as initiating loader. To resolve type "T" from <B, L2>, JVM will use L2 as initiating loader. Unless care is taken, these two loadClass calls could result in two different classes -- and hence assumption of type equality made during compilation would break! One quick and dirty solution would be this: whenever field is resolved at runtime force class loading for "T" by both L1 and L2 and check equality. But, that means we are leaving lazy loading behind - we are forcing early loading!

Instead of this solution, JVM takes another solution: impose loader constraints.

  • <A, L1> refers to a field "f" of type "T" from <B, L2>
  • During this field resolution, JVM records a loader constraint of the form TL1 == TL2."T" as initiated from L1 and L2 should result in same type. In other words, L1.loadClass("T") should be same as L2.loadClass("T").
  • But, JVM does not immediately verify this constraints. At each class loading, all loader constraints in the system are checked. If any of the constraints is violated during a class load, then that particular class load attempt fails!
  • By the time of recording a constraint, let us say, TL1 != TL2. i.e., T has been loaded already using L1 and L2 and those are not equal -- we are too late to impose the constraint. In this case, LinkageError will be thrown for the field linking.

Now consider case (2) above. Class "A" refers to a method of class "B". Let us say A refers to the method of the following form


   class B {
       R foo(A1 a1, A2 a2,... An an);
   }

   class A {
      void func(B b) {
         b.foo(/\* args here \*/);
      }
   }



and assume that all Ai=1, n are reference types and R (return type) is also a reference type. Again, during compilation javac would have verified that method "foo" of correct signature exists in class "B". Now again assume class A and B are defined two different loaders L1, L2. While linking this method "foo", JVM would impose the following loader constraints:

  • A1L1 == A1L2
  • A2L1 == A2L2 ....
  • AnL1 == AnL2 and
  • RL1 == RL2

Again these constraints are not verified immediately. These are just recorded in an internal data structure. JVM would verify these constraints at subsequent class load attempts.

There is one more case we have ignored so far: class B extends A and overrides method "foo". Again, overriding method B.foo is subsitutable everywhere A.foo can be called (the subtype subsititution).

   class A {
       R foo(A1 a1, A2 a2... An an) {}
   }

   class B extends A {
       @Override R foo(A1 a1, A2 a2... An an) {}
   }

If A and B are defined by two different loaders L1 and L2, then JVM imposes the following loader constraints at the time of preparation of B.
  • A1L1 == A1L2
  • A2L1 == A2L2 ....
  • AnL1 == AnL2 and
  • RL1 == RL2
As mentioned above, these constraints are just recorded initially and at each subsequent class load these would be verified.

Case Study

If all of the above sounds too abstract to you, then let us consider a concrete problem - a bug we faced in a web container. Webcontainers support "application reloading" without restarting of the webserver. A webcontainer used classloaders for application reloading. The container used a different class loader to reload an application. For example, if one of .jsp files of the webapp is modified (detected by filesystem timestamp), then the webcontainer would recompile that particular .jsp and reload the app using a different class loader instance. So far so good - everything worked as expected. But then, someone thought it may not be optimal to load all classes of the webapp just because one particular class - the class corresponding to the modifed jsp - changed. So, he optimized by the following change:

  • create a new class loader instance, say L2, for reload
  • Make the existing classloader, say L1, the same app to be the parent of the new loader
  • make new loader to delegate all class loading to parent loader - except for the modified "foo.jsp"'s "FooServlet.class".
This way, he achieved "sharing" of all other unmodified classes of the webapp. Again, everything looked right. Then, someone modified a .jsp file with the following code:

 <!-- foo.jsp file fragment -- >

 <--newly added part of JSP -->
 <%!
    class Util {
      public void func() {}
    }
 %>

 <%!Util u = new Util();%>


Then, he attempted to "reload" the webapp! He hit the infamous "loader constraint violation" at the line where "new Util()" is executed! Why? The JSP file is compiled as a Servlet class, say "FooServlet.class". And his "Util" class became inner class of FooServlet. javac passes outer instance to inner class constructor. This is how instance variables of outer class are made accessible from inner class instance.

  • Util's constructor accepts the FooServlet as parameter. It becomes FooServlet$Util.<init>(FooServlet outerThis);
  • Old FooServlet class was loaded (defined) by loader L1.
  • New FooServlet generated from new JSP was loaded (defined) using L2
  • L2 does not load "FooServlet$Util" -- it delegated it to the old loader "L1" and "L1" loaded FooServlet$Util
  • In effect <FooServlet, L2> refers to <FooServlet$Util, L1>'s constructor
  • The constructor accepts FooServlet type object for outer this. For this method resolution, JVM has to impose loader constraints on parameter and return types.
  • So, JVM imposed loader constraint FooServletL1 == FooServletL2 i.e., FooServlet as initiated from L1 and L2 should be same. Clearly, this can not be true -- L1 is the old loader that defined old FooServlet and L2 is the new loader that defined new FooServet. So, this constraint record attempt failed -- during FooServlet$Util.<init> method!!
One particular fix for this problem (ugly workaround?) would be to force L2 - the new loader - to load (define) all inner, nested, anonymous classes of FooServlet class and delegate loading of all other classes to the parent loader!!

References

Comments:

Hello!

Thanks for the informative content.

A Java "Tech Tips" article just came out that outlined "The Singleton Pattern". Then an update came out saying that if you wind up instatiating 2 instances of your class using different classloaders then you wind up not having a singleton as you might expect. I have a question; maybe you can answer this:

Would it be possible to load 2 different instances of your singleton unintentionally and then fail by assuming a == b since you made a singleton, but in fact the JVM considers the two not equal and simply returns false. Or, are you destined to always fail with a linkage error instead -- meaning that, yes, you did instantiate two different instances of what you called a singleton, but any time you refer to them from a piece of code that is "crossing" the two classloaders you will get a failure.

Hopefully that got my question out clearly.

What do you think? -- Or, what do you know...!

Thanks again,
Steven.

Posted by Steven Coco on March 15, 2006 at 07:45 PM IST #

Hello!

Thanks for the informative content.

A Java "Tech Tips" article just came out that outlined "The Singleton Pattern". Then an update came out saying that if you wind up instatiating 2 instances of your class using different classloaders then you wind up not having a singleton as you might expect. I have a question; maybe you can answer this:

Would it be possible to load 2 different instances of your singleton unintentionally and then fail by assuming a == b since you made a singleton, but in fact the JVM considers the two not equal and simply returns false. Or, are you destined to always fail with a linkage error instead -- meaning that, yes, you did instantiate two different instances of what you called a singleton, but any time you refer to them from a piece of code that is "crossing" the two classloaders you will get a failure.

Hopefully that got my question out clearly.

What do you think? -- Or, what do you know...!

Thanks again,
Steven.

Posted by Steven Coco on March 15, 2006 at 07:47 PM IST #

Hi Steven: It depends on how you refer to the those singleton instances "a" and "b". If the singleton class is referred by name in a "Client" class, then two references of it (one for "a" and one for "b") will resolve to the same type - JVM would use "Client"'s defining loader to resolve "Singleton" for the first and use the same class for second reference (from cache) as well - so there won't be any conflict. i.e., you can't refer to <Singleton, L1> and <Singleton, L2> [L1 != L2] from <Client, B> by name. But, if you refer to one of those singletons using reflection [may be, you called L2.loadClass("Singleton") and invoked static singleton getter method for "a". For "b", you called Singleton.getInstance() directly] or if you compare using java.lang.Object as type, then comparison will just return false and there won't be any loader constraints. If you can post an example, we can analyze the same.

Posted by A. Sundararajan on March 16, 2006 at 02:36 AM IST #

I don't have any code that falls under this topic -- that I know of. What came to mind was the aspect of exactly how and when you could experience a problem that didn't lead to an exception. I wondered if the cases that did not lead to an exception might boil down to some specific usages.

Your answer was definitely enlightening; and I think it answered my question. Even if I de-serialize an instance, if I am referring to it by name then the class loader is going to look up the name [in the serialized stream] in its own cache and use that definition to proceed -- having no bearing on the class loader that did in fact instantiate the object. But in fact, you won't even see a linkage error unless you attempt to invoke something on the object -- a simple comparison can't raise this exception.

It was interesting to also compare the section on enums (8.9) in the current JLS, in the first comment that explains why an enum is guaranteed to be unique (serialization and reflection are handled specially, among others); since these are similar areas of concern.

I am also intrigued by what a rigorous breakdown of what can and cannot be expected when one looks at content of the actual Java source file and relates the actual input to what is or is not possible to manufacture given only that input (relating perhaps to types and names or something) -- you might find a "boundary". But at this point you are surely allowed to let me go and "read the manual"! And I do appreciate the time you have taken for this.

Thanks again!
Steven.

Posted by Steven Coco on March 17, 2006 at 02:10 AM IST #

Hi Steven: JLS section 8.9 on enums does guarantee that we can't create instances of a enum class outside of the class itself -- neither by reflection nor by deserialization. But, still this does not say anything about class loaders. If we have an enum class called "Color" loaded by two different loaders <Color, L1> and <Color, L2>, then "corresponding" static field values are different (even if those are loaded from identical class bytes). Example:

File: Color.java

public enum Color {
   RED, GREEN, BLUE;
}

File: Main.java

import java.io.\*;
import java.lang.reflect.\*;

public class Main {
    static class MyLoader extends ClassLoader {
        public Class loadClass(String name) throws ClassNotFoundException {
            if (name.equals("Color")) {
                return findClass(name);
            } else {
                return super.loadClass(name);
            }
        }

        public Class findClass(String name) throws ClassNotFoundException {
            try {
                // we load the class from current directory
                File f = new File(name.replace(".", "/") + ".class");
                int len = (int) f.length();
                byte[] buf = new byte[len];
                FileInputStream fis = new FileInputStream(f);
                fis.read(buf);
                return defineClass(name, buf, 0, buf.length);
            } catch (Exception exp) {
                throw new ClassNotFoundException(exp.getMessage());
            }
        }

    }

    public static void main(String[] args) throws Exception {
        MyLoader loader = new MyLoader();
        Class colorClazz = loader.loadClass("Color");

        /\* If you the following line, red and Color.RED will compare equal \*/
        // Class colorClazz = Class.forName("Color");

        Field redField = colorClazz.getDeclaredField("RED");
        Object red = redField.get(null);

        // When I refer as "Color" by name, the application class loader
        // loaded it. "RED" field from that class is different from "RED"
        // field of <Color, loader>


        System.out.println(colorClazz.getClassLoader());
        System.out.println(Color.class.getClassLoader());

        if (red == Color.RED) { 
            System.out.println("What? How is that possible??"); 
        } else { 
            System.out.println("Yes, I expected this!");
        }  
    }
}

In the above code, "Yes, I expected this!" is printed - well, as expected :-)

Posted by A. Sundararajan on March 17, 2006 at 08:53 AM IST #

Hi!

That example did behave as I would have expected. It was clear. The behavior produced by your commented optional line in the main method is also understood: the application class loader is used to load the class. And that does point to my thoughts about the boundary provided or implied by the syntax: You referred to the class by name in the getClass() invocation; but specifically, the code does not contain a parsed identifier, or something that is checked by the compiler. And you were able to load 2 different classes. So; I wouldn't say that I've narrowed it all down to exactly when you can be holding 2 different classes in the same block, but I do know some more now!

Thanks again for the helpful information!

Steven.

Posted by Steven Coco on March 17, 2006 at 06:51 PM IST #

Post a Comment:
Comments are closed for this entry.
About

sundararajan

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Bookmarks
Links

No bookmarks in folder

Blogroll

No bookmarks in folder

News

No bookmarks in folder