The JSR292 endgame

The current JSR292 proposal has come a very long way and is indeed very close to its final design. A lot of problems and design choices have been resolved successfully over time by a world wide group led by John Rose. The current design looks good. However as a member of the JRockit team that will have to implement JSR292 in our JVM, I feel a small trepidation over the proposed transform API in the MethodHandles factory class.

Before examining the details of the proposed API, I would like to summarize JSR292 for readers who have not followed the design process. The latest JSR292 API proposal offers 3 different concepts that together makes the life of the dynamic language runtime developer significantly easier. Easier means that generation of bytecodes can be avoided during runtime. In fact all of the problems described below can be (and have been) solved by generation of bytecode during runtime. If only a few classes had to be generated during runtime this approach would probably work. Unfortunately the pre JSR292 implemented runtime for Ruby required huge amounts of classes to be generated. The JVM could therefore not be said to have a good abstraction layer for supporting dynamic languages.

threeblobs.png

The Marker

When a Ruby program is compiled into Java bytecode a new bytecode instruction (invokedynamic) indicates that a dynamic language runtime specific call occurs at this point. The destination will be resolved by the dynamic language runtime, not through any JVM specific code. Remember, it is spelled invokedynamic, but it is pronounced build call site. A bytecode disassembly could look like this:

bytecode example:
bipush        123
invokedynamic #9; //Method java/dyn/InvokeDynamic.dosomething:(I)I

java example:
java.dyn.InvokeDynamic.dosomething<int>(123);

The javac support for compiling method calls prefixed with java.dyn.InvokeDynamic into the invokedynamic bytecode is useful for testing and evaluating invokedynamic with a pure Java program. The syntax <int> is necessary to tell javac what return type should be encoded at the call site since this is not visible from the java code itself. (Normal java calls fetch the return type from the class definition.)

The method name, dosomething, is in fact, not a method. It is only information handed to the dynamic language runtime to assist it in choosing the correct destination. No dosomething need actually exist anywhere. This leads us to the next part, how does the dynamic runtime make the callsite identified by the string "dosomething" call, lets say, the static negate method in the class Reziver?

The Binding

Virtual and interface calls are not flexible enough for the dynamic language runtime to bind the call site to a destination. It must be possible, during runtime, to make class A call a method in class B even though A has never heard of B, B's interfaces or inheritance hierarchy before. To solve this problem the MethodHandle was introduced. The MethodHandle is not something new, similar constructs have existed in other OO languages for quite some time. (SmallTalk/Squeak perform:, Objective-C: performSelector:, C++ pointer to member function).

The method handle is a generic concept and is a worthwhile addition to the java standard by itself. Lets see an example of how a method handle is created and used. In the example, we want to call the static method Reziver.negate with the argument 123:

bytecode example:
ldc #9; //Method Reziver.negate:(I)I
bipush 123
invokevirtual #6; //Method java/dyn/MethodHandle.invoke:(I)I

java example:
MethodHandle mh = #Reziver.negate;
mh.invoke<int>(123);

Here I propose a Java syntax for acquiring a MethodHandle. Currently JSR292 gives several factory methods to do this. Unfortunately these do not offer javac compile time checking which means that if you spell the name of the method wrong, you will only find out later, at runtime.

Using this syntax, javac has to perform a lexical lookup of Reziver.negate. If there had been more than one static negate in the class Reziver, we might have had to specify the signature for the benefit of javac:

MethodHandle mh = #Reziver.negate<int>(int);

If the method is not static, then we would get a method handle that requires a receiver. Without the receiver the method would not know which instance to work on:

Reziver r = new Reziver();
MethodHandle mh = #Reziver.update<void>(int);
mh.invoke<int>(r,123);      

A method handle that needs a reciever can be bound permanently to a specific reciever. This can be done with the factory method MethodHandles.bind(mh, alpha), or using the syntax I propose here:

Reziver r = new Reziver();
MethodHandle mh = #r.update;
mh.invoke<int>(123); 

The dynamic language runtime will be called by the JVM through a bootstrap method (examples given here). Within this method the dynamic language runtime will have access to a call site variable (CallSite cs). Assume that the dynamic language runtime wants to make the callsite calling "dosomething" execute (static) method Reziver.negate, then it would execute the code:

cs.setTarget(#Reziver.negate);
This would transform:
java.dyn.Dynamic.dosomething<int>(123);
into:
(#Reziver.negate).invoke<int>(123);

Transform

Dynamic languages not only bind methods to call sites very late, dynamic languages also do a large amount of argument conversion automatically. For example a string "578" is easily translated into an integer argument only for the purpose of a single method call.

A dynamic language runtime developer can easily create a transform class with an adapt method suitable for being called through a method handle. The adapt method will adjust/adapt the arguments before calling the destination method handle. For example assume that the invokedynamic callsite is:

java.dyn.Dynamic.dosomething<int>(123);

But the target selected by the dynamic runtime has a different descriptor (aka MethodType in JSR292 speak):

#Reziver.add<int>(int,int)

This is easy enough. The adapt method only needs to call the target with an extra integer argument, for example the constant 10. But this assumes that the dynamic language runtime developer had anticipated this and created a class with such a method beforehand. Ok, not completely unthinkable. Lets have a look at such a class:

class AppendIntArgument {
   MethodHandle target;
   int extra_int;

   public static MethodHandle create(MethodHandle t, int extra) 
		throws Cantdothat
   {
        if (!t.type().equals(#<int>(int,int))) throw new Cantdothat();  
        AppendIntArgument a = new AppendIntArgument();
        a.target = t;
        a.extra_int = extra;      
        return #a.adapt;
   }  

   private int adapt(int i) { return target.invoke<int>(i, extra_int); }
}

The dynamic language runtime would then bind the call site to its destination with the code:

cs.setTarget(AppendIntArgument.create(#Reziver.add, 10));

But then, suddenly, the runtime has to append an integer to a call with the signature (String,Socket) and this is where it gets complicated for the poor dynamic runtime developer. Clearly it is unthinkable that all possible signatures and targets could be covered beforehand! (And do not even mention bytecode generation, we are trying to get away from that!)

The current solution and its problems

The current JSR292 proposal tries to solve this by creating a factory for generating these adapter classes. The factory interface is implemented by the JVM and is therefore final. The selection of adapt commands are for example appendArgument, prependArgument, dropArgument, it can cast arguments, check the type of arguments etc. The factory is supposed to be able to generate all "reasonable" adapter classes. This can be done by iteratively applying a transform on a transform on a method handle.

This is the source of my trepidation! Why should we limit ourselves to "reasonable" transforms cut in stone by the JVM? What happens if a runtime really needs an unreasonable transform? Clearly the dynamic language runtime developer cannot prepare for all signatures for the new transform, therefore the runtime has to generate bytecode and this we do not want!

There will also be a huge amount of new adapter classes even though the creation of these are hidden behind a factory interface. The large amount of classes is also something that the pre-JSR292 dynamic language runtimes had problems with. Java is not designed (and definitely not implemented) to have millions of classes. (It is quite feasable though, to have billions of instances of classes, as long as your heap is large enough.)

Of course the JVM might implement the adapter classes in a completely different way internally. But this would not help the JVM implementors! It would still require a serious investment in new infrastructure. For JVMs with an interpreter it is far too easy to simply concatenate a bit of bytecode internally and call it anonymous classes and execute it. Our JVM, JRockit, does not have an interpreter, everything is compiled. We cannot take the easy route using an interpreter.

Clearly we want to be able to express the transforms using the Java language itself! If there was a way we could make the above AppendIntArgument class to express AppendArgument for any class and primitive type within Java, then we would not need the magic factory class!

I can hear the C++ crowd shouting templates below my window here in Stockholm. But no, the solution is much simpler than that.

The new solution and why it is better

The solution is to loosen up the requirements on an exact MethodType match at the call site when invoking a method handle. Thus, a method handle invoke will always be attempted as long as the MethodType of the handle and the callsite agree on the number of arguments! This does sound counterintuitive, I know, but please read on!

What do I mean with attempted? I mean that as long as the number of arguments are the same. The interpreter will cast each argument to an Object (after boxing any primitive arguments.) Then the interpreter will cast each argument to the correct type of the destination. If the destination argument is a primitive the Object is first cast to for example Integer, and then unboxed. If there is any mismatch the invocation will fail with a ClassCastException.

It might sound like this will simply delay the failure of a call until a cast fails. Well it does, but the real reason is that by using this design you can always call a method that takes the same number of arguments, but where all arguments are of the class Object! And just as important, you can call any method handle using only Object references as arguments. If you link together several adapters working with Object references. The cast (and potential box) will only happen at the initial invoke. As long as it is adapter code calling adapter code, all arguments remain Object references. Finally, when the real target is invoked, the references are downcast to the required argument types.

With this single change of the MethodHandle invocation principle it is now possible to write class independent argument appending adapters! Lets have a look at that example again:

class AppendArgument {
   final MethodHandle target;
   final Object extra;

   public static MethodHandle create(MethodHandle t, Object e) 
		throws Cantdothat
   {
        AppendArgument a = new AppendArgument(t,e);
        if (t.type().parameterCount() == 0) throw new Cantdothat;
        else if (t.type().parameterCount() == 1) return #a.adapt1; 
        else if (t.type().parameterCount() == 2) return #a.adapt2;
        else if (t.type().parameterCount() == 3) return #a.adapt3;
        throw new CantDoThat;
   }  

   private AppendArgument(MethodHandle t, Object e) {
        target = t; 
        extra = a;
   }

   private Object adapt1() { 
        return target.invoke<Object>(extra); 
   }
   private Object adapt2(Object o) { 
        return target.invoke<Object>(o, extra); 
   }
   private Object adapt3(Object o, Object p) { 
        return target.invoke<Object>(o, p, extra); 
   }
}

Obviously we need as many adapt methods as we might have arguments. But this is definitely manageable. (If there is a dynamic language that regularily deals with 20+ arguments to method calls, it might want to create an adapt class with 20+ methods that transforms arguments into a method handle with the type: #<Object>(Object[]) and vice versa. Then the dynamic language specific adapter can work on object arrays instead.)

Therefore a single Java class is enough for appending an argument to all possible calls with all possible signatures and return types! The dynamic language runtime would then bind the call site to its destination with the code:

cs.setTarget(AppendArgument.create(#Reziver.add<int>(int,int), 10));

You might think that, even though this looks like a good theoretical solution to the problem of adapting calls, it will not be fast! However, this is not so! Firstly, if a call chain is not hot, it does not matter how convoluted the call chain is! Secondly, if a call chain is indeed hot, then it will be compiled with all the optimization techniques available to the JVM. Thirdly, all the necessary compilation techniques to generate efficient code from the append adapter above already exists in todays JVMs!

I will first show that it is indeed possible to compile such a loose method handle invocation efficiently even when the method handle is not final and not bound. (I will soon tell you what is possible when the method handle is indeed final and bound.)

For those who have not worked with JVM compilers before, invokestatic, invokevirtual and invokeinterface are all compiled into different types of call sites. It is therefore not at all surprising that invokevirtual on a MethodHandle will be yet another call site. Lets have a look at some pseudo code for such a method handle call site:

The invocation of the bound method handle mh = #reziver.negate:

    mh.invoke<int>(123);

is compiled into the following pseudo-code:

if (mh.type().equals(#<int>(int))) {
   // If mh is un-bound, do virtual lookup on first argument,
   // then remove this argument for the actual call:
   call-method-standard-entry // since arguments now match perfectly.
} else if (mh.type().parameterCount() == 1) {
   // If mh is un-bound, the first argument is the receiver. Its type
   // must be an instance of the receiver type, or else we cannot do a
   // virtual lookup.
   if (mh is un-bound) 
       if (!arg0 instanceof mh.type().parameterType(0)) {
           throw WrongMethodTypeException
       } 
       ...remove receiver argument
   }
   ...Box all primitive arguments and cast to Object
   ...Cast all other arguments to Object, i.e. a no-op.
   call-method-safe-entry // that will cast/unbox arguments and, if needed box the return value
   ...cast return value to Integer
   ...unbox
} else {
   throw WrongMethodTypeException
}
callsite_method.png

Each method has, in addition to the standard entry which is used for all normal Java calls, a safe entry where the method assumes every argument to be an Object pointer that needs to be casted and potentially unboxed. When the method is called through the safe entry, the method will also return a boxed value if the original method returns a primitive.

  • If call site arguments are all references (no primitives) then the whole process of casting all arguments at the call site is a no-op. This is true for all adapt methods that invokes on the target.
  • The safe entry can point to the standard entry if the method only takes Objects as arguments and return an Object! Again this is true for all adapt methods.
  • For a chain of generic adapters, boxing will only occur at the first method handle call and for the return value.
  • The safe entry can be lazily generated if it is needed, i.e. only for those methods that are really invoked by a method handle through an adapter.

By separating the complexity of argument adjustment into the callsite and into the safe entry of the method, it is possible to compile a complicated method handle (ie one where the MethodType cannot be deduced beforehand) into efficient code.

Now if the JVM knows that a method handle variable mh always contains a certain MethodHandle. Either because mh is final or because mh was created within the same method context. Then the JVM can compile the call into code with the same speed as standard java invocation. I.e. the following calls will be identical in speed!

// An un-bound method handle is stored 
// in a final field.
final MethodHandle mh = 
   #GraphicObject.drawOn<void>(Context);
...
for (GraphicObject g : objects) {
    // The method handle invoke
    mh.invoke<void>(g, mycontext); 
    // is identical in speed to the 
    // normal virtual java call:
    g.drawOn(mycontext);   
} 
// A un-bound method handle is created 
// locally, this invoke is 
(#update).invoke(reziver);
// identical in speed to the 
// normal virtual java call:
reziver.update()


If a method handle is both final and bound, then the invocation call is identical to a static call! This means that the invocation can be inlined! I purposely made sure the AppendArgument put the target into a final field and returned a bound method handle. The result of this is that all adapt methods will be inlined into each other since they are usually very short.

Please have a look at the method test() below. What will a modern JVM optimize this code into? I have only tested this on JRockit, but similar results should be possible on other highly optimizing JVMs as well.

public class Example {        

    Object extra;     
    static int y;
    
    public Example(Object o) {
        extra = o;
    }
    
    public static void main(String[] args) {    
         for (;;) {
             Example.test();
         }
    }    

static int test() { Example e = new Example(new Integer(10)); return ((Integer)e.adapt2(123)).intValue(); }
Object adapt2(Object o) { return add(o, extra); } Object add(Object a, Object b) { Integer i = (Integer)a; Integer j = (Integer)b; return new Integer(i.intValue()+j.intValue()); } }

When JRockit has optized test(),the optimized code will be:

mov    $0x00000085,%eax

Yep, that is right, a single machine code mov, that moves 123+10 = 133 = 0x85 into %eax which is the return register.

You can test this yourself, download the latest JRockit Mission Control here!
then execute:

jrockit.../bin/java -Xverbose:opt,gc Example

Even if you cannot examine the actual generated code you can verify that
as soon as JRockit has optimized test(), there will be no further garbage 
collects! This is due to the fact that test() no longer allocates Integer 
objects.

Now if we change the constant 123 into y. Then the optimized code will be:

mov    $0x7f65df16d648,%r12
mov    0x0(%r12),%eax
add    $0x0000000a,%eax

First it loads the static value y into the return value %eax and then add the constant value 10 to eax.

What we can see here is that a modern JVM makes heavy use of inlining, type inference and escape analysis. My trivial example highlights a (sort of) invokedynamic call chain where the initial int 123 is boxed into an Integer, adapt2 is an AppendArgument adapter that calls the target add with the given argument o and the stored extra.

The add method obviously only works with Integers but nothing in the interface tells you so. The JVM will inline first to generate a chain of variable assignments. Integer(123) -> Object o -> Object a -> Integer i using type inference it will realize that the final type cast to Integer is unnecessary since it was an Integer to start with. Then it realizes that creating the Integer objects are unnecessary, among other things they do not escape. So it reduces everything down to primitives.

Clearly the adapter code for the appendArgument does not in this example cause any extra runtime computational logic at all. The shuffling of arguments takes place fully within the compiler logic. The same should be true for dropArguments, insertArgument, permuteArguments.

Spread and Collect are a bit trickier. A few tests show that JRockit can track symbolic references going in and out of arrays as long as the array indexes are constants. But it does not optimize as well when we use loops to iterate over the arrays. This should not be impossible to improve significantly when the for loop bounds can be deduced and are small. Which is reasonable to assume when the arrays are used for method argument lists.

Also arrayElementGetter and arrayElementSetter will be unnecessary since we can create static methods that do the right thing. GuardWithTest and CheckArguments are simply Java classes that contain final method handles named test, target, fallback and the code is simple Java code.

Finally, when setTarget is called on the call site object. The JVM is in full control of this call and can therefore treat the set method handle as if it was final! If the invokedynamic code path is hot and the final target is short and all the adapter methods are short. Everything will be inlined into the invokedynamic call site. It can't get any faster than this!

Conlusions

With this change to the method handle calling convention the JSR292 can be simplified into two generic concepts:

twoblobs.png

By allowing method handle calls where all arguments and return values are allowed to be (optimistically) cast to Object. We get the following benefits:

  1. All adapter code can be expressed with normal Java code. No need for a magic factory.
  2. The invokedynamic call site will be as fast as possible based on ordinary optimizations techniques, not because of special JVM code compiling the adapters.
  3. Instead of creating many new unique classes (hidden behind a factory or not) the adapter trees in this solution are represented as numerous instances of a limited number of classes. This is what the JVM is designed for.
  4. The time spent on creating adapter code factories can instead be spent on improving already existing optimizations in JVM to generate truly fast code for the adapters. This will benefit all Java applications not only dynamic languange runtimes.

Thanks for reading all of this text!

Fredrik Öhrström

Javac syntax proposal (and small bytecode addition)

The hash character # signifies the start of a MethodType or a MethodHandle. If it begins with #< then it is a MethodType otherwise a MethodHandle. This syntax is inspired from Smalltalk:

; Two identical SmallTalk invocations:
reziver update
reziver perform: #update
; Three identical Java invocations:
reziver.update()
(#update).invoke(reziver);
(#reziver.update).invoke();

class Test {
    public static int negate(int i) { return -i; } 
    public static long handle(int i) { return i*i; } 
    public static long handle(String i) { return Long.parseLong(i); } 
    public void update(int i) { v=i; } 
    public void step(int i) { v=i; } 
    public void step(String i) { v=Integer.parseInt(i); } 

    int v;
 
    public void test() {
         MethodHandle m; 
         // Lookup unique negate inside the current context.
         m = #negate;
         if (!m.type().equals(#<int>(int))) throw new Error();

         // Lookup unique negate in Test.
         m = #Test.negate;
         if (!m.type().equals(#<int>(int))) throw new Error();

         // Lookup overloaded handle method in Test.
         m = #Reziver.handle<long>(String);
         if (!m.type().equals(#<long>(String)) throw new Error();

         // Lookup unique non-static method in current context.
         m = #update;
         if (!m.type().equals(#<void>(Test,int)) throw new Error();

         // Lookup unique non-static method in Test.
         m = #Test.update;
         if (!m.type().equals(#<void>(Test,int)) throw new Error();

         // Lookup overloaded non-static method in Test.
         m = #Test.step<void>(String);
         if (!m.type().equals(#<void>(Test,String)) throw new Error();

         // Lookup and bind non-static method to this.
         m = #this.update;
         if (!m.type().equals(#<void>(int)) throw new Error();

         // These would fail since the methods are not unique.
         // m = #handle;
         // m = #Test.handle;
         // m = #step;
         // m = #Test.step;
         // m = #this.step;
    }
}

The use of the hash character is going to be used here as well. These two proposals are perhaps compatible.

According to JSR292 it is possible to use "ldc CONSTANT_Methodref_info" and "ldc CONSTANT_InterfaceMethodref_info" to immediately create a MethodHandle and load it onto the operand stack. Therefore we should also allow "ldc CONSTANT_Utf8_info" (where info describes a valid descriptor) to immediately create a MethodType.

More flexible MethodHandle invoke proposal

A call site that invokes a method handle will either be:

  1. a successful fast case where the method type matches perfectly
  2. a slow case where the call site and the MethodHandle have the same number of arguments, all arguments and the return value are converted to Object, therefore the call might fail with failed casts when the arguments are cast to the required destination type.
  3. fail quickly if not the same number of arguments.
class Test {
    public static int alpha(int i) { return -i; } 
    public static String beta(String s) { return s+s; } 
    public static void gamma(String s) { System.out.println(s); } 
    public void delta(String S) { System.out.println(s); }
 
    public void test() {
         MethodHandle m;
         int x;
         long y;
         String s;
         Object o;
 
         m = #alpha;
         x = m.invoke<int>(5);
         // x will contain -5
         x = m.invoke<Integer>(Integer.valueOf(5)).intValue();
         // x will contain -5
         m.invoke<void>(5);
         // The return value is ignored. 

         m = #beta;
         s = m.invoke<String>("Hello");
         // s will contain "HelloHello"
         o = m.invoke<Object>((Object)"Hello");
         // o will point to a String "HelloHello"
%
        
    
Comments:

Post a Comment:
Comments are closed for this entry.
About

bocadmin_ww

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today