Pulling a Machine Code Rabbit Out of a Java Hat

How good can modern JVMs be at optimizing code that makes excessive use of auto boxing for method arguments? As it turns out, at least JRockit can be pretty darn good!

<script language='javascript' type='text/javascript'> </script>

The question arose because of the proposal to allow Java code to express the transforms necessary for dynamic language runtimes in JSR292. According to the proposal, the different steps in a method handle invocation would be:

1) If there is an exact type match between the call site and the method handle target, then the call succeeds (as specified by the current JSR292 spec):

2) If the types do not match exactly, but the number of arguments are the same, then (instead of throwing a WrongMethodTypeException as the current JSR292 spec says) the call is attempted by casting/boxing the arguments to Object. Such a call might of course fail halfway if the casts are unsuccessful.

3) If the number of arguments differ between the call site and the target, then the call will fail with a WrongMethodTypeException.

The proposed generic invocation can be efficiently implemented. It minimizes the amount of allocations by not going through Object[] arrays. This is important as long as the code is interpreted and not yet optimized. If the code path is hot, then the code will be optimized and all the auto boxing will be removed! To be proved later in this text!

It is backwards compatible with the current JSR292 design since the generic invocation will only be attempted in situations where the current JSR292 design would throw a WrongMethodTypeException!



Why introduce this generic invocation in method handles?

The purpose of the generic invocation is to help the dynamic language runtime developer to avoid the combinatorial explosion of possible method types! The current JSR292 proposal aims to solve this by only offering a limited set of factory methods for creating transforms. This factory and its transforms can very well remain in the spec, there is nothing wrong with these per se. However denying the actual Java Virtual Machine bytecode from performing the same task would be very limiting!

With method handles that support generic invocation you can write generic transform methods yourself! For example to drop an argument and to permute the arguments.



What would the current JSR292 transforms look like?

Here are a few such transforms expressed with plain Java code. (When you read the Java source for the transforms, the syntax #alfa.adapt simply means MethodHandles.Lookup.bind(alfa, "adapt") the reasons for this syntax is explained here.)

The transforms can be implemented in two different ways depending on how advanced the JVM is. The longer versions do not cause extra object arrays to be allocated for each call and are easier to optimize. The shorter versions can be used with highly optimizing JVMs.

  • Collect.to Turn the fixed number of arguments into a varargs Object[] array. View source
  • Spread.to Turn a varargs Object[] array into to a method call with a fixed number of arguments. View source
  • InsertArgument.create Call the target method with an extra (constant) argument. View source
  • DropArgument.create Call the target method with one less argument. View source
  • PermuteArgument.create Call the target method with the same arguments, but with a different order. View source
  • ArrayElement Extract an element out of an array. View source
  • GuardWithTest.create Act as an if-statement. View source
  • CombineArguments.create First execute a method handle that returns a new argument, to be inserted into the argument list when calling the final target method. View source
  • ExactInvoker.create Use the first argument as a method handle and invoke the rest of the arguments on it. View source



What can these transforms be used for?

Let us construct an example transform using the user defined methods:

boolean compare(int a);
int sub(int a, int b, int c);
int mul(int a, int b);
int constant123();

The transform is then created using the following statement:

MethodHandle test = 
    GuardWithTest.create(
        #compare, 
        InsertArgument.create(
            CombineArguments.create(
                #sub, 
                1, 
                #mul), 
            1, 
            0x567),
        #constant123);

In effect, this transform equals the following Java code:

int test(int i) {
  if (compare(i)) {
    return sub(i, mul(i, 0x567), 0x567);
  }
  return 0x123;
}

Now, JRockit does not support MethodHandles yet. Therefore I have to simulate the methodhandles using standard Java code. Remember, this was about if a JVM could automatically optimize away all those auto boxing operations. The Java code below simulates the constructed transform above with generic invocation support. (It takes the more difficult way by using Object[] arrays.) The question is, how well will JRockit optimize int test(int i)?

public class Example {		
    Object insert_arg; 	
    Object compare_arg; 	

    public static void main(String[] args) {
	for (int i=0; ; ++i) {			
	    int x = Example.test(i);	   
	    if (i%1000 == 0) { i = 0; }
	}
    }
	
    static int test(int i)
    {		
	Example e = new Example();
	e.insert_arg = 0x567;
	e.compare_arg = 0x890;
	return ((Integer)e.collect(i)).intValue();
    }

    Object collect(Object o)
    {
	Object[] args = { o };
	return guardWithTest(args);
    }

    Object compare(Object[] args) {
	int x = (int)(Integer)args[0];
	int y = (int)(Integer)compare_arg;
	return x > y;
    }		 
	
    Object constant123(Object[] args) {
	return 0x123;
    }

    Object guardWithTest(Object[] in_args) {
	boolean b =  (boolean)(Boolean)compare(in_args);
	if (b) {
	    return insertArgument(in_args); 
	} else {
	    return constant123(in_args);
	}
    }
	
    Object insertArgument(Object[] in_args)
    {
	Object[] out_args = new Object[in_args.length+1];
	System.arraycopy(in_args, 0, out_args, 0, in_args.length);
	out_args[in_args.length] = insert_arg;
	return combineArguments(out_args);
    }	
    
    Object combineArguments(Object[] args) {
	Object v = mul(args);
	Object[] out_args = new Object[args.length+1];
	System.arraycopy(args, 0, out_args, 0, 1);
	System.arraycopy(args, 1, out_args, 2, 1);
	out_args[1] = v;
	return sub(out_args);
    }
    
    Object mul(Object[] args)
    {
	int a = (int)(Integer)args[0];
	int b = (int)(Integer)args[1];
	return a*b;
    }
    
    Object sub(Object[] args)
    {						
	int x = (int)(Integer)args[0];
	int y = (int)(Integer)args[1];
	int z = (int)(Integer)args[2];
	return (x-y-z);
    }
}

It turned out that the latest JRockit development version almost made a perfect job. I only needed to fix a few glitches in the optimizer to have JRockit produce the following machine code:

           0x8592e3f9:	mov    %eax,%ecx
           0x8592e3fb:	cmp    $0x00000890,%eax
           0x8592e400:	jg     0x8592e409
           0x8592e402:	mov    $0x00000123,%ecx
           0x8592e407:	jmp    0x8592e419
           0x8592e409:	mov    %eax,%esi
           0x8592e40b:	imul   $0x00000567,%esi
           0x8592e411:	sub    %esi,%ecx
           0x8592e413:	sub    $0x00000567,%ecx
           0x8592e419:	mov    %ecx,%eax
           0x8592e41b:	ret

As you can see, every single array and auto boxing operation have been removed! This code is as fast, as if the transform had been written in plain Java or generated using bytecodes and a class loader!



Pulling a machine code rabbit out of a Java hat

What we have here is the ability to craft a tree of existing methods linked together with final and bound method handles. To pull the machine code out of the method handle tree, you simply have to make sure that the method handle is hot!

You can use such transforms to avoid generating bytecode when you:

  • are executing a dynamic language
  • are rendering dynamic web pages
  • are building a new XSLT processor
  • are creating an expert system

The generic invocation proposal is backwards compatible with the current JSR292 proposal and it can be efficiently optimized. It makes it possible for dynamic language runtime developers to construct their own transforms easily and with readable code.

Fredrik Öhrström












class Collect {

  final MethodHandle target;

  private Collect(MethodHandle t) {
    target = t;
  }

  public static 
  MethodHandle to(MethodHandle target, 
                  int nargs) 
    throws CantDoThat
  {
    int nargs=target.type().parameterCount();
    if (!target.type().equals(
             #<Object>(Object[]))) {
      throw new CantDoThat();
    }
    Collect s = new Collect(target);
    switch (nargs) {
      case 0 : return #s.adapt0;
      case 1 : return #s.adapt1;
      case 2 : return #s.adapt2;
      case 3 : return #s.adapt3;    
      // continue to ~20
    }
    throw new CantDothat();
  }

  private Object adapt0() {
    return target.invoke<Object>(new Object[0]);
  }

  private Object adapt1(Object a) {
    Object[] out_args = { a };
    return target.invoke<Object>(out_args);  
  }

  private Object adapt2(Object a, 
                        Object b) {
    Object[] out_args = { a, b };
    return target.invoke<Object>(out_args);  
  }

  private Object adapt3(Object a, 
                        Object b, 
                        Object c) {
    Object[] out_args = { a, b, c };
    return target.invoke<Object>(out_args);  
  }
  // continue to ~20
}	
class Spread {

  final MethodHandle target;

  private Spread(MethodHandle t) {
    target = t;
  }

  public static 
  MethodHandle to(MethodHandle target) 
    throws CantDoThat
  {
    int nargs = target.type().parameterCount();
    Spread s = new Spread(target);

    switch (n_toargs) {
      case 0 : return #s.adapt0;
      case 1 : return #s.adapt1;
      case 2 : return #s.adapt2;
      case 3 : return #s.adapt3;    
      // continue to ~20
    }
    throw new CantDothat();
  }

  private Object adapt0(Object[] args) {
    return target.invoke<Object>();
  }

  private Object adapt1(Object[] args) {
    return target.invoke<Object>(args[0]);
  }

  private Object adapt2(Object[] args) {
    return target.invoke<Object>(args[0], 
                                 args[1]);
  }

  private Object adapt3(Object[] args) {
    return target.invoke<Object>(args[0], 
                                       args[1], 
                                       args[2]);
  }
  // continue to ~20
}	

Not using varargs:

class InsertArgument
{
  final MethodHandle target;
  final int pos;
  final Object arg;

  private InsertArgument(MethodHandle t, 
                         int p,
                         Object ia) 
  {
    target = t;
    pos = p;
    arg = ia;
  }

  public static 
  MethodHandle create(MethodHandle target, 
                      int pos,
                      Object arg) 
    throws CantDoThat
  {
    int nargs=target.type().parameterCount();
    if (nargs == 0) {
      throw new CantDoThat();
    }
    PermuteArguments p = 
       new InsertArgument(target, pos, arg);
    switch (nargs) {
      case 1: return #p.insert0;
      case 2: return #p.insert1;
      case 3: return #p.insert2;
      // continue to ~20
    }
    throw new CantDoThat();
  }

  private Object insert0() {
    return target.invoke<Object>(arg);
  }

  private Object insert1(Object a) {
    switch (pos) {
      case 0:
        return target.invoke<Object>(arg, a);
    }
    return target.invoke<Object>(a, arg);
  }

  private Object insert2(Object a,
                         Object b) {
    switch (pos) {
      case 0:
        return target.invoke<Object>(arg, a, b);
      case 1:
        return target.invoke<Object>(a, arg, b);
    }
    return target.invoke<Object>(a, b, arg);
  }

  // continue to ~20
}

Shorter impl using varargs:

class InsertArgument 
{
  final MethodHandle target;
  final int pos, nargs;
  final Object arg;

  private InsertArgument(MethodHandle t, 
                         int p, int n, 
                         Object a) {
    target = Spread.to(t);
    pos = p;
    nargs = n;
    arg = a;
  }

  public static 
  MethodHandle create(MethodHandle target, 
                      int pos, 
                      Object arg) 
    throws CantDoThat
  {
    int nargs = target.type().parameterCount();
    if (nargs == 0) {
      throw new CantDoThat();
    }
    InsertArgument i = 
      new InsertArgument(target, pos, nargs, arg);
    return Collect.to(#s.insert, nargs-1);
  }

  private Object insert(Object[] in) {
    Object[] out = new Object[nargs];
    System.arraycopy(in, 0, out, 0, pos);
    System.arraycopy(in, pos, 
                     out, pos+1, nargs-pos);
    out[pos] = obj;
    return target.invoke<Object>(out);
  }
}

Not using varargs:

class DropArgument
{
  final MethodHandle target;
  final int pos;

  private DropArguments(MethodHandle t, 
                        int p) 
  {
    target = t;
    pos = p;
  }

  public static 
  MethodHandle create(MethodHandle target, 
                      int pos) 
    throws CantDoThat
  {
    int nargs=target.type().parameterCount();
    DropArguments p = 
       new DropArguments(target, pos);
    switch (nargs) {
      case 0: return #p.drop1;
      case 1: return #p.drop2;
      case 2: return #p.drop3;
      // continue to ~20
    }
    throw new CantDoThat();
  }

  private Object drop1(Object a) {
    return target.invoke<Object>();
  }

  private Object drop2(Object a, 
                       Object b) {
    switch (pos) {
      case 0: return target.invoke<Object>(b);
    }
    return target.invoke<Object>(a);
  }

  private Object drop3(Object a, 
                       Object b,
                       Object c) {
    switch (pos) {
      case 0: return target.invoke<Object>(b, c);
      case 1: return target.invoke<Object>(a, c);
    }
    return target.invoke<Object>(a, b);
  }
  // continue to ~20
}

Shorter impl using varargs:

class DropArgument
{
   final MethodHandle target;
   final int pos, nargs;

   private DropArguments(MethodHandle t, 
                         int p, int n) 
   {
     target = Spread.to(t);
     pos = p;
     nargs = n;
   }

   public static 
   MethodHandle create(MethodHandle target, 
                       int pos) 
    throws CantDoThat
   {
     int nargs=target.type().parameterCount();
     DropArguments p = 
        new DropArguments(target, pos, nargs);
     return Collect.to(#s.drop, nargs+1);
   }

   private Object drop(Object[] in) {
     Object[] out = new Object[nargs];
     System.arraycopy(in, 0, out, 0, pos);
     System.arraycopy(in, pos+1, 
                      out, pos, nargs-pos);
     return target.invoke<Object>(out);
  }
}

Not using varargs:

class PermuteArguments
{
  final MethodHandle target;
  final int[] reorder;

  private PermuteArguments(MethodHandle t, 
                           int[] r) 
  {
    target = t;
    reorder = r;
  }

  public static 
  MethodHandle create(MethodHandle target, 
                      int[] reorder) 
    throws CantDoThat
  {
    int nargs=target.type().parameterCount();
    PermuteArguments p = 
       new PermuteArguments(target, reorder);
    switch (nargs) {
      case 0: return #p.permute0;
      case 1: return #p.permute1;
      case 2: return #p.permute2;
      case 3: return #p.permute3;
      // continue to ~20
    }
    throw new CantDoThat();
  }

  private Object permute0() {
    return target.invoke<Object>();
  }

  private Object permute1(Object a) {
    return target.invoke<Object>(a);
  }

  private Object permute2(Object a, 
                          Object b) {
    Object out = { a, b };
    return target.invoke<Object>(out[reorder[0]], 
                                 out[reorder[1]]);
  }

  private Object permute3(Object a, 
                          Object b, 
                          Object c) {
    Object out = { a, b, c};
    return target.invoke<Object>(out[reorder[0]], 
                                 out[reorder[1]], 
                                 out[reorder[2]]);
  }
  // continue to ~20
}

Shorter impl using varargs:

class PermuteArguments
{
   final MethodHandle target;
   final int   nargs;
   final int[] reorder;

   private PermuteArguments(MethodHandle t, 
              int[] r) 
   {
     target = Spread.to(t);
     reorder = r;
     nargs = r.length;
   }

   public static 
   MethodHandle create(MethodHandle target, 
             int[] reorder) 
     throws CantDoThat
   {
     int nargs=target.type().parameterCount();
     PermuteArguments p = 
       new PermuteArguments(target, reorder);
     return Collect.to(#s.permute);
   }

   private Object permute(Object[] in_args) {
     Object[] out_args = new Object[nargs];
     for (int i=0; i<nargs; ++i) {
       out_args[i] = in_args[reorder[i]]; 
     }
     return target.invoke<Object>(out_args);
  }
}

class ArrayElement { 

  public static 
  MethodHandle getter(int[] array) {
    return #getint;
  } 

  public static 
  MethodHandle setter(int[] array) {
    return #setint;
  } 

  public static 
  MethodHandle setter(Object[] array) {
    return #setref;
  } 

  public static 
  MethodHandle getter(Object[] array) {
    return #getref;
  } 

  private static 
  int getint(int[] array, int i) {
    return array[i];
  } 

  private static 
  void setint(int[] array, int i, int value) {
    array[i] = value;
  }

  private static 
  Object getref(Object[] array, int i) {
    return array[i];
  } 

  private static 
  void setref(Object[] array, int i, 
              Object value) {
    array[i] = value;
  }  
}

Not using varargs:

class GuardWithTest {

  final MethodHandle test, target, fallback;

  private GuardWithTest(MethodHandle te, 
                        MethodHandle ta, 
                        MethodHandle fa) {
    test = te; 
    target = ta;
    fallback = fa;
  }

  public static 
  MethodHandle create(MethodHandle test,  
                      MethodHandle target,  
                      MethodHandle fallback)
  {  
    int nargs = test.type.parameterCount();
    GuardWithTest g = 
      new GuardWithTest(test, target, fallback);

    switch (nargs) {
      case 0: return #g.test0;
      case 1: return #g.test1;
      case 2: return #g.test2;
      case 3: return #g.test3;
      // continue to ~20
    }
    throw new CantDoThat();
  }

  private Object test0() {
    if (test.invoke<boolean>()) {
      return target.invoke<Object>();
    } 
    return fallback.invoke<Object>();
  }

  private Object test1(Object a) {
    if (test.invoke<boolean>(a)) {
      return target.invoke<Object>(a);
    } 
    return fallback.invoke<Object>(a);
  }

  private Object test2(Object a, 
                         Object b) {
    if (test.invoke<boolean>(a, b)) {
      return target.invoke<Object>(a, b);
    } 
    return fallback.invoke<Object>(a, b);
  }

  private Object test3(Object a, 
                       Object b, 
                       Object c) {
    if (test.invoke<boolean>(a, b, c)) {
      return target.invoke<Object>(a, b, c);
    } 
    return fallback.invoke<Object>(a, b, c);
  }
  // continue to ~20
}

Shorter impl using varargs:

class GuardWithTest {

  final MethodHandle test, target, fallback;

  private GuardWithTest(MethodHandle te, 
                        MethodHandle ta, 
                        MethodHandle fa) {
    test = Spread.to(te); 
    target = Spread.to(ta);
    fallback = Spread.to(fa);
  }

  public static 
  MethodHandle create(MethodHandle test,  
                      MethodHandle target,  
                      MethodHandle fallback)
  {  
    int nargs = test.type.parameterCount();
    GuardWithTest g = 
       new GuardWithTest(test, target, fallback);
    return Collect.to(#g.test, nargs);
  }

  private Object test(Object[] args) {
    if (test.invoke<boolean>(args)) {
      return target.invoke<Object>(args);
    } 
    return fallback.invoke<Object>(args);
  }
}

Not using varargs:

class CombineArguments {  
  final MethodHandle target;
  final int pos;
  final MethodHandle combiner;

  private 
  CombineArguments(MethodHandle t, 
                   int p, 
                   MethodHandle c) {
    target = t;
    pos = p;
    combiner = c;
  }

  public static 
  MethodHandle create(MethodHandle target,
                      int pos,   
                      MethodHandle combiner)
    throws CantDoThat;
  {  
    int nargs=target.type.parameterCount();
    int cargs=combiner.type().parameterCount();
    if (nargs < 1 || nargs != cargs+1 ||
        pos < 0 || pos > nargs) {
      throw new CantDoThat();
    }

    CombineArguments g = 
      new CombineArguments(target, pos, combiner);

    switch (nargs-1) {
      case 0: return #g.combine0;
      case 1: return #g.combine1;
      case 2: return #g.combine2;
      case 3: return #g.combine3;
      // continue to ~20
    }
    throw new CantDoThat();
  }

  private Object combine0() {
    Object v = combiner.invoke<Object>();
    return target.invoke<Object>(v); 
  }

  private Object combine1(Object a) {
    Object v = combiner.invoke<Object>(a);
    if (pos==0) {
      return target.invoke<Object>(v, a);
    }
    return target.invoke<Object>(a, v);		
  }

  private Object combine2(Object a, 
                          Object b) {
    Object v = combiner.invoke<Object>(a, b);
    switch (pos) {
      case 0:
        return target.invoke<Object>(v, a, b);
      case 1:
        return target.invoke<Object>(a, v, b);
    }
    return target.invoke<Object>(a, b, v);
  }

  private Object combine3(Object a, 
                          Object b, 
                          Object c) {
    Object v = combiner.invoke<Object>(a, b, c);
    switch (pos) {
      case 0:
        return target.invoke<Object>(v, a, b, c);
      case 1:
        return target.invoke<Object>(a, v, b, c);
      case 2:
        return target.invoke<Object>(a, b, v, c);
    }
    return target.invoke<Object>(a, b, v);
  }

  // continue to ~20
 }

Shorter impl using varargs:

class CombineArguments {  
  final MethodHandle target;
  final int pos, nargs;
  final MethodHandle combiner;

  private 
  GuardWithTest(MethodHandle t, 
                int p, int n, 
                MethodHandle c) {
    target = Spread.to(t);
    pos = p;
    nargs = n;
    combiner = Spread.to(c);
  }

  public static 
  MethodHandle create(MethodHandle target,
                      int pos,   
                      MethodHandle combiner)
  {  
    int nargs=target.type().parameterCount();
    int cargs=combiner.type.().parameterCount();
    if (nargs < 1 || nargs != cargs+1 ||
        pos < 0 || pos > nargs) {
      throw new CantDoThat();
    }

    CombineArguments g = 
      new CombineArguments(target, 
                           pos, nargs, 
                           combiner);

    return Collect.to(#g.combine, nargs);
  }

  private Object combine(Object[] in_args) {
    Object v = combiner.invoke<Object>(in_args);
    Object[] out_args = new Object[nargs];
    System.arraycopy(in_args, 0, 
                     out_args, 0, pos);
    System.arraycopy(in_args, pos+1, 
                     out_args, 0, nargs-pos);
    out_args[pos] = v;
    return target.invoke<Object>(out_args); 
  }
}

class ExactInvoker {  

  public static 
  MethodHandle create(MethodType mt)
    throws CantDoThat;
  {
    int nargs = mt.parameterCount();
    switch (nargs) {
      case 0: return #invoker1;
      case 1: return #invoker2;
      case 2: return #invoker3;
      // continue to ~20
    }
    throw new CantDoThat();
  }

  private static Object invoker1(Object a) {
    MethodHandle mh = (MethodHandle)a;
    return mh.invoke<Object>(); 
  }

  private static Object invoker2(Object a,
                                 Object b) {
    MethodHandle mh = (MethodHandle)a;
    return mh.invoke<Object>(b); 
  }

  private static Object invoker3(Object a,
                                 Object b,
                                 Object c) {
    MethodHandle mh = (MethodHandle)a;
    return mh.invoke<Object>(b, c); 
  }

  // continue to ~20
}
Comments:

Post a Comment:
Comments are closed for this entry.
About

bocadmin_ww

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today