method handles in a nutshell

The JVM prefers to interconnect methods via static reference or dispatch through a class or interface. The Core Reflection API lets programmers work with methods outside these constraints, but only through a simulation layer that imposes extra complexity and execution overhead. This note gives the essential outlines of a design for method handles, a way to name and interconnect methods without regard to method type or placement, and with full type safety and native execution speed. We will do this in three and a half swift movements...

1. Direct method handles

Given any method M that I am able to invoke, the JVM provides me a way to produce a method handle H(M). I can use this handle later on, even after forgetting the name of M, to call M as often as I want. Moreover, if I provide this handle to other callers, they also can invoke M through the handle, even if they do not have access rights to call M by name. If the method is non-static, the method handle always takes the receiver as its first argument. If the method is virtual or interface, the method handle performs the dispatch on the receiver.

A method handle will confess its type reflectively, as a series of Class values, through the type operation.

In pseudo-code:

MHD h1 = H(Object.equals);
MHD h2 = H(System.identityHashCode);
MHD h3 = Hs(String.hashCode);
assert h1.type() == SIG[(Object,Object)boolean];
assert h1.invoke(r1,a1) == r1.equals(a1);
assert h2.invoke(a2) == System.identityHashCode(a2);
assert h3.invoke(r3) == r3.invokespecial:String.hashCode();
The actual name of the type MHD will be given shortly. The actual API for H and Hs is uninterestingly straightforward, and may be found at the end with the other details.

To complete the low-level access (and fill a gap in the Core Reflection API), there is a variation Hs(M) which forces static linkage just like an invokespecial instruction, and is allowed only if I have the right to issue an invokespecial instruction on M.

From the JVM implementor’s point of view, there are probably three or four distinct subclasses of direct method handle, corresponding to the distinct varieties of invoke instruction. To round things out, one kind of method handle should work for invoking a method handle itself. These are low-level concerns, which hide nicely behind the H (and Hs) operator described above.

2. Invoking method handles

Given a method handle H, I can invoke it by issuing an invokeinterface bytecode against it. The signature I use must exactly match the original signature of the target method. (Even beyond the spelling, the linked meaning of class names must be the same, in the argument and return types.) The method name I use must always be invoke (not the name of the target method). In pseudo-code:
MHI h1 = ...;
h1.invoke(a1...)
The type MHI is special interface type known to the JVM. (Its actual name will be given shortly.)

MHI functions as a marker interface to tell the JVM that this occurrence of the invokeinterface bytecode must be treated specially, different from all other interface invocations. For one thing, normal JVM linking rules cannot apply, because the signature of the call site relates to the target method, not to the marker interface. This kind of call site works on direct method handles (type MHD) created in part 1 above. In a moment we will drop the other shoe and observe that it works on other types of method handles.

The invokeinterface instruction is uniquely suited for this sort of JVM extension, because the result for bytecode verification allow any object to serve as the receiver of an interface invocation.

3. Adapting method handles

The type MHI provides a very flexible jumping off point, for the bytecodes of one method to call any other method, of any given signature. The next question is whether the calling method and receiving method have to agree exactly on the signature, and the answer is “no”. This brings us to the third and final major design point, of adapting method calling sequences.

The most important case of adaptation is partial invocation (sometimes known as currying or binding). A direct method handle by itself is really quite boring because, unlike nearly everything else in an object-oriented system, it is pure code, with no data to modify its meaning.

Thus, given a method handle and some arguments for it, the JVM will give me a partial invocation of that method handle, which is the new method handle that remembers those arguments, and, when invoked on the remaining arguments, will invoke the original method handle with the grand total set of arguments.

At the very least, the JVM is willing to let me specify the first argument R of a virtual or interface method handle H(M), because that lets it perform method dispatch when the handle is created, and hand me back a method handle Adapt(H(M),R) that not only remembers the argument R, but has also pre-resolved the method dispatch R.M. This special case of partial invocation, sometimes called “bound method references”, is enough of a hook to let programmers introduce the usual object-oriented flexibilities into method handles.

In pseudo-code:

MHD h1 = H(Object.equals);  // SIG[(Object,Object)boolean]
MHB h2 = Bind(h1, (Object)"foo");
assert h2.type() == SIG[(Object)boolean];
assert h2.invoke(a2) == "foo".equals(a2);

The type MHB stands for a bound method reference. (Please wait a moment for its actual spelling.)

3.5 Further adaptation

As long as we are messing with arguments, there is a fairly unsurprising range of other adaptations that arise naturally from the richness of JVM signatures, and the conversions that apply between various data types. (The details of varargs and reflective invocation also bear on this design.)

Specifically, given two method signatures (A)T and (A')T', and a method handle H(M) of type (A)T, there is a library routine which will create me a new method handle H' = Adapt(H(M), (A')T). It is my responsibility to help the library routine match up the corresponding arguments of the two signatures, to direct it to drop unneeded arguments in A', to supply preset values for arguments in A missing in A' (this is where partial invocation comes into the general picture), and to tell it of the presence of varargs in either signature. The library is happy to insert casts, primitive conversions, and boxing (or unboxing) to make the arguments match up completely.

Here are some pseudo-code examples:

MHD h1 = H(String.concat);  // SIG[(String,String)String]
MHA h2 = Adapt(h1, SIG[(String,String)String], $1, $0);
MHA h3 = Adapt(h1, SIG[(String)String], $0, $0);
MHA h4 = Adapt(h1, SIG[(String)String], $0, ".java");
assert h2.invoke(a,b) == b.concat(a);
assert h3.invoke(c) == c.concat(c);
assert h4.invoke(c) == c.concat(".java");

That is a longish step beyond bound method references, but I believe the sweet spot of the design will supply a flexible set of method signature adaptations (including currying), and let JVM implementors choose how much of that the JVM wants to take responsibility for.

At a minimum, bound method references must be special-cased by the JVM, but everything else could be supplied by a Java library (one which is willing to dynamically code-generate many of its adapter methods).

At a maximum, the JVM could supply a Swiss Army Knife combinator which interpretively handles all possible argument wrangling. This is probably the right way to go for HotSpot, since the HotSpot JIT is as well suited for optimizing complex adapters as simple ones, and having the complex ones appear to the compiler as single intrinsics is no big deal.

Breaking the suspense: And the name of the winner is...

So we have four different types floating around:
  • MHD - a direct handle to a user-requested method (either virtual or static)
  • MHI - the magic type which warns the JVM of a method handle call site
  • MHB - a bound method handle, which remembers the method receiver
  • MHA - a more complex adapted method handle
I can see no particular benefit in distinguishing all these types in an API design. Therefore, I believe the proper spelling for all these types is something all-encompassing: java.dyn.MethodHandle. Clearly there will be other types under the covers, such as the concrete types chosen by the JVM for specific direct method handles (MHD), or various implementation classes of adapted methods (MHB, MHA). But there is no reason to distinguish them to the user.

However, one specific case of bound method handles is important to consider from the user’s viewpoint. If a receiver object R has a public method (in a public API type) already named invoke, with a signature of (S)T, then R is already looking very much like a bound method handle for its own invoke method, with signature (S)T.

For completeness of exposition, let’ll give this kind of non-primitive method handle its own informal type name:

  • MHJ - a Java object that implements MethodHandle and a type-consistent invoke operation

So, at the risk of adding a chore to the JVM implementor’s list, I think an object of such a type (MHJ) should serve (uniformly in the contexts described above) as a method handle. (It is may be necessary to ask that R implement the marker interface and the type method; but is something the system could also figure out well enough on its own.) I admit that this is not a necessary feature, but it could cut in half the number of small method-like objects running around in some systems. And the MHA implementation above probably requires an MHJ anyway.

Background: How did we get here?

One of the biggest puzzles for dynamic language implementors on the JVM, and therefore for the JSR 292 (invokedynamic) Expert Group, is how to represent bits of code as small but composible units of behavior. The JVM makes it easy to compose objects according to fixed APIs, but it is surprisingly hard to do this from the back end of a compiler, when (potentially) each call site is a little different from its neighbors, and none of them match some fixed API. The missing link is an object which will represent a chunk of callable behavior, but will not require an early commitment to a fixed calling sequence. In theory-language, we want an object whose API is polymorphic over all possible method signatures, so the compiler (and runtime call site linker, in turn) can manage calls in a common framework, not one framework per signature.

Put another way, we cannot represent all callees as Runnable or Callable, because fixed interfaces like those serve just a subset of all interesting call signatures. APIs which attempt to represent all possible calls, notably Java’s Core Reflection API, simulate all signatures by boxing arguments into an array, but this is a simulation (with telltale overheads) rather than a native JVM realization of each signature.

We know signature polymorphism is powerful, from our experience with many dynamic and functional languages. (For an old example, consider the Lisp APPLY function, which is an efficient but universal call generator.) Integrating such polymorphism into the Java language is challenging; that’s why the function types in Neal Gafter’s closures proposal are a significant portion of the specification.

Happily, it is a simpler matter to integrate signature polymorphism into the JVM. As part of the JSR 292 process, I have been worrying about this for some time. The result is the present story of method handles which (a) JVMs can implement efficiently, which (b) are useful to language backends, and which (c) have a workable Java API. That last is actually the hardest, which is why I have not given it yet. (See previous paragraph.)

Before giving the API, I want to emphasize a few more points. First, method handles (per se) are completely stateless and opaque. They self-report their signature (S)T (via a type operation on MethodHandle) but they reveal nothing else about their target. They do not perform any of the symbol table queries supplied by the Core Reflection API.

Every native call site for a method handle is hardwired with a particular signature. Compiler writers have every right to expect that, if the target method has a similar signature, the call will have only a few instructions of overhead. Likewise, a method handle’s signature is intrinsic to the handle, and completely rigid. Calls to near-miss signatures will fail, as will violations of class loader naming consistency.

Besides signature simulation, one serious overhead in the Core Reflection API is the requirement that, on every call to a reflected method, the JVM look at the caller’s identity and perform an access check to make sure that he is not calling someone else’s private method. The method handle design respects all such access checks, but performs them up front at handle creation, where (presumably) they are more affordable. But you can publish a handle to your own private method, if you choose.

One use case (which I have used to test the quality of this design) is whether it can be used to re-implement the invoke functionality in the Core Reflection API, for better speed and code compactness. This has long been a sore spot for language implementors (for reasons detailed above). This one reason I have included varargs in the competency of the method adaptation API.

The calling sequence for a method handle (in part 2 above) will be approximately as fast as today’s interface invocations. Searching for an invoke method in a receiver is the same sort of task as searching for an interface (and its associated “vtable”, if you use such things). The search can be sped up by the usual sorts of pre-indexing. A JVM-managed method handle will advertise its signature prominently in its header, so that a pointer equality check (remember, signature agreement is exact) is all that needs to happen before the caller jumps through a hardware-level function address.

Details and a hasty exit

Finally, here is a sketch of the API:
package java.dyn;

public interface MethodHandle /\*>\*/ {
    // T type();  public R invoke(A...);

    public MethodType type();
}

public interface MethodType {
    public Class parameterType(int num);  // -1 => return type
    public int parameterCount();
}

public class MethodHandles {
    public static MethodHandle
    findStatic(Class defc, String name, MethodType type);

    public static MethodHandle
    findVirtual(Class defc, String name, MethodType type);

    public static MethodHandle
    findSpecial(Class defc, String name, MethodType type);

    public static MethodHandle
    unreflect(java.lang.reflect.Method m);

    public static MethodHandle
    convertArguments(MethodHandle mh, MethodType newType);

    public static MethodHandle
    insertArgument(MethodHandle mh, Object value);
...

    // The whole enchilada:
    public static MethodHandle
    adaptArguments(MethodHandle mh, MethodType newType,
                   String argumentMovements, Object values);
}

That’s it, in a nutshell. Perhaps rather large coconut shell. Actually, quite small, if you are used to Unix shells.

You will have noticed that there is no way to call these guys from Java code, unless you assemble yourself a class file around the required invokeinterface. It is simple enough to create a Java API for calling method handles. Getting performance beyond the reflective boxed-varargs style of calling is a little messier, but doable. Dynamic language implementors solve this sort of thing as they fight to remove simulation overheads from their system. Given closures in Java, there would be nicer bridges for interoperability, to say nothing of implementing closures on top of method handles.

But the point is not calling or using these things from Java; the point is using them, down near the metal, to assemble the next 700 witty and winsome programming languages.

Comments:

At the risk of sounding like I don't get out much, I'm really pleased to see the progress being made on JVM, or should I say MLVM, changes.

As someone who uses one language evolution of Java that leverages the VM, AspectJ, I think that the VM changes are \*the\* crucial stream of work on Java 7, especially where the VM forces less efficient approaches.

Keep up the great work!

Posted by Neale on April 17, 2008 at 11:08 PM PDT #

Compiler error: You use MethodHandle.getType() in the pseudo code but you have defined MethodHandle.type() in the API sketch.

Posted by Andrea Francia on April 18, 2008 at 05:05 AM PDT #

Neale: Thanks. Working on it...

Andrea: Grazie. I simplified getType => type.

There's some conversation on this over at http://groups.google.com/group/jvm-languages/t/f8df67386ad3c17d .

Posted by John Rose on April 18, 2008 at 06:05 AM PDT #

Great Work.

Are you folks also considering a way to extend classes in Java the way "prototype" extends a class in JavaScript?

Posted by Mayur Patel on October 06, 2008 at 04:49 AM PDT #

Is there any bridge between the old Method class and the new MethodHandle?
I understand that while Methods allow users to get information about method that they can't invoke, MethodHandle cannot.

I suppose such a bridge is easy to build (just take the Class, the method name and the parameter types). Am I wrong?

Posted by Emmanuel CASTRO on November 03, 2009 at 09:15 PM PST #

Mayur: We have some ideas on the back burner; hopefully I'll have time to blog them some day.

Emmanuel: The bridge is java.dyn.MethodHandles.unreflect, which takes a java.lang.reflect.Method, checks it against the caller's access rights, and returns a method handle.

Posted by John Rose on November 10, 2009 at 07:43 AM PST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

John R. Rose

Java maven, HotSpot developer, Mac user, Scheme refugee.

Once Sun and present Oracle engineer.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today