dynamic invocation in the VM
By John.Rose-Oracle on May 07, 2008
For several years now, JSR 292 has promised an
invokedynamic instruction in one form or another. The problem has been with picking the one form that simultaneously enables a good range of use cases, addresses several architectural challenges in the JVM, and can be optimized by a variety of commercial JVMs. It has been a restless search for “one bytecode to rule them all”.
The EG has decided to propose an answer, in the form of an Early Draft Review, which is (to me at least) surprisingly simple to specify and implement. It does not even introduce a change to the bytecode format or verifier, yet it provides a hook which refers all important decisions at a dynamic call site out of the JVM and into Java code. This note builds on a previous blog entry, giving more concrete details and use cases.
The previous blog entry promised an EDR “in a few weeks”—it will have been twenty-one when the JCP releases the EDR next week. The internal reason for the delay was our sense that early versions of the design tried to do too much in one monolithic bytecode. The current design, serving the same set of use cases and requirements, is refactored by heavy use of method handles, which greatly reduces complexity and clarifies the various roles of language implementors and the JVM.
Why add another invoke bytecode? The answer is that call sites (instances of invoke bytecodes) are useful, and yet the existing formulas for invocation are tied so closely to the Java language that the natural capabilities of the JVM are not fully available to languages that would benefit from them. The key restrictions are:
- the receiver type must conform to the resolved type of the call site
- there is no generic way to create adapters around call targets (a corollary of the previous point)
- the call site must link, which means the resolved method always pre-exists
- the symbolic call name is the name of an actual method (a corollary of the previous point)
- argument matching is exact with no implicit coercions (another corollary)
- linkage decisions cannot be reversed (although optimization decisions change, invisibly)
Dynamic languages implementors expend much time and effort working around these limitations, simulating generic calls in terms of JVM invoke bytecodes constrained by the Java language. Although individual language requirements differ in detail, the following generic requirements seem (to me) to be representative:
- the receiver can be any object, not just one created by a specific language runtime
- call site linkage is under the complete control of the runtime
- call site linkage can change over time
- type checking of receivers (and all other arguments) is under runtime control
- there are generic ways for the runtime to build wrappers, ways that work for all relevant descriptors
- despite all this, a call site make a direct call to its target method
There is another set of requirements, too, which is about the practical problems of introducing new bytecode behavior into a large, mature Java ecosystem, one with many independent vendors and creators of JVMs, tools, and libraries. The new facilities must be as backward compatible as possible, in all dimensions:
- the new bytecode behaviors must have a simple and precise specification
- tools that manipulate bytecodes must be minimally disrupted
- the new behaviors must be reasonably simple implement, across the range of JVM implementation styles
- existing JVMs must be able to readily process them, with good code optimization
Solution: The linkage state is a method handleOur solution to these requirements is in three steps. First, we factor out method handles as a simple and generic way of managing methods (arbitrary JVM methods) as units of behavior, which are (as methods should be) directly callable. Second, we define an
invokedynamicinstruction with one machine word of linkage state, a handle to the call site’s target method. Third, we define a set of core Java APIs for managing linkage state and creating the target method handles for call sites, taking care they these APIs can present the right optimization opportunities to JVMs that wish to exploit them.
Step one: method handlesThe first step, of creating method handles, is described in my previous post on the subject. The method handle lifecycle requires new APIs for creating, adapting, and calling them. The key features of method handles are:
- a call to a method handle directly calls the wrapped method
- method handle invocation can potentially include the receiver-based dispatch behavior
- method handle invocation can potentially include access privileges enjoyed by the creating class (including
- method handles can invoke both static and non-static methods
- there is an API to bind a method handle to a receiver, creating a bound method handle
- there is an API to adapt a method handle to include changes in argument and return type and value
invokeinterfacebytecode on a method handle receiver has extended linkage behavior, allowing any descriptor (call signature) to be paired with the symbolic name
- when a method handle’s
invokemethod is called, the resolved descriptor exactly matches the receiver’s method handle type
This is the most complex part of the
invokedynamic design, but it is also the most boring part, because every functional language includes the same sort of function types and their associated operations. The least boring part is where method handles touch (and therefore provide direct handles to) pre-existing JVM capabilities, notably interface and virtual invocation.
One we have a firm foundation for working generically with methods as directly callable units of behavior, the remaining steps are relatively easy.
Step two: dynamic linkingThe
invokedynamicinstruction almost identical to any
invokeinterfaceinstruction, in that it has a symbolic method name and descriptor, and can process any type of non-null receiver. (Note that the JVM verifier does not make a static requirement on the receiver of an interface call.) However, an
invokedynamicinstruction does not specify a particular receiver in which the method name and descriptor are linked.
invokedynamic instruction specifies no particular interface type against which to link. Syntactically, it is identical to an
invokeinterface instruction, except that its symbolic interface is the dummy type
java.dyn.Dynamic. (This interface is defined in the JVM, but it has no methods, and nobody ever needs to implement it.) The verifier passes such instructions without objection, in all past and future JVMs. The same is true for any other tool (such as Pack200) that is less restrictive than the verifier about the bytecodes that it will process.
Each instance of an
invokedynamic instruction is associated with a hidden variable, called the target method, which encodes that call site’s linkage state. This variable is managed by the JVM but not visible as a named variable in any Java class. Its type is a method handle, but it starts out null, which means the site is not yet linked. When the site is linked, any call to that site is equivalent to a call to the target method handle.
invokedynamic works very much like a normal method handle invocation site, with the same generality of calling sequences (descriptor polymorphism). The argumeent and return type matching rules are the same: The resolved descriptor of the call site must exactly match the target method’s type. But there are these differences from explicit method handle invocation:
- the called method (the explicit receiver with non-dynamic invocation) comes from the linkage state word
- the symbolic method name can be any constant string whatsoever
- there is an unlinked state which causes special processing of the call site
- the JVM may have more opportunity to constant fold and/or inline the linkage state, if it appears to be stable
Step three: target method managementBeyond the basic APIs for managing method handles, there are specific APIs for managing linkage state of call sites. The most important API pertains to bootstrap methods, which I have previously discussed. Other APIs have to do with the specification of bootstrap methods, bulk invalidation of multiple dynamic call sites, and optimizable combinators which produce useful target method handles.
There is not room in this blog for full treatment of these APIs; I will merely sketch them. More details are given in the JSR 292 EDR, which is coming out next week. (The ‘E’ in ‘EDR’ stands for early. It is still early.) Full details will evolve over time, and will be easily observable as javadoc comments on the reference implementation (RI) in the Da Vinci Machine repository forest, and will of course be duly separated out from the RI into the final JSR 292 specification.
To continue... Each class containing
invokedynamic instructions must also specify a bootstrap method. This specification happens either via a classfile attribute, or else an explicit registration call.
The bootstrap method is itself a method handle, whose signature is universal in about the same way that reflective invocation is universal: The arguments are boxed into an object array, and primitive return values are unboxed. (Unlike reflective invocation, any thrown exceptions are left unwrapped.) In effect, the JVM allocates one additional machine word per class to hold the method handle of the bootstrap method for that class.
When an unlinked dynamic call site is executed, the bootstrap method is called. It receives the following information:
- the outgoing arguments in an array, with all primitives boxed
- static information, of type
java.dyn.StaticContextabout the context of the call: caller class, method name, resolved descriptor, etc.
- a capability object, of type
java.dyn.CallSite, which supports getting and setting of the call site’s target method (linkage state)
First of all, the bootstrap method is responsible for fulfilling the current call. It may ignore the
CallSite value, or it may store it in a table somewhere, or it may immediately compute an appropriate target method (as a method handle) and use the
CallSite.setTarget method to link the call site.
The extra information (beyond the arguments themselves) reifies (or makes real to Java) the call site itself. Most of this reification is merely informative (introspective), but the
CallSite, crucially, lets the language runtime change the reified call site’s target method. The bootstrap method can be viewed as a call for help from the call site itself, a messenger temporarily incapable of delivering a message, sending a meta-message to the language runtime. The answer to the call for help is the
setTarget, a meta-message back to the messenger. When linkage is complete, the messenger disappears again, and the call site may again be viewed as a simple message, which calls the target method.
By the way, passing a null to
setTarget will immediately unlink the corresponding call site, restoring it to its original linkage state. Passing it another method handle will immediately change the linkage of the call site. Interactions with the memory model have to be nailed down, but the
setTarget call will probably have a similar effect (regarding memory order) to setting either a plain static variable, or a volatile one.
Performance considerationsAll this will be disappointingly slow if the JIT is not able to process dynamic call sites with similar optimizations as other call sites. Because the linkage state is a target method, it can be directly invoked, the same as with a normal method handle call.
Because the linkage state is a single reference, it is relatively simple for a non-optimizing JIT to compile a dynamic call site with a pluggable pointer, not too different from the old monomorphic call pattern. The pointer handles the current target method, or perhaps jumps through a signature adapter to the bootstrap method. The bootstrap adapter can be generated by brute force in the JVM, or (more likely) by an up-call to let Java code do the heavy lifting.
The linkage state is not an exposed static variable, as in the case of pure Java simulations of dynamic calls. Having it be hidden means that the JIT can conspire with the JVM to do a deep analysis on the structure of the target method. Getting to the bottom of this analysis requires one more thing: The target method handle, if it is more complex than a simple direct method reference, must still be simple enough for the JIT to analyze (as a compile-time constant object) and “see through to the bottom”.
In particular, if the target method is an adapter which performs type tests and dispatches to one of several ultimate targets, that dispatch logic must be fully foldable. Who is responsible that this happens? First, the JVM implementor must provide the APIs which create such cascading, adapting method handles, and make them have a format which is transparent to the JIT. (This is a JVM-specific task.)
Second, the dynamic language implementor should use those APIs which the JIT is able to optimize, rather than using custom versions unknown to the JVM implementor. (There is a style of higher-order function called a combinator which I think will be useful, but this blog entry is already long enough.) Nevertheless, sometimes the Java coders lead while the JIT follows as best it can. We will sort this out during the EDR period, as we prototype.
ApplicationsHere are a few of the applications of this design:
- simple one-time linkage (bootstrap method finds a handle, installs it as final target, and calls it in tail position)
- call to Java method (bootstrap method emulates JVM linkage, does one-time linkage)
- call to pseudo-Java extension method (bootstrap method finds extension method, does one-time linkage)
- monomorphic inline cache (bootstrap method installs an optimistic target, with a type test and fall back to bootstrap again)
- polymorphic inline cache (like previous, but bootstrap method re-balances a decision tree each time a new type is seen)
- call site warmup (bootstrap method increments a count and gathers type info. until call site is mature for optimization, then installs a well-crafted target method)
- traits (like polymorphic inline cache, but method lookup does pattern matching and method instantiation)
- pattern matcher (decision tree within target method has patched-in recognizers, can grow over time as new cases appear)
- PyPy-style block-oriented JIT (each branch point is a patchable invoke site, adds more compiled blocks as new types are seen; requires tailcall also)
All of these applications (patterns, really) can be coded up today as pure Java patterns. In fact, if the
invokedynamic feature is to be successful, the same APIs I described above as belonging to upgraded JVMs will also have to be implemented, backward compatibly and more expensively, on existing JVM releases. The fact that this is possible does not remove the need to extend the JVM architecture.
Such backward-compatible code patterns are really simulations, not direct encodings, because the unit of behavior cannot be represented today as a simple method, named via a method handle. In the backward compatible code, a unit of behavior will be represented by a method, wrapped in a class, named by a throwaway name, loaded in a throwaway classloader (if GC is required), installed in the JVM’s internal dictionary, implementing a signature-specfic interface, and finally stored as a target method in a static variable (or worse, in a table of some sort). Everything I just described beyond the method handle is the overhead required by a simulation of method handles in existing systems. By integrating method handles into the JVM, then building
invokedynamic on method handles, we get a new, more direct way of accessing the power of modern JVMs for dynamic code.