In the Java Virtual Machine, method calls are the way work gets done. This note is a slightly simplified description of the parts of a call site and what they do. I will also sketch some of the implications of this design on the JVM’s support for languages other than Java.
For the absolutely complete and correct details, you’ll have to read the Java VM Specification,
version 3 as amended by JSR 202. If the alert reader of the account below finds an inconsistency with the JVM specification, the latter is of course correct (and I myself would appreciate an alert).
Here are the parts that make up any method call, as found in the JVM bytecodes:
(The term “symbolic” is adopted from the JVM specification. The JVM specification uses the term “target” instead of “receiver”, but the HotSpot JVM uses the term “receiver”, which dates back to its Smalltalk roots.)
Here, a “fixed” part of the call is one which can be determined at some point before the first call. A “variable” part is one which may change over time, each time the call is executed.
A bytecode instruction is one of four kinds:
invokeinterface. The format of all these is substantially the same. The instruction format includes an operand field which refers to a constant pool entry that encodes the symbolic type, name, and descriptor.
If the receiver is missing, the bytecode must be
invokestatic. If the receiver is present and the resolved type is an interface, the bytecode must be
invokeinterface. Otherwise, the symbolic type is a class or array, and the bytecode must be
The resolved type is derived from the symbolic type at some point before the first execution of the call. Likewise, the resolved method is derived from the resolved type by searching for the given symbolic name and descriptor. If these derivations fail, errors are raised, and the call is never executed. We shall generally pass over such errors with a respectful silence.
Both the resolved type and resolved method must be accessible to the class containing the call instruction.
If the method returns normally, it produces a return value (if not void). If the method returns abnormally, it produces a thrown exception. As a third possibility, it might never return at all.
If there is a receiver, the JVM ensures that it is not null. It does not convert the receiver in any way, although it may dynamically test the receiver’s type. In the case of
invokevirtual, the JVM’s verifier proves statically that the receiver type is always a subtype of the resolved type.
In the case of
invokeinterface, the verifier allows any reference type, but the JVM performs a dynamic check on the receiver type, on every call. This check ensures that the receiver type is actually a subtype of the resolved type.
From one point of view, the receiver is not an argument, because its type does not appear in the call’s symbolic descriptor. However, in the layout of the JVM stack (of the caller) and locals (of the callee), the receiver looks exactly like an initial argument before the arguments mentioned in the descriptor.
We can therefore talk about the effective call descriptor of a call, which is the type of the tuple on the stack which the call consumes, plus the type of the return value which the call produces on the stack. If the call is not
invokestatic, this effective descriptor can be spelled by modifying the symbolic descriptor, prepending the call’s symbolic type.
A callee method also has an effective method descriptor, which is the type of the tuple initially in the JVM local variables on entry to the callee (and also describes the value eventually returned back to the caller). Again, this effective descriptor may be spelled (if the method is non-static) by prepending the method’s defining class to the method’s symbolic descriptor. The caller and callee must agree exactly on the symbolic descriptor used to make the call, but they may disagree in the first argument of their effective descriptors. This is safe because the JVM tests the receiver type so as to ensure that no improper method is called.
The JVM does not perform conversion on arguments. The JVM’s verifier proves statically that each argument type will match the corresponding descriptor type. The matching is exact for primitive types in a descriptor, except that any type smaller than
int can convert (with sign extension) to an
int. The matching for class or array types follows the class hierarchy in the expected way. An interface type in a descriptor will match any reference argument, whether it implements that interface or not. (The JVM defers interface type checking until the reference is used as the receiver of a call.)
There is no essential reason the JVM cannot convert the arguments, as long as the conversions “preserve information”. More specifically, they should not violate intentions shared by the caller and the callee. The JVM cannot know such intentions, but it can provide conventions which align well with the implicit conversions found in most languages.
For example, if the caller passes an
int and the callee receives an
Integer wrapping the passed value, no type safety is violated, no information is destroyed, and the intentions of caller and callee should continue to match accurately. The Java compiler performs this conversion (called “autoboxing”) as part of method calls. The inverse conversion (“unboxing”) is also reasonable. Conversion between reference types is also reasonable: If a caller passes a
String and the callee expects any
Object (which includes strings), there would be no harm if the JVM allowed the different descriptors to match. The inverse conversion (casting, with the possibility of a
ClassCastException) is also reasonable. Again, the Java compiler performs this conversion when it converts between generically typed methods and their erased types.
A more spectacular (but still valid) conversion would be to package up some or all of the argument tuple into an object array, and pass it as a single argument to the callee. As long as the callee is expecting this format of arguments, again the structure of the program as a whole is preserved. The Java compiler performs this transformation for varargs methods. The inverse would also make sense: A caller could pass an object array, with the callee receiving a tuple of arguments. The Java Core Reflection API performs this conversion on every call to
Here are some basic argument transformations which could be considered to be intention preserving:
Such conversions would violate the rules of Java method resolution, but other languages would find them useful. (Maybe even a future version of Java…) In particular, dynamic languages routinely convert back and forth between more and less specific call descriptors. For example, Lisp’s
APPLY function performs argument list unboxing. Any kind of bridge from a dynamically typed language to Java APIs is likely to perform many unboxing and cast conversion on arguments passed to regular Java methods. Although specific languages are likely to require even more types of argument conversions (such as between strings and numbers, or lists and arrays), it is likely that the JVM can reduce their implementation complexity by providing a basic repertoire of conversions between calling sequences, including some of the above conversions.
All this about implicit conversions of arguments assumes that there is some way of linking a symbolic method with one descriptor to an actual method of a different descriptor… read on for that part!
There is a symmetry between the treatment of outgoing argument values and incoming return values. Both are subject to the same type restrictions, as mediated by the descriptor.
The JVM’s verifier proves statically that returned value matches the descriptor type of the method doing the return, except for interface values.
Return values could be subjected to intention-preserving transformations much as arguments could be.
More interestingly, there is no fundamental reason the JVM cannot return several values from a single call. It would be possible to slightly extend the syntax of method descriptors to allow several return values to be specified just as several argument can be. This would be useful for languages that feature tuples; it would allow compilers to avoid boxing a tuple value on return from a method.
For languages which support dynamically selected multiple value returns (e.g., Common Lisp), a varargs return convention would be simple to create, corresponding to the varargs argument passing convention already in the JVM. Conversion between varargs and positional value passing would be intention preserving for return values just as for argument values.
The verifier does not check exception types. The JVM is always ready to receive any throwable from any call, without distinction. Exception handlers are defined by a per-method table indexed by bytecode index ranges, and a handler is often shared by several call instructions. A thrown exception which does not match any handler terminates the calling method abruptly, directing control to its caller. This process of popping the stack continues until a handler is found or the thread stack is completely emptied.
In the case of
invokestatic, the actual method is just the resolved method. Bad things happen if the symbolic method cannot be resolved, or if the actual method is abstract or native and not loadable. After resolution, the call site can jump directly into the actual method.
In the case of
invokeinterface, the actual method is derived from the resolved method by searching for an override in the receiver type.
These rules insure various type safety properties of the JVM. Here are some crucial ones:
invokespecial, which has other limitations)
Oddly enough, interface types per se are never statically verified. The JVM verifier (though not the Java language) will allow any reference to be passed as an actual argument under an interface in a descriptor; it does not attempt to prove interface types. This is why
invokeinterface must always perform a dynamic type check.
Here are more details: A method M defined in a concrete class C overrides a method N in a class or interface B if the following are all true:
The accessibility restrictions complicate things a bit, since package-private methods with the same name and descriptor and a common superclass can be mutually inaccessible, and therefore neither overrides the other.
Since two independent class loaders can assign the same class name C load two distinct (and incompatible) types, purely symbolic descriptors are not strong enough to ensure the type safety of arguments passed under the name C. That is why there are extra checks for the meaning of names found in method descriptors. These checks are called signature constraints, and that’s all I want to say about them in this note.
A JVM call instructions raise various kinds of errors when its symbolic method fails to correspond to a resolve method. Currently, this raises an error of some sort (about which we are being vague), but such a mismatch also provides room for interesting extensions. Most dynamic languages have a “message not understood” hook, which can be used to assign meaning to method calls that have no built-in meaning. In a specific language, such a hook will usually contain reflective code provided by a library writer (sometimes an end user) which looks around for a way to satisfy the caller’s intentions in a less literal way. For example, many languages provide a way for a dictionary object (Java calls it
Map) to implement implement any protocol as long as the dictionary contains a suitably keyed closure for each of the protocol’s methods.
In a multi-language VM (and with the JVM, specifically), such a hook needs to be placed on call sites, not on specific objects, because it is call sites that are most directly tied to the intentions of its language. An object might be shared between several languages, and in fact it would be a sign of weak VM design if each language had to implement its own types like
Object, etc. Therefore, instead of several languages competing to define the API of shared types like
Integer, each language needs its own hook for extending the shared APIs.
There is another difference between the JVM and single-language systems which bears on the design of this hook, and that is the JVM’s strong distinction between reflective and normal method invocation. As discussed below, reflective calling sequences are slower and more complex because they perform many steps of boxing, unboxing, dispatching, and access control on every call. With normal JVM calls, these steps, if done at all, are finished in a linkage step before the first call executes. A “message not understood” hook appropriate to the JVM needs to work this way also: It need to perform its linkage work once before a number (potentially unlimited) of actual method calls. When the call to the hook delivers an actual method to be used, this method should be associated with that call site and reused for similar calls in the future.
A final difference between the JVM and a specific language’s runtime is that the details of the “method not understood” should not be tailored to one language, but should rather be a general and flexible means of satisfying method calls. Single inheritance or even single-argument dispatch are too limited a range of functionality, especially for dynamic languages. (Consider the case of an extensible “add” operation in a symbolic algebra library.) This means that there needs to be a low-level convention for associating the receiver and argument types with an actual method that has previously been associated with those types, and can be reused in the future without further up-calls to the hook.
We will call this matching actual method the applicable method. It is characterized not only by a particular method to invoke, but also a set of guards (argument and receiver types, or other constraints) which must be satisfied if the actual method is to be run. Because the guards can fail, there is the logical possibility that a given call site might have several applicable methods, with distinct guards.
Thus, the best future shape of a “message not understood” hook for the JVM is probably a bootstrap method with the following characteristics:
A bootstrap method would be invoked, at a compiler’s request, when the JVM cannot link the call site normally, nor can it find a previously used actual method that is applicable. The returned applicable method would be cacheable (or not!) and reusable when applicable method’s guards permit.
Note that the applicable method could differ in its symbolic name and symbolic descriptor from those of the call site. This is a great difference from the current JVM behavior, which uses exact mapping of name and descriptor to drive method linkage. The name (in this setting) is irrelevant, but the descriptors must be reconciled somehow, or the JVM will no longer be type-safe. The bootstrap method could always make a mistake, and return a grossly mismatched method; this is logically equivalent to a linkage error.
More interestingly, the bootstrap method could return an applicable method whose descriptor differs from that of the call site, but only by low-level argument conversions, such as were discussed above. In this case, the call sequence itself should include the necessary argument conversions, without further ceremony.
Some of these features can be intrinsic to the JVM. Others could be implemented by low-level Java code, either specific to one language or (hopefully) shared by a group of languages. Some descriptor reconciliation is so obvious and low-level that it can be done by the JVM, while language-specific conversion must be handled by the bootstrap method. For example, a language which can convert numbers to strings should call
Object.toString in a bridge method that then invokes the intended actual method, but it is the bridge method that must be returned from the bootstrap method to the call site.
An actual method has the following parts relevant to its invocation by a call instruction:
The actual method’s bytecode is executed in its own stack frame, whose first few locals are initialized to the incoming receiver and arguments. Eventually, a return instruction may pass a value back to the caller, or an exception may be thrown back to the caller.
If call site linkage is extended to include method handles, then the actual method would be a method handle, an object in its own right which would be invoked. It is natural to think of such an actual method as an arbitrary closure, an object which the JVM would call in order to fulfill the intention of the call site. If the actual method is in fact literally a method handle, then the JVM would immediately call that method (either virtually or statically, as the case may be).
Sometimes complex or language-specific argument transformations are needed before the intended actual method is entered. (By “intended” I mean the one which the programmer thinks is getting called directly.) These should be handled by a bridge method (aka adapter ethod). The bridge takes control, adjusts the arguments, calls the intended method, and then adjusts any return values or exceptions on the way back. When it is linked into a call site (by the bootstrap method) it must take a form compatible with a plain method handle. This implies that method handles and closures should be very similar in form, and to some degree interoperable.
Such bridging of calling sequences is complex to describe, but straightforward to implement, and to execute. There is one relevant optimization which is not obvious, and that is tail recursion elimination. When a call must convert arguments but not return values, it is most efficient for the eventual method and the bridge method to share a stack frame. This can be described (from the bridge method’s point of view) as a tail call at the end of the bridge method to the eventual method. When this tail call completes, the control stack looks as though the caller had directly called the eventual method, and it will return directly to it.
As I have pointed out in another note, tail calls are useful in their own right, if the language makes firm promises about them. (If they are an optional optimization, they are less useful, because programmers cannot rely on them to code up state machines such as loops.) A call is a tail call when the caller offers up its own stack frame to be reused by the callee, instead of requiring the callee to create its own stack frame.
Another future form of method target could be a continuation or coroutine. Stack frames could be allocated on the heap, and/or serialized and deserialized in groups to secondary data structures. But this note is already too long to talk about those ideas.
The JVM is type-safe. This prevents bad behaviors such as the following:
These false steps, if allowed, could crash the JVM, expose private data, or allow an attacker to perform privileged actions.
Type safety depends on many factors, such as the fact that the JVM heap is automatically managed and each heap object has a specific type which is easy to check and cannot be changed. The type safety of calls derives from the correct matching of arguments and receiver with the actual method’s formal parameters and formal receiver. In particular, the values consumed from the caller’s stack must agree in number and type with the values stored in locals on entry to the callee.
The JVM also provides certain assurances about access protection, for example:
(More details: A protected member of class D can be accessed from a class C that is a subclass of D or in the same package as C, and there are additional constraints on the symbolic type and receiver type. An overridden method in class A can be invoked, via
invokespecial, on a receiver whose class C overrides that method, but only from within some class B between A and C inclusive.)
The effect of access control is that programmers can mark parts of their code so as to prevent untrusted code from touching those parts. The high-level rule is that nobody can request a non-public service or make a non-public state change except parties that have a right to do so. Of course, such privileged parties can, as mediators, provide those actions to the public. This is why, for example, many private fields have public getter methods.
(For many of us, in practice, the benefit of access control is not so much self-protection as protection-against-self. I don’t trust the code I write next month to interact properly with some finely-balanced algorithm I wrote this week, so I often put fences around chunks even if there are no security problems that could arise. And the same goes between me and my esteemed colleagues, both ways. Good fences make good coders.)
There is not much need for an overhaul of the JVM’s type and access control model. But there are a few points to mention, in connection with the them of enhancing method calls.
Certain aspects of Java inner classes, and closures, will be easier to compile if the JVM would let groups of classes share private methods or fields, and this is true of non-Java languages also.
If the JVM were to allow separate method loading (“autonomous methods”), then such a method could benefit from sharing of privates, if the system gave permission to load it into a host class’s interior. For example, a debugging method could be adjoined to
String, which would give it access to
String’s private fields
It is quite helpful (when making proofs of correctness or security) that a JVM objects cannot change its class. Languages which provide objects with typestate or mutable classes are generally implemented with an extra level of indirection, and the JVM will accommodate such patterns reasonably. There may be a call in the future for “class splitting” or “class changing” features, where a JVM object can (in some limited and structured way) modify its class pointer. An example of class splitting could be refining the raw type
List into a number of types
List<Integer>, etc., so that list objects can “know” what their creator intended their element type to be. The array types native to the JVM are something like a set of splits from the raw array type, an array of
The most aggressive enhancement to JVM object mutability is probably Smalltalk’s
become method, which replaces one object (everywhere) by another. This feature certainly interferes with many optimizations, and requires expensive and pervasive checks (akin to null checks). But it could be the right answer if the JVM were to support lazy functional programming, or languages with “futures”. The JVM’s GC could give special help in updating all references to an object which had changed its identity.
Any call instruction (except
invokespecial) can be emulated via the method
java.lang.reflect.Method.invoke. The key emulated parts are as follows:
invokecall (which is ignored if the call has no receiver)
InvocationTargetException, which is then thrown
Various error conditions are reported via thrown exceptions. A few errors are logically unique to reflection:
Since reflection does not use symbolic method references, reflective calls cannot produce symbol resolution errors.
Reflection can produce linkage errors if the caller of the
invoke method cannot access the resolved method. These access checks are performed dynamically, every time
Method.invoke is called. They can be quite expensive, since the
invoke method must walk the stack to identify its caller. Method objects can be configured to omit these checks.
Method objects are not only used for invoking methods, but also for responding to a wide variety of symbolic queries. For example, even if you have no privilege to invoke a particular method, you can still ask for its name, parameter and return types, and other information. These purposes require data structure overheads and API complexity far beyond the simple task of method invocation.
There are several weaknesses in the current Core Reflection API:
invokespecialinstruction cannot be emulated
All these weaknesses can be solved by introducing a lower-level JVM construct called method handles. This construct deserves a note of its own, but in short it is:
The last point (unchecked invocation) might sound like a security hole, but it is not. The checking is simply done when the handle is created, rather than when it is used. For example, a handle to a private method could only be created by code which already had the right to call that method. The alert reader can see that this aspect of methods is exactly as secure as inner classes, which can also provide public wrappers for private methods.
The main difference between method handles and an emulation with inner classes is a simplified API. Specifically, there is neither an interface which holds the invocation descriptor, nor a class which holds the bridge method. Instead, the JVM simply provides a direct connection between any callee method and its eventual caller. This leads to fewer overheads when defining and using method handles.
Just as the
ldc instruction was extended to apply to
Class constants (in JSR 202), it might be natural to extend it to apply to method handle constants in a JVM which supports method handles. Thus, there would be two interesting things to do with a symbolic method: Call it, or get a handle to it, preparatory to calling it later. Both operations would be subject to the same linkage and access rules.
I hope this has been an interesting summary for those who are curious about the inner workings of the JVM. I have also tried to explain and motivate some plausible future expansions for the JVM, to support a broad array of languages. There is active work and discussion in this area, with JSR 292, and with the
jvm-languages group at
googlegroups.com. Stay tuned!