tail calls in the VM

Or, how to chase one's tail without throwing up.

Introduction

Tail-calling is an old idea; see the MIT “Lambda the Ultimate” papers. It is usually treated as an optimization. If A calls B calls C, and the last action of B is to call C, then the machine resources (such as stack) are adjusted so that it is as if A had directly called C. In effect, B pops its stack frame back to A, pushes arguments to C, and jumps to C, so that when C pops its arguments and returns, it branches directly back to A.

This optimization can save instructions, stack space, and instruction cache activity, so it is often found in compilers. But as long as it’s only an elective optimization of what’s really just a recursive call, it matters only as a performance hack. Call this hack “soft tail call”.

On the other hand, a “hard tail call” is a guaranteed tail call, one not only allows the compiler to improve the call site (from B to C then back to A), but requires the improvement. Some languages, notably Scheme, Haskell, and Fortress, feature hard tail calls, and use them to form iterative constructs, threaded interpreters, and other forms of control flow delegation.

JVM Support for Hard Tail Calls

We propose a new bytecode prefix called tailcall, which may precede any invoke instruction. This prefix has no other action than to require that the following instruction constitute a tail call, and be implemented as a hard tail call.

The byte code for tailcall is numerically the same as the wide bytecode.

The verifier statically checks the following properties:

  • The callee method’s return signature is identical with the caller method’s return signature.
  • The invoke instruction is immediately followed by a return instruction.
  • No exception handlers apply to the invoke instruction.
  • The caller method is not synchronized.
  • The caller method is holding no object locks. (It has executed sufficient monitorexit instructions to undo the effect of any monitorenter instructions.)
  • The callee method is accessible and linkable from the caller method.

The JVM executes the invoke instruction as if the tailcall prefix were not present, except that the JVM discards the stack frame for the caller before executing the callee. The following return instruction is never directly executed; it is present only to simplify code analysis.

The effects of removing the caller’s stack frame are visible to some APIs, notably access control checks and stack tracing. It is as if the caller’s caller had directly called the callee. Any privileges possessed by the caller are discarded after control is transferred to the callee. However, the linkage and accessibility of the callee method are computed before the transfer of control, and take into account the tail-calling caller.

Removing the caller’s stack prevents StackOverflowError conditions which would otherwise be eventually caused by iteratively tail-calling one or more methods but never returning.

Because of the difficulty of ensuring correct caller access, the following methods may not be the subject of a hard tail call. In particular, Method.invoke simulates access checks which properly occur when the caller links to the callee, and therefore requires the caller to be visible on the stack.

  • java.lang.Class.\*
  • java.lang.reflect.Method.\*
  • java.lang.reflect.Constructor.\*
  • java.security.AccessController.\*
  • (what else?)

Soft Tail Calls

The JVM may also support a soft tail call optimization, but this must be invisible to programmers. In particular, stack traces and access control checks must take the caller into account, even if the callee has been soft-tail-called.
Comments:

Interesting stuff John. It would be interesting to see some examples that would benefit from such optimization. Am I correct that the following code could use this feature to achieve better performance? int f(int i) { return i==1 ? 1 : i + f(i-1); } Do you think if javac compiler could such optimizations automatically for the regular Java language.

Posted by Eugene Kuleshov on July 12, 2007 at 12:44 PM PDT #

This wouldn't have anything to do with invokedynamic's call sites, would it? :) I can certainly see where one use case for tailcall would be to remove dynamic invocation sites from the call stack. Don't get me wrong...I'm on board for tailcall.

Posted by Charles Oliver Nutter on July 12, 2007 at 06:19 PM PDT #

This would be a wonderful addition to Java 7. I personally have far more interest in better support for functional languages on the JVM than dynamically typed languages, so the lack of tail calls really hurts. However, as you note, there are problems with access control checks and other techniques that require stack inspection. Have you seen a paper by Clements and Felleisen "A Tail-Recursive Machine with Stack Inspection"(PDF)? It appears to describe an approach which would preserve the information necessary for access control checks without inflating the stack.

Posted by Neil Bartlett on July 12, 2007 at 09:45 PM PDT #

It sounds like you have covered all the issues. I wonder if you could simplify the rules by only allowing tail calls when the receiver is "this"? Also, do you really need to keep the return instruction?

Posted by Peter Ahé on July 13, 2007 at 06:04 AM PDT #

Eugene: Yes, javac could have a soft tail call optimization if it could issue a hard tail call bytecode. But the call to 'f' in your code example is not a tail-call. See Anatomy of a Loop or Tail-recursive functions.

Charles: It would apply to invokedynamic along with the others. Having hard tailcall would give you more options for implementing dynamic calls: You could branch to a dispatch routine, which would then tailcall the normal routine. (In fact, that's how java.lang.reflect.Method.invoke could work today, at least if return values are not to be boxed.) Since Scheme is a dynamic language, tailcalls are loosely related to JSR 292; see also Tail Recursion in Groovy?.

Neil: Yes, I care about functional languages also. (See my autonomous method post, and watch this space for more.) Yes, I've seen the Clements/Felleisen paper; it's very informative. It seems easier to me to limit hard tail calls to cases which do not violate access control, rather than inventing a Java-specific thread-local state machine for access tracking.

Peter: Thanks! I don't see how a restriction to 'this' buys anything special. The extra return instruction makes it a little easier to upgrade tool chains. It also gives JVM implementors more options for implementing tailcall. Lazy stack contraction might be preferable when running under a debugger. The 'tailcall' instruction would then mean 'this frame is GC-able'.

Posted by John Rose on July 13, 2007 at 06:59 AM PDT #

John: if you only allow tail calls when this is the receiver, I do not think you need a black-list of classes that cannot receive a tail call.

Regarding the return instruction, I guess I'm not convinced that the envisioned benefits to implementations are really that valuable. I'm thinking about how will this look a few revisions from now? Will it just look like yet another complication that wasn't really worth it?

Posted by Peter Ahé on July 13, 2007 at 10:17 AM PDT #

Anything to help Scala compete with OCaml and F# gets my vote!

Posted by Jon Harrop on July 15, 2007 at 08:06 PM PDT #

Eugene's code could easily be rewriten to
int f(int i) {
  if (i == 1) return 1;
  return i + f(i - 1);
}
In that case it would be a tail call, and the compiler should be able to rewrite Eugene's code to mine.

Posted by Roel Spilker on July 25, 2007 at 10:21 AM PDT #

Great news. Now, please make this feature as nice as possible to Java programmers, not only to other languages (like functional / dynamic ones) compilable to the JVM. For one thing, since there are so many rules to make tailcall valid, less experienced programmers could easily write recursive code that they think will tail-call, but won't, resulting in a StackOverflow exception (which will probably only happen in production, as Murhpy's law goes). My suggestion is having any kind of syntax feature -- perhaps a statement-scoped annotation -- to let the programmer require that some specific call is a tail-call; javac would flag an error otherwise. Or perhaps, javac could just have a non-default warning that complains about every recursive call that's not a tail call. Such a warning, e.g. "Recursive call is not tail-callable because of xxx and may result in StackOverflow", would likely produce "false positives" for legacy code, but not so much because recursion is not a widely used feature in Java today... but at least in post-Java7 code I'd expect programmers to educate themselves about tail calls and good recursive coding, and write recursive methods only/mostly with tail calls.

Posted by Osvaldo Pinali doederlein on July 30, 2007 at 01:38 AM PDT #

Roel: It could be rewritten thus, but it would still not be a tail call. That's why the design distinction between hard and soft tail calls is so important: Especially in languages where there is no pre-existing practice of tail calling, programmers often intend a tail call in a place where one is not possible. Only a hard tail call convention will inform them of their error.

Osvaldo: You are advocating a hard tail call syntax in Java. I agree it's the only way to retrofit tail calls into Java. I have little opinion on the syntax, other than that it must be different from the current syntax. (And I don't want to burden a JVM blog entry with language design, though I do love language design!)

Posted by John Rose on July 30, 2007 at 08:11 AM PDT #

Tail calls would be great, indeed, at least for implementors of other languages on the JVM.

One way to achieve that Java programmers could use tail calls comfortably would be to allow the "goto" keyword whereever "return" is allowed now.

Note that perl5 uses "goto" in this way for a very long time and it is not being abused by the average programmer. No wonder, since most people do not even understand what a tail call is.

Posted by Ingo Wechsung on September 17, 2007 at 08:09 PM PDT #

Hoare's CSP theory uses the idea of tail recursion. If JVM supported tail calls it would be great possibility for implementing CSP-based languages on top of JVM. Enabling tail calls for me is not just an optimization, but a feature which would allow to benefit from converging the best practices of object oriented, parallel and functional programming. I think JVM support for hard tail calls is a great idea. Is it going to be implemented in Java 7?

Posted by Dmitriy Kumshayev on September 24, 2007 at 04:39 AM PDT #

Eugene's code should be rewritten as

int f(int i) {
return _f(i, 0);
}

int _f(int i, int sum) {
if (i == 1) return sum;
return _f(i - 1, sum + i);
}

Tried with Scala 2.6 (it has tail recursion support, see scala-lang.org), function "_f" is truly tail-call optimized (you may analyze effect with bytecode decompiler)

VS

Posted by Valery Silaev on September 24, 2007 at 08:21 PM PDT #

The presence of strict tail-call elimination in the language would facilitate this style of programming:

asyncOp(new EventListener() {
public void onCompleted() { ... }
public void onTimeout() { ... }
});

If asyncOp invokes the callbacks in the tail position, then the programmer can be guaranteed that the stack frame for the operation will be popped before proceeding. This would allow event listeners to be arbitrarily nested without fear of stack overflow, or fear of holding onto garbage from ancestor listeners.

Posted by Warren Harris on October 23, 2007 at 01:07 PM PDT #

I would clarify this statement:
- The caller method is holding no object locks.

It could be misinterpreted to mean that the calling thread holds no locks, but what you mean is that any locks acquired (reentrantly or not) during the current call frame have been released. In other words, the state of all locks held by this thread is the same as it was when the caller method was entered.

Posted by Brian Goetz on October 31, 2007 at 05:59 AM PDT #

In regards to a hard syntax for tail calls in Java, why not use the already-reserved-but-useless "goto"? Here's an example using the favorite example function in the article:

int f(int i, int sum) {
if (i == 1) return sum;
goto f(i - 1, sum + i);
}

Posted by David M. Lloyd on April 20, 2009 at 04:28 AM PDT #

Just a comment on the security front - The classes listed above are not the only ones that do access checks. In fact \*anyone\* can do an access check, and as defined here, a tail-recursion call looks like it's pretty much a backdoor around access restrictions (if your method is the only one on the call stack, you can effectively perform actions with your caller's privilege level by tail-recursing "out" of your access control context).

Maybe it would be better to restrict tail-recursion calls to classes within, say, your classloader, thus guaranteeing that your access control context is unchanged?

Posted by David M. Lloyd on May 15, 2009 at 03:10 AM PDT #

Last week at Devoxx, during the jdk7 BoF-session, lead by Mark Reinholds, Joe Darcy, Alex Buckley and Brian Goetz, there was a hint that tail call optimization might be on the agenda for jdk7. That said, they also hinted that there were still some problems with the implementation. If I understood correctly, it had to do access restrictions, as already David mentioned. John, can you give us an update of the current state of affairs?

Posted by Roel Spilker on November 25, 2009 at 12:52 AM PST #

This is a late comment but TCO in Java would be so great, why not revive the thread again?

One common misunderstanding about TCO is due to the word 'optimization'. It is indeed a space optimization (don't use more space than goto, as Guy said in the 1970s) but a language should implement TCO in support of PROPER DESIGN. To wit, go through the OO design pattern books and inspect all the little patterns. Pick some -- say interpreter or composite -- and design your Java program accordingly. Then run a stress test and weep. Java blows up even if all the method calls are tail-recursive because it doesn't support TCO. Now do the same in PLT Scheme's class system and smile. It works -- for all inputs.

See my ECOOP keynote (2004) for some more details -- Matthias

Posted by matthias Felleisen on November 27, 2009 at 01:43 AM PST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

John R. Rose

Java maven, HotSpot developer, Mac user, Scheme refugee.

Once Sun and present Oracle engineer.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today