Bravo for the dynamic runtime!

This week several of us from Sun attended the Lang.NET Symposium. The symposium was intensely technical and not too large to fit into a single room, so the presentations and conversations were a fine exercise in what one might call N-3: Nerd-to-Nerd Networking. Sometimes they were even downright inspiring—bravo Anders, Jim, Erik, Gilad, Peli, Jeffrey. Our hosts kindly welcomed presentations from three of us at Sun: Dan Ingalls showed everyone the Lively Kernel, while Charles Nutter and yours truly introduced the Da Vinci Machine project.

The centerpiece of the conference was of course Microsoft’s Common Language Runtime, and especially the new Dynamic Language Runtime, Jim Hugunin’s encore to IronPython which factors out the reusable logic that glues dynamic languages on top of the CLR.

Why am I suddenly excited about Microsoft technology? Two or three reasons. First, the DLR (with IronPython and IronRuby) is another evidence that we are in some sort of renaissance or resurgence of programming language design. For some reason, people are inventing programming languages again in a big way, expecting to get audiences, and sometimes getting them. I think the “some reason” is a combination of better tools and higher-level runtimes and cheaper CPU cycles and the open source movement.

These new conditions are combining in a deep and exciting way with some very old ideas. (I mean “very old” in the field of computing, dating back no more than fifty years.) Somehow the basic ideas were sketched early. I am thinking of distinctions of syntax (concrete, abstract, and “sugar”), data structures (including procedural, object-oriented, functional, symbolic, and relational), the idea of exploratory programming, full inter-conversion between program, data, and text, declarative versus imperative notations, lexical versus dynamic versus message-based scoping, maximal versus minimal versus extensible languages, closures, continuations, reflection, garbage collection, caching, lazy evaluation, and more. (I apologize for the length and baldness of the list, and would welcome a reference to a good retrospective survey to improve on the list.) People like Peter Landin were already mapping the landscape of programming languages in the 1960’s. The categories have shifted little, because they remain amazingly useful.

(Side note: The symposium ended with a lecture on, among other things, the merits of data-driven programming and XML. This led to an exchange, apparently ongoing, about the merits of XML versus JSON, which was a new thing to me. As the exchange petered out in comments on robustness and compactness, I just had to loudly propose “S-expressions!”, the 50-year-old Lisp syntax that may be more robust, regular, and compact than either.)

Although I do appreciate history for history’s sake, what is really interesting to me is observing the new changes on the old themes. Thus it is informative to characterize new languages like Ruby in terms of quite old terms (Lisp in C syntax). It exciting when a dormant idea comes into widespread practical use, such as when Java popularized garbage collection, or the Hotspot JVM applied message optimization techniques originally developed for Smalltalk and Self. In the case of the DLR, it is exciting to see those techniques extended into a programmable metaobject protocol.

As readers of this blog know, Sun has also been designing and developing technology apply the power of the JVM to the difficult problems of dynamic language implementation. The second thing that excited me at Redmond (along with the Microsoft people), was a striking case of parallel evolution between the DLR over the CLR on one hand and the Da Vinci Machine over the JVM on the other side. My talk was shortly after Jim’s, and (as I remarked at the time) I had less explaining to do than I expected, since Jim had already explained a very similar architecture in the DLR. In my work on JVM dynamic invocation, I knew I was applying tried and true ideas from Smalltalk, Self, and CLOS, but I was encouraged to find that a colleague (and competitor) had been busy proving the practicality of those ideas.

The differences between the CLR and JVM extensions are interesting to note. They work completely above the level of the CLR without significantly enhancing it, while we are developing the JVM and libraries at the same time. I have been busy with the JVM, while Charles Nutter has been doing great work rebuilding JRuby downward toward the JVM. The latter work has converted JRuby from a tree-walking interpreter to a compiler which emits just the sort of bytecodes the JVM most likes to optimize. I think of what we are doing as a sort of transcontinental railroad, with the JVM building out from the West as JRuby extends from the East; we are putting in the Golden Spike this year.

One reason for this difference in approach is that the Microsoft CLR JIT does not appear to be under active development; its optimization level is as rudimentary as the earliest Java JITs. In the CLR that kind of performance is just the accepted cost of running managed code. The CLR has no notion of inlining interface calls or compiling under dynamic profiles, so all the dynamism of the DLR has to be inside the little call-site objects that the DLR itself manages. We Sun people realized afresh (and so did our colleagues) what an astonishing thing it is to have a great compiler in your VM, which can use online type information to inline and optimize huge swathes of bytecode.

Another contrast between the DLR and our work is our design of method guards and targets, versus theirs of tests and targets. (The CLOS invocation protocol speaks of discrimination functions and applicable methods.) In all these systems, an up-call populates the call site with one or more target methods, and each target method has an associated argument-testing rule.

So what’s the contrast here? The DLR up-call expresses the test and target as a pair of abstract syntax trees, which the lower layers of the DLR combine with previous ASTs to compile a new version of the call site. The Da Vinci Machine design combines the guard and target into a single chunk of behavior, called a method handle; a dynamic invocation site can have a small set of these handles. (Currently JRuby simulates this pattern above the level of the JVM.) The JVM interpreter will sift through them, invoking each eagerly, expecting success, and fielding the small exceptions it can throw if its guard logic fails.

Eventually the JIT will kick in, observe the state of the call site, and generate a suitably optimized decision tree based on the method handles it sees on the call site. (Since method handles are immutable, it will be able to inline their structure completely, if that is desirable.) The JVM can afford to use exceptions for call site composition, because they are cheap. One surprise for me at Redmond was learning that the CLR architecture makes it nearly impossible to compile exceptions to simple “goto” jumps, not only because their JIT does not optimize much, but also because CLR exceptions have a much more complex semantics than those of Java. Hotspot as been optimizing exceptions since the JVM98 performance sweepstakes. This reminds me: What great thing competition is, and specifically in the Java ecosystem. Hotspot is as fast as it is thanks to a reasonable set of standard benchmarks, and our race with our other JVM competitors.

This leads me to another metaphor about gold: In Redmond I realized (as I said above) that systems built on Hotspot and other JVMs are sitting on a gold-mine of performance. While IronPython on the DLR has to do hard, brilliant work to “iron” out the wrinkles in the CLR JIT’s weak performance profile, the “irony” is that Hotspot has already been optimizing highly dynamic programs for almost a decade. JRuby is already tapping our gold mine of JIT optimizations, and showing benchmark performances even greater than the original C-coded version of Ruby. It can only get better from here. (Jim, all, I hope you’ll forgive the metallurgical puns... The Scot in me loves a cheap laugh.)

The final reason I am excited about Microsoft’s DLR is that I am pleased for their customers, since they will enjoy using the emerging variety of new languages on the CLR. It is time for those languages to arise, because the platforms are strong and the CPU cycles cheap. But, as you might guess, I am even more excited for the customers of the JVM, because they will also enjoy the new languages in an expansive open-source community, and on their choice of blazingly fast Java virtual machines. Starting (I hope) with Sun’s Hotspot.

Bravo for a new golden age of language design, and a renaissance of high level languages!

Comments:

To call this an exciting would be putting it very lightly. This is a tremendously encouraging development. I'm sure the external developer community is eagerly looking to get involved more actively in the mlvm project.
In this regard, John, can you please explain what kind of external assistance you might look for with respect to the DVM? What areas of interest/expertise should such external contributors posses? Any recommended reading lists?
(You might choose to respond to these questions either on your blog or on the mlvm mailing list. But either way, what I'm requesting for is a short "getting started" note that specifies the scope/purview of the project and the kind of expertise required of an external contributor. )
Thanks for your time.

Regards,
Bharath

Posted by Bharath R on February 02, 2008 at 01:02 AM PST #

I'm not sure if you've seen XKCD before, but your post reminded me of this comic: http://xkcd.com/297/

Posted by Joel Franusic on February 02, 2008 at 11:17 AM PST #

Thanks for a very interesting and entertaining post. I was surprised by the observations on the CLR JIT and its low level of runtime optimization. What's the .NET community's take on this? Are they really happy with an "accepted cost" as you say, and resorting to the "unsafe" keyword when needed? I can see how it's easier to accept that when you can assume a known OS underneath. If anyone has pointers to further information on CLR performance characteristics, I would greatly appreciate it.

Posted by Kjetil Valstadsve on February 04, 2008 at 05:50 PM PST #

What I would really like to see is in a new (mainstream) VM is support for lightweight processes and interprocess communication like Erlang and its BEAM VM offer.

Any chance that the Da Vinci project team will consider this feature?

Regards

Posted by Lars Schneider on February 08, 2008 at 10:39 PM PST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

John R. Rose

Java maven, HotSpot developer, Mac user, Scheme refugee.

Once Sun and present Oracle engineer.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today