anonymous classes in the VM

Or, showing up in class without registering.

Introduction

This post describes a VM feature called anonymous classes. This feature is being prototyped in the multi-language project called the Da Vinci Machine, and it is tracked by Sun bug 6653858.

One pain point in dynamic language implementation is managing code dynamically. While implementor’s focus is on the body of a method, and the linkage of that body to some desired calling sequence, there is a host of surrounding details required by the JVM to properly place that code. These details include:

  • method name
  • enclosing class name
  • various access restrictions relative to other named entities
  • class loader and protection domain
  • linkage and initialization state
  • placement in class hierarchy (even if the class is never instantiated)

These details add noise to the implementor’s task, and often enough they cause various execution overheads. Because class of a given name (and class loader) must be defined exactly once, and must afterwards be recoverable only from its name (via Class.forName) the JVM must connect each newly-defined class to its defining class loader and to a data structure called the system dictionary, which will handle later linkage requests. These connections take time to make, especially since they must grab various system locks. They also make it much harder for the GC to collect unused code.

Anonymous classes can partially address these problems, and we are prototyping this feature in the Da Vinci Machine (which is the fancy name for the OpenJDK Multi-Language VM project). Desired features of anonymous classes:

  • load an arbitrary class from a block of bytecodes
  • associate the new class with a pre-existing host class, inheriting its access, linkage, and permission characteristics (as if in an inner/outer relation to the host class)
  • do not associate the new class with any globally-defined name
  • do not make the new class reachable from the class loader of its host class
  • put the class in class hierarchy logically, but allow it to be garbage collected when unused
  • allow the definer to patch class elements in the constant pool, to provide local access to previously defined anonymous classes
  • allow the definer to patch constants in the constant pool, to provide local access to dynamically specified data relevant to the language implementation
  • allow UTF8 elements in the constant pool to be patched, to make it easier to build glue classes from canned templates

The key motivation is that we want to cut ClassLoaders and the system dictionary out of the loop. This means there will be fewer locks and no GC entanglements. Drop the last object, and the class goes away too, just as it should.

Why the patching stuff? There are a few corner cases where, because we are dealing with anonymous entities, an essentially symbolic constant pool is not up to the task. Since the standard class file format is specified as a byte stream, there is no way to introduce live objects or classes to the newly-loaded class, unless they first are given names. Therefore, there must be some sort of substitution mechanism for replacing constants (classes, at least) into the loading classfile. Given that requirement, it is an easy step to generalize this substitution mechanism to other types of constant pool entries.

The resulting facility is powerful in interesting ways. You can often build a template classfile by passing Java code through the javac compiler, and then use constant substitution to customize the code.

Generated code often needs to get to complex constants (e.g., lists or tables) and this provides a hook to introduce them directly via the CP. The string-type constant pool entry is extended to support arbitrary objects, if they are substituted into the loaded anonymous class. This need not scare the verifier; it just treats non-strings as general objects. General objects, of course, are not a problem for dynamic languages.

Here is a toy example, which actually works in a current prototype. Note that the anonymous classes are defined in a chain, with each new one a subclass of a previously loaded one.

The API is a single static method defineAnonymousClass, which is privileged. In the current prototype it is in sun.misc.Unsafe, which is a non-standard class. If it is standardized, it will be given a suitable name. (And suitable security checks!) This method takes an array of bytecodes, a host class, and an optional array of constant pool patches. It returns the newly created anonymous class. Unlike all other class queries, it can never return the same class twice.

[New text:] Here is a more polished API.

As you can see from the sample output in the test code, an anonymous class has a name (via getName) which consists of the original name of template class, followed by a slash (which never otherwise appears in a class name) and the identity hash code of the anonymous class.

This prototype will help provide the basis of further experimentation with other constructs, notably anonymous methods and method handles.

Comments:

\* This bug is not available.

More information is available at -http://developers.sun.com/resources/bugsFAQ.html#s4q2

Posted by Curt Cox on January 22, 2008 at 10:10 AM PST #

Great work. Will this mechanism be useful for erasing erasure along the lines of NextGen?

Posted by Howard Lovatt on January 22, 2008 at 11:57 AM PST #

Great, great, great,
i'am like a kid with a new toy.

In your sample, instead of using getResourceAsStream(), you can get the URL using getResource().
It's interresting because URLConnection has a method getContentLength() that returns the length of the resource
(or -1 :( ).

By the way avaliable() can return a value lower than the length of the resource without being broken,
by example, if the resource is on a
network disk.

Rémi

Posted by Rémi Forax on January 22, 2008 at 06:52 PM PST #

Here you can find the 2008 Democratic Presidential candidates debate from Myrtle Beach, South Carolina on CNN. And everything you need to know about the candidates. http://presidentofamerica.blogspot.com

Posted by President on January 22, 2008 at 09:26 PM PST #

Howard: It's a low-level component for such a project.
You'd also need a method for what I call "class splitting", where a
single Java-level class (which might be final) is refined into a set of more or
less equivalent variants.

Rémi: Thanks for the coding suggestion. And yes, we
are going to have some pretty cool toys to play with this year.

Posted by John Rose on January 24, 2008 at 08:12 AM PST #

Re. Non-erased Generics,

The scheme I had in mind was to present the bytecode stream template for the generic class and then patch the type for the particular generic variation. For the next generic load; present the same bycodes again, but this time patch differently.

The disadvantage of this approach would be that two lists of integers, say, would have different classes. The alternative is to do what NextGen currently does and load the patched classes as normal. That way you have a global reference and you can check if you have already loaded list of integer, say, and therefore you would only have one class for each generic variation.

A question that comes to mind is, what does instanceOfAnonymousClass.getClass().getName() return?

Posted by Howard Lovatt on January 24, 2008 at 11:49 AM PST #

[Trackback] One of the aspects we have to work around building and improving a dynamic language implementation on the Java Virtual Machine is the way the JVM loads and executes bytecode. In order for JRuby to take advantage of the Hotspot just-in-time (JIT) compil...

Posted by Nick Sieger on February 20, 2008 at 01:52 PM PST #

[Trackback] Use anonymous class of the Da Vinci VM to implement the runtime support of the property spec.

Posted by Rémi Forax's Blog on March 23, 2008 at 07:09 AM PDT #

[Trackback] The first public, beta-quality build of JDK 6 Update 14 is available, introducing another batch of important enhancements for everyone...

Posted by Osvaldo Pinali Doederlein's Blog on February 10, 2009 at 08:48 PM PST #

[Trackback] The first public, beta-quality build of JDK 6 Update 14 is available, introducing another batch of important enhancements for everyone...

Posted by Osvaldo Pinali Doederlein's Blog on March 03, 2009 at 11:02 PM PST #

What about serialization of anonymous classes? Will it be possible, or are there technical limitations preventing it (or philosophical reasons for not doing it)?

Posted by Alessio Stalla on July 08, 2009 at 12:04 AM PDT #

[Trackback] This entry show how to implement an Expression Tree like the one that comes with the DLR in Java on top of JSR 292 API.

Posted by Rémi Forax's Blog on July 31, 2009 at 08:35 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

John R. Rose

Java maven, HotSpot developer, Mac user, Scheme refugee.

Once Sun and present Oracle engineer.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today