• JVM |
    May 31, 2008

with Android and Dalvik at Google I/O

John Rose
Invited by some friends at Google, I went to Google I/O this week to find out about Android, and specifically their Java story. I went to a few talks and had some excellent chats with various colleagues.

The top ten things I learned about Android and the Dalvik VM

  1. Android is a slimmed down Linux/JVM stack. They rewrote libc to be 200Kb, redoing speed-vs.-space optimizations, and throwing out C++ exceptions and C-level wide char support. (As far as the JVM is concerned, this reminds me of recent work we have done to “kernelize” HotSpot on Windows.)

  2. A special strength of the platform is their attention to detail about reducing the cost of private pages. Many pages are read-only mapped from files, which means they can be dropped from RAM at a moment’s notice. Many other pages are shared with only rare use of copy-on-write (e.g., between JVMs). (This is similar to the work we have done on HotSpot with class data sharing.)

  3. The first and main reason they give for using Harmony instead of OpenJDK is the GNU license (GPL). Cell phone makers want to link proprietary value-add code directly into the system (into JVM-based apps. and/or service processes), and they do not want to worry about copyleft. Perhaps there is some education needed here about the class path exception. (I know I don’t understand it; maybe they don’t either. And, their license wonks appear to have a well-considered preference for Apache 2 over GPL+CPE.)

  4. The VM they have for Android 1.0 is very basic: A “malloc-like” heap, interpreter only. This means they still have many build-vs-buy decisions to make. I told them they should adopt Hotspot’s first-level JIT (C1), that we should work together on kernelization, and that it is time for a classfile format update anyway.

  5. Key reasons against using JVM bytecodes are interpreter complexity and dirty page footprint. The Dalvik bytecode design executes Java code in less power (fewer CPU and memory cycles) and with more compact linkage data structures (their constant pool replacement looks like that of Pack200, and reminds me of some recent experiments with adapting the JVM to load Pack archives directly).

  6. The VM uses “dex” files like Java cards use their own internal instruction sets. The tool chain does use class files, but there is a sizeable (100K 70K LOC) tool called “dx” that cooks JARs into DEX assemblies. The dex format is loaded into the phone, which then verifies and quickens the bytecodes and performs additional local optimizations.

  7. Something like the dx tool can be forced into the phone, so that Java code could in principle continue to generate bytecodes, yet have them be translated into a VM-runnable form. But, at present, Java code cannot be generated on the fly. This means Dalvik cannot run dynamic languages (JRuby, Jython, Groovy). Yet. (Perhaps the dex format needs a detuned variant which can be easily generated from bytecodes.)

  8. The “dx” tool turns classfiles into SSA and then (after reorganization) to dex files. However, optimizations are missing. Loop invariants must still be pulled up manually by programmers. The dex format is not known to be easily JIT-able, however the designers have given some thought to make it so. (Probably the dex format needs some work in this direction. Let’s do that, and standardize on it!)

  9. The dex format has the usual merged, typed constant pool with 32-bit clean indexes. (Cf. Pack200.) This work is likely to stimulate the Java world to update the classfile format standard in that direction. Hopefully we can do this in a way that benefits much of the ecosystem.

  10. People are thankful to Sun for past stewardship of Java, but are not seeing much guidance from Sun toward the future. At least, that is the story I heard. Whether that story is mere perception or an actual leadership vacuum, our actions with OpenJDK need to reverse it.

Bonus: The view from Yahoo

I also met Sam Pullara, and had a good long chat with him about (among other things) what Yahoo would like to see JVMs do better. Here are my notes:

about NIO

  • Customers want simplified non-stateful buffers; buffer.flip is inscrutable.

  • Customers want poll not select. That is, when the data is ready, give it to the listener without an additional fetch, or else when the listener wakes up the data is likely to be gone again (swapped out).

about runaway JVMs

  • Need a way to watch for runaway memory and/or CPU usage

  • Customers want to kill a whole VM that goes off the rails. (Perhaps applets also.)

  • I noted that TLAB fill events are the right place to cut in a per-thread allocation monitor.

  • JMX is really helpful here. It can report memory threshold events.

  • So let’s add thread-specific ones, and CPU threshold events.

JVMs as sandboxes for bad old code

Customers are eager to use Java VMs as containers for sandboxing old C libraries. These libraries are often non-reentrant and may age badly (crash, run out of memory) after heavy use. But server systems need to run multiple instances of them, in order to scale across HW resources. It it not enough just to load one into your JVM and wrap a thread around it.

Customers are sometimes desperate enough to use the “nested VM” hack (JVM interprets MIPS code generated by gcc). Surely there is an opportunity here!

So why (I am asked) is there no action on isolates? They look to the customer like a no-brainer (like Android’s zygote). In any case, it is clear to me we need better plumbing and monitoring for isolating such old code. A better story could look like:

  • Keep pre-warmed JVM ready to form new isolates.

  • When we want to start up another instance of a C library, we fork a new isolate.

  • On the client VM side, we have some nice light RPC binding.

  • On the service VM side, we have swig-generated tight (unsafe) binding to the C library.

  • The service instance is monitored and can die or be killed if it misbehaves.

Crunchy search goodness

Sam showed me a cool Yahoo search plugin for sifting between versions of Java APIs. Here is an example query for hashmap, which probably does not work on your browser until the plugin is installed.

(Thanks, Sam and Dan, for sending some corrections. Any remaining errors are still my fault.)

Join the discussion

Comments ( 6 )
  • Casper Sunday, June 1, 2008

    > Customers want to kill a whole VM that goes off the rails. (Perhaps applets also.)

    Sure, and it would also be nice if it was possible to name the java process, very hard to identify an offender when all you see is 6 similar named java processing running.

    In the same category, allow a JVM instance to be restarted without having to rely on native wrappers (like nb.exe). Even JSR-296 refused to include this in the form of simply firing up another instance using a new classloader.

  • guest Monday, June 2, 2008

    The JDK "jps" tool is often very useful to distinguish multiple java processes.

  • Nick Evgeniev Monday, June 2, 2008

    Speaking of java and native libraries I would like to mention JNA library https://jna.dev.java.net/ as it's the way to go (not swig). So if it possible to incorporate this effort into jvm, or introduce tweaks to make JNA work faster (minimize overhead) it would be nice.

  • Viktor Tamas Thursday, June 5, 2008

    'The VM uses “dex” files like Java cards use their own internal instruction sets.'

    Do you mean the similarity they have each their own internal instruction sets or there may be further relations between DEX and JavaCard format / VM instructions e.g. some kind of compatibility?

  • bob pasker Sunday, April 12, 2009

    Sam, who used to work for me, knows that I have been asking the Java people to solve the runaway thread problem for over 10 years. Here's just my latest rant on it.


  • guest Wednesday, August 5, 2009

    About the zygote, it seems like the application processes share live core libraries (shared dirty, read-only) and zygote heap (copy-on-write). Are the preloaded classes preloaded into the zygote heap?

Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha

Integrated Cloud Applications & Platform Services