Is the JDK losing its edge(s)?

One of the goals for JDK 7 is to get us to a modular platform. Getting there will be hard as it's a very interconnected code base with many undesirable dependencies between APIs and different areas of the implementation. These dependencies have built up over many years and releases. To give an example (from a couple of builds ago but mostly applicable to JDK 6 too): Suppose you are using the Logging API (meaning java.util.logging). Logging requires NIO (for file locking) and JMX (as loggers are managed). JMX requires JavaBeans, JNDI, RMI and CORBA (the JMX remote API mandates that the RMI connector support both JRMP and IIOP). JNDI requires java.applet.Applet (huh?) and JavaBeans has dependencies on AWT, Swing, and all things client. Not satisfied with this, JavaBeans has persistent delegates that create dependencies on JDBC and more. I could continue but it should be clear that this seemingly innocent use of the logging API results in transitive dependencies that envelop almost the entire platform. And just to be clear - these are just dependencies and logging shouldn't of course require any of CORBA's 1600+ classes to be actually be loaded. Think of it more like dinner for two except that she hires a fleet of buses to bring her extended family and friends to wait outside the restaurant.

The good news is that we've started to make progress over the last few builds to address many of these issues. Logging no longer requires JMX (this required an API change to be backed-out/re-visited). We're separated out the RMI-IIOP transport so you can do remote management without CORBA being present. JMX will now do its own introspection when JavaBeans is not present. JNDI no longer requires java.applet.Applet, JavaBeans no longer requires JDBC, AWT no longer requires RMI, and many more.

So where are we at? At this time we have a tentative base module that is essentially the core libraries (think lang/io/net/util/nio/security). The dependencies that used to exist from the classes in this module on JNDI, deployment code, AWT, the preferences and logging APIs, and JMX have been removed or inverted. There is a remaining dependency on XML parsing (from java.util.Properties) and that will be solved in time.

All things Swing, AWT, 2D, etc. are grouped into a tentative client module. The APIs in this module are deeply interconnected and so pose a big challenge. There are still a few dependencies from other modules (like web services) on client that will require work but ultimately it should be possible to chop off the head, say when deploying on an embedded device.

We have a several fine-grain modules that could potentially be grouped, maybe into coarser grain profiles in the future. Logging, RMI, JSSE (SSL), SASL, JDBC, JNDI, LDAP and other JNDI providers, PKCS11 and other security providers to name some of them. JSSE is a good example of work done to decouple it from other areas of the platform. One would think it should be possible to do secure networking without requiring the world but, as SSL can negotiate to use Kerberos based authentication, it was tied to Kerberos/JGSS. For this case, the dependency is now optional. If Kerberos is installed then SSL will include the Kerberos cipher suites when negotiating the security context. When not installed it won't negotiate to use Kerberos.

I mentioned CORBA above as it is often used as the whipping boy by those that are critical of the compatibility baggage that the JDK carries. A potential module that has been suggested is a compatibility module for deprecated, legacy, unloved and other baggage. Good work from Sean Mullan and Vincent Ryan in jdk7 b78 removes the dependencies on the deprecated security classes so that they don't need to be in the base module. Other potential candidates are the legacy converters. We should have ditched these years ago but there are still JDBC drivers that haven't removed their dependencies. We've also got many classes in sun.misc that aren't used anymore but we can't remove them because there may be naughty applications out there using them directly. Legacy protocol handlers and content handlers are other candidates. I'm sure the reader can think of others.

In the above it is worth pointing out that modules aren't necessarily aligned on package boundaries. It's clearly desirable for a module to contain all the classes that are in one or more complete packages but there are many cases where this isn't not possible. I mentioned JavaBeans above and that is a clear case where one has to separate out the property event support and annotations from the introspection and other classes that tie one to the client area. APIs such New I/O and Logging have management interfaces and it makes a lot more sense to group the management interfaces into the management module along with with JMX and I mentioned the separation of the IIOP transport from the RMI Connector above. For that one, the rmic compiler will generate the stubs and ties to the package but we wouldn't want to group these into the management module as it would create a dependency on CORBA. If one were to split up the base module further then it would mean looking at packages such as java.util that contains a lot more than the collection APIs.

I hope the above gives you a feel for the work that has been happening in jdk7. A more modular JDK should get us closer to our goals to improve performance (download and startup time), enable the platform to scale-down, and more. For those interested in diving into this further then the Jigsaw project page and mailing list are a good starting point. The jigsaw/tools repository has the ClassAnalyzer tool that we've been using to analyze dependencies and guide the changes. The tool consumes module definitions and generates several files per module, including the list of classes, and dependencies. There are summary files in various forms, including DOT files for those interested in visualizing the dependencies. Much of the work mentioned above can be thought of as removing edges from the dependency graph. Mandy has been working on the next step, the build changes so that the JDK build will generate modules rather than rt.jar. This will take a few steps to get there. Initially the build will generate JAR files but ultimately we will of course transition to a better container format.


Thanks for sharing the information on the progress!

I really think that the interdependencies of the JDK packages is the biggest argument against using the Java Platform. Nobody wants a monolithic architecture when it comes to software.

I guess, in a perfect world we would have 'modules' with 'extensions', e.g.:

logging.nio, logging.jmx (or better jmx.logging?)
beans.awt, beans.swing, beans.jdbc (or awt.beans, .. )

Unfortunately, this would lead to a namespace explosion but can hopefully be solved with jdk7's module language extensions.

P.S. Also glad to hear that even interface changes are acceptable for the greater good.

Posted by Michael Nischt on December 13, 2009 at 12:35 AM PST #

[Trackback] This post was mentioned on Twitter by pholdings: alan bateman on jdk modularity:

Posted by uberVU - social comments on December 13, 2009 at 04:49 AM PST #

I just love this - keep up the good work!

I so hope for the JVM to become as small and have as fast startup as the Flash runtime. If the notion of running a small systray application in a JVM becomes something else than a complete absurdity, I think "java on the desktop" could at last become a real and actual concept.

PS, on a tangent: A "do a full GC, full compaction incl. shedding of SoftRefs, then release it all back to the OS and shrink the footprint to this absolute minimal size" method would do heaps of good in regard to java on the desktop: An application could invoke this gcAndMemoryCompaction() method after so and so many seconds of inactivity (by some app-specific plausible not-in-use logic), or on "minimize to tray" or similar - to play nice with the rest of the OS and hence the user. It would also make the JVM slightly less hostile to swapping.

This post is interesting, and I asked some similar questions in the comment section:

Posted by Endre Stølsvik on December 13, 2009 at 07:33 AM PST #

I can see how this will reduce the (one-time) download time of the platform, but I don't see how it improves startup time. The VM today doesn't load classes that aren't actually used.

Posted by Neal Gafter on December 13, 2009 at 08:41 AM PST #

Yes. Reducing start-up time requires more efficient classloading - this might actually increase the time, or is that my impression?

Posted by paulo on December 13, 2009 at 11:15 PM PST #

Even though the JVM only loads the required classes as what can be seen with -verbose, RT.jar is 43MB which means that the JVM has to load this large file which takes time.

Posted by Carl Antaki on December 13, 2009 at 11:52 PM PST #

Neal: you are right that startup is another area. It will require optimizations at module installation or configuration time and changes to get faster loading.

Posted by Alan on December 13, 2009 at 11:56 PM PST #

Sounds very similar to dependency refactoring we undertook in the NetBeans Platform, especially cca. 2003-2005.

Fixing dependencies without breaking compatibility of published APIs is unfortunately very hard in a statically typed language with closed classes like Java, since dependencies may be physically mentioned in public signatures. For a while we resorted to bytecode postprocessing tricks to maintain a degree of binary compatibility while refactoring sources, but this proved difficult to maintain and eventually we just decided to break binary compatibility for some of the oldest APIs. What would have made the job far easier is some way of expressing bytecode-level transformations of clients, to be interpreted by class loaders. For example, say you could remove java.awt.Image getIcon(int) from BeanInfo yet somehow place an annotation on the BeanInfo class declaration indicating that an interface exists in the java.awt module defining this method, together with some adapter providing implementations. Existing clients which called myBeanInfo.getIcon(myType) could be rewired during class loading to call something like sun.reflect.Rewire.invoke(BeanInfo.class, "Ljava/awt/Image;getIcon(I)", myBeanInfo, muType) which would somehow look up the right implementation in java.awt and call it, without requiring the java.beans module to have either a compile-time or link-time dependency on java.awt. ( is a somewhat analogous tool for source transformations.)

One thing we continue to do is to apply load-time transformations to module dependency declarations (as opposed to class "imports" in bytecode). The primary purpose is to deprecate part of a published API (at class granularity) and split it off into a deprecated module. Old client modules will then get an implied dependency on the deprecated module, while new clients know to be free of deprecated dependencies can skip loading it. Format:

By the way, do you plan as part of the separation to physically divide JDK sources (mainly jdk/src/share/classes) into separate source roots according to module membership? This is the clearest way to enforce good dependencies and ensure that developers think from the beginning about where a piece of code belongs, rather than problems being flagged only in a late packaging step. It is also friendliest to IDEs.

Posted by Jesse Glick on December 14, 2009 at 02:41 AM PST #

This is great work.

With regard to: "We've also got many classes in sun.misc that aren't used anymore but we can't remove them because there may be naughty applications out there using them directly"

Surely this support doesn't need to be maintained? Considering that the documentation and the compiler have scary warnings about the dangers of using sun.misc. It was always explicit that it was not a supported part of the platform.

Posted by Lachlan O'Dea on December 14, 2009 at 09:42 AM PST #

How about removing those nasty yahoo toolbar dependencies while you're at it?

Posted by Rick Minerich on December 14, 2009 at 10:31 PM PST #

A plain Java class with only an empty main() defined causes 545 classes to be loaded (java -verbose:class). Seems like there is room for improvement; am glad that you all are working on it.

Posted by Patrick Wright on December 15, 2009 at 06:44 AM PST #

I did some work analyzing the dependencies mainly in the XML area, reported here:

which may be of interest. A few small changes can help a lot. (In the end, though, I shipped with the Apache Xerces parser rather than the one from the JDK, partly because it was easier to extricate, but more importantly because it is much more reliable and conformant.)

Posted by Michael Kay on December 16, 2009 at 06:34 PM PST #

Jesse: we definitely want to learn from other projects although we have to be very careful not to break compatibility. I think the only think we've broken so far is support for jdk1.1 style security policy files. API dependencies are the most problematic. You mentioned Beans but that's not too bad in that the remaining dependencies are in implementation code (not API signatures) so there is potential to separate it into its own module. We don't have any plans to do load-time transformations. If we did then it could be done statically to avoid the runtime cost. The build and layout in the repository will need to attacked at some point but for now we're focused more on the runtime.

Patrick: the number of classes loaded for the empty main case should be ~300.

Michael: Thanks for the link. In our module definitions (all very tentative at this point) you'll see there is Xerces and Xalan modules, in addition to the API module. This might be interesting to your work.

Posted by Alan on December 16, 2009 at 11:11 PM PST #

Would the base module include support for JSR 233 and 292? I'm wondering if the new light weight Java VM wouldn't be a very good candidate for creating a cross platform/cross browser JavaScript engine.

While there are quite a few good entrants in this race: WebKit's SquirrelFish Extreme, Google's V8, Mozilla's TraceMonkey, and Opera's Carakan; I would suggest that the race is far from over yet.

I think an argument could be made that a more memory efficient and performant JVM for dynamic languages that can go anywhere would be very valuable to the browser platform makers targeting a consistent development experience for desktops, netbooks, and moble devices.

Posted by Micah J on December 25, 2009 at 05:46 AM PST #

When will the NIO api be updated to take advantage of GetQueuedCompletionStatusEx on 2008 Server?

Posted by John Davis on December 25, 2009 at 10:19 PM PST #

@Alan: "the number of classes loaded for the empty main case should be ~300."

Oops, I think there might have snuck in two zeros in that string; You probably meant ~3? String, Object, and the class in question?

Posted by Endre Stølsvik on December 26, 2009 at 09:09 AM PST #

Micah: jsr223 and jsr292 aren't currently in our base module.

John: I'd suggest bringing up your question on nio-dev as it's off-topic here.

Endre: No, I meant that the no-op test will likely ~300 classes (not >500 as suggested).

Posted by Alan on December 28, 2009 at 05:19 AM PST #

Post a Comment:
Comments are closed for this entry.



Top Tags
« April 2014

No bookmarks in folder