The massive, monolithic JDK

Space is big. Really big. You just won’t believe how vastly, hugely, mindbogglingly big it is. I mean, you may think it’s a long way down the road to the chemist, but that’s just peanuts to space.

— Douglas Adams, The Hitchhiker’s Guide to the Galaxy

The JDK is big, too—though not (yet) as big as space.

It’s big because over the last thirteen years the Java SE platform has grown from a small system originally intended for embedded devices into a rich collection of libraries serving a wide variety of needs across a broad range of environments.

It’s incredibly handy to have such a large and capable Swiss-Army knife at one’s disposal, but size is not without its costs.

Size The JDK and its runtime subset, the JRE, have always been delivered as massive, indivisible artifacts. The growth of the platform has thus inevitably led to the growth of the basic JRE download, which now stands at well over 14MB despite heroic engineering efforts such as the Pack200 class-file compression format.

Complexity The JDK is big—and it’s also deeply interconnected. It has been built, on the whole, as a monolithic software system. In this mode of development it’s completely natural to take advantage of other parts of the platform when writing new code or even just improving old code, relying upon the flexible linking mechanism of the Java virtual machine to make it all work at runtime.

Over the years, however, this style of development can lead to unexpected connections between APIs—and between their implementations—leading in turn to increased startup time and memory footprint. A trivial command-line “Hello, world!” program, e.g., now loads and initializes over 300 separate classes, taking around 100ms on a recent desktop machine despite yet more heroic engineering efforts such as class-data sharing. The situation is even worse, of course, for larger applications.

Palliatives The Java Kernel and Quickstarter features in the JDK 6u10 release do improve download time and (cold) startup time, at least for Windows users. These techniques really just address the symptoms of long-term interconnected growth, however, rather than the underlying cause.

The modular JDK The most promising way to improve the key metrics of download time, startup time, and memory footprint is to attack the root problem head-on: Divide the JDK into a set of well-specified and separate, yet interdependent, modules.

The process of restructuring the JDK into modules would force all of the unexpected interconnections out into the open where they can be analyzed and, in many cases, either hidden or eliminated. This would, in turn, reduce the total number of classes loaded and thereby improve both startup time and memory footprint.

If we had a modular JDK then at download time we could deliver just those modules required to start a particular application, rather than the entire JRE. The Java Kernel is a first step toward this kind of solution; a further advantage of having well-specified modules is that the download stream could be customized, in advance, to the particular needs of the application at hand.

Now, wouldn’t all that be cool?

Going further, the modularization process could be applied not just to the JDK but to libraries and applications themselves so as to improve these metrics even more. Doing so might also enable us to address some other longstanding problems related to the packaging and delivery of Java code.

Hmm …

Comments:

Mark

I am glad to see you raise this issue, as it shows that the Sun Java team is not deaf to the concerns we Java developers have as the JDK grows and grows. I think many of us would be happy to see a more modular JDK. What will be interesting to see is how far you can go while maintaining compatibility and TCK compliance.

Any hints on a timeframe for addressing this? Java 7? Java 8?

Also, do you envision this as part of a JSR?

Good luck with the project, please keep us posted!

Patrick

Posted by Patrick Wright on November 24, 2008 at 06:19 PM PST #

Glad to see you blogging on this. So, we're ready to modularize it right now, aren't we?

Posted by Weijun on November 24, 2008 at 07:23 PM PST #

The JDK is big because Java never specified any industrial way of managing software dependencies.

Thus the only way to deploy reliably the Java stack was to bundle it in a huge monolithic monster.

Incidently, this does not work for third-parties, only for SUN and the JDK. The worst consequence of this lack of dependency management is not the bloated JDK but all the unmanageable apps which have been produced over time, with hardcoded classpath strings and massive forking (because if you can't manage deps and update them independently you may as well fork the private copy of the jars you're forced to bundle with your apps).

You don't have to look much farther to understand why Java only ever strived in J2EE servers (that provided a bit of the management lacking in the base Java platform).

Posted by nim-nim on November 24, 2008 at 08:01 PM PST #

I totally agree with the point of this article.

We need to learn from the industrial revolution and from manufacturing industries.

Some sizes to show how far away software development is from manufacturing industries.

\* A Boeing 747 has some six million (6,000,000) parts!
\* Java Core (jdk 1.6_2) has:
o Just two million (2,035,060) lines (code, javadoc comments and comments).
o The source exists in just seven thousand (7,069) source files
o The source exists in just under five hundred (480) directories.

Even in this primitive state we do not yet have a firm foundation on which to build,
i.e. we are currently building on SAND.

Dave

Posted by David M. Gaskin on November 24, 2008 at 10:42 PM PST #

I guess this is all a symptom of "backward compatability". Seems the Java Kernel is a good step in the direction.

Why does the existing packaging name space (java.io, java.nio, etc) not already capture some of the "modular"-ism issues being discussed?

Why does the importing of necessary packages for the application also not addressing some of the concerns here?
On this one, I assume this really only addresses the application specific requirements and not the JRE/JVM level.

In the end, I guess this is a side effect of classpath pains.

So is the need to have some way of having the basic kernel as found in the recent updates, with an added use case when a class is not found in the classpath, that then downloads any additional optional packages or from a repository (apt-like, Maven?) somewhere (This does assume network connectivity though)?

Would one of the upcoming JSRs (294) address some of these concerns?

Posted by Eric on November 24, 2008 at 10:43 PM PST #

What can I do to help?

Posted by Collin on November 24, 2008 at 10:49 PM PST #

Good call Colin. I am willing to invest some man-hours in this endeavor as well. Where do i sign up?

Posted by Chris Butler on November 24, 2008 at 11:28 PM PST #

Modular JDK is the way to go. Will that be in Java7 ?

Although I yet to fully understand how PatchInPlace of Consumer Java works now that can be used to built Java7.

For example, If a user installed the full Java7.0 JRE, and run a applet, the applet detect that it need Java7.01 core Java module library (eg Swing), so it download the Java module Library. Now does it patch the full Java7.0 JRE ? Or is the Java7.01 module only available for that applet only ?

So I think Modular JDK will probably have versioning issues as a result of patching.

Posted by GeekyCoder on November 25, 2008 at 12:19 AM PST #

While it may be "cool", I doubt that this is a top priority for most Java developers.

I have this bad feeling that you might be swayed by a handful of positive responses to your blog, without any idea of whether they're representative of the Java developer community as a whole.

Even just fixing the bugs that have the most votes would be a better strategy than just working on what's "cool".

I suppose "listening to your customers" is just one of those old-fashioned, closed-source ideas that has been thrown overboard. But look at the bright side...you now have two random developers who say they're willing to help. Good luck with that.

Posted by Andy Tripp on November 25, 2008 at 01:26 AM PST #

Mark,

You hit the nail on the head. I've been pushing for precisely this sort of work to be done for a long long time. I was very happy to see Sun agree to Java Kernel because it is a huge step in the right direction.

There are two major problems that must be overcome before this is settled, and they aren't technical in nature:

1) From Sun's point of view, we're asking it to do "house cleaning" that brings in zero revenue at a time when it's laying off 20% of its staff. There is never a good time to do house cleaning, but it still needs to get done.

2) Sun has to understand that modularity inevitably leads to integration bugs and it needs to be okay with this. We \*should\* be able to evolve a module at a different speed than another and not necessarily sync major releases to official JDK releases. The way I envision this working is that each module would progress at its own rate. Sun would tag a specific version of each module at every official release and developers would be able to choose to either depend on those tested tags or take a risk by using the newer versions available in the wild.

The good news is that once this is settled it will also let us finally remove deprecated classes and methods because those can make up yet another module that can be optionally downloaded.

Posted by Gili on November 25, 2008 at 01:31 AM PST #

Andy,

I agree with you that Sun should work on longstanding issues with hundreds of votes before adding "cool" new features. The reason I agree with you is that I think that if this came up to a vote it \*too\* would receive hundreds of votes. If it receives less votes than other issues then those other issues should be worked on first.

The first step in my view is to retrofit BugParade. It is a monster that needs to die. If you want open-source Java development we need two-way communication with Sun and BugParade is decidedly one way. Half the time I leave comments I'm not sure anyone ever reads them, and vice versa, I would love to get notifications when Sun Engineers leave comments on a regular basis as opposed to once in a blue moon.

Posted by Gili on November 25, 2008 at 01:40 AM PST #

Euh, we getting into this again? was this all well underway before the great clash of sun's super packages and osgi and 2 JSR's that had very similar goals etc etc.

I thought with glassfish using osgi the siege between the two camps had ended.

Meh, I must have been mistaken but I gues that's not that hard looking from such a distance.

Posted by Michael B on November 25, 2008 at 03:19 AM PST #

A modular JDK (or rather JRE) is completely irrelevant to enterprise users. I think enterprise users prefer the "blob" as it is now, because modules mean dependencies and that smells like "DLL hell". The blob is easy to distribute and patch, that's what counts. Also, Java has good upward compatibility: Not only "write once, run everywhere", but also "write once, run forever", which means great return on investment (ROI). This is one of the reasons people prefer Java over .NET. MS strategy has always been: "Here's the new version of insert-favorite-hype-here, please retrain and port all your apps." MS technologies are way too short lived. The Java platform is already modular: The modules are third-party libraries that my app needs. I like the way Sun only makes libraries part of the platform when they have become mature enough. Robustnest and reliability are key to Java's success. So please go back now and fix those remaining bugs. I really liked that aspect of the 6u10 release!

Posted by will69 on November 25, 2008 at 03:23 AM PST #

In a broadband world, 14MB is no big deal.

In a GPL world where Java comes embedded with your OS and apps, 14Mb is no big deal.

Re-using code from one area of the JRE while implementing another not only makes it easier on the developers, but reduces bugs. That is a huge deal.

Posted by Chris Hubick on November 25, 2008 at 04:24 AM PST #

Sorry. I don't think it's a great idea. And we don't need it. What we need is a fast run time not a modular one. And what's the point of making JDK modular? Does anyone use JDK other than for development purposes?

Downloading core JRE components dynamically? Are you kidding? Not everyone in this world is having a super fast internet connection.

There are so many things to be improved in the JRE. For example, the jre update can be made much more efficient by making it update incrementally.

Posted by James Selvakumar on November 25, 2008 at 09:32 AM PST #

I don't understand you, James.
The run time is already fast. For Java to gain in the consumer market, we need to decrease start up times, among other things. We hope to achieve this by dividing the environment into modules, to highlight and remove unnecessary dependencies between classes (see Mark's example of 300 classes used to print hello world).

The Patch-In-Place system has already been introduced, with 6u10. And the fact that not everyone has a high speed connection is precisely WHY we should distribute the JRE in modules to consumers, rather than as a big blob. Entire modules could remain not downloaded until needed. How often do you use CORBA? Many (most?) consumers have the JRE installed only to view applets. How many of them use RMI? Or SQL? Or javax.management?

Posted by J on November 25, 2008 at 06:46 PM PST #

For those nodding in agreement and wondering when they can get their hands on a modular JDK, the answer is: right now. Apache Harmony has already done this. It uses the Java standard for modularity (JSR 291, aka OSGi) to define module dependencies and it works right now.

It's great news if this blog post indicates that Sun intends to catch up in this area. Though not so great news if they intend to use a proprietary or non-standard module system.

Posted by Neil Bartlett on November 25, 2008 at 10:13 PM PST #

First of all, the number of applications that connect to privately owned websites on their own accord should be very, very limited. Second of all, a program should run wether or not the OS is connected to the internet. These two norms preclude a dynamicly self-updating JRE.

Posted by Michiel de Boer on November 26, 2008 at 04:38 PM PST #

Good call!

For now, versions are a hell in java - with too many requirements at a global vm level. In order to break things up we must face versioning on a package level with a awt-version, nio-version just like the apache-version. It also requires a more relaxed package management so we don't have to download every version in order to comply with super strict enterprise edition rules and configurations.

Will this give us a java - dll problem?
Well, compared with perl and python package management, its sometimes ugly - but less of an iland than java.

/jonas

Posted by jonas bosson on November 26, 2008 at 11:35 PM PST #

I posted in Charles Nutter's blog (http://blog.headius.com/2008/11/noise-cancelling.html) some long considerations about adapting the JRE for dynamic/scripting languages. In this context, a well-modularized JRE (both APIs and VM) would be invaluable.

Posted by Osvaldo Pinali Doederlein on November 28, 2008 at 03:01 AM PST #

I'd really like to hear Mark comment on Apache Harmony's already modularized JRE - a modularization which is done via OSGi. Even better, the modularization of Harmony is integrated directly into the PDE of Eclipse so it's something that developers can make use of right away.

I seem to remember Mark saying somewhere that OSGi was insufficient for the needs of modularizing the JRE, so I'd love to see his criticism of what Harmony has done and why it is completely insufficient for the needs of Sun. I'd love to see Mark's reasoning as to why using a well defined, existing standard simply won't do for what he has in mind.

Posted by Hal on November 29, 2008 at 03:25 AM PST #

hehe, a good taste for engineers, meaningless for others.
i am happy to see such topics.

Posted by neoe on December 04, 2008 at 03:21 PM PST #

Post a Comment:
Comments are closed for this entry.
About

This blog has moved to http://mreinhold.org/blog. <script>var p = window.location.pathname.split('/'); var n = p[p.length - 1].replace(/_/g,'-'); if (n != "301") window.location = "http://mreinhold.org/blog/" + n;</script>

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
News

No bookmarks in folder

Blogroll

No bookmarks in folder

Feeds
RSS Atom