Guidance on measuring the size of a language change

Soon a project will be starting to consider adding a to-be-determined set of small language changes to JDK 7. Given the rough timeline for JDK 7 and other on-going efforts to change the language, such as modules and annotations on types, only a limited number of small changes can be considered for JDK 7. That does not imply that larger changes aren't appropriate or worthwhile at some point in the future; in the mean time such changes can be explored and honed for JDK 8 or later.

Separate from its size, criteria to evaluate the utility of a language change will be discussed in a future blog entry.

The JCP process defines three deliverables for a JSR:

  • Specification.

  • Reference Implementation

  • Compatibility Tests

These three distinct aspects of a language change, specification, implementation, and general testing, exist whether or not the change is managed under a JSR. For this project, a language change will be judged small if it is simultaneously a small-enough effort under all three of specification, implementation, and testing. In other words, if a change is medium sized or larger in a single area, it is not a small change. (This corresponds to using an infinity norm to measure size; see "Norms: How to Measure Size".) Another concern is the size of change to developers, but if the change is small in these three areas, it is likely to be small for developers to learn and adopt too. Because there is limited fungiblity between the people working on specification, implementation, and testing, a single oversize component can't necessarily be compensated for by the other two components being small enough to managed on their own.

The size of a specification change is not just related to the amount of text that is altered; it also depends on which text, how many new concepts are needed, and the complexity of those concepts. Similarly, the implementation effort can be large if a limited amount of tricky code is involved as well as if a large volume of prosaic code is needed. An estimate of the future maintenance effort should factor into judging the net implementation cost too. The specification size and implementation size are often not closely related; a small spec change can require large implementation efforts and vice versa. JCK-style conformance testing is based on testing assertions in the specification, so the size of this kind of testing effort should have some positive correlation with the size of the specification change. Likewise, regression testing should have at least a weak positive correlation with the size of the implementation change. However, adequate conformance testing can be disproportionately large compared to the size of the specification change depending on how the assertions interact and how many programs they affect.

Due to complexity of the Java type system and the desire to maintain backwards compatibility, almost any type system change will be at least a medium-sized effort for the implementation, specification, or both. Each new feature of the type system can interact with all the existing features, as well as all the future ones, so type system changes must be approached with healthy skepticism.

As a point of reference, the set of Java SE 5 language features will be sized according to the above criteria; from smallest to largest:

  • Normal maintenance, Size: Tiny
    In the course of maintaining the platform, small changes and corrections are made to the Java Language Specification (JLS) and javac. These changes even take together are not large enough to warrant a JSR separate from the platform umbrella JSR.

  • Hexadecimal floating-point literals, Size: Very small
    Hexadecimal floating-point literals were a small new feature added to the language in JDK 5 under maintenance. Only very localized grammatical changes were needed in the JLS together with well-bounded supporting library methods.

  • for-each loop, Size: Small
    Part of JSR 201, the enhanced for statement required a new section in the JLS and a straightforward desugaring by the compiler. However, there were still complications; calamity was narrowly averted in the new libraries needed to support the for loop. A new java.lang.Iterator type that would have broken migration compatibility was dropped in favor of reusing the less than ideal java.util.Iterator.

  • static import, Size: Small, but more complicated than expected
    Static import added more ways to influence the mapping of simple names in source code to the binary names in class files. The mapping already had complexities, including rules for hiding, shadowing, and obscuring; static import introduced more interactions.

  • enum types, Size: Medium
    By introducing a new kind of type, adding enum types included a type system modification and so were a medium-sized change. While the normative JLS text devoted to enums is brief, JVMS changes were also required, as well as surprising time-consuming and intricate libraries work, including interactions with IIOP serialization.

  • autoboxing and unboxing, Size: Medium
    The complications with autoboxing and unboxing come not from the feature directly, but from its interactions with generics and method resolution.

  • Annotation types, Size: Large
    As an enum was a new kind of specialized class, an annotation type, introduced in JSR 175, were a new kind of specialized interface. Besides being a type change, annotation types required coordinated JVM and library modifications as well as a new tool and framework, and a subsequent standardization, to fulfill the potential of the feature.

  • Generics, Size: Huge
    Generics were a pervasive change to the platform, introducing many new concepts in the specification, considerable change to the compiler, and far-reaching libraries updates.

Some examples of bigger-than-small language changes that have been discussed in the community include:

  • BGGA closures: Independent of the technical merit of the proposal, BGGA closures would be a large change to the language and platform.

  • Properties: While a detailed judgment would have to be made against a specific proposal, as a new kind of type properties would most likely be at least medium-sized.

  • Reification: The addition of information about the type parameters of objects at runtime would involve language changes, nontrivial JVM changes to maintain efficiency, and raise compatibility issues.

Specific small language changes we at Sun are advocating for JDK 7 will be discussed in the near future.

Comments:

What about import aliases ? Do you think, that this can be added into Java 7? It would be so great addition....

Posted by Marek Lewczuk on December 11, 2008 at 06:23 PM PST #

@Joseph,
You forget another small change included in jdk5.
.class on non primitive type are compiled using a variant of LDC.

@Marek, I'm against aliases (any kind of) because
it will create as many dialects as developers.

Rémi

Posted by Rémi Forax on December 13, 2008 at 01:29 AM PST #

I agree with Remi about aliases.

Posted by Hervé on December 13, 2008 at 10:20 PM PST #

@Remi,

The syntax for class literals goes back long before JDK 5, to at least 1.3. It is true that the class file idiom javac uses to compile down those constructs did change for target 1.5 or higher because of the expanded allowable arguments to LDC. While that does have slightly different runtime semantics than the old idiom, see Sun bug 4993813 "(reflect) Need a way to force a class to be initialized," I view this difference as down in the "maintenance, tiny" area and not a full-fledged language change. Likewise, over the years different idioms have been used to translate inner classes.

Posted by Joe Darcy on December 15, 2008 at 03:00 AM PST #

Hi Joe,
class literals was introduced in 1.1
see http://java.sun.com/docs/books/jls/first_edition/html/1.1Update.html
section D.7.3
(I am currently trying to write a LR grammar of the JLS with version information)

This change unlike the way accessors of inner classes members are generated requires a change of the compiler and a change of the VM Spec so it's a small change not a tiny one because it requires coordination between compilers and VMs.

Rémi

Posted by Rémi Forax on December 15, 2008 at 04:41 PM PST #

@Remi,

There are a few contracts around the Java Language Specification and Java compilers:

1) The contract between the language specification and the programmer (what is compiled and what it means)

2) The contract between the compiler and the JVM (how something is compiled)

3) Contracts the compiler keeps with itself

Changes in 1) are primarily what I consider language changes since they direct affect developers and what programs are accepted by the compiler. The language changes I've been writing about recently are all of this kind.

As you point out, over time there are also changes of the second kind; one example is using the LDC instruction to load class literals, another is the tightened naming requirements of local and anonymous classes as of JVLv3:
"13.1 The Form of a Binary"
http://java.sun.com/docs/books/jls/third_edition/html/binaryComp.html#13.1

While such changes certain impact compilers and similar tools that process source and class files, they don't necessarily have much direct impact on developers. Rather than being language changes, these are more so changes to the "compiler specification," a largely implicit specification with room for different approaches.

The third kind of contract is negotiated and kept within a compiler implementation team.

Posted by Joe Darcy on December 16, 2008 at 02:02 AM PST #

I recently read http://hamletdarcy.blogspot.com/2008/12/java-7-update-from-mark-reinhold-at.html and your blog, and I'm wondering why are you keeping such a strong focus on the contract between the compiler and the JVM. Why keep this focus instead of means to run different versions side by side such as .NET does?
I would bet that the changes with generics in Java 5 were huge because of the compatibility issues and the result wasn't a massive improvement on the language itself, the compiler does hardly infer any types.

Inferring the types isn't any ground breaking science and it looks like this will take 2 major versions until something decent is available.

Now, it sounds like even properties might not make it, and we will be doomed to clutter the code with getter and setters for many years to come.

Posted by Erik Putrycz on December 16, 2008 at 04:29 AM PST #

@Erik,

A few comments. One reason the compiler-JVM contract has been mostly stable is it is considerably more work to coordinate changes on both sides of the contract than on a single side. However, John Rose's Da Vinci Machine project (http://openjdk.java.net/projects/mlvm/) is exploring adding exciting new functionality to the JVM, primarily to support non-Java languages. Even without those additions, many non-Java languages currently use the JVM as an execution environment.

The JSR 14 effort to add generics to the language came with a constraint of migration compatibility, that is, the ability to not require libraries and their clients to be converted to generics all at the same time. For many years, no one knew how to have a system with reified generics that met the migration compatibility constraint. The people working on that effort were certainly aware of reificaiton and its benefits, but meeting the constraints was an open problem and candidate solutions were not developed until it was too late to incorporate them in JDK 5. At the recent JVM Language Summit, a number of speakers actually said the erased generics at the JVM level made targeting the JVM from other languages easier .

As adding properties to Java would be a bigger-than-small change, they will not be included as part of the small language effort in JDK 7. However, for graphical applications a new JVM-targeting language does have properties amongst other new features (http://www.javafx.org/).

Posted by Joe Darcy on December 17, 2008 at 07:54 AM PST #

Post a Comment:
Comments are closed for this entry.
About

darcy

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
News

No bookmarks in folder

Blogroll