Musings on JDK development

  • Java
    April 17, 2008

Kinds of Compatibility: Source, Binary, and Behavioral

Every change is an incompatible change. A
risk/benefit analysis is always required.

—Martin Buchholz

Veteran JDK Engineer

When evolving the JDK, compatibility concerns
are taken very seriously.
However, different standards are applied to evolving various aspects
of the platform. From a certain point of view, it is true that any
observable difference could potentially cause some unknown
application to break. Indeed, just changing the reported version
number is incompatible in this sense because, for example, a JNLP file
can refuse to run an application on later versions of the platform.
Therefore, since not making any changes at all is clearly not viable
for evolving the platform, changes need to be evaluated against and
managed according to a variety of compatibility contracts.

For Java programs, there are three main categories of compatibility:

  1. Source: Source compatibility concerns translating Java
    source code into class files.

  2. Binary: Binary compatibility is href="http://java.sun.com/docs/books/jls/third_edition/html/binaryComp.html#13.2"
    title="JLSv3 13.2 What Binary Compatibility Is and Is Not">defined
    in The Java Language Specification as preserving the
    ability to link without error.

  3. Behavioral: Behavioral compatibility includes the
    semantics of the code that is executed at runtime.

Note that non-source compatibility is sometimes colloquially referred
to as "binary compatibility." Such usage is incorrect since the JLS
spends an entire chapter precisely defining the term binary
compatibility; often behavioral compatibility is the intended notion

There are many other observable aspects of the JDK not related to Java
programs, such as file layout, etc. Those will not be further
discussed in this note.

The basic challenge of compatibility is the difficulty of finding and
modifying all the software and systems impacted by a change. In a
closed-world scenario where all the clients of an API are known and
can in principle be simultaneously changed, introducing "incompatible"
changes is just a matter of being able to coordinate the engineering
necessary to evaporate the liquid in a small body of water, perhaps only
a puddle or pot on a stove. In contrast, for APIs that are used as
widely as the JDK, rigorously finding all the possible programs
impacted by an incompatible change is as impractical as href="http://poetry.about.com/library/weekly/aa072997.htm">boiling the
oceans, so evolving such APIs is quite constrained by comparison.

Generally, we will consider whether a program P is compatible
is some fashion (or not) with respect to two versions of a library
L1 and L2 that differ in some way.
(We will not consider the compatibility impact of such changes to
independent implementers of L.)
Sometimes only a particular program is of interest; is the change from
L1 to L2 compatible with
this program? When evaluating how the platform should evolve,
a broader consideration of the programs of concern is used. For
example, does the change from L1 to
L2 cause a problem for any program that
currently exists? If so, what fraction of existing programs is
affected? Finally, the broadest consideration is does the change
affect any program that could exist? Often once a platform
version is released, the latter two notions are similar because
imperfect knowledge about the set of actual programs means it can be
more tractable to consider the worst possible outcome for any
potential program rather than estimate the impact over actual
programs. Stated more formally, depending on the change being
considered, judging the change based on the worst possible outcome for
any program is more appropriate than judging based on some other kind
of href="http://blogs.sun.com/darcy/entry/norms_how_to_measure_size"
title="Joe on Norms: How to Measure Size">norm of the disruption
over the space of known programs.

Generally each kind of compatibility has both href="http://blogs.sun.com/darcy/entry/balance_of_error" title="Joe on
Balance of Error">positive and negative aspects; that is, the
positive aspect keeping things that "work" working and the negative
aspect of keeping things that "don't work" not working. For
example, the TCK tests for Java compilers include both positive tests
of programs that must be accepted and negative tests of programs that
must be rejected.
In many circumstances, preserving or expanding the positive behavior
is more acceptable and important than maintaining the negative
behavior and we will focus on positive compatibility in this entry.

In terms of relative severity, source compatibility problems are
usually the mildest since there are often straightforward workarounds,
such as adjusting import statements or switching to fully qualified
names. Gradations of source compatibility are identified and
discussed below. Behavioral compatibility problems can have a range of
impacts while true binary compatibility issues are problematic since
linking is prevented.

Source Compatibility

The basic job of any linker or loader is simple: It binds more
abstract names to more concrete names, which permits programmers to
write code using the more abstract names. (href="http://linker.iecc.com/">Linkers and Loaders)

A Java compiler's job also includes mapping more abstract names to
more concrete ones, specifically mapping simple and qualified names
appearing in source code into binary names in class files.
Source compatibility concerns this mapping of source code into class
files, not only whether or not such a mapping is possible, but also
whether or not the resulting class files are suitable. Source
compatibility is influenced by changing the set of types available
during compilation, such as adding a new class, as well as changes
within existing types themselves, such as adding an overloaded method.
There is a large set of possible changes to href="http://java.sun.com/docs/books/jls/third_edition/html/binaryComp.html#13.4"
title="JLSv3 Evolution of Classes">classes and href="http://java.sun.com/docs/books/jls/third_edition/html/binaryComp.html#13.5"
title="JLSv3 Evolution of Interfaces">interfaces examined for
their binary compatibility impact. All these changes could also be
classified according to their source compatibility repercussions, but
only a few of kinds of changes will be analyzed below.

The most rudimentary kind of positive source compatibility is whether
code that compiles against L1 will continue to
compile against L2; however, that is not the
entirety of the space of concerns since the class file resulting from
compilation might not be equivalent.
Java source code often uses href="http://java.sun.com/docs/books/jls/third_edition/html/names.html#"
title="JLSv3 Simple Type Names">simple names for types;
using information about imports, the compiler will href="http://java.sun.com/docs/books/jls/third_edition/html/names.html#6.5"
title="JLSv3 6.5 Determining the Meaning of a Name">interpret
these simple names and transform them into href="http://java.sun.com/docs/books/jls/third_edition/html/binaryComp.html#13.1"
title="JLSv3 13.1 The Form of a Binary">binary names for use in
the resulting class file(s). In a class file, the binary name of an
entity (along with its signature in the case of methods and
constructors) serves as the unique, universal identifier to allow the
entity to be referenced.
So different degrees of source compatibility can be identified:

  • Does the code still compile (or not compile)?

  • If the code still compiles, do all the names resolve to the
    same binary names in the class file?

  • If the code still compiles and the names do not all
    resolve to the same binary names, does a behaviorally
    class file result?

Whether or not a program is valid can also be affected by language
changes. Usually previously invalid program are made valid, as when
generics were added, but sometimes existing programs are rendered
invalid, as when keywords were added (href="http://java.sun.com/docs/books/jls/second_edition/html/classes.doc.html#251946"
title="JLSv3 strictfp Classes">strictfp
, href="http://www.jcp.org/en/jsr/detail?id=41" title="JSR 41: A Simple
Assertion Facility">assert
, and href="http://www.jcp.org/en/jsr/detail?id=201" title="JSR 201:
Extending the Java Programming Language with Enumerations, Autoboxing,
Enhanced for loops and Static Import">enum
The version number of the resulting class file is also an external
compatibility issue of sorts since that affects which platform
versions the code can be run on.

Full source compatibility with any existing program is usually
not achievable because of \* imports. For example,
consider L1 with packages foo and
bar where foo includes the class
Quux. Then L2 adds class
bar.Quux. This program

import foo.\*;
import bar.\*;
public class HelloQuux {
public static void main(String... args) {

Object o = Quux.class;

System.out.println("Hello " + o.toString());

will compile under L1 but not under
L2 since the name "Quux" is now
ambiguous as reported by javac:

HelloQuux.java:6: reference to Quux is ambiguous, both class bar.Quux in bar and
class foo.Quux in foo match
Object o = Quux.class;
1 error

An adversarial program could almost always include \*
imports that conflict with a given library.href="#adversary">1 Therefore, judging source compatibility
by requiring all possible programs to compile is an overly
restrictive criterion. However, when naming their types, API
designers should not reuse "String",
"Object", and other names of core classes from packages like
java.lang and java.util to avoid this kind of
annoying name conflict.

Due to the \* import wrinkle, a more reasonable definition
of source compatibility considers programs transformed to only use href="http://java.sun.com/docs/books/jls/third_edition/html/names.html#6.7"
title="6.7 Fully Qualified Names and Canonical Names">fully qualified
names. Let FQN(P, L) be program P
where each name is replaced by its fully qualified form in the context
of libraries L. Call such a library transformation from
L1 to L2 binary-preserving
source compatible
with source program P if
FQN(P, L1) equals FQN(P,
L2). This is a strict form of source
compatibility that will usually result in class files for P
using the same binary names when compiled against both versions of the
library. Class files with the same binary names will result when each
type has a distinct fully qualified name. Multiple types can have the
same fully qualified name but differing binary names; those cases do
not arise when the standard naming conventions are being

Adding overloaded methods has the potential to change method
resolution and thus change the signatures of the method call sites in
the resulting class file. Whether or not such a change is problematic
with respect to source compatibility depends on what semantics are
required and how the different overloaded methods operate on the same
inputs, which interacts with behavioral equivalence notions. Assume
class C originally has a method void m(T t)
and then an overload void m(S s) is added. Some cases
of interest include:

  • S and T are both reference types:

    • If there is no typing relationship between S and
      T, overload resolution will not be affected.

    • If there is a typing relationship between S and
      T, such as T is a subtype of S,
      call sites in existing source may now resolve to the new method.
      Well-written programs will follow the href="http://en.wikipedia.org/wiki/Liskov_substitution_principle"
      title="Wikipedia on Liskov substitution principle">Liskov substitution
      principle and C will do "the same" operation on the
      argument no matter which overloaded method is called. Less than
      well-written programs may fail to follow this principle.

  • S and T are both primitive
    By extension, if a numerical value can be represented in
    multiple primitive types, overloaded methods taking a type with that
    value should usually perform an equivalent operation. However, the
    silent loss of precision in primitive widening conversion can affect
    the actual value that gets passed to an overloaded method.

    Concretely, consider class C with methods
    m(int) and m(double). The call site
    "m(123L)" will undergo primitive widening conversion,
    converting the argument value to double before
    m(double) is called. Now if m(long) is
    added to C, the call site will resolve to the new method.
    Even assuming each m method does an equivalent operation
    when passed a numerically equal value, there can still be differences
    after the third method is added since some long values
    lose precision when converted to double, for example,
    Long.MAX_VALUE. Therefore, a client when compiled
    against the two version of C can have different runtime
    behavior even if each m method behaves reasonably.
    This kind of subtle change in overloading behavior occurred with the
    addition of a BigDecimal constructor
    taking as long
    as part of href="http://www.jcp.org/en/jsr/detail?id=13" title="JSR 13: Decimal
    Arithmetic Enhancement">JSR 13

  • One of S and T is a reference
    type, the other is primitive:
    Before generics were added to the
    language, two methods which differed in the primitive/reference status
    of the ith parameter could not
    possibly be applicable to the same arguments. But, along with
    generics came href="http://java.sun.com/docs/books/jls/third_edition/html/conversions.html#5.1.7"
    title="JLSv3 5.1.7 Boxing Conversion">boxing and href="http://java.sun.com/docs/books/jls/third_edition/html/conversions.html#5.1.8"
    title="JLSv3 5.1.8 Unboxing Conversion">unboxing conversions that
    can map, for example, a value of an int primitive type to a
    java.lang.Integer object with a reference type, and vice
    versa. These mapping have the potential to introduce ambiguities in
    method resolution such that adding a method could introduce an
    ambiguity that prevented previously valid code from compiling;
    however, the rules for method invocation expressions were href="http://java.sun.com/docs/books/jls/third_edition/html/expressions.html#15.12.2"
    title="JLSv3 15.12.2 Compile-Time Step 2: Determine Method
    Signature">updated to avoid such potential ambiguities from
    boxing/unboxing as well as var-args.

If a new method cannot change resolution, then it is a
binary-preserving source transformation. If a new method can change
resolution, if the different class file that results has acceptably
similar behavior, the change may still be acceptable, while changing
resolution in such a way that does not preserve semantics is
likely problematic. Changing a library in such a way that current
clients no longer compile is seldom appropriate.

Source compatibility levels of FQN programs

Binary Compatibility

JLSv3 §13.2
What Binary Compatibility Is and Is Not

A change to a type is binary compatible with (equivalently,
does not break binary compatibility with) preexisting binaries
if preexisting binaries that previously linked without error will
continue to link without error.

The JLS defines binary compatibility strictly according to linkage; it
P links with L1 and continues to link with
L2, the change made in L2 is
binary compatible. The runtime behavior after linking is not
included in binary compatibility:

JLSv3 13.4.22 Method and Constructor Body

Changes to the body of a method or constructor do not break [binary]
compatibility with pre-existing binaries.

As an extreme example, if the body of a method is changed to throw an
error instead of compute a useful result, while the change is
certainly a compatibility issue, it is not a binary
compatibility issue since client classes would continue to link.
Also, it is href="http://java.sun.com/docs/books/jls/third_edition/html/binaryComp.html#13.5.3"
title="JLSv3 13.5.3 The Interface Members">not a binary
compatibility issue to add methods to an interface. Class files
compiled against the old version of the interface will still link
against the new interface despite the class not having an
implementation of the new method. If the new method is called at
runtime, an href="http://java.sun.com/javase/6/docs/api/java/lang/AbstractMethodError.html"
title="Java SE 6 Specification for
AbstractMethodError">AbstractMethodError is thrown; if
the new method is not called, the existing methods can be used without
incident. (Adding a method to an interface is a source
incompatibility that can break compilation though.)

A design requirement from the addition of generics via href="http://www.jcp.org/en/jsr/detail?id=14" title="JSR 14: Add
Generic Types To The Java Programming Language">JSR 14 was
migration compatibility. Migration compatibility requires that
a library can be generified and existing (nongeneric) clients can
continue to compile and link against the generic version.
Meeting this constraint led to the use of erasure, href="http://gafter.blogspot.com/2004/09/puzzling-through-erasure-answer.html"
title="Neal on Puzzling Through Erasure: answer section">a
controversial aspect of the generics design. During JSR 14, it
was not known how to add generics in a way that supported both
reification and migration compatibility; href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5098163"
title="Sun bug 5098163 Add reification of generic type parameters to
the Java programming language">future work might address this

Behavioral Compatibility

Intuitively, behavioral compatibility should mean that with the same
inputs program P does "the same" or an "equivalent" operation
under different versions of libraries or the platform. Defining
equivalence can be a bit involved; for example, even just defining a
proper equals method in a class can be nontrivial. In this
case, to formalize this concept would require an href="http://en.wikipedia.org/wiki/Operational_semantics"
title="Wikipedia on Operational Semantics">operational
for the JVM for the aspects of the system a program
was interested in. For example, there is a fundamental difference in
visible changes between programs that introspect on the system and
those that do not. Examples of introspection include calling core
reflection, relying on stack trace output, using timing measurements
to influence code execution, and so on. For programs that do not use,
say, core reflection, changes to the structure of libraries, such as
adding new public methods, is entirely transparent. In
contrast, a (poorly behaved) program could use reflection to look up
the set of public methods on a library class and throw an
exception if any unexpected methods were present. A tricky program
could even make decisions based on information like a timing href="http://en.wikipedia.org/wiki/Side_channel_attack">side
channel. For example, two threads could repeatedly run different
operations and make some indication of progress, for example, href="http://java.sun.com/javase/6/docs/api/java/util/concurrent/atomic/AtomicInteger.html#incrementAndGet()"
title="Java SE 6 Specification for
AtomicInteger.incrementAndGet">incrementing an atomic counter, and
the relative rates of progress could be compared. If the ratio is
over a certain threshold, some unrelated action could be taken, or
not. This allows a program to create a dependence on the optimization
capabilities of a particular JVM, which is generally outside a
reasonable behavioral compatibility contract.

The evolution of a library is constrained by the library's contract
included in its specification; for final classes this
contract doesn't usually include a prohibition of adding new public
methods! While an end-user may not care why a program does not work
with a newer version of a library, what contracts are being followed
or broken should determine which party has the onus for fixing the
problem. That said, there are times in evolving the JDK when
differences are found between the specified behavior and the actual
behavior (for example href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4707389"
title="Sun bug 4707389 {Float, Double}.valueOf erroneously accepts
integer strings">4707389, href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6365176"
title="Sun bug 6365176
java.math.BigInteger.ZERO.multiply(null)">6365176). The two basic
approaches to fixing these bugs are to change the implementation to
match the specified behavior or to change the specification (in a
platform release) to match the implementation's (perhaps
long-standing) behavior; often the latter option is chosen since it
has a lower de facto impact on behavioral compatibility.

Case Study

Consider two versions of a simple enum representing the crew of the
USS Enterprise, one for the first season:

public enum StarTrekCast {
JANICE_RAND("Yeoman Rand"),
UHURA("Uhura"); // Any first name for Uhura is non-canon.
private String nickname;
StarTrekCast(String nickname) {

public String nickname() { return nickname;}

and another for the second season:

public enum StarTrekCast {
/\* JANICE_RAND("Yeoman Rand"), \*/ // Only in 8 episodes!
PAVEL_CHEKOV("Chekov"), // Introduced in season 2.
UHURA("Uhura"); // Any first name for Uhura is non-canon.
private String nickname;
StarTrekCast(String nickname) {

public String nickname() { return nickname;}

Compared to the first reason, the second season:

  1. Deletes yeoman Janice Rand

  2. Adds Pavel Chekov

  3. Reorders Bones, Scotty, and Spock to better reflect the order of
    who commands the ship if the Captain and others are unavailable.

These changes have varying source, binary, and behavioral compatibility

  1. Deleting JANICE_RAND is source incompatible, able to break
    compilations. The deletion is also binary incompatible. Besides
    being observable via reflection, the deletion affects the behavior of
    various built-in methods on the enum, including values and
    valueOf. In addition, the deletion will break previously
    serialized streams with this constant.

  2. Adding CHEKOV is binary-preserving source compatible.
    Likewise, the addition of a new public static final field is binary
    compatible. However, the addition of a new constant is visible to
    reflection and alters the behavior of built-in enum methods. Existing
    serialized instance continue to work after a new constant is

  3. Reordering McCoy, Scotty, and Spock is a binary-preserving
    source compatible and binary compatible change, but the reordering
    changes the behavior of built-in methods, most notably

JDK Platform and Update Release Compatibility Policies

The title="JDK 6 Release Notes on Compatibility">compatibility
we apply to platform releases, like JDK 7, differ from
those applied to maintenance and update releases, like JDK 6 updates.
For both kinds of releases, binary compatibility must be maintained
for JCP-managed APIs. Update releases must maintain source
compatibility, but platform releases are able to break source
compatibility given sufficient justification. In update releases,
behavioral compatibility is regarded as very important; programs may
be relying on specified-to-be-unspecified behavior of a particular
implementation and switching to another update in the same release
family should be seamless whenever possible. In contrast, platform releases have fewer
restrictions on changing such behavior. So, for example, modifying the
order of iteration of elements in a HashMap to allow faster
hashing algorithms, would be quite appropriate for a platform release
title="Java SE 6 specification for HashMap">"This class makes no
guarantees as to the order of the map; in particular, it does not
guarantee that the order will remain constant over time."), but
would be much less suited to an update release.

Managing Compatibility

Original Preface to JLS

Except for timing dependencies or other non-determinisms and given
sufficient time and sufficient memory space, a program written in the
Java programming language should compute the same result on all
machines and in all implementations.

The above statement from the original JLS could be regarded as
vacuously true about any platform: except for the non-determinisms, a
program is deterministic. The difference was that in Java, with
programmer discipline, the set of deterministic programs was nontrivial
and the set of predictable programs was quite large. In other
words, the platform provider and the programmer both have
responsibilities in making programs portable in practice; the platform
should abide by the specification and conversely programs should
tolerate any valid implementation of the specification.

To make continued evolution of the platform more tractable, it may be
helpful to introduce more structured ways of tracking behavioral
changes so that programs could in principle by audited for depending
on aspects of the platform in ways that are not recommended. For
example, potentially annotations could be used to:

  1. Mark classes and methods whose specification has changed in a
    release (analogous to change bars in a written document).

  2. Record stability information about a method's contract,
    deterministic, non-deterministic, volatile (expected to change over
    time), etc., for example whether the hashCode of a class is
    specified to return particular values or just obey href="http://java.sun.com/javase/6/docs/api/java/lang/Object.html#hashCode()"
    title="Object.hashCode in Java SE 6">the general contract.

  3. Using com.sun.\* annotations, annotate constructs whose
    implementations we have changed in our specific implementation in a
    particular release, such as HashMap ordering.

Annotation processing is a general purpose meta-programming framework,
standardized as part of the
as of JDK 6. Annotation processors, probably also using
the href="http://java.sun.com/javase/6/docs/jdk/api/javac/tree/index.html"
title="javac tree API">tree API, could be written to check for
usage of changed or problematic APIs in source code. The D compiler in DTrace can enforce analogous limits on the stability levels and dependency classes of D scripts.

While there would be considerable cost and complication to designing
such a scheme and retrofitting it onto at least a subset of the JDK,
the ability to define and then programmatically test policies for
behavioral compatibility issues could enable platform providers and
programmers to have a smoother joint stewardship of keeping
applications running and Java usage growing.


Compatibility is a multifaceted concept, with nuances within each
broad category. In the future, annotation processors or other
program analyzers might help manage source, binary, and behavioral
analysis by direct analysis or program markup.


Éamonn McManus gave useful feedback on a draft of this entry.


There are some cases where such an adversarial program could be
thwarted in practice. For example, when the Unicode version supported
by JDK platform is upgraded previously illegal identifier strings are
often allowed. A new JDK platform class could use the newly valid
names not open to preexisting malicious clients; although new
adversaries could afterward use the new name. This assumes the
compatibility threat model only includes class files generated from
Java sources. As of class file version 49.0 for JDK 5 and later, at
the JVM level many more identifiers are legal than those accepted in
Java source.

Even code that always uses fully qualified names is not completely
immune from ambiguities and unintended (or malicious) changes in the
meaning of names stemming from changes in the library environment
since distinct types can have the same fully qualified name. For
example, the type name "a.b.C" could refer to:

  • class C in package a.b:

    package a.b;
    public class C {}

  • class C nested inside class b where
    class b is a member of package a:

    package a;
    public class b {
    public static class C{}

  • class C nested inside class b which
    is in turn nested inside class a where class
    a is a member of an unnamed package (href="http://java.sun.com/docs/books/jls/third_edition/html/packages.html#7.4.2"
    title="JLSv3 7.4.2 Unnamed Packages">unnamed packages are not href="http://en.wikipedia.org/wiki/Highlander_(film)" title="There can
    be only one!">Immortal):

    public class a {
    public static class b {

    public static class C{}

These three classes cannot all be compiled together ("package a.b
clashes with class of same name
"); however, they can be compiled
separately to the same output location and so can all appear on a
classpath when another file is compiled. If all three are on the
classpath together, when other code is compiled the qualified name
"a.b.C" resolves to the doubly-nested class C in an
unnamed package.

To avoid such name collisions, binary names use "$"
instead of "." to separate the name of an enclosing class
from a nested class, leading to the distinct binary names
"a.b.C", "a.b$C", and "a$b$C",
respectively, for the classes in question. Following the recommended
title="JLSv3 6.8 Naming Conventions">naming conventions avoids
such name clashes. Therefore, such name clashes should be rare in
practice when compiling against libraries following the conventions,
as JCP moderated java.\* and javax.\* APIs
should do. As an extreme case, do not write this program:

public class java {
public static class lang {

public static class String {

String(Object o){}

public static void main(String... args) {

java.lang.String s =

new java.lang.String("I don't think this means " +

"what you think it means.");

if (!s.getClass().getName().equals("java.lang.String"))


In this perverse example, the nested class java.lang href="http://java.sun.com/docs/books/jls/third_edition/html/names.html#6.3.2"
title="JLSv3 6.3.2 Obscured Declarations">obscures
venerable java.lang package and the local
java.lang.String declaration href="http://java.sun.com/docs/books/jls/third_edition/html/names.html#6.3.1"
title="JLSv3 6.3.1 Shadowing Declarations">shadows
standard href="http://java.sun.com/javase/6/docs/api/java/lang/String.html"
title="Good old java.lang.String from Java SE

Further Reading

  1. href="http://wiki.eclipse.org/index.php/Evolving_Java-based_APIs">
    Evolving Java-based APIs
    , by Jim des Rivières.

  2. href="http://blogs.sun.com/abuckley/entry/on_compatibility">Different
    kinds of compatibility
    by Alex Buckley.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.