Naming the null type

Everyone knows that java.lang.Object is the common superclass of all Java classes. It is also the common supertype of all interfaces, which do not 'extend' Object but do support the Object protocol. This makes it the Top type, useful for programming generic algorithms.

Top represents all values in a programming language. It ensures that the type hierarchy is a complete partial order by providing an upper bound for every pair of types. Computing the upper bound of types is what makes assignment and method call work (via widening reference conversion), so a well-founded type hierarchy is important.

(Ignore that the complete partial order for primitive types is distinct from the complete partial order for reference types. Sigh.)

The counterpart to Top is Bottom, a type that is the common subtype of all other types. Bottom makes the type hierarchy into a lattice because it ensures every pair of types has a lower bound. Lower bounds play a role in Java wildcards - specifically, capture conversion and type inference - so it could be useful to know that every type has a lower bound.

Java has the null type. Pre-JLS3, the null type was not officially the subtype of any type, and the null reference was not officially a value of any type except the null type. A decree made the null reference castable to any reference type for pragmatic reasons. (This is similar to the decree that makes List<?> assignable to a List<T> formal parameter even though List<?> is not a subtype of List<T>. You know the decree as capture conversion.) JLS3 defines the null type as a subtype of every type, so it looks an awful lot like Bottom.

(Strictly, JLS3 restricts the null type to be a subtype of every reference type. Again, just ignore primitive types.)

The null type is expressible, i.e. can be the type of a term. The compiler will expose it if necessary, e.g. int x = true?null:null;. But it is not denotable, i.e. cannot be written as the type of a term. You can't write NullType v = null;. An RFE asks for a name for the null type. Is this a good idea?

Beyond the use case in the RFE, being able to denote NullType would be useful in certain situations where type inference fails, because NullType may be a better actual type argument than Object. So that's in NullType's favor.

Bottom is usually not a denotable (or even expressible) type in textbook type systems because type rules must be special-cased to ignore it. (See Pierce 15.4,16.4) But in Java, the presence of a value for the null type means expression evaluations has always had to consider the null type, responding with a NullPointerException. (Indeed, the null reference means that the null type is not a true Bottom type.) Introducing NullType would allow more variables to store the null reference, but such variables evaluate to the null reference just like any variable of reference type can.

Statements would need tweaking. Consider the if statement: "The Expression must have type boolean or Boolean, or a compile-time error occurs." A type system with Bottom would allow the expression to have the Bottom type by subsumption, so traditionally an extra rule would catch that case and assign Bottom as the type of the whole statement. We just want if ([expression of null type]) ... to be illegal, so would need an "exactly" before "type boolean or Boolean".

[The first version of this blog entry said this wasn't necessary because final types didn't have any subtypes, not even the null type. Prompted by Remi's comment, I changed my mind. While a final class has no further implementations, special subtypes are possible.]

So, since Java already has the null reference, there is less problem adding NullType than if the null reference didn't exist.

Arrays cause a slight pain. A NullType[] can store only null values, which appears useless but someone will want it. On the face of it, we need the null type to be reified to enforce array covariance:

NullType[] n = new NullType[5];
Object[] o = n;
o[0] = new Object(); // Statically safe, dynamically unsafe - Object, a supertype of NullType, is now stored in n

To avoid reification, we could define NullType[] as a static equivalent of List<? super NullType>. Then, elements could be added to a NullType[] but not removed (except as Object). A more drastic idea is to make arrays of NullType denotable but uninstantiatable, like arrays of generic types. The value of all this is becoming questionable.

Denoting the null type is less useful in Java than might be expected. Consider the classic uses of the Bottom type:


  • A return type for a function which doesn't return. Since the Bottom type is empty, the function has no possible return value. Those of you now hoping that NullType could indicate a method which tail-calls itself are out of luck, because the method could just return null;. What you need is Neal's Unreachable type, which is a true Bottom type because it's a subtype of everything and it's empty.
  • Signalling errors. Java has exceptions. Next.
  • A stand-in where no other reference type will do, as exemplified in the RFE. Here, the special subtyping properties of Bottom are less interesting than its emptiness. Neal's Void type is useful here.

To summarize, the null reference makes NullType in Java weaker than Bottom, which in turn makes NullType less problematic than Bottom but also less useful. No other major programming language denotes NullType, let alone Bottom, so it is hard to claim that Java is falling behind by not having NullType. It doesn't make things simpler, nor radically expand the space of programs that can be easily written, so don't look for it in JLS4.

Comments:

Even though not defined as a "type" per se, the "nil" constant in Objective-C does "feel" like the sub-type of all types, as it responds to all messages (which could be seen as methods for argument's sake) by doing nothing and returning a default value (0 for primitives and nil for references).

This might cause unwanted bugs, but at the same time prevents the need to check for "nil" in most simple cases. Image the following:

String name = person.getParent().getChildren().get(0).getName(); // get the eldest sibling's name.

To be completely NPE safe, I would have to make many checks. With each method returning null if the reference itself was null, all I need to check is that "name" is not null.

Posted by Aviad Ben Dov on September 21, 2007 at 04:30 AM PDT #

Alex, I think you're too quick at dismissing the uses of type-of-null for writing generic functions. For example, assume we have a visitor class parameterized by the return type of the visitor methods: Visitor<R>. Furthermore, assume we have a method that operates on visitors that returns strings, and one that that operates on visitors that return booleans: void doSomething1(Visitor<String> visitor) and void doSomething2(Visitor<Boolean> visitor). Now let's say you want to write a visitor that tests how these methods handle null. What would be the type of this visitor?

Posted by Peter Ahé on September 21, 2007 at 06:02 AM PDT #

Aviad,

Very interesting. I started looking at the null type because someone pointed out that 'null instanceof ...' is false, the rationale being that returning true will probably lead to messages being sent to null which can only lead to NPE. Stephen Colebourne has proposed that Java be able to simulate the Obj-C behavior with his null-ignore invocation syntax [1].

The dual of null-ignore invocation has also been proposed: void methods would return non-null objects [2]. It has apparently been implemented in OpenJDK, but I think it raises many unanswered questions. Formal self types would be better.

In Obj-C, I note that the static return type of the message sent to nil matters - if the return type is non-pointer-sized (e.g. float, struct), the return value from nil is undefined. I guess this is an implementation detail, of nil's representation, showing through the spec.

[1] http://www.jroller.com/scolebourne/entry/java_7_null_ignore_invocation

[2] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6373386

Posted by Alex Buckley on September 21, 2007 at 06:23 AM PDT #

A language shouldn't have a null, NULL, nil, ... It's a language design error!

Posted by Anonymous on September 21, 2007 at 07:41 PM PDT #

Alex,

You are right by saying that this is an implementation detail, but it is of how messages work in ObjC.

Eventually nil is also ((void\*)0). The difference is made during runtime, where messages send to nil are captured and nil is returned. The reason a non-pointer-sized return value is undefined is because from the message request the runtime cannot figure out the expected return value, unlike in Java where methods' return values are known at call-time.

That's why I believe it would be easier to implement this ability in Java.

Posted by Aviad Ben Dov on September 21, 2007 at 11:04 PM PDT #

Hi Alex,
about the if Statement, i think the JLS should allow any type that can be assignable to boolean.
According to the spec, this should not compile:

List<? extends Boolean> list=null;
if (list.get(0)) {

}

Hopefully, javac doesn't follow that part of the spec :)

About the fact that Boolean is final,
final is not currently used by the type system, by example
List<String> is not equivalent to
List<? extends String>.
I think it's because removing final is not considered as a change that breaks backward compatibility.

Do you planed to change that in the JLS4 ?

Rémi

Posted by Rémi Forax on September 22, 2007 at 05:27 AM PDT #

Remi, your bug 6404665 was one of the first I looked at, and I agree that many statements should allow more relaxed types. (This goes slightly counter to the edit I've made to the post, about "exactly type XYZ", but I am not trying to redefine everything right now.) No changes are planned w.r.t. final.

Posted by Alex Buckley on September 22, 2007 at 11:48 AM PDT #

I think that maybe Void could be used, because it already has the right property of only having null as its value. Thus there is a natural extension of assigning a Void to anything since you will be assigning null.

As an aside; I also think it is useful if methods of type Void didn't return a value, then they automatically return null. I suggested this in my C3S proposal for Java:

http://www.artima.com/weblogs/viewpost.jsp?thread=182412

See rule 13

Posted by Howard Lovatt on September 23, 2007 at 03:47 PM PDT #

ANY and NONE. ANY is similar to Java's Object - i.e. every class extends it. Since Eiffel has no interfaces and it has multiple inheritance, there's no need for special creed about interfaces conforming to Object protocol.

Eiffel's NONE is what it says - no object is of that type, but null expression is.

Any object is assignable to ANY, and nothing is assignable to NONE except for null. NONE 'extends' all classes and therefore null - the only expression of type NONE- is assignable to any type.

I agree with Alex. Since Java has quite simplified (and easier - possibly more practical) relations between classes, interfaces, Object and null, there's no much we can do about it. Who would \*really\* need a NullType - it sounds to me quite an absurd abstraction.

Posted by Peter Kehl on October 06, 2007 at 07:15 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

Alex Buckley is the Specification Lead for the Java language and JVM at Oracle.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Feeds