Self types (aka type of this)

Neal directed my attention to Laird Nelson's struggle with generics.

Many people have found themselves in a similar situation and I can certainly empathize. Generally speaking, the problem is the need to refer to the type of the current class. This is called self-types or the type of this. But I am getting ahead of myself, let us first examine if there are ways to solve the problem in the current language.

Design with Generic Types

Software development starts with the design phase. This is when you design your software in a small group by the whiteboard and draw informal diagrams. The boxes you draw on the whiteboard represents concepts in the application domain and will eventually result in classes and interfaces. You may also draw lines between concepts that are related. At this time you are not too concerned about how to implement the behavior but how the concepts are related.

The design phase is the ideal time to decide what type variables you need. In most cases you will need a type parameter to when a class aggregates (behaves as a collection of) other objects of varying types depending on usage. For example, consider event handlers: if you design a general event handler that can handle all sorts of events this event handler does not need to be parameterized with the type of event it handles. On the other hand, if you have multiple kinds of event handlers, one for mouse events, one for keyboard events, and one for timer events, then it may make sense to have a single event handler class that is parameterized with the event type.

Ignore how to implement the classes. If a type variable does not make sense on the whiteboard, it does not belong in your code.

It does not matter if you are new to generic types, a brilliant type theoretician, or a recovering C++ template meta-programming addict: it is too easy to use type variable for things they are not suited for!

This lesson is particularly hard to learn if you are used to C++ templates. Generic types are not templates and you should not use type parameters for implementation convenience.

Laird Nelson's Scenario

Laird Nelson describes a scenario with objects, references, and adapters. Objects can be canonically identified and references can be dereferenced. I do not know why references are needed and I would personally go for something simpler. However, I will assume that there are good reasons for having all the classes and interfaces described by Laird Nelson but note that I do not fully understand the purpose of all of them.

The goal is to be able to reference objects type-safely as illustrated by this example:

Person p = null;
Reference</* what goes here? */> ref = p.getCanonicalReference();
Person p2 = ref.dereference();

My Design

So I put all the above theory to the test and started drawing Laird Nelson's example on my whiteboard. Types like Dereferenceable and Reference are naturally parameterized to specify the type of what they reference. Similarly, a BaseObjectAdapter can be parameterized over what it adapts. I am not so sure about CanonicallyIdentified so I chose not to parameterize it. It is easy to do so if it makes sense, though. Since I do not know anything about BaseObject I do not see a need to add any parameters. Certainly neither aParty nor a Person are parameterized.

So after the design phase, it could look like this:

class DereferenceException extends Exception {}

interface Dereferenceable<T extends BaseObject> {}

class Reference<T extends BaseObject> implements Dereferenceable<T> {}

interface CanonicallyIdentified {}

interface BaseObject extends CanonicallyIdentified {}

interface Party extends BaseObject {}

interface Person extends Party {}

class BaseObjectAdapter<T extends BaseObject> implements BaseObject {}

Compare to where Laird Nelson gave up:

class DereferenceException extends Exception {}

interface Dereferenceable<T extends BaseObject> {}

class Reference<T extends BaseObject>
    implements Dereferenceable<T> {}

interface CanonicallyIdentified {}

interface BaseObject extends CanonicallyIdentified {}


interface Party extends BaseObject {}


interface Person extends Party {}


class BaseObjectAdapter<T extends BaseObject>
    implements BaseObject {}

My suggestion

class DereferenceException extends Exception {}

interface Dereferenceable<T extends BaseObject<T>> {}

class Reference<T extends BaseObject<T>>
   implements Dereferenceable<T> {}

interface CanonicallyIdentified /* crap. */ {}

interface BaseObject<T extends BaseObject<T>>
    extends CanonicallyIdentified<T> {}

interface Party<T extends BaseObject<T>>
    extends BaseObject<T> {}

interface Person<T extends Person<T>>
    extends Party<T> {}

class BaseObjectAdapter<T extends BaseObject<T>>
    implements BaseObject<T> {}

Laird Nelson's original example after generics

Implementing My Design

During the design phase implementation details are not too important. However, once the design is mature we do need to worry about how to implement it? There are a few tricks to learn but most of it is straightforward, at least Dereferenceable, and Reference are:

interface Dereferenceable<T extends BaseObject> {
    T dereference() throws DereferenceException;
}

class Reference<T extends BaseObject> implements Dereferenceable<T> {
    public T dereference() throws DereferenceException {
        return null; // or something
    }
}

CanonicallyIdentified may not seem as straightforward and what about Laird Nelson's example:

Person p = null;
Reference</* what goes here? */> ref = p.getCanonicalReference();
Person p2 = ref.dereference();

So clearly, I must parameterize CanonicallyIdentified? No. I was unsure at the whiteboard and decided not to add parameterize CanonicallyIdentified and it is not used in the example anyway. I will use a wildcard:

interface CanonicallyIdentified {
    Reference<?> getCanonicalReference();
}

On the other hand, had I decided (at the whiteboard) that it did make sense to parameterize CanonicallyIdentified it could look like this:

interface CanonicallyIdentified<T> {
    Reference<? extends T> getCanonicalReference();
}

This decision is based on the the design of the object hierarchies, not on where types flow. BaseObject looks like this:

interface BaseObject extends CanonicallyIdentified {}

If I chose to parameterize CanonicallyIdentified, it would look like this:

interface BaseObject extends CanonicallyIdentified<BaseObject> {}

BaseObject is not parameterized and should not be.

Party and Person are not parameterized either but we do want to know the type of the canonical reference. The answer is to override the method and specialize the return type (covariant return types):

interface Party extends BaseObject {
    String getSortName(); // or whatever
    Reference<? extends Party> getCanonicalReference();
}

interface Person extends Party {
    Reference<? extends Person> getCanonicalReference();
}

This also answers the question about what type argument to use in the example above:

Person p = null;
Reference<? extends Person> ref = p.getCanonicalReference();
Person p2 = ref.dereference();

Finally, BaseObjectAdapter:

class BaseObjectAdapter<T extends BaseObject> implements BaseObject {
    /* various instance fields... */
    private Reference<T> canonicalReference; // with the usual getters and setters
    public Reference<T> getCanonicalReference() {
        return canonicalReference;
    }
}

Is there a lesson to learn from this? Generic types are not C++ templates. Design your types to have the type parameters they naturally need and do not add unnecessary type parameters to save yourself from typing. Joe and I have been talking about best practices when using generics for software design at JavaOne 2005 and 2006. We recommend that you try to avoid unnecessary type variables. Sometimes the solution is to not use generics.

Generic Methods

Type parameters on generic methods are different. However, not too much. Never use type parameters on public methods if they only benefit the implementation. Instead use a wildcard and a private generic method if you need to name a type when implementing the behavior. For example, consider how to implement Collections.reverse:

public static void reverse(List<?> list) {
    reverse0(list);
}
private static <T> void reverse0(List<T> list) {
    ListIterator<T> fwd = list.listIterator();
    ListIterator<T> rev = list.listIterator(list.size());
    for (int i=0, mid=list.size()>>1; i<mid; i++) {
        T tmp = fwd.next();
        fwd.set(rev.previous());
        rev.set(tmp);
    }
}

Self-Types

What about self-types? The Java™ programming language does not have self-types right now. Self-types would allow you to refer to the type of the receiver (the type of this, the current class). Imagine we used this to denote a self-type and we could avoid overriding getCanonicalReference:

interface CanonicallyIdentified {
    Reference<? extends this> getCanonicalReference();
}

interface Party extends BaseObject {
    String getSortName(); // or whatever
}

interface Person extends Party {}

Compare this to when I used covariant return types:

interface CanonicallyIdentified {
    Reference<? extends this> getCanonicalReference();
}

interface Party extends BaseObject {
    String getSortName(); // or whatever
}


interface Person extends Party {}

 

With self-types

interface CanonicallyIdentified {
    Reference<?> getCanonicalReference();
}

interface Party extends BaseObject {
    String getSortName(); // or whatever
    Reference<? extends Party> getCanonicalReference();
}

interface Person extends Party {
    Reference<? extends Person> getCanonicalReference();
}

Without self-types

The Strongtalk type system for Smalltalk has self-types. You can download the Strongtalk system from www.strongtalk.org. The LOOJ paper by Bruce and Foster includes a proposal for adding ThisClass to the Java programming language.

If self-types were added to the Java programming language, it would be obvious to consider retrofitting this onto Object.clone():

protected native this clone() throws CloneNotSupportedException;

Unfortunately, this is not possible because the specification of that method does not require:

x.clone().getClass() == x.getClass()

It is only recommended and such a change could then break existing programs that follow the specification. Although we sometimes have to break source compatibility, breaking programs that follow the specification is not a viable option. In the situation where new API is defined or the subtypes of a class are controlled, it is possible to take advantage of self-types on the clone method:

class NewClass implements Cloneable {
    protected this clone() {
        return (this)super.clone(); // cast required as we cannot retrofit Object.clone()
    }
}

Since it is already possible to use covariant return types to simulate this behavior today and we cannot retrofit Object.clone() I consider it unlikely that we will add self-types to the Java programming language anytime soon.

Acknowledgments

Joseph D. Darcy and Alex Buckley provided me with a lot of useful feedback on the early drafts and helped me get the flow better.

Comments:

Post a Comment:
Comments are closed for this entry.
About

The Former Weblog of Peter Ahé

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today