Scope ambiguities between outer and super
By john.rose on Aug 01, 2007
Gilad Bracha has published a nice paper reviewing an interaction between inheritance and block scoping, specifically the problem of deciding whether a simple (unqualified) name comes from a superclass or from an outer scope. This issue only arises in languages which can nest classes within block scopes, and intermix regular definitions with subclass definitions. This issue arose for Java when we added nested classes.
Similar problems arise any time two definitions of the same name come into scope in one place. Examples include shadowing local variable definitions, import statements, multiple inheritance, or method overloading. The usual way to resolve ambiguity in a name is to arrange the candidate definitions in some sort of order of specificity. (It need not be a total order, but total orders are usually the simplest choice. If there is a total order, it can be viewed as a search order, in which potential definition sites are searched) Then language uses the "most specific" definition, if there is a unique one, else the program is in error. When such rules apply to block structure, they almost always prefer more recent, more "inner" definitions over earlier, "outer" definitions. (This is more or less what block scoping is.)
When this generic practice is applied to a combination of block scoping and inheritance, the current class at a given scope level imports its inherited names, as if they had been defined along with the current class. (A subclass definition can be viewed including an import or even a reassertion of the super class's definitions. Try following this logic in to give a meaning to block-local import statements in Java, and you'll see some interesting language extensions.) Thus, we look for definitions in an inner class's super-class chain before consulting the next-most-inner class's super-class chain, and so on out to any top-level super class chain. As Gilad points out, this has been called "comb semantics" for identifier lookup, because a comb-shaped (single-spline) tree is being searched.
A decade ago, I briefly contributed to this topic in an aside in the original whitepaper from Sun:
Sometimes the combination of inheritance and lexical scoping can be confusing. For example, if the class E inherited a field named array from Enumeration, the field would hide the parameter of the same name in the enclosing scope. To prevent ambiguity in such cases, Java 1.1 allows inherited names to hide ones defined in enclosing block or class scopes, but prohibits them from being used without explicit qualification. [Sun, Inner Classes White Paper, 1997]
This rule has been dubbed the "Mother May I" rule, because when a programmer uses a name with this ambiguity (between outer and super), the compiler reports an error, and requires the programmer to supply a more specific intention, by providing a qualifier which refers either to the outer scope or the inherited scope. (If there is no such qualifier possible, the intended variable is a local, and it may therefore be renamed to remove the ambiguity.) Instead of using the classic total order (the comb rule) to resolve the ambiguity, the compiler refuses to pick between the alternatives, and (with a mildly annoying maternal nudge) requests the programmer to make an explicit choice.
This rule rejects ambiguity by refusing to totally order the choices. It is something like Java's original rule for local variable shadowing. Java does not allow two definitions of the same local variable, when one is in the scope of the other. The error is reported when the inner definition is parsed, but the rule can also be thought of as a refusal to linearly order the two definitions, so that any use of the name in the shared scope fails to match to a uniquely most specific definition.
The "Mother May I" rule was enforced in the 1.1 through 1.3 versions of the Java language. It was not always perfectly enforced in 1.1; see for example the 1997 Sun bug 4030368 against javac, which includes a concise code example. If you try this example in version 1.3 or earlier, you'll get a message like this:
InheritHideBug.java:10: m() is inherited from InheritHideBug.S and hides method in outer class InheritHideBug. An explicit 'this' qualifier must be used to select the desired instance. m(); // BAD \^
In 1.4 or later versions of javac, the compiler will quietly use the total order from the comb rule, and (like the kind of mother who puts your things away for you) selects the inherited method as the more desirable choice. Is this a recurrence of bug 4030368? Not exactly; the "Mother May I" rule defined in 1.1 was quietly removed in 1.4 when generics were introduced. I disagreed with this choice when it was made, but I was working on different things (the Hotspot JVM) at the time, and thus was born a small subspecies of Java puzzler.
I ran into this subspecies again a few months ago, during a visit to Sun by Neal Gafter, who cited it as an interesting source of bugs arising from inner classes, a problem that can be repaired by closures. (As fate would have it, this was in Sun's "Mother May I" conference room. I am not making this up.) I rejoined that this used to be a solved problem.
I'm glad that Gilad has taken up this problem, and I agree (since 1997) with his contention that, in a conflict between an unseen inherited name and a visible (or global) definition in an outer scope, the programmer usually wants the uplevel name, not the inherited name. As his current paper's conclusion says,
We argue that the classic "comb" semantics whereby inherited members may obscure lexically visible ones are counterintuitive. Raising this issue is itself a contribution of this paper. Beyond that, we have demonstrated a solution in the context of Newspeak, a language derived from Smalltalk. We believe that lexically visible definitions should be given priority over inherited ones, either implicitly via scope rules or by requiring references to inherited features to explicitly specify their intended receiver. [Bracha, On ... Inheritance and Nesting, 2007]
I contend that the "Mother May I" rule (an instance of the last option he mentions) provides the best way to simply inform the programmer of lurking ambiguities, and to protect references to uplevel names from being silently captured by superclasses.
You may be wondering why Sun took out this rule in 1.4, about five years ago. The answer is also stranger than I could make up. Gilad was responsible for writing the updated specification for Java which included generics. (As both a language wonk and a programmer I enjoy the brilliant work he did adding generics to Java.) He removed the rule from 1.1, apparently preferring the simplicity of the comb rule. When I asked him to reconsider, he refused. At this point the term "Mother May I rule" was first applied to this problem, though each of us thinks the other coined this usage. (Gilad recalls that I used that phrase generically for a variety of rules that Java imposes on programmers for their own protection--whether they want it or not. For my part, I prefer to think in terms of ambiguity hazards and having the compiler press the programmer for clearer formulations.)
It is good that Gilad has taken a deeper look at this problem, and that his new language will not suffer from Java's problems in this matter.
Better yet, thank you, Gilad, for re-raising the issue. I see that it's past time to file a bug and ask for the return of the "Mother May I" rule in Java.