A wrinkle with 'module'

We hoped very much that the 'module' restricted keyword could be disambiguated everywhere in the language with only a fixed syntactic lookahead. That is, a compiler could treat 'module' as an identifier everywhere except in certain productions, for which a simple algorithm would use the immediate context to determine if 'module' was a modifier or an identifier. Even edge cases seemed to support this hope:

module class C { ...
module module module;
module module module() { ...

However, consider this code:

class foo {
module foo() { ...

Is foo() a method with a return type of 'module', or a module-private constructor? The former is legal today, and though it's very bad practice to have a method take the name of the class, it must remain legal in JDK7. So how can we disambiguate it from a module-private constructor?

One option is to do semantic analysis of the subsequent method body. javac currently perceives any 'return' statement in the body of a constructor, regardless of control flow, as an error, so just checking for no returns would be enough to claim it's not a constructor. But this level of analysis is completely inappropriate in a parser.

Another option, which we prefer, is to recognize that method name == class name is bad practice and that it's defensible to parse the term:

  'module' <identifier> '('
as a module-private constructor if the identifier is equivalent to the class name. If you're currently using the term to declare a method which returns a 'module' object, the compiler will complain about the method's 'return' statement(s) having an expression - but fear not, you will be able to put the 'package' modifier on your dubious declaration to make the compiler realize that 'module' is a return type not a modifier. (Actually, any accessibility modifier will do.) Admittedly, this means that the 'module' restricted keyword is not 100% backward-compatible, but it's pretty close. We've thought for years about introducing 'package' as an explicit modifier, to increase consistency and to allow package-package interface members. It finally looks like we have a compelling reason to do it.

If constructors were more strongly called out in the language, then no ambiguity would occur for 'module'. A 'constructor' modifier would suffice, or mandating a reserved name like 'init', or just defining any method-like declaration as a constructor if it has the name of the class.

Compared to class types, enum types are simple. Their constructors cannot be 'public' or 'protected' because creation of enum objects is heavily controlled. Making module-private constructors illegal is a no-brainer. Therefore, in a poorly named method (shares the enum's name), a 'module' identifier is automatically a return type. Interface types are also simple; with no constructors to worry about, the simple syntactic lookahead rules disambiguate 'module'-as-modifier from 'module'-as-identifier in any method, poorly named or not.

Edit: Clarified that a 'return' in an existing 'module <identifier> (' method would have an expression, which is illegal in a constructor.

Comments:

'return' in a constructor is perfectly legal. If javac no longer accepts it, that is a regression.

Posted by Neal Gafter on July 28, 2008 at 12:16 PM PDT #

Also, semantic analysis of the body might not help if it never completes normally, such as if it only contains a "throw" statement.

Posted by Anonymous on July 28, 2008 at 12:24 PM PDT #

I think adding a source statement, see:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6519124

This would allow new keywords to be introduced without causing backward compatibility problems.

Posted by Howard Lovatt on July 28, 2008 at 04:03 PM PDT #

This seems like a reasonable compromise, although I think a proper 'module' keyword would be a lot cleaner. I assume that 'package' will be the package scope modifier?

Posted by Stephen Colebourne on July 28, 2008 at 07:31 PM PDT #

+1 to Stephen on the simplicity count. Plenty of food for new puzzlers :-) but I realize it is probably inevitable.

I have some experience in migrating large codebases, my suggestion is in case of doubt:
\* by default interpret the code as if it was pre-Java7,
\* require explicit instruction for new semantics to be used.

The reason is that if I want a module-visible constructor, it means I am writing new code, and adding a disambiguating annotation is easy. However if somewhere deep down in my code-base I have this kind of methods, all I want is for them to compile with Java7 just like they did before - I don't want to touch what I don't need to.

My 2c.

Posted by Yardena on July 28, 2008 at 10:16 PM PDT #

I am a big fan of using a "source" or "version" keyword at the beginning of each file. Then you could make "module" a real keyword for "version 7". That way, the developers can work on a mixed source base while they are migrating to java7 without complicating the compilation process. Currently, since you have to specify the source version as a compiler option, it is very difficult to compile mixed source bases. By the way, the same goes for an "encoding" keyword. So each file would/could start with

version 7;
encoding UTF-8;
package x.y.z;

Posted by Roel Spilker on July 28, 2008 at 11:24 PM PDT #

what about CORBA? it already has that, we should wait for CORBA2.

Posted by raveman on July 29, 2008 at 03:28 AM PDT #

I think the parser should raise a warning if 'module' is used not as a modifier to educate developer.

Rémi

Posted by Rémi Forax on July 29, 2008 at 09:23 AM PDT #

The addition of an encoding keyword as well as a source/version keyword is a good suggestion. These additions make life easier for programmer, IDE, and compiler alike.

Posted by Howard Lovatt on July 29, 2008 at 03:07 PM PDT #

+1 for making 'module' a true keyword. Just do what you do with 'enum'. IDE search/replace is an extremely easy solution for migration.

Posted by Paul on July 29, 2008 at 03:46 PM PDT #

I applaud Alex's effort to try to make the "restricted keyword" notion work, vs. how the assert and enum keywords were introduced. Source version specifiers, whether as a compiler option or embedded in the source file, are insufficient if you need to refer to a pre-existing API element that shares the new keyword's name: at least there should be an escape syntax to specify the keyword as a regular identifier (I believe Scala has such a syntax).

Posted by Anonymous on July 30, 2008 at 02:49 PM PDT #

@Anonymous, In the proposal:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6519124

There is an escape mechanism proposed that would allow pre-existing API's that use the module keyword as an identifier to be called.

Posted by Howard Lovatt on July 31, 2008 at 09:58 AM PDT #

Howard, using backticks seems to be the favorite for doing that.

Posted by Paul on August 01, 2008 at 04:51 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

Alex Buckley is the Specification Lead for the Java language and JVM at Oracle.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Feeds