Friday Jun 14, 2013
Monday Oct 06, 2008
By jjg on Oct 06, 2008
A while back, we created a new OpenJDK project, Compiler Grammar, so that we could investigate integrating an ANTLR grammar for Java into javac. Thanks to some hard work by our intern Yang Jiang, with assistance from Terence Parr, the initial results of that work are now available.
The grammar currently supports Java version 1.5, although the goal is to fully support the -source option and support older (and newer) versions of the language as well. Right now, the performance is slower than that of standard javac, so this will not be the default lexer and parser for javac for a while, but even so, it should prove an interesting code base for anyone wishing to experiment with potential new language features. And, it does mean that the grammar files being used have been fully tested\* in the context of a complete Java compiler.
We are also looking to align the grammar more closely with the grammar found in JLS.
This version of javac is in the langtools component of the compiler-grammar set of Mercurial OpenJDK repositories.
\*There are currently a few test failures in the regression test suite. Some are to be expected, because the error messages generated by the parser do not match the errors given by the standard version of javac; the other failures are being investigated.
Wednesday May 07, 2008
By jjg on May 07, 2008
In javac land, we're looking at improving the diagnostic messages generated by the compiler ...
Here's one of my anti-favorite messages that I got from the compiler recently:
It is my pet peeve message type ("can't apply method...") and also includes wildcards, captured type variables, and a <nulltype>. The text of the error message, excluding source file name and highlighted lines is a whopping 577 characters :-) Who says we don't need to improve this?
We have various ideas in mind. This first list is about the content and form of the messages generated by javac.
Omit package names from types when the package is clear from the context. For example, use
java.lang.String, etc, when those are the only classes named
Stringetc, in the context.
Don't embed long signatures in message. Instead, restructure the message into a shorter summary plus supporting information.
Method name cannot be applied to given types required: types found: types
Don't embed captured and similar types in signatures, since they inject wordy non-Java constructions into the context of a Java signature. Instead, use short placeholders, and a key.
For example, in the message above, replace
java.lang.Iterable<capture#81 of ? extends javax.tools.JavaFileObject>
with a note following the rest of the message:
where #1 is a capture of ? extends javax.tools.JavaFileObject
Put these suggestions together, and the message above becomes:
It is a lot shorter (less than half the length of the original message, if you're counting), and more importantly, it breaks the message down into segments that are easier to read and understand, one at a time. It still has a long file name in it, and I'll address that below.
The following ideas are more about the presentation of messages. javac is typically used in two different ways: batch mode (output to a console), and within an IDE, where the messages might be presented as "popup" messages near the point of failure, and in a log window within the IDE.
When used in batch mode, either directly from the command line or in a build system, the compiler could allow the user to control the verbosity of the diagnostics. If you're compiling someone else's library, you might not be worried about the details in any warnings that might be generated. If you're compiling your own code, you might be comfortable with a quick summary of each diagnostic, or you might want as much detail as possible.
When used in an IDE, it would be good to provide the IDE with more access to the component elements of the diagnostic, so that the IDE could improve the presentation of the message. For example,
- display the base name of the file containing the error, and link it to the compilation unit, instead of displaying the full file name as above
- use different fonts for the message text and the Java code or signatures contained within it
- hyperlink types used in the diagnostic to their declaration in the source code
- given the resource key for the message, an IDE could use the key as an index into additional documentation specific to the type of the error message, explaining the possible causes for the error, and more importantly, what might be done to fix the problem.
Here's how these ideas could be used to improve the presentation of the example message.
OK, I'll leave the real presentation design to the UI experts, but I hope you get the idea of the sort of improvements that might be possible.
Finally, we'll be looking at improving the focus of error messages. For example, this means that if the compiler can determine which of the arguments is at fault in a particular invocation, it should give a message about that particular argument, instead of about the invocation as a whole. However, care must also be taken not to narrow the focus of an error message incorrectly, so that the message becomes misleading. A typical example of that is when the compiler is parsing source code, and having determined that the next token is not one of A or B, it then checks C. If that is not found the compiler may then report "C expected", when a better message would have be "A, B or C expected." This means that such optimizations have to be studied carefully on a case by case basis, whereas all of the preceding suggestions can be applied more generally to all diagnostics.
So, do you have any "pet peeve" messages you get from the compiler? Do you have suggestions on how the messages could be improved, or how they get presented? Add a comment here, or mail your suggestions to the OpenJDK compiler group mailing list, compiler-dev at openjdk.java.net.
Thanks to Maurizio and others for contributing some of the suggestions here.
Discussions about technical life in my part of the OpenJDK world.