X

Sundararajan's Weblog

Printing parse trees using nashorn directive prologues

ECMAScript specification allows for "directive prologues" (http://www.ecma-international.org/ecma-262/5.1/#sec-14.1). A directive prologue in an instruction to the ECMAScript engine. Apart from the standard specified "use strict" directive (which makes the particular program or function "strict"), ECMAScript specification allows implementation defined directive prologues as well.

Implementations may define implementation specific meanings for ExpressionStatement productions which are not a Use Strict Directive and which occur in a Directive Prologue..

Nashorn supports few directives apart from the standard "use strict" in nashorn debug mode. All nashorn directives start with the "nashorn" word. To trigger the nashorn "debug mode", you need to set the Java System property "nashorn.debug".

Two of the nashorn specific directives are "nashorn print ast" and "nashorn print lower ast". If you use these directives at the start of the program or a function, Nashorn prints abstract syntax tree (AST) of the program or the function in the nashorn debug mode. In the non-debug (default) mode, Nashorn specific directives are just ignored. "lower ast" is the AST after nashorn processes AST for non-reachable statements, inlined "finally" blocks are so on - after "lowering" the AST!

I'll demonstrate the use of these directives with a simple "automatic semicolon insertion" example. You perhaps omit semicolons and expect the ECMAScript engine to insert semicolons for you. While this seems to work in most cases, there are corner cases.


In the following example, where do you think the semicolon is inserted?

function func() {
// return an object literal with one property or undefined?
// in other words, semicolon inserted after "return" or after "object literal"?
return
{ x: 44 }
}
func()

How about printing the AST to see what happens? The same example with the nashorn directive to print AST inserted:

function func() {
"nashorn print ast"; // ask nashorn to print AST
return
{ x: 44 }
}
func()


jjs -J-Dnashorn.debug=true file.js
[function root { @0x057e1b0c]
[block body { @0x0ea1a8d5]
[statements[0..3]]
[expression statements[0] string @0x30a3107a]
[literal expression = '"nashorn print ast"' @0x7a765367]
[return statements[1] = 'return' [Terminal] @0x17d677df]
[block statements[2] { @0x78e67e0a]
[block block { @0x0bd8db5a]
[statements[0..1]]
[label statements[0] ident @0x4b553d26]
[block body decimal @0x069a3d1d]
[statements[0..1]]
[expression statements[0] decimal @0x086be70a]
[literal expression = '44' @0x2a556333]

As you see, semicolon is inserted after the "return" keyword. So, effectively you've a "return;" - which is return undefined! And there is an unreachable statement after that return statement. That is not object literal expression at all. That is a block statement that has one labeled statement - the labeled statement is labels a literal expression statement!

You can print "lowered" AST - AST after a bit of processing by Nashorn - in particular, unreachable statements removed.


function func() {
"nashorn print lower ast";
return
{ x: 44 }
}
func()


jjs -J-Dnashorn.debug=true file.js
Lower AST for: 'func'
[function root { @0x7181ae3f]
[block body { [Terminal] @0x1188e820]
[statements[0..2]]
[expression statements[0] string @0x679b62af]
[literal expression = '"nashorn print lower ast"' @0x799d4f69]
[return statements[1] = 'return' [Terminal] @0x290dbf45]

Now, you can see that the unreachable block statement is removed by Nashorn! Apart from having some debugging fun with Nashorn, the moral of the story is: better to stay away from automatic semicolon insertion as much as possible. Typing semicolons is not that hard really ;)

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.