Geertjan's Blog

  • March 17, 2007

Programming for Non-Programmers and Heinrich Schliemann

Geertjan Wielenga
Product Manager
To anyone who is not a programmer, programming is seen as a dry activity, comparable to filling out a tax form or applying for a visa. Fundamentally, it is typing after all, and it is typing code, which, compared to words, is not alive. Code cannot express human longing, cannot express hope and passion, does not have the vocabulary for pain, remorse and loss (unless you consider garbage collection a subset of loss). Of course, this is comparing apples with oranges. Code has no need to express those things. But, just because it has no need to express those things and therefore doesn't, a dry activity it is not. It does engage in and interact with language. And this is where the NetBeans APIs are interesting, because one of their strengths lies in their ability to work with language. Code completion, syntax highlighting, and other language support features such as these, are products that help someone work in an editor with a programming language, similar to how a dictionary and a thesaurus are products that help someone with a human language. And the ability to provide these features is central to the NetBeans APIs. It is for this reason that I argue, also in the interview with Roman recently, that in the area of editor functionality, the NetBeans APIs give the NetBeans Platform a very strong reason for existence. Whatever happens in the direction of JSR-296, the editor functionality exposed by the NetBeans APIs will be a fundamental reason for building editors on top of it, rather than on top of an alternative.

So if you are from the "programming is dry" school of thought, set aside that perspective for a minute while I show you some cool new toys that facilitate the interaction between language and programming. Hopefully, at the end of it all, you'll see that there is a fun, creative element to programming, something akin to a jigsaw puzzle or a rubik's cube, rather than the image of the lone programmer hunched over a keyboard with only a flickering green screen for company. The latter image is from an outdated movie anyway, so you're showing your age, my anti-programming friend. For readers of this blog who are programmers, specifically those who are familiar with the plethora of NetBeans APIs involved in providing editor functionality, you're about to be amazed, if you're not yet familiar with something known as NetBeans' "Project Schliemann". You're probably aware of the fact that providing editor functionality involves implementing the Editor Completion API, the Syntax Highlighting API, the Hyperlinking API, the Navigator API, the Code Folding API, and about 10 other APIs, all described in some detail in the upcoming Rich Client Programming: Plugging into the NetBeans Platform. Let's now look at the Schliemann alternative to all of these APIs.

A fundamental point to understand is that Schliemann is a framework for language programming. What does that mean? Common to all frameworks everywhere is that they provide an infrastructure, with the intention that the framework's client can focus on only those things that are important to the client and not those things that are common to each and every other client. For example, as Prakash Narayan said recently in his presentation at the Sun Tech Days in Hyderabad, generally you don't need to worry about your house's infrastructure, such as its plumbing. Except on rare occasions when the plumbing goes faulty, which implies a failure of the infrastructure, you can live very comfortably in your house without worrying about its plumbing. You don't even need to know about the plumbing. All you need is the phone number of a good (and cheap) plumber. (Isn't this simultaneously an argument for modular development and an example of loose coupling?) Similarly, the Struts Web Framework, and the hundreds of other web frameworks, provides the plumbing of your web application and the NetBeans Platform provides the plumbing of your Java desktop application. In the former case, you get a struts-config.xml file to handle the navigation between pages, while the latter gives you a window system on top of which you build your desktop application. Navigation is a fundamental concern for all web applications, while a window system is a fundamental concern for all Java desktop applications that have outgrown the JFrame.

What's the connection between all of this and Schliemann? Typically, when you're providing language support for a programming language, you need to define a multiplicity of Java classes, which extend the NetBeans APIs, for all the various features you're providing. And this needs to be done by every provider of new features for every new programming language. However, all along you, as a provider of the language support features, are really ONLY interested in describing the language. In the realm of human languages, this situation is comparable to someone creating a new language and then being told: "Your language is only official once you have applied in triplicate to the United Nations, the European Union, NATO, and the World Bank. And remember that each has a different submission procedure and each has different forms to fill in!" As a framework, Schliemann provides all this "red tape" right out of the box. You simply need to define your language and then state where you want to use which parts of it. Some parts you might want to give a particular syntax color, other parts you might want to show in the Navigator, and so on. This means the end of bureaucracy (which can be slow!) for language support features, allowing you to focus on your expert area... that of the programming language itself. Read the "Goals", "Overview", and "Why Schliemann?" sections of the Schliemann Project page to see that these are the exact underlying concerns for this project's existence. From personal experience, I've worked on tutorials for providing features such as syntax highlighting in NetBeans IDE, code completion in NetBeans IDE, and so on, as well as in the context of the book, so I know how much work it is just to describe these APIs, never mind learn them. Of course, doing so is necessary in many situations and Schliemann doesn't pretend to be a golden bullet. Schliemann provides a framework only for those programming languages that do not require compilation. Which is why... it is specifically useful for scripting languages. It is a light weight solution, intending to facilitate writers of scripting languages everywhere to quickly incorporate their language into NetBeans IDE. Think of it... what better way to improve the use of your scripting language than to provide support for it in an existing editor? This is as extreme as asking: "What better way to increase the use of your human language than to provide a dictionary?"

So, with that background, let's look at the Schliemann approach to manifest files. Although manifest files do not use a scripting language, they are not compilable, hence provide the necessary requirements for Schliemann's applicability. At the same time, I'm hoping to counter the "programming is dry" anti-pattern, by showing that its engagement with language should be very appealing to the non-programming soul.

When dealing with a programming language in the context of editors, a fundamental concern is tokens. Before the non-programmer's eyes glaze over, let's quickly state that "verb", "noun", and "adjective" are tokens of human languages. Okay? No worries, we're just dealing with parts of speech. So, we need to define the parts of speech of our language. Here's an example of a manifest file:

Name: java/util/
Specification-Title: "Java Utility Classes"
Specification-Version: "1.2"
Specification-Vendor: "Sun Microsystems, Inc.".
Implementation-Title: "java.util"
Implementation-Version: "build57"
Implementation-Vendor: "Sun Microsystems, Inc."

What parts of speech can we deduce from the above manifest file? We see a colon-separated list of entries. Before the colon we see something normally referred to as the "key", after the colon we see something often called the "value". There's always something before the colon, there's always a colon, and there's always something after the colon. Often, there may be a comment in the manifest file, like this:

#need to check with Bill about this one:

In this case, the comment is just a note to the person creating the manifest, and one hopes that it will be removed at the time that the application is packaged and distributed.

So, here we have the following parts of speech (i.e., tokens): key, colon, value, comment. Sounds like the basis of a language! Wouldn't it be nice if all the keys had one color, all the colons another color, all the values yet another color, and the comments were easily discernible somehow? First, lets looks at the Manifest File Syntax Highlighting Tutorial to see how this is traditionally done in NetBeans IDE. Note that since the writing of that tutorial, things have simplified to some extent, via something known as "incremental lexing" (a topic deserving a separate blog entry, such as the one here, where the creator of the Lexer modules is interviewed in this blog) via the NetBeans Lexer modules, which are described in the upcoming book. Again, the fundamental problem is that, in the tutorial, we're doing a lot more than creating tokens. It would be pretty cool if we could define the tokens using regular expressions, like this:

TOKEN:comment:( "#" [\^ "\\n" "\\r"]\* ["\\n" "\\r"]+ )
TOKEN:key:( [\^"#"] [\^ ":" "\\n" "\\r"]\* ):<VALUE>
TOKEN:whitespace:( ["\\n" "\\r"]+ ):<DEFAULT>
TOKEN:operator:( ":" ):<IN_VALUE>
TOKEN:whitespace:( ["\\n" "\\r"]+ ):<DEFAULT>
TOKEN:value:( [\^ "\\n" "\\r"]\* )

Maybe slightly dense on first reading, but the above tokens make use of something called state. Don't let your eyes glaze over yet, and remember that everything before a colon is a key and everything after it is a value. Therefore, if we know where we are, we know what token is applicable. (If I know you're in the state "drunk", then I know the token "random insults" is forthcoming.) Here the concept of a "tokenizer", also known as "lexical analyzer", is applicable too. Inside NetBeans IDE is a "lexical analyzer", something that reads a document whenever it is opened in NetBeans IDE. This lexical analyzer reads a document and, if the tokens above are assigned to the type of document in question, the lexical analyzer applies the tokens to the content of the document. In the above example, if the first character it finds in a line is #, the lexical analyzer is entering a comment. That's what the first line above says. If the first character in a line is not a #, the lexical analyzer is in a key. When a colon is reached, as specified in the second line above, the lexical analyzer is moving towards a value. That's what the second line above tells us. Then the <VALUE> section above is entered. When a colon is reached, the line enters the <IN_VALUE> state. That, briefly, is what happens line by line in a document that is opened in NetBeans IDE, if the document is assigned to the above file. To see how that is done, see yesterday's blog entry. All you need to do is define a small XML file called a MIME resolver, which associates a file extension (here it would be "mf") to a MIME type (here it could be "text/mf") and then register it in the XML layer, where the file containing the tokens above is also registered. Note that the file should have the "nbs" file extension, which stands for "NetBeans scripting".

So, now we have tokens (i.e., parts of speech in our manifest file language). In the Manifest File Syntax Highlighting Tutorial, you will see that this is MUCH more difficult when done in Java. A knowledge of regular expressions is all one really needs to create the above tokens (and there are plenty of tutorials on that, such as here in the Swing tutorials).

Okay, great, now we have tokens. So, now what? In the same file above, we can immediately assign colors to tokens:

COLOR:key: {
foreground_color: "blue";
COLOR:operator: {
foreground_color: "black";
COLOR:value: {
foreground_color: "magenta";

Note that we didn't assign a color to the "comment" token. Why? Because tokens with this name receive a grey color automatically by the Schliemann support modules. Another token, "keyword", is automatically dark blue. We could use that keyword here, and even override its color, but here we've used the token name "key" instead. (Note to self: Must find out what all the pre-defined tokens are.)

So, great, we have colors. You can, literally, install the module right now, and you'll have syntax coloring for manifest files. However, Schliemann provides much more. Normally to the left of the editor in NetBeans IDE, you see something called a "navigator", which lets you jump to places in the editor itself. Creating a navigator is not the most difficult part of the NetBeans APIs, but still requires some coding, as shown here in this blog. However, this is all that we now need to do, in the same file where we define the tokens and colors:

icon: "/org/netbeans/modules/languages/resources/method.gif";

That's it! We will now have a navigator, listing the keys in the manifest file, together with the "method.gif" icon from the NetBeans sources. It could not be simpler. However, let's complicate matters and suppose that we don't only want to display the key, but also the colon and value. To do that, we first need to create a set of grammar rules. As in human languages, a grammar rule determines the correct combinations of tokens, and the correct order in which they can be expressed. So, here are our grammar rules, again in the same file as above:

StatementRule = KeyRule OperatorRule ValueRule WhiteSpaceRule;
KeyRule = <key>;
OperatorRule = <operator>;
ValueRule = <value>;
WhiteSpaceRule = <whitespace>;

So we create a grammar rule for what a "statement" consists of, which in this case is a rule composed of each of the tokens that we defined earlier. Now we can use the grammar rules in the navigator:

NAVIGATOR:StatementRule: {
icon: "/org/netbeans/modules/languages/resources/method.gif";

This is the result, we now have colors and a navigator for manifest files:

Let's note three things in the context of the navigator. Firstly, we have a full-blown navigator, because Schliemann provides a framework. For example, we can click an entry in the navigator, and then the editor will open (if closed) and the cursor will land on the line corresponding to the entry in the navigator. All we needed to supply was the content and the icon. (And, if you supply no icon, a default one is supplied for you.) Secondly, think about debugging... if something is wrong, where are we going to look for the problem? In one of 10 different Java classes? No, only in the definition shown above. That is a big plus. Thirdly, note that adding additional functionality is really easy. For example, here's the same navigator, but this time with a tooltip that appears when the mouse hovers over an entry in the navigator:

NAVIGATOR:StatementRule : {
tooltip: "Click to change value $ValueRule$";
icon: "/org/netbeans/modules/languages/resources/method.gif";

The biggest issue with working with language features this way is that one needs to be comfortable with the Schliemann language used in the "nbs" file, which is really a subset of JavaCC and regular expression language. This document provides guidance, but a lot more needs to be done. For example, code completion for "nbs" files and maybe a visual editor as well. However, clearly, for the languages that fall in the ambit of Schliemann's scope, this is really a fantastic innovation that lifts the NetBeans Platform to a new level. Together with the Visual Library, Schliemann shows the continual reinvention and innovation of the NetBeans Platform, the ongoing search for improvements and simplifications, all aiming to make the life of the end user (i.e., a developer) less cumbersome. And, hopefully, I've also shown that this interaction between language and programming makes the latter far from a "dry" activity.

I plan to blog quite a bit more about Schliemann, because here I've only shown syntax coloring and navigator support. Code folding, code completion, hyperlinking, and a variety of other language features are supported or are planned to be supported, in similar ways to syntax coloring and the navigator above. In the process, it is good to note that Schliemann is flexible enough to encompass a variety of approaches, such as the "state" approach to syntax highlighting shown here, but also shown in the fact that Java programming can play a role in adding more detailed sophistication to a bare-bones Schliemann implementation. And why is it called Schliemann anyway? Read yesterday's blog entry to find out!

Join the discussion

Comments ( 3 )
  • Den Saturday, March 17, 2007
    I have read at least in part some of the above... and I have to say that I do not think of programming as that dry art you suggest some may think, at lest in the title, and I criticise this theory on slightly different grounds than you; programming is a highly creative exercise, at least when you have the freedom to explore an idea for an application that you have in mind. The choice of application is not a dry thing and can express the inner soul of the programmer as does every consequent line of code... I come from an arty background and to tell the truth programing is not so different from getting out the paint-box and painting some abstract concept or landscape. I can not claim to be a brilliant programmer, but it is an artform that is thoroughly worthwhile and not at all dry.
  • Sandip Saturday, March 17, 2007
    Thanks Geertjan for the article.

    I just wanted to add this quick info - the page at


    mentions property names for COLOR: tag as:

    • color-name:
    • default-coloring:
    • foreground-color:
    • background-color:
    • underline-color:
    • wave-underline-color:
    • strike-through-color:
    • font-name:

    The use of dash (-) in the name is an error. The correct values are with underscore (_):
    • color_name:
    • default_coloring:
    • foreground_color:
    • background_color:
    • underline_color:
    • wave_underline_color:
    • strike_through_color:
    • font_name:

    I hope this saves some folks some time. I have already filed an issue for it to be corrected.
  • Geertjan Sunday, March 18, 2007
    Yes, Den, that's what I think too. Sandip, thanks for the note. I also like your suggestion of somehow automating a process whereby code snippets are analyzed for tokens and grammar. That would really be cool. Can't you make a module that does that, outputting to an NBS file?
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.