Turtle support for NetBeans 6

Yesterday I added NTriples support for NetBeans. Today it was the turn of Turtle, a notation for RDF that takes human writers into account, and that is carefully being looked after by Dave Beckett, who now works at Yahoo!, on some project which seems to be leading to employment opportunities . Of course making things simple for humans, makes things more complicated for the computer. But not so complex, that I did not get most of it done in one day.

Turtle makes things more readable because it allows one to

  • declare namespaces, so as not to have to constantly write out the URLs in full
  • declare a base url
  • use relative urls
  • some punctuation shorthands:
    • use "," when you have sentences that have the same subject and predicate but different objects
    • use ";" when you have sentences that have the same subject but different predicates and objects
  • [] for anonymous nodes (nodes you can't be bothered to give a URL to). You can place predicate object statements into the brackets, meaning that their subject is the anonymous node.
  • ( a b c ) shorthand for lists
Here for example is a section of my foaf file in Turtle:
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix : <http://bblfish.net/people/henry/card#> .

:me    a foaf:Person;
       foaf:depiction <http://farm1.static.flickr.com/164/373663745_1801c2dddf.jpg?v=0>;
       foaf:openid <http://openid.sun.com/bblfish> ;
       foaf:gender "male";
       foaf:birthday "07-29";
       foaf:title "Mr";
       foaf:family_name "Story";
       foaf:givenname "Henry";
       foaf:name "Henry J. Story";
       foaf:homepage <http://bblfish.net/>;
       foaf:schoolHomepage <http://www.bbk.ac.uk/phil/>,
                           <http://www.doc.ic.ac.uk/>,
                           <http://www.kcl.ac.uk/kis/schools/hums/philosophy/>;
       foaf:mbox <mailto:henry.story@bblfish.net>,
                 <mailto:henry.story@gmail.com>,
                 <mailto:henry.story@sun.com>;
       foaf:nick "bblfish".
This is clearly much easier to read and to write that NTriples, but it hides somewhat the fact that everything is named by a URL.

There are again two main sections to the NetBeans Schliemann file (view the current version)

TOKEN:space:( [" " "\\t" "\\n" "\\r"]+ ) #unicodify
TOKEN:comment:("#" [\^ "\\n" "\\r"]\* ["\\n" "\\r"]+ )
TOKEN:bnode:( "_:" ["A"-"Z" "a"-"z" "0"-"9"]+ ) 
TOKEN:uriref:( "<" [\^ "<" ">" " " "\\t"]\* ">" ) #unicodify
TOKEN:string:( "\\"" [\^ "\\"" "\\n" "\\r"]\* "\\"" )
TOKEN:qname:(["A"-"Z" "a"-"z" "0"-"9"]\* ":" ["A"-"Z" "a"-"z" "0"-"9" "_"]+)
TOKEN:longString:("\\"\\"\\"" .\* "\\"\\"\\"" )
TOKEN:punct:(";" | "," | "." | "\^\^" )
TOKEN:integer:([ "+" "-"]? ["0"-"9"]+)
TOKEN:decimal:(["+" "-"]? ((["0"-"9"]+ "." ["0"-"9"]\*) | ( "." ["0"-"9"]+) )) # I leave out the decimals that can't be distinguished from integers 
TOKEN:exponent:(["e" "E"]["+" "-"]?["0"-"9"]+)
TOKEN:boolean:("true"|"false") 
TOKEN:exists:("[" | "]")
TOKEN:list:( "(" | ")" )
TOKEN:prefix:("@base" | "@prefix")
TOKEN:prefixName:(["A"-"Z" "a"-"z"]["A"-"Z" "_" "a"-"z"]\* )
TOKEN:shortrels:("a" | "=")
TOKEN:nameSep:(":")
TOKEN:lang:("@" ["A"-"Z" "a"-"z"]["A"-"Z" "a"-"z"]) #wrong but simpler

I have not yet filled in all the unicode special cases as I wanted to first make sure the main pieces would be working.

Then there is the grammar that goes with it.

S = ( Statement )\*;
Statement = ( Directive "."  ) | ( Triples  "."  ) ;
Directive = PrefixID | Base ;
PrefixID = "@prefix"  [ <prefixName> ] ":"  <uriref> ;
Base = "@base"  <uriref> ;
Triples = Subject  PredicateObjectList ;
#PredicateObjectList = ( Verb  ObjectList [";"] )+ ; #we have to force the ";" even though this is not necessary
PredicateObjectList = Verb  ObjectList More ; #Here it gets confused about whether the last ";" belongs here or below
More = (  ";"  Verb  ObjectList) \*; 
ObjectList = Object (  ","  Object )\* ;
Verb = Predicate | "a" | "=";
Subject = Resource | Blank ;
Predicate = Resource ;
Object = Resource | Blank | Literal ;
Literal = ( QuotedString [ <lang> ] ) | DatatypeString | <integer> | Double | <decimal> | <boolean> ;
DatatypeString = QuotedString "\^\^" Resource ;
QuotedString = <string> | <longString> ;
Resource = <uriref> | <qname>;
Blank = <bnode> | <exists,"["><exists,"]"> | <exists,"["> PredicateObjectList <exists,"]"> | Collection ;
Double = <integer> <exponent> | <decimal> <exponent> ;
Collection = <list,"("> [ ItemList ]  <list,")"> ;
ItemList = Object (  Object )\*;

The Grammar is a little different from Dave's official Turtle spec. For one I added the "=" sign as a tease. More importantly I ignore all blank spaces and all comments. It tried not to, but the parser in NBS only looks ahead by one token I think, so the white spaces was confusing it a lot. Removing them does not seem to be problematic, but only time will tell. I did this with the two lines:

SKIP:space
SKIP:comment
More problematic was that I could not get the optional ending of sentences with ";" to work. In Turtle one can have sentences like
:me foaf:knows      [ a foaf:Person;
                   foaf:name "Tim Boudreau";
                   foaf:weblog <http://weblogs.java.net/blog/timboudreau/>
                                ] .
Notice how the last line does not need a semicolon as it is followed by a "]" which clearly closes the sentence too. But it is nice to be able to add the semicolon anyway, as it is one less thing for the programmer to worry about. But I could not get it to work. Following the Turtle spec I tried:
Triples = Subject  PredicateObjectList ;
PredicateObjectList =  Verb  ObjectList More [";"] ; 
More = (  ";"  Verb  ObjectList) \*;

But this confuses the parser, who does not know if the final ";" is the one at the end of the PredicateObjectList line, or at the beginning of the More line... So for the moment I decide not to allow extra semicolons....

Again since everything is built on URIs (and so also URLs) in RDF, it is nice to add functionality so that one can click on links. Just as in yesterday's demo, I added a line to the nbs file:

HYPERLINK:uriref: net.java.dev.sommer.editors.Turtle.hyperlink

And then wrote out a clearly more complicated Turtle.java class. This is more complicated because Turtle allows relative urls, and defines the base e in different parts of the document. Hacking the AST libraries I put something together that works well enough for me to be satisfied at having done a great days work, and having spent a great time here in Prague.

Comments:

Post a Comment:
Comments are closed for this entry.
About

bblfish

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today