RDFa parser for Sesame
By bblfish on Sep 09, 2009
RDFa is the microformat-inspired standard for embedding semantic web relations directly into (X)HTML. It is being used more and more widely, and we are starting to have foaf+ssl annotated web pages, such as Alexandre Passant's home page. This is forcing me to update my foaf+ssl Identity Provider to support RDFa.
The problem was that I have been using Sesame as my semweb toolkit, and there is currently was no RDFa parser for it. Luckily I found out that Damian Steer (aka. Shellac) had written a SAX bases rdfa parser for the HP Jena toolkit, which he had put up on the java-rdfa github server. With a bit of help from Damian and the Sesame team, I adapted the code to sesame, create a git fork of the initial project, and uploaded the changes on the bblfish java-rdfa git clone. Currently all but three of the 106 tests pass without problem.
$ git clone git://github.com/bblfish/java-rdfa.git
This will download the whole history of changes of this project, so you will be able to see how I moved from Shellac's code to the Sesame rdfa parser. You can then parse Alex's home page, by running the following on the command line (thanks a lot to Sands Fish for the Maven tip in his comment to this blog):
$ mvn exec:java -Dexec.mainClass="rdfa.parse" -Dexec.args="http://apassant.net/" [snip output of sesame-java-rdfa compilation] @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix geo: <http://www.geonames.org/ontology/> . @prefix rel: <http://purl.org/vocab/relationship/> . @prefix cert: <http://www.w3.org/ns/auth/cert#> . @prefix rsa: <http://www.w3.org/ns/auth/rsa#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . <http://apassant.net/> <http://www.w3.org/1999/xhtml/vocab#icon> <http://apassant.net/misc/favicon.ico> ; <http://www.w3.org/1999/xhtml/vocab#stylesheet> <http://apassant.net/sites/apassant.net/files/css/css_84042a598208a6aade8783e8c2937a8c.css> , <http://apassant.net/sites/apassant.net/files/css/css_ba2732162a421c6422a6f5a68742254e.css> . <http://apassant.net/#id> rdfs:label "About"@en . <http://apassant.net/alex> a foaf:Person ; foaf:name "Alexandre Passant"@en ; foaf:workplaceHomepage <http://deri.ie> , <http://nuigalway.ie> ; foaf:schoolHomepage <http://paris-sorbonne.fr> , <http://dauphine.fr> ; foaf:topic_interest <http://dbpedia.org/page/Social_software_%28computer_software%29> , <http://dbpedia.org/resource/Semantic_Web> ; foaf:currentProject <http://www.w3.org/2009/sparql/wiki/> , <http://www.w3.org/2005/Incubator/socialweb/> ; <http://purl.org/vocab/bio/0.1/olb> """ \\nDr. Alexandre Passant is a postdoctoral researcher at the Digital Enterprise Research Institute, National University of Ireland, Galway. His research activities focus around the Semantic Web and Social Software: in particular, how these fields can interact with and benefit from each other in order to provide a socially-enabled machine-readable Web, leading to new services and paradigms for end-users. Prior to joining DERI, he was a PhD student at Université Paris-Sorbonne and carried out applied research work on \\"Semantic Web technologies for Enterprise 2.0\\" at Electricité De France. He is the co-author of SIOC, a model to represent the activities of online communities on the Semantic Web, the author of MOAT, a framework to let people tag their content using Semantic Web technologies, and is also involved in various related applications as well as standardization activities.\\n"""@en ; foaf:based_near <http://dbpedia.org/resource/Galway> ; geo:locatedIn <http://dbpedia.org/resource/Galway> ; rel:spouseOf <http://julie.letierce.net/#id> ; foaf:holdsAccount <http://www.flickr.com/people/terraces/> , <http://www.linkedin.com/pub/alexandre-passant/1/797/1ab> , <http://last.fm/user/terraces> , <http://slideshare.net/terraces> , <http://twitter.com/terraces> . <http://apassant.net/#cert> a rsa:RSAPublicKey ; cert:identity <http://apassant.net/alex> . _:node14efunnjjx1 cert:decimal "65537"@en . <http://apassant.net/#cert> rsa:public_exponent _:node14efunnjjx1 . _:node14efunnjjx2 cert:hex "8af4cb6d6ec004bd28c08d37f63301a3e63ddfb812475c679cf073c4dc7328bd20dadb9654d4fa588f155ca05e7ca61a6898fbace156edb650d2109ecee65e7f93a2a26b3928d3b97feeb7aa062e3767f4fadfcf169a223f4a621583a7f6fd8992f65ef1d17bc42392f2d6831993c49187e8bdba42e5e9a018328de026813a9f"@en . <http://apassant.net/#cert> rsa:modulus _:node14efunnjjx2 . [snip]
This graph can then be queried with SPARQL, merged with other graphs, and just as it links to other resources, those can in turn link back to it, and to elements defined therein. As a result Alexandre Passant can then use this in combination with an appropriate X509 certificate to log into foaf+ssl enabled web sites in one click, without needing to either remember a password or a URL.