Serialising Java Objects to RDF with Jersey

Jersey is the reference implementation of JSR311 (JAX-RS) the Java API for RESTful Web Services. In short JSR311 makes it easy to publish graphs of Java Objects to the web, and implement update and POST semantics - all this using simple java annotations. It makes it easy for Java developers to do the right thing when writing data to the web.

JAX-RS deals with the mapping of java objects to a representation format - any format. Writing a good format, though, is at least as tricky as building RESTful services. So JAX-RS solves only half the problem. What is needed is to make it easy to serialize any object in a manner that scales to the web. For this we have RDF, the Resource Description Framework. By combining JAX-RS and the So(m)mer's @rdf annotation one can remove remove the hurdle of having to create yet another format, and do this in a way that should be really easy to understand.

I have been wanting to demonstrate how this could be done, since the JavaOne 2007 presentation on Jersey. Last week I finally got down to writing up some initial code with the help of Paul Sandoz whilst in Grenoble. It turned out to be really easy to do. Here is a description of where this is heading.

Howto

The code to do this available from the so(m)mer subversion repository, in the misc/Jersey directory. I will refer and link to the online code in my explanations here.

Annotate one's classes

First one needs to annotate one's model classes with @rdf annotations on the classes and fields. This is a way to give them global identifiers - URIs. After all if you are publishing to the web, you are publishing to a global context so global names are the only way to remove ambiguity. So for example our Person class can be written out like this:

@rdf(foaf+"Person")
public class Person extends Agent {
    static final String foaf = "http://xmlns.com/foaf/0.1/";

    @rdf(foaf+"surname") private Collection<String> surnames = new HashSet<String>();
    @rdf(foaf+"family_name") private Collection<String> familynames = new HashSet<String>();
    @rdf(foaf+"plan") private Collection<String> plans = new HashSet<String>();
    @rdf(foaf+"img") private Collection<URL> images = new HashSet<URL>();
    @rdf(foaf+"myersBriggs") private Collection<String> myersBriggss = new HashSet<String>();
    @rdf(foaf+"workplaceHomepage") private Collection<FoafDocument> workplaceHomePages = new HashSet<FoafDocument>();
    ....
}

This just requires one to find existing ontologies for the classes one has, or to publish new ones. (Perhaps this framework could be extended so as to automate the publishing of ontologies as deduced somehow form Java classes? - Probably a framework that could be built on top of this)

Map the web resources to the model

Next one has to find a mapping for web resources to objects. This is done by subclassing the RdfResource<T> template class, as we do three times in the Main class. Here is a sample:

 @Path("/person/{id}")
   public static class PersonResource extends RdfResource<Employee> {
      public PersonResource(@PathParam("id") String id) {
          t = DB.getPersonWithId(id);
      }
   }

This just tells Jersey to publish any Employee object on the server at the local /person/{id} url. When a request for some resource say /person/155492 is made, a PersonResource object will be created whose model object can be found by querying the DB for the person with id 155492. For this of course one has to somehow link the model objects ( Person, Office,... in our example ) to some database. This could be done by loading flat files, querying an ldap server, or an SQL server, or whatever... In our example we just created a simple hard coded java class that acts as a DB.

Map the Model to the resource

An object can contain pointers to other objects. In order for the serialiser to know what the URL of objects are one has to map model objects to web resources. This is done simply with the static code in the same Main class [ looking for improovements here too ]

   static {
      RdfResource.register(Employee.class, PersonResource.class);
      RdfResource.register(Room.class, OfficeResource.class);
      RdfResource.register(Building.class, BuildingResource.class);              
   }

Given an object the serialiser (RdfMessageWriter) can then look up the resource URL pattern, and so determine the object's URL. So to take an example, consider an instance of the Room class. From the above map, the serialiser can find that it is linked to the OfficeResource class from which it can find the /building/{buildingName}/{roomId} URI pattern. Using that it can then call the two getters on that Room object, namely getBuildingName() and getRoomId() to build the URL referring to that object. Knowing the URL of an object means that the serialiser can stop its serialisation at that point if the object is not the primary topic of the representation. So when serialising /person/155492 the serialiser does not need to walk through the properties of /building/SCA22/3181. The client may already have that information and if not, the info is just a further GET request away.

Running it on the command line

If you have downloaded the whole repository you can just run

$ ant run
from the command line. This will build the classes, recompile the @rdf annotated classes, and start the simple web server. You can then just curl for a few of the published resources like this:
hjs@bblfish:0$ curl -i http://localhost:9998/person/155492
HTTP/1.1 200 OK
Date: Wed, 24 Sep 2008 14:37:38 GMT
Content-type: text/rdf+n3
Transfer-encoding: chunked

<> <http://xmlns.com/foaf/0.1/primaryTopic> </person/155492#HS> .
</person/155492#HS> <http://xmlns.com/foaf/0.1/knows> <http://www.w3.org/People/Berners-Lee/card#i> .
</person/155492#HS> <http://xmlns.com/foaf/0.1/knows> </person/528#JG> .
</person/155492#HS> <http://xmlns.com/foaf/0.1/birthday> "29_07" .
</person/155492#HS> <http://xmlns.com/foaf/0.1/name> "Henry Story" .

The representation returned not a very elegant serialisation of the Turtle subset of N3. This makes the triple structure of RDF clear- subject relation object - and it uses relative URLs to refer to local resources. Other serialisers could be added, such as for rdf/xml. See the todo list at the end of this article.

The represenation says simple that this resource <> has as primary topic the entity named by #HS in the document. That entity's name is "Henry Story" and knows a few people, one of which is refered to via a global URL http://www.w3.org/People/Berners-Lee/card#i, and the other via a local URL /person/528#JG.

We can find out more about the /person/528#JG thing by making the following request:

hjs@bblfish:0$ curl -i http://localhost:9998/person/528#JG
HTTP/1.1 200 OK
Date: Wed, 24 Sep 2008 14:38:10 GMT
Content-type: text/rdf+n3
Transfer-encoding: chunked

<> <http://xmlns.com/foaf/0.1/primaryTopic> </person/528#JG> .
</person/528#JG> <http://xmlns.com/foaf/0.1/knows> </person/155492#HS> .
</person/528#JG> <http://xmlns.com/foaf/0.1/knows> </person/29604#BT> .
</person/528#JG> <http://xmlns.com/foaf/0.1/knows> <http://www.w3.org/People/Berners-Lee/card#i> .
</person/528#JG> <http://www.w3.org/2000/10/swap/pim/contact#office> </building/SCA22/3181#it> .
</person/528#JG> <http://xmlns.com/foaf/0.1/birthday> "19-05" .
</person/528#JG> <http://xmlns.com/foaf/0.1/name> "James Gosling" .

... where we find out that the resource named by that URL is James Gosling. We find that James has an office named by a further URL, which we can discover more about with yet another request


hjs@bblfish:0$ curl -i http://localhost:9998/building/SCA22/3181#it
HTTP/1.1 200 OK
Date: Wed, 24 Sep 2008 14:38:38 GMT
Content-type: text/rdf+n3
Transfer-encoding: chunked

<> <http://xmlns.com/foaf/0.1/primaryTopic> </building/SCA22/3181#it> .
</building/SCA22/3181#it> <http://www.w3.org/2000/10/swap/pim/contact#address> _:2828781 .
_:2828781 a <http://www.w3.org/2000/10/swap/pim/contact#Address> .
_:2828781 <http://www.w3.org/2000/10/swap/pim/contact#officeName> "3181" .
_:2828781 <http://www.w3.org/2000/10/swap/pim/contact#street> "4220 Network Circle" .
_:2828781 <http://www.w3.org/2000/10/swap/pim/contact#stateOrProvince> "CA" .
_:2828781 <http://www.w3.org/2000/10/swap/pim/contact#city> "Santa Clara" .
_:2828781 <http://www.w3.org/2000/10/swap/pim/contact#country> "USA" .
_:2828781 <http://www.w3.org/2000/10/swap/pim/contact#postalCode> "95054" .

Here we have a Location that has an Address. The address does not have a global name, so we give it a document local name, _:2828781 and serialise it in the same representation, as shown above.

Because every resource has a clear hyperlinked representation we don't need to serialise the whole virtual machine in one go. We just publish something close to the Concise Bounded Description of the graph of objects.

Browsing the results

Viewing the data through a command line interface is nice, but it's not as fun as when viewing it through a web interface. For that it is best currently to install the Tabulator Firefox plugin. Once you have that you can simply click on our first URL http://localhost:9998/person/155492. This will show up something like this:

picture of tabualator on loading /person/155492

If you then click on JG you will see something like this:

picture tabulator showing James Gosling

This it turns out is a resource naming James Gosling. James knows a few people including a BT. The button next to BT is in blue, because that resource has not yet been loaded, whereas the resource for "Henry Story" has. Load BT by clicking on it, and you get

picture of tabulator showing Henry Story knowning Bernard Traversat

This reveals the information about Bernard Traversat we placed in our little Database class. Click now on the i and we get

Tabulator with info about Tim Berners Lee

Now we suddenly have a whole bunch of information about Tim Berners Lee, including his picture, some of the people he has listed as knowing, where he works, his home page, etc... This is information we did not put in our Database! It's on the web of data.

One of the people Tim Berner's Lee knows is our very own Tim Bray.

tabulator showing info from dbpedia on Tim Bray

And you can go on exploring this data for an eternity. All you did was put a little bit of data on a web server using Jersey, and you can participate in the global web of data.

Todo

There are of course a lot of things that can be done to improove this Jersey/so(m)mer mapper. Here are just a few I can think of now:

  • Improove the N3 output. The code works with the examples but it does not deal well with all the literal types, nor does it yet deal with relations to collections. The output could also be more human readable by avoiding repetiations.
  • Refine the linking between model and resources. The use of getters sounds right, but it could be a bit fragile if methods are renamed....
  • Build serialisers for rdf/xml and other RDF formats.
  • Deal with publication of non information resources, such as http:// xmlns.com/foaf/0.1/Person which names the class of Persons. When you GET it, it redirects you to an information resources:
    hjs@bblfish:1$ curl -i http://xmlns.com/foaf/0.1/Person
    HTTP/1.1 303 See Other
    Date: Wed, 24 Sep 2008 15:30:14 GMT
    Server: Apache/2.0.61 (Unix) PHP/4.4.7 mod_ssl/2.0.61 OpenSSL/0.9.7e mod_fastcgi/2.4.2 Phusion_Passenger/2.0.2 DAV/2 SVN/1.4.2
    Location: http://xmlns.com/foaf/spec/
    Vary: Accept-Encoding
    Content-Length: 234
    Content-Type: text/html; charset=iso-8859-1
    
    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <html><head>
    <title>303 See Other</title>
    </head><body>
    <h1>See Other</h1>
    <p>The answer to your request is located <a href="http://xmlns.com/foaf/spec/">here</a>.</p>
    </body></html>
    
    This should also be made easy and foolproof for Java developers.
  • make it industrial strength...
  • my current implementation is tied much too strongly to so(m)mer. The @rdf annotated classes get rewritten to create getters and setters for every field, that links to the sommer mappers. This is not needed here. All we need is to automatically add the RdfSerialiser interface to make it easy to access the private fields.
  • One may want to add support for serialising @rdf annotated getters
  • Add some basic functionality for POSTing to collections or PUTing resources. This will require some thought.

Bookmarks: digg+, reddit+, del.icio.us+, dzone+, facebook+

Comments:

In the InfoQ article "JSR 311 Final: Java API for RESTful Web Services", Paul Sandoz and Mark Hadley answer some questions and refer to this article.
see:
http://www.infoq.com/news/2008/09/jsr311-approved

Posted by Henry Story on September 24, 2008 at 02:10 PM CEST #

Post a Comment:
Comments are closed for this entry.
About

bblfish

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today