Java Annotations & the Semantic Web

Intro

The topic of annotations has been making headlines in the blogosphere [1] [2], and so this is probably a good time to write these thoughts out. As a side note, I have already implemented something along these lines in the current version of BlogEd and this is just a generalisation and improvement over that initial work. Anyhow I want to show here how annotations can be used to make the relation between java and the semantic web obvious to any java programmer.

A very quick intro to the Semantic Web

If I can put it really concisely, the semantic web is best thought of as a mathematical structure based on graph theory. It is really very easy to understand. It reduces everything to triples [Subject Relation Object], where the Relation is always identified by URIs, the Subject sometimes (and sometimes by an unnamed node). Finally the Object can be either a URI, a blank node or also be a literal, which is a string or XSD specified bit of xml. Dates for example are good candidates for Literals. You can express anything with this, so mathematicians have proven, apparently. A quick example:
 
       
        _X rdf:type foaf:Person.
        _X foaf:name "Henry Story".
        _X foaf:mbox mailto:hjs@bblfish.net.
   
The above uses the very useful foaf vocabulary to say that there is an entity _X (an anonymous node) that is a Person, has a name "Henry Story" and a particular mailbox. Notice that vocabularies can be easily mixed. So I could use some geolocation vocabulary to specify where I am at a particular time. Simple. Even more so, when one presents it in a graphical way

The semantic web can be serialised in many ways. The unfortunate XML/RDF format is one way. Much closer to the structure of the model is NTriples, easier still to read for humans is Turtle or N3. But for those of us who program in Java, a java serialisation would be best of all. All of them are ways of describing the above graph.

Java as an serialisation of the Semantic Web

I'll argue a little provocatively here perhaps, that Java, or at least a superset of JavaBeans, can be thought to be a serialisation of the RDF. RDFS[3] gives us a vocabulary to describe object oriented structures such as those given by Java Beans. It has predicates to specify that an URI is a class and that another is a relation. You can specify the domain of and the range of a relation. This is all that is needed to describe a Java Bean.

Here is a first attempt [6] at annotation the AtomPerson class (I have just found a bug in BlogEd so I can't write the less that and greater than symbols which I have replaced by ≤ and ≥)

@RDF(AtomPerson.BASE+"Person")
public interface AtomPerson {
    String BASE = "https://bloged.dev.java.net/Ontologies/Atom/2005-01-03/"

    @RDF(BASE+"name")
    public void setName(String name);
    public String getName();

    @RDF(BASE+"email")
    public void addEmail(URI email); 
    public Collection≤URI≥ getEmails();

    @RDF(BASE+"dateOfBirth")
    public void setDateOfBirth(Date birth);
    public Date getDateOfBirth();
}
So here we annotate the class and the bean methods with URIs. Having done this we could then automatically serialise the above Java Bean in any of the other RDF formats (once one agrees to the correct serializations for some of the primitive java types, such as int, or Date). That is really it. That's how easy RDF is.

The above gives us the semantics for RDFS [3]. OWL [4] gives us more power, and brings us closer to (or even beyond I am not sure) what we can get with UML class diagrams. OWL allows us to specify the following properties of relations:

  • functional: if A rel B and A rel C the B == C
  • inverse functional: if A rel C and B rel C then A == B
  • the max number of values for an instance and the min number. This is useful for the add... type beans (the setXXX methods clearly have at most one value.
  • transitivity: if A rel B rel C then A rel C
  • symmetricity: if A rel B then B rel A
so for example we could add more information to our AtomPerson class above by adding a few more annotations such as:
@RDF(AtomPerson.BASE+"Person")
public interface AtomPerson {
    String BASE = "https://bloged.dev.java.net/Ontologies/Atom/2005-01-03/"

    @RDF(BASE+"name")
    public void setName(String name);
    public String getName();

    @RDF(BASE+"email") @InverseFunctional
    public void addEmail(URI email);
    public Collection≤URI≥ getEmails();

    @RDF(BASE+"dateOfBirth") @Cardinality(max=1,min=1) 
    public void setDateOfBirth(Date birth);
    public Date getDateOfBirth();

    @RDF(BASE+"sibling") @Symmetric @Transitive
    public void addSibling(Person sibling);
    public Collection≤Person≥ getAllSiblings();
}
So the sibling relation (uniquely identified by the https://bloged.dev.java.net/Ontologies/Atom/2005-01-03/sibling uri and by the addSibling and getSibling methods) is clearly
  • symmetric: because if Christina is my sibling then I am her sibling
  • transitive: because if Alex is a sibling of Christina then Alex is also my sibling
There we have it. We now have written RDF and OWL in java.

What does it give us?

  • For one we can see how easy it is to understand OWL :-) Please mail me if you still have not understood at henry.story at bblfish.net. I need to know how I can improve this overview. :-)
  • This should provide a much more generalised way of mapping JavaBeans to a database. Using URIs for beans is a lot better than using table names. URIs are \*Universal\*. SQL tables layouts are different from one database to the next. It should be up to the database administrator only to set the mapping from RDF into his private internal scheme. We have done here all the mapping that needs to be done. This has been argued very clearly by the Model Driven Architecture crowd [7]. But I think the Semantic Web adds a generality to the whole concept that both simplifies the problem, standardises it with web technologies, and thereby makes it far more accessible that the Model Driven Architecture could make their framework. Furthermore the above annotation scheme seems a lot simpler than the JMI spec.
  • We can implement our beans by using dynamic proxies or other aspect oriented techniques to have our classes mirror an RDF or any normal relational (SQL) database
  • The above now maps very easily into UML class diagrams (which one can argue are just another notation for OWL)
  • We can annotate normal classes too (with some questions as to how the behavior is to be understood)
  • others?

Backward compatibility

I'll argue that all java beans by default already work this way. All we need to do is give every java class a URI. Luckily that is easy as a unique package naming mechanism was built from the outset right into java. Let us invent a URI scheme for java classes to make this more obvious. Let
java:com.sun.labs.tools.blog.AtomPerson,1.2/
would be the URI of the current class [6] and the name relation could be identified via the URI
java:com.sun.labs.tools.blog.AtomPerson,1.2/name
So any java bean can already be transformed into RDF. The RDF annotation we developed above could then be seen to enable us to override the default URI for the class, interface or property.

Possible simplification

In the above I have only annotated the setter method. One could also annotate the getter, adder, getAll methods or even a field. This ends up creating too many places for annotations I think. Is there a standard solution for this? One of these [8] would be to just annotate the variable
@RDF(AtomPerson.BASE+"Person")
public AtomPerson {

    @Access(read | write) @RDF(BASE+"name")
    @Cardinality(max=1,min=1) String name;

    @Access(read | write) @RDF(BASE+"email")
    @InverseFunctional URI email;

    @Access(read | write) @RDF(BASE+"dateOfBirth")
    @Cardinality(max=1,min=1) 
    Date birth;

    @Access(read | write) @RDF(BASE+"sibling")
    @Symmetric @Transitive Person sibling;
}
And apparently we can then use apt to then create the getters, setters, adders, etc... Getters would be produced only for fields with read access and setters only for fields with write access clearly. This is nice and terse. It will probably be used to generate the more verbose version we had previously, so it this would be mostly something one could do for convenience. Still there may be many cases when this is all that is needed.

More advance Java Beans

In the above I have been working with a little more developed version of Java Beans, for which we clearly need to develop a . Java Beans don't distinguish very well between a thing that is related to a collection
a R (c,d,e,f)

and a thing having numerous relations of the same type
a R c
a R d
a R e
a R f
. In Java Beans there are just setters and getters. RDF allows one to distinguish between those cases. So our more advanced java beans would need an addRelation(Object) and getAllRelation() types to allow us to add single relation instances to our graph. (Perhaps we can distinguish these cases simply by using an annotation.) There is in fact a pragmatic reason why one may want to have an adder method. Sometimes I am sure it would be a lot faster to do an add operation to a database rather than having to fetch all the elements in order to add one more element to the collection. So there may also be some efficiency reasons for doing this apart from allowing us to map better to RDF and UML.

Combined Inverse Functional Properties

One of the uses of Inverse Functional Properties is that they serve as hints to the database that the value is what in the well established relational database world (note that RDF is also all about relations) is known as a primary key. Given an e-mail address for example you can search in the table [email inverse(mbox) person] table to find the Person that is identified by the email. inverse(X) is a function that gives us the inverse of a relation. And so inverse(mbox) is a functional relation, since mbox is an inverse functional one (duh!). But how do we deal with compound keys? I found these popping up all over the place in my work on BlogEd.

Compound keys are what I have named Combined Inverse Functional Properties (CIFP). They state that two object together when known identify something uniquely. Let us imagine a world, a kind of swiss cloud cookoo land, where by design no one ever has the same first name and surname. The first name and surname combination always identify one and only one person. In such a world one could say that the relation from a person to the pair (first name, surname) is inverse functional. Call that relation the fullname relation. How can we now use annotations to specify such a relation?

Well perhaps the following will do the trick. As hinted above the relation we are looking for is towards an ordered pair. And how do we specify an ordered pair in java? With arguments! The arguments of a method are just a bit of syntactic sugar to help us specify a relation to a pair.

@RDF(FoafPerson.BASE + "Person")
interface FoafPerson {
    String BASE = "http://xmlns.com/foaf/0.1/";

    @RDF(BASE+"weblog") void addWeblog(URI uri);
    @RDF(BASE+"weblog") Collection≤URI≥ getWeblogs();

    @RDF(BASE+"surname")
    @Functional String getSurname();
    @RDF(BASE+"surname") void setSurname(String surname);

    @RDF(BASE+"firstName") String getFirstName();
    @RDF(BASE+"firstName") void setFirstName(String firstName);


    @RDF(BASE+"fullName") String[] getFullName()
    @RDF(BASE+"fullName") @InverseFunctional
              void setFullName(@RDF(BASE+"fistName") String firstName,
                                     @RDF(BASE+"surname")  String surname);

    @RDF(BASE+"mbox")
    @InverseFunctional void addMbox(URI mbox);
    @RDF(BASE+"mbox") Collection≤URI≥ getAllMbox();

}
So here we have the setFullName relation that is a relation to a pair. In the arguments we specify further how each pair is itself related to the object in question, by annotating the arguments themselves. This I believe gives us exactly what I was looking for.

So as this is getting just a little complex, let us put together an example of how this might work in practice. Let us imagine that we have a framework that maps our annotated interface to a deductive database. We might then have the following behavior.

   FoafPerson me = factory.create(FoafPerson.class);
   me.setFirstName("Henry");
   me.setSurname("Story");
   me.addMbox(URI.create("mailto:hjs@bblfish.net"));

   //let us create another person
   FoafPerson someone = factory.create(FoafPerson.class);
   someone.setFirstName("Henry");

   assert(!someone.equals(me)); //yes. They should be different. There could be another person named "Henry"
   
   //now let us add the family name
   someone.setSurname("Story");
   
   assert(someone.equals(me));

Well perhaps we don't even need an inferencing backend layer to deduce the above. That could be done with some very simple java. We might find the inferencing helpful if we wanted the following behavior to follow:
 
  assert("mailto:hjs@bblfish.net".equals(someone.getAllMbox().iterator().next().toString());

The idea is that as soon as the object knows both my names it would deduce the that the two variables me and someone refer to the same person and therefore know that all the other properties are the same too. Notice that that does not require some black magic inferencing layer, but some pretty standard simple inferencing on the equality of objects.

Prior Work

Frank McCabe made me aware that I had not mentioned similar work in the field. Here is a list of some of the other projects that I am aware of:
  • There is Jastor that works with HP's Jena framework to create Java Beans from OWL files. The Beans created are implemented classes that make calls in the Jena framework.
  • There is a very simple Elmo library that works with the Sesame framework, though I am not sure how extensible it is.
  • rdfreactor [9] like bloged currently creates interfaces and uses dynamic proxy objects to implement the behavior of the interfaces at run time. The interfaces are generated from OWL/XML files. I learnt a lot from this framework.
  • BlogEd 0.7 uses static final variables to annotate methods and interfaces. These interfaces are then used as with rdfreactor by a factory object to create dynamic proxy objects that wrap the behavior.
What I am proposing is closest to what rdfreactor does. But I think that by using annotations we are generalising on the work of all the above in a very useful way.
  • Annotations allow one to express OWL directly in Java, which makes the relation between RDF and Java much easier and clearer for Java programmers to understand, as well as making it much easier to work with.
  • Just as important, annotations allow one to separate the implementation of the behavior from the declaration of it. The behavior can now be coded directly into a class as with Jastor, or it can use Dynamic Proxies as rdfreactor and BlogEd are currently doing, or it could even use aspect oriented programming languages. This will allow one to write mappings from annotated java classes into RDF/OWL and vice versa without specifying the implementation. Other libraries will then more usefully specialise in interpreting the annotations in the way best suited to the database or framwork used by the application. This should help interoperability between the RDF frameworks.

Conclusion

Mapping between the triples of the Semantic Web and Java Beans is really easy with annotations, which is quite weird if you think about it, since annotations are a way to add metadata to java, and the semantic web is the most general metadata framework in existence. In any case once this relationship is understood we have gone most of the way towards laying the foundations of an open, standards based, model driven architecture I believe. The Semantic Web builds on the most fundamental part of the Web: It's universal naming scheme exemplified by URIs. This gives us something fundamental that all the previous systems lacked or had to invent in an adhoc manner. This should therefore get us a lot further than any previous systems could. I'll be exploring this advantage further on this blog, and in my BlogEd code.

Does anyone have any feedback on this? Don't hesistate to mail your questions. I believe that if properly explained this is really not very difficult for Java programmers to take on board. So if you don't quite understand, it's my fault!

Update: thanks to feedback by Pete (UK) Kirkham and Danny Ayers.

  1. Annotations: Don't Mess with Java
  2. Annotations are the best thing that has happened to Java in a long time
  3. RDF Vocabulary Description Language 1.0: RDF Schema
  4. Web Ontology Working Group papers
  5. in the spirit of the article Using Annotations to add Validity Constraints to JavaBeans Properties Thouth the type of constraints given by the paper above would seem more correctly best encapsulated in Java by classes. A Date, a social security number, and any other object is probably best identified by a class, which then will have many different serialisations depending on the locale, serialisation format, etc... So though I think the article makes for some very good reading I don't think that it gives the best example for using annotations in java. RDF is a much more powerful and useful tool.
  6. It seems clear to me that setters and getters don't give us quite all that we want. It would be really nice if java beans also had addXXX and getAllXXX. See the section More Advanced Java Beans above.
  7. see especiall the paper "Model-driven architecture: Vision, standards, and Emerging technologies" on the JMI page.
  8. see section "Constraints as Part of a Property Annotation" of [5]
  9. I just discovered that the RdfReactor team have submitted a paper to ISWC2005 where they mention that they want to use annotations in their next version. That is a very good read btw.
  10. These thoughts are now (Summer 2006) being developed on the so(m)mer project on dev.java.net.
Comments:

Henry: A few comments. 1. Do you like Omnigraffle? 2. Your characterization of OWL is interesting. There has been some work already in mapping OWL-style ontologies onto java; in particular mapping OWL classes to Java classes. 3. There are some strong limitations to OWL-style notations. For example, it is not possible to state that a property is functionally dependant on another (e.g., the wife of a husband is the same as the husband of a wife) 4. There is a lot more to Java than beans. In fact, JavaBeans are kind of ironic: a weakly typed formalism embedded in a strongly typed language Frank

Posted by Frank McCabe on August 28, 2005 at 02:23 PM CEST #

He! It is that easy to see that I used Omnigraffle for the illustration? It's a really good tool :-)

I am aware of a few other groups that have done some mapping from OWL to Java. There is a Jastor that works with HP's Jena to create Java Beans. Someone has written something similar for the Sesame group. In both those cases people go from ontologies written in OWL to Java. What is novel in my case is that by working with these annotations I am

  • allowing the Java developer to work directly with the tools he understand so well,
  • create a good framework to which the OWL/RDF tools mentioned above can map to
  • by using annotations I don't force a particular implementation of the annotations. The behavior can be implemented in many ways: I am using dynamic proxies currently, but this could be done just as well or better with aspects, especially as aspects start getting built right into the jvm (I heard BEA is doing this).
  • and I think more intriguingly, and perhaps this will require me to rewrite my blog entry a little, it looks like these annotations may allow one to step quite a way out of the Bean Box. I have allready found a interesting way to map methods with multiple arguments to RDF... But as I said this needs thinking about a little more.

Thanks for pointing out the limitation of OWL. I had seen that mention before in some paper on April and Go that you had written. I imagine that these types of relations would then probably best be written using Tim Berner's Lee's N3 language and mapped into the intelligent database the state of which the beans are mirroring... Still I suppose I'll see over time how useful OWL really is.

I need to understand this idea that JavaBeans are weekly typed formalism. What is useful about them though is that they are so well understood. I have heard people say that it is useless to consume RDF because what should one do with vocabulary that one might not understand. Might as well stick with simple XML the argument of those skeptics goes. But by showing that RDF can be mapped to java beans simply, we have a decade of example tools to show them where tools such as Netbeans (or Visual Basic) can consume objects that they only have very little understanding of, and yet present a very useful interface to interact with them.

Posted by Henry Story on August 28, 2005 at 07:29 PM CEST #

Hi henry, very good post. For OmniGraffle, I sent them an email something like two months ago, asking them if they could support RDF as an import format at least to draw graphs. They seemed to be interested by this. So if there's enough traction for it, maybe we will have an import feature in OmniGraffle.

Posted by karl Dubost on September 05, 2005 at 02:28 PM CEST #

That would be cool. All that would be needed would for them to create a Swing version that's as good as what they've done on OSX and they'd be untouchable. Currently there is a gaping hole for someone to do something as beautiful that's cross platform, and it's bound to be filled...

Posted by Henry Story on September 05, 2005 at 06:37 PM CEST #

Unfortunately this does not work.

There are two problems:

1. Subproperties in OWL - maybe you can also annotate them as subproperty of but then again they can have different restrictions and domains on them. This tends to break OO.

2. Open world semantics : In OWL you do not know if a certain individual belongs to a certain class until stated - in other words there are three states for each individiual-class (i,c) pair :

i is in c , (explicitly stated)
i is not in c, (explicitly stated)
we don't know if i is in c. (default)

in OO there are two states:
i is in c , (explicitly stated)
i is not in c, (default)

These two reasons are the main obstacles between the semantic web world and the OO world and AFAIK no one came with a good mapping solution yet.

Posted by Emek Demir on August 20, 2007 at 01:14 PM CEST #

Hi Emek,

It does not have to be a perfect mapping to be useful. It just has to be good enough for it to make life for OO programmers a lot easier. Tools such as Hibernate are very successful because they make it easier for developers to work with Relational databases. The mappings can't be perfect or else there would be no OO/Relational DB problem...

> 1. Subproperties in OWL - maybe you can also annotate them as subproperty of but then again they can have different restrictions and domains on them. This tends to break OO.

I would leave this to the RDF inferencing engine. It can relate a subproperty relation to a fact in the DB to deduce the existence of a subrelation mapped to a so(m)mer field.

> 2. Open world semantics : In OWL you do not know if a certain individual belongs to a certain class until stated - in other words there are three states for each individiual-class (i,c) pair :
>
> i is in c , (explicitly stated)
> i is not in c, (explicitly stated)
> we don't know if i is in c. (default)

Well you are missing that given information about the domain and range of a relation you can deduce from a relation that an object is member of a class.

Otherwise, if you don't know then so be it. Don't map it.

> in OO there are two states:
> i is in c , (explicitly stated)
> i is not in c, (default)

Well you always know that an object is a java.lang.Object, whatever it is.

You can also map relations to HashMap<Object,Object>. Here you would know of a bunch of relations without knowing what types the objects are.

> These two reasons are the main obstacles between the semantic web world and the OO world and AFAIK no one came with a good mapping solution yet.

As you see I think it is a matter of pragmatics. I think OO programmers will find so(m)er a lot more helpful mapping that writing a lot of StatementIterators to step through the results of SPARQL queries.
This is also the best way to get immediate traction of the SemWeb with all the java libraries currently in existence.
Henry

Posted by Henry Story on August 20, 2007 at 02:02 PM CEST #

Post a Comment:
Comments are closed for this entry.
About

bblfish

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today