Friday Jul 13, 2007

The limitations of JSON

A thread on REST-discuss recently turned into a JSON vs XML fight. I had not thought too deeply about JSON before this, but now that I have I though I should summarize what I have learnt.

JSON has a clear and simple syntax, described on json.org. As far as I could see there is no semantics associated with it directly, just as with XML. The syntax does make space for special tokens such as numbers, booleans, ints, etc which of course one automatically presumes will be mapped to the equivalent types: ie things that one can add or compare with boolean operators. Behind the scenes of course a semantics is clearly defined by the fact that it is meant to be used by JavaScript for evaluation purposes. In this it differs from XML, which only assumes it will be parsed by XML aware tools.

On the list there was quite a lot of confusion about syntax and semantics. The picture accompanying this post shows how logicians understand the distinction. Syntax starts by defining tokens and how they can be combined into well formed structures. Semantics defines how these tokens relate to things in the world, and so how one can evaluate the truth, among other things of the well formed syntactic structure. In the picture we are using the NTriples syntax which is very simple defined as the succession of three URIs or 2 URIs and a string followed by a full stop. URIs are Universal names, so their role is to refer to things. In the case of the formula

<http://richard.cyganiak.de/foaf.rdf#cygri> <http://xmlns.com/foaf/0.1/knows> <http://www.anjeve.de/foaf.rdf#AnjaJentzsch> .
the first URI refers to Richard Cyganiak on the left in the picture, the second URI refers to a special knows relation defined at http://xmlns.com/foaf/0.1/, and depicted by the red arrow in the center of the picture, and the third URI refers to Anja Jenzsch who is sitting on the right of the picture. You have to imagine the red arrow as being real - that makes things much easier to understand. So the sentence above is saying that the relation depicted is real. And it is: I took the photo in Berlin this Febuary during the Semantic Desktop workshop in Berlin.

I also noticed some confusion as to the semantics of XML. It seems that many people believe it is the same as the DOM or the Infoset. Those are in fact just objectivisations of the syntax. It would be like saying that the example above just consisted of three URIs followed by a dot. One could speak of which URI followed which one, which one was before the dot. And that would be it. One may even speak about the number of letters that appear in a URI. But that is very different from what that sentence is saying about the world, which is what really interests us in day to day life. I care that Richard knows Anja, not how many vowels appear in Richard's name.

At one point the debate between XML and JSON focused on which had the simplest syntax. I suppose xml with its entity encoding and DTD definitions is more complicated, but that is not really a clinching point. Because if syntactic simplicity were an overarching value, then NTriples and Lisp would have to be declared winners. NTriples is so simple I think one could use the well known very light weight grep command line tool to parse it. Try that with JSON! But that is of course not what is attractive about JSON to the people that use it, namely usually JavaScript developers. What is nice for them is that they can immediately turn the document into a JavaScript structure. They can do that because they assume the JSON document has the JavaScript semantics. [1]

But this is where JSON shows its greatest weakness. Yes the little semantics JSON datastructures have makes them easy to work with. One knows how to interpret an array, how to interpret a number and how to interpret a boolean. But this is very minimal semantics. It is very much pre web semantics. It works as long as the client and the server, the publisher of the data and the consumer of the data are closely tied together. Why so? Because there is no use of URIs, Universal Names, in JSON. JSON has a provincial semantics. Compare to XML which gives a place for the concept of a namespace specified in terms of a URI. To make this clearer let me look at the JSON example from the wikipedia page (as I found it today):

{
    "firstName": "John",
    "lastName": "Smith",
    "address": {
        "streetAddress": "21 2nd Street",
        "city": "New York",
        "state": "NY",
        "postalCode": 10021
    },
    "phoneNumbers": [
        "212 732-1234",
        "646 123-4567"
    ]
}

We know there is a map between something related to the string "firstName" and something related to the string "John". [2] But what exactly is this saying? That there is a mapping from the string firstName to the string John? And what is that to tell us? What if I find somewhere on the web another string "prenom" written by a French person. How could I say that the "firstName" string refers to the same thing the "prenom" name refers to? This does not fall out nicely.

The provincialism is similar to that which led the xmlrpc specification to forget to put time stamps on their dates, among other things, as I pointed out in "The Limitations of the MetaWeblog API". To assume that sending dates around on the internet without specifying a time zone makes sense, is to assume that every one in the world lives in the same time zone as you.
The web allows us to connect things just by creating hyperlinks. So to tie the meaning of data to a particular script in a particular page is not to take on the full thrust of the web. It is a bit like the example above which writes out phone numbers, but forgets to write the country prefix. Is this data only going to get used by people in the US? What about the provincialism of using a number to represent a postal code. In the UK postal codes are written out mostly with letters. Now those two elements are just modelling mistakes. But if one is going to be serious about creating a data modelling language, then one should avoid making mistakes that are attributable to the idea that string have universal meaning, as if the whole world spoke english, and as if english were not ambigous. Yes, natural language can be disambiguated when one is aware of the exact location and time and context of the speaker. But on a web were everything should link up to everything else, that is not and cannot be the case.
That JSON is so much tied to a web page should not come as a surprise if one looks at its origin, as a serialisation of JavaScript objects. JavaScript is a scripting language designed to live inside a web page, with a few hooks to go outwards. It was certainly not designed as a universal data format.

Compare the above with the following Turtle subset of N3 which presumably expresses the same thing:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix : <http://www.w3.org/2000/10/swap/pim/contact#> .

<http://eg.com/joe#p>
   a foaf:Person;
   foaf:firstName "John";
   foaf:family_name "Smith";
   :home [
         :address [
              :city "New York";
              :country "New York";
              :postalCode "10021";
              :street "21 2nd Street";
         ]
   ];
   foaf:phone <tel:+1-212-732-1234>, <tel:+1-646-123-4567>;
.

Now this may require a little learning curve - but frankly not that much - to understand. In fact to make it even simpler I have drawn out the relations specified above in the following graph:

(I have added some of the inferred types)

The RDF version has the following advantages:

  • you can know what any of the terms mean by clicking on them (append the prefix to the name) and do an HTTP GET
  • you can make statements of equality between relations and things, such as
    foaf:firstname = frenchfoaf:prenom .
  • you can infer things from the above, such as that
    <http://eg.com/joe#p> a foaf:Agent .
  • you can mix vocabularies from different namespaces as above, just as in Java you can mix classes developed by different organisations. There does not even seem to be the notion of a namespace in JSON, so how would you reuse the work of others?
  • you can split the data about something in pieces. So you can put your information about <http://eg.com/joe#p> at the "http://eg.com/joe" URL, in a RESTful way, and other people can talk about him by using that URL. I could for example add the following to my foaf file:
    <http://bblfish.net/people/henry/card#me> foaf:knows <http://eg.com/joe#p> .
    You can't do that in a standard way in JSON because it does not have a URI as a base type (weird for a language that wants to be a web language, to miss the core element of the web, and yet put so much energy into all these other features such as booleans and numbers!)

Now that does not mean JSON can't be made to work this way, as the SPARQL JSON result set serialisation does. But it does not do the right thing by default. A bit like languages before Java that did not have unicode support by default. The few who were aware of the problems would do the right things, all the rest would just discover the reality of their mistakes by painful experience.

This does not take away from the major advantage that JSON has of being much easier to integrate with JavaScript, which is a real benefit to web developers. It should possible to get the same effect with a few good libraries. The Tabulator project provides a javascript library to parse rdf, but it would probably require something like a so(m)mer mapping from relations to javascript objects for it to be as transparent to those developers as JSON is.

Notes

[1]
Now procedural languages such as JavaScript don't have the same notion of semantics as the one I spoke of previously. The notion of semantics defined there is a procedural one: namely two documents can be said to have the same semantics if they behave the same way.
[2]
The spec says that an "object is an unordered set of name-value pairs", which would mean that person could have another "firstName" I presume. But I also heard other people speak about those being hash maps, which only allow unique keys. Not sure which is the correct interpretation...

Vote for this: |dzone

Monday Jul 02, 2007

refactoring xml

Refactoring is defined as "Improving a computer program by reorganising its internal structure without altering its external behaviour". This is incredibly useful in OO programming, and is what has led to the growth of IDEs such as Netbeans, IntelliJ and Eclipse, and is behind very powerful software development movements such as Agile and Xtreeme programming. It is what helps every OO programmer get over the insidious writers block. Don't worry too much about the model or field names now, it will be easy to refactor those later!

If maintaining behavior is what defines refactoring of OO programs - change the code, but maintain the behavior - what would the equivalent be for XML? If XML is considered a syntax for declarative languages, then refactoring XML would be changing the XML whilst maintaining its meaning. So this brings us right to the question of meaning. Meaning in a procedural language is easy to define. It is closely related to behavior, and behavior is what programming languages do their best to specify very precisely. Java pushes that very far, creating very complex and detailed tests for every aspect of the language. Nothing can be called Java if it does not pass the JCP, if it does not act the way specified.
So again what is meaning of an XML document? XML does not define behavior. It does not even define an abstract semantics, how the symbols refer to the world. XML is purely specified at the syntactic level: how can one combine strings to form valid XML documents, or valid subsets of XML documents. If there is no general mapping of XML to one thing, then there is nothing that can be maintained to retain its meaning. There is nothing in general that can be said to be preserved by transformation one XML document into another.
So it is not really possible to define the meaning of an XML document in the abstract. One has to look at subsets of it, such at the Atom syndication format. These subset are given more or less formal semantics. The atom syndication format is given an english readable one for example. Other XML formats in the wild may have none at all, other than what an english reader will be able to deduce by looking at it. Now it is not always necessary to formally describe the semantics of a language for it to gain one. Natural languages for example do not have formal semantics, they evolved one. The problem with artificial languages that don't have a formal semantics is that in order to reconstruct it one has to look at how they are used, and so one has to make very subtle distinction between appropriate and inappropriate uses. This inevitably ends up being time consuming and controversial. Nothing that is going to make it easy to build automatic refactoring tools.

This is where Frameworks such as RDF come in very handy. The semantics of RDF, are very well defined using model theory. This defines clearly what every element of an RDF document means, what it refers to. To refactor RDF is then simply any change that preserves the meaning of the document. If two RDF names refer to the same resource, then one can replace one name with the other, the meaning will remain the same, or at least the facts described by the one will be the same as the one described by the other, which may be exactly what the person doing the refactoring wishes to preserve.

In conclusion: to refactor a document is to change it at the syntactic level whilst preserving its meaning. One cannot refactor XML in general, and in particular instances it will be much easier to build refactoring tools for documents with clear semantics. XML documents that have clear RDF interpretations will be very very easy to refactor mechanically. So if you are ever asking yourself what XML format you want to use: think how useful it is to be able to refactor your Java programs. And consider that by using a format with clear semantics you will be able to make use of similar tools for your data.

Friday Jun 15, 2007

semantics of invalid passports

For those travelers out there who are into XML and semantics, the question is how does one specify that a valid passport has a passport number in it, semantically.

In OWL one can specify that a relation has a cardinality. So for example the mortal:parent relation with domain and range of mortal:Human, would usually be defined as having cardinality 2. So whenever we have something that is a human, we can deduce that it has two parents, even if we don't know who they are.

Note that this is not expressible in a DTD or even in RelaxNG . The best one could say is that a <Person> element could have 0 to 2 <parent> elements. So one could have something like this

<Person>
  <name>Henry Story</name>
  <parent><Person>...</Person></parent>
  <parent><Person>...</Person></parent>
</Person>
One can not say that it must have 2 elements, without thereby specifying documents that are necessarily infinitely long, since each parent is a Person, and so would have to itself have a parent element, etc. etc... With XML we are stuck to the level of syntax.

Working at the level of syntax does have some advantage: it is obvious how to query for the existence of information. One just searches the document using an XQuery. So say I have an XML passport, and I want to find out if it contains a number, it would be easy to find out by searching for the passportNumber attribute for example. The disadvantage is that the query will only succeed on certain types of xml documents, namely those that put the information in that spot. It won't work with information written in other xml passport formats or real paper passports.

Now how does one specify that a passport has a number printed in it? We don't want to say that a doc:Passport has a relation doc:passportNumber with cardinality of 1. Though that seems correct, it would fail to help us find invalid passports that did not have a number printed on them, since

  • a OWL reasoner would add the relation to a blank node anyway by following the suggested owl cardinality rule.
  • there could be a statement as to the passport number written down somewhere else completely, which might have been meshed with the information about the passport. A passport with a passport number written on a separate piece of paper won't help you cross the border...
  • The passport might have had a passport number in it until I cut that information out of the passport, or it got erased in some way by a mischievous border guard. The government databases would still attribute a passport number to my passport. So as soon as I asked them what it is, I would end up having correct knowledge of my passport, yet my passport still be invalid.

Here is the solution presented by Tim Berners Lee:

OWL cardinality might say that a person must have at least one passport number, but it can NOT say that a document about a person contains information about at least one passport number.

N3 rules can, with log:semantics (which related a document to the graph you get from parsing it) and log:includes, and nested graphs:

@forAll x, p, g1.
{   x a Person; passport p.
	p log:semantics g1.
        g1 log:notIncludes  { x passportNumber []   }
}
    =>
{  ?p a InvalidPassport  }.

On the semantic web, as anyone can in principle say anything about anything, you can never make statements about how many passportNumber statements there are without specifying the document in question.

A passport is quite clearly both a document and an object. As an object it can have properties such as a being in your pocket. As a document it tells us something about the world, among other things information about the owner of the passport. There are many source of information in the world. If one wants to find out what possible worlds a particular source of information describes, one has to limit one's query to that source of information.

Note: Semantics of Graphs

Following David Lewis I like to think of a graph as a set of possible worlds which satisfy the patterns of the graph. Tim Berners Lee's formula is saying: find me the set of possible worlds that correctly interpret the passport I have. If this set includes worlds where I don't have a passport number, then my passport is invalid. That is because I can only have such worlds in my interpretation of my passport if I don't have the number on my passport.

This interpretation of graphs must be a little too strong semantically, as it leads to the following problems:

How does one query for documents about mathematical truths? If a document says that "2+2=4" it will be true in all possible worlds, just as the document that says "1+1=2", and so querying for the one will be right if I query for the other. Perhaps here one has to query literally, namely for the string or regex "2+2=4".

Monday Aug 28, 2006

crystalizing rdf

Today I am going to coin a new term: "rdf crystalization". According to today's wikipedia a crystal

is a solid in which the constituent atoms, molecules, or ions are packed in a regularly ordered, repeating pattern extending in all three spatial dimensions.
Generally, crystals form when they undergo a process of solidification. Under ideal conditions, the result may be a single crystal, where all of the atoms in the solid fit into the same crystal structure.

As explained previously rdf allows one to describe relations between objects. Since we live in a post-einsteinien world, we know that there is no center of the world, no central object from which everything can be described. Every object can be a taken as a center, and from there we can describe everything else. So we can just describe things one fact at a time the way NTriples does

<http://bblfish.net/people/henry/card#me> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .
<http://bblfish.net/people/henry/card#me> <http://www.w3.org/2000/10/swap/pim/contact#home> _:L28C17 .
_:L28C17 <http://www.w3.org/2003/01/geo/wgs84_pos#lat> "48.404532" .
_:L28C17 <http://www.w3.org/2003/01/geo/wgs84_pos#long> "2.700448" .

(There is an XML equivalent for this called TriX btw.)

or we can write it out by taking one object as the root and trying to form some tree structure of relations from it as I do in my N3 foaf file (N3 is a superset of NTriples). I don't completely succeed there, as there are two roots: one describing me and one describing the document. But anyway,...

So what is rdf crystalization? Well the above makes it clear how fluid rdf is. You can put your facts in any order, you can organise them in trees if it makes it easier to read, etc, etc, ... But what xml people really need is a more deterministic structure so that they can easily parse their documents. They need something more solid. Crystalising an rdf graph therefore is to serialise it into an xml format that has a RelaxNg schema (or something similar), and that is easy to manipulate using tools such as XSLT, XPath, XQuery or DOM.

There are two types of xml documents one can crystalise an rdf graph to.

  1. Ideally one will try to crystalise it to rdf/xml. But to do this one will need to specify an relax-ng schama in addition to the rdf in order to give the rdf/xml structure. An excellent example of this is the RSS1.1 spec. Doing this can be a good way to both
    • make the rdf more widely consumeable - since more people have access to DOM and XSLT tools,
    • easier to specify the xml - since now you get a well defined semantics for free
    • and by carefully specifying relax-ng extension points one can make one's format predicteably extensible, that is it will be clear how to interpret extensions to the xml.
  2. Practically, one will at present most often need to crystalise a graph to plain xml, like atom. The advantage here is that one picks up all the consumers of that xml. One usually looses clear semantics and extensibility, but the data can always be extracted again using GRDDL.
Now as more people get a good understanding of both rdf and xml I believe we will end up with a better understanding what types of improvements to rdf/xml one should make in a version 2 of the spec to ease such crystalizations, i.e. to get rdf/xml crystalizations resemble what xml experts would be really comfortable with looking at the result as plain xml.

And funnily enough, I am told that people do calculate an rdf (radial distribution function) in studying crystallization :-)

Tools

There are few tools to help crystalize an rdf graph at present. Ian Davis has been working on RDFT a template language that allows one to declaratively crystalize an rdf graph. This certainly looks like something every rdf developer needs in his chest of tools. I look forward to an RDFT java implementation.

Examples

  • RSS1.1 spec is a very good example of an rdf/xml crystalisation.
  • The french statistics institute INSEE has made its ldap information available as a xml. I don't quite agree with that article that we need a new query language, but it is a good example of how to think about crystalisations.
  • The INSEE has also made its geographical information available online in a crystalized form

I have started a wiki page on this topic where people can post further good examples, ideas, and techniques.

Technorati Profile

Wednesday Aug 09, 2006

All XML roads lead to RDF

Where I proove that thinking of the web as a database does not mean thinking of it in terms of an xml database. Rather we will have to think in terms of graphs.

XML applied to Documents

XML stands for eXtensible Markup Language. The most successful markup language is and remains html and its cleaned up xhtml successor. So let me start with this.

Xhtml is designed to markup strings such as the following "Still pond A frog jumps in Plop!" with presentation information. In a wysiwyg editor I would do this by highlighting parts of the text using my mouse, then setting a property for that part of the text by clicking some button such as bold available to me from some menu. XML allows me save this information in the document by giving me a way to create arbitrary parenthesizing properties. So for example using the xhtml markup language, the following ≺blockquote≻≺font face="papyrus,helvetica"≻Still pond≺br/≻A frog jumps in≺br/≻Plop!≺br/≻≺/font≻≺/blockquote≻ should display in a compliant browser as

Still pond
A frog jumps in
Plop!

XML applied to Data

Take some information written out one way or another. This will usually be in some tabular format as in a receipt, a spread sheet or a database table. Take this receipt I have in front of me for example. The text on it is "1 Fresh Orange Juice 1.95 1 Café au Lait 1,20 1 Sparkling Water 1,40". This is not very easy to parse for a human, let a alone a machine. But extrapolating from the experience with html, I can use xml to mark the text up like this:

≺bill≻
  ≺item≻≺description≻1 Fresh Orange Juice≺/description≻≺price currency="euro"≻1,95≺/price≻≺/item≻
  ≺item≻≺description≻1 Café au Lait≺/description≻≺price currency="euro"≻1,20≺/price≻≺/item≻
  ≺item≻≺description≻1 Sparking Water≺/description≻≺price currency="euro"≻1,40≺/price≻≺/item≻
≺/bill≻

This is now much more accessible to computer automation, as the machine can deduce from the tags what interpretation to give the enclosed string. And so was born the great enthusiasm for xml formats and web services.

World Wide Data

It is clear that by following the above procedure we can create machine readable documents for every type of data, stored in every kind of database available worldwide. We just need to mark it up. Wait. We need to do more. We need to agree on a vocabulary and a tree like way of displaying it, since xml forms a markup tree. Let us assume for this article that the naming problem has somehow been solved, and let's look more closely at the data format problem.

Say I want to describe a house, then I will want to have an xml format something like this

≺house≻
   ≺owner≻...≺/owner≻
   ≺address≻...≺/address≻
   ≺rooms≻
     ≺room≻...≺/room≻
     ...
   ≺/rooms≻
≺/house≻

but if I want to describe a person I would of course describe them like this

≺person≻
   ≺name≻...≺/name≻
   ≺owns≻≺house≻...≺/house≻≺/owns≻
   ≺friend≻...≺/friend≻
   ≺friend≻...≺/friend≻
   ....
≺/person≻

In the first case the person object is part of the house document, whereas in the second case the house information is part of the person document. Both are equally valid ways of doing this. This is not an isolated case. It will happen whenever we wish to describe some object. No object has priority of any other. Here is another example. We may want to describe a book like this:

≺book≻
   ≺author≻Ken Wilber≺/author≻
   ≺title≻Sex, Ecology, Spirituality≺/title≻
   ...
≺/book≻

but of course if I had a CV/resumé of Ken Wilber then my xml would be like this

≺CV≻
  ≺name≻Ken Wilber≺/name≻
  ≺publications≻
     ≺book≻...≺/book≻
  ≺/publications≻
≺/CV≻

again there is no natural way putting things. In one case the ≺book≻ element is the root of the tree, in another it is an element of the tree. It follows that every type of object will require its own type of document to describe it. This is not a problem if the world were composed just of turtles. But it isn't: there are an infinite number of types of things in our very rich world. Furthermore what is of interest in each type of document depends completely on the context. In one type of document we may be more interested in the friends a person has, in another in his medical history, in yet another his academic achievements, etc... So there is not even one objective way to describe anything! If we were to create a tree structured document to describe every type of thing we are interested in, we would therefore also need to create an uncountable number of document formats for every different way we wanted to describe each class of objects.

Conclusion

This is summarized simply by saying "The World is a Graph". The world can just be described holistically as consisting of objects and relations between those objects. Take any object in the world, you will be able to reach any other object by following relations stemming from it. Make that type of object the root of your graph, and you have an xml format.
So the problem is not so much that it is not possible to describe each subgraph we find using XML. One can! The problem emerges rather when considering the tools required to query and understand these documents. It is clear from the arguments above that when thinking web wide, one has to give up the idea that information will reach one in a limited number of hierarchically structured ways. As a result tools such as XQuery, that are designed to query documents at the xml structure level are not adapted for querying information across documents, since the tree structure of the xml documents will gets in the way of the description of the graph that the world is and that documents are attempting to describe. XQuery people know this, which is why they don't like RDF. But it is not RDF that is the problem. It is reality that is the problem. And that is a lot more difficult to change.

To repeat, if RDF never had been invented, your database of documents would end up containing an infinitely large number of different types of xml documents to describing the infinite types of objects out there, each of course requiring its own specific interpretation (since XML does not come with a semantics). And so you may as well start off using RDF, since that is where you will end up anyway.

The world is an interconnected graph of things. RDF allows one to describe the world as a graph. SPARQL is the query language to query such a graph. Use the tools that fit the world!

Note

  • This is not to say that rdf/xml is perfect. I myself believe it is a really good first attempt at trying to do something very ambitious. Sadly it was done a little too early. Something better will certainly come along. In the mean time it is good enough for nearly everything anyone may want to do with it when wishing to send data out on the web.
  • Having many XML documents is not a problem for the Semantic web since it is easy to convert each of the formats using GRDDL to rdf.
About

bblfish

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today