Wednesday Oct 07, 2009

Sketch of a RESTful photo Printing service with foaf+ssl

Let us imagine a future where you own your data. It's all on a server you control, under a domain name you own, hosted at home, in your garage, or on some cloud somewhere. Just as your OS gets updates, so all your server software will be updated, and patched automatically. The user interface for installing applications may be as easy as installing an app on the iPhone ( as La Distribution is doing).

A few years back, with one click, you installed a myPhoto service, a distributed version of fotopedia. You have been uploading all your work, social, and personal photos there. These services have become really popular and all your friends are working the same way too. When your friends visit you, they are automatically and seamlessly recognized using foaf+ssl in one click. They can browse the photos you made with them, share interesting tidbits, and more... When you organize a party, you can put up a wiki where friends of your friends can have write access, leave notes as to what they are going to bring, and whether or not they are coming. Similarly your colleagues have access to your calendar schedule, your work documents and your business related photos. Your extended family, defined through a linked data of family relationship (every member of your family just needs to describe their relation to their close family network) can see photos of your family, see the videos of your new born baby, and organize Christmas reunions, as well as tag photos.

One day you wish to print a few photos. So you go to web site we will provisionally call print.com. Print.com is neither a friend of yours, nor a colleague, nor family. It is just a company, and so it gets minimal access to the content on your web server. It can't see your photos, and all it may know of you is a nickname you like to use, and perhaps an icon you like. So how are you going to allow print.com access to the photos you wish to print? This is what I would like to try to sketch a solution for here. It should be very simple, RESTful, and work in a distributed and decentralized environment, where everyone owns and controls their data, and is security conscious.

Before looking at the details of the interactions detailed in the UML Sequence diagram below, let me describe the user experience at a general level.

  1. You go to print.com site after clicking on a link a friend of your suggested on a blog. On the home web page is a button you can click to add your photos.
  2. You click it, and your browser asks you which WebID you wish to use to Identify yourself. You choose your personal ID, as you wish to print some personal photos of yours. Having done that, your are authenticated, and print.com welcomes you using your nicknames and displays your icon on the resulting page.
  3. When you click a button that says "Give Print.com access to the pictures you wish us to print", a new frame is opened on your web site
  4. This frame displays a page from your server, where you are already logged in. The page recognized you and asks if you want to give print.com access to some of your content. It gives you information about print.com's current stock value on NASDAQ, and recent news stories about the company. There is a link to more information, which you don't bother exploring right now.
  5. You agree to give Print.com access, but only for 1 hour.
  6. When your web site asks you which content you want to give it access to, you select the pictures you would like it to have. Your server knows how to do content negotiation, so even though copying each one of the pictures over is feasible, you'd rather give print.com access to the photos directly, and let the two servers negotiate the best representation to use.
  7. Having done that you drag and drop an icon representing the set of photos you chose from this frame to a printing icon on the print.com frame.
  8. Print.com thanks you, shows you icons of the pictures you wish to print, and tells you that the photos will be on their way to your the address of your choosing within 2 hours.

In more detail then we have the following interactions:

  1. Your browser GETs print.com's home page, which returns a page with a "publish my photos" button.
  2. You click the button, which starts the foaf+ssl handshake. The initial ssl connection requests a client certificate, which leads your browser to ask for your WebID in a nice popup as the iPhone can currently do. Print.com then dereferences your WebId in (2a) to verify that the public key in the certificate is indeed correct. Your WebId (Joe's foaf file) contains information about you, your public keys, and a relation to your contact addition service. Perhaps something like the following:
    :me xxx:contactRegistration </addContact> .
    Print.com uses this information when it creates the resulting html page to point you to your server.
  3. When you click the "Give Print.com access to the pictures you wish us to print" you are sending a POST form to the <addContact> resource on your server, with the WebId of Print.com <https://nasdaq.com/co/PRNT#co> in the body of the POST. The results of this POST are displayed in a new frame.
  4. Your web server dereferences Print.com, where it gets some information about it from the NASDAQ URL. Your server puts this information together (4a) in the html it returns to you, asking what kind of access you want to give this company, and for how long you wish to give it.
  5. You give print.com access for 1 hour by filling in the forms.
  6. You give access rights to Print.com to your individual pictures using the excellent user interface available to you on your server.
  7. When you drag and drop the resulting icon depicting the collection of the photos accessible to Print.com, onto its "Print" icon in the other frame - which is possible with html5 - your browser sends off a request to the printing server with that URL.
  8. Print.com dereferences that URL which is a collection of photos it now has access to, and which it downloads one by one. Print.com had access to the photos on your server after having been authenticated with its WebId using foaf+ssl. (note: your server did not need to GET print.com's foaf file, as it still had a fresh version in its cache). Print.com builds small icons of your photos, which it puts up on its server, and then links to in the resulting html before showing you the result. You can click on those previews to get an idea what you will get printed.

So all the above requires very little in addition to foaf+ssl. Just one relation, to point to a contact-addition POST endpoint. The rest is just good user interface design.

What do you think? Have I forgotten something obvious here? Is there something that won't work? Comment on this here, or on the foaf-protocols mailing list.

Notes

Creative Commons License
print.com sequence diagram by Henry Story is licensed under a Creative Commons Attribution 3.0 United States License.
Based on a work at blogs.sun.com.

Thursday May 14, 2009

FOAF+SSL: RESTful Authentication for the Social Web

The European Semantic Web Conference (ESWC) will be held in Heraklion on the Island of Crete in Greece from 31 May to 4 June. I will be presenting the paper "FOAF+SSL: RESTful Authentication for the Social Web" which I co-authored with Bruno Harbulot, Ian Jacobi and Mike Jones. Here is the abstract:

We describe a simple protocol for RESTful authentication, using widely deployed technologies such as HTTP, SSL/TLS and Semantic Web vocabularies. This protocol can be used for one-click sign-on to web sites using existing browsers — requiring the user to enter neither an identifier nor a password. Upon this, distributed, open yet secure social networks and applications can be built. After summarizing each of these technologies and how they come together in FOAF+SSL, we describe declaratively the reasoning of a server in its authentication decision. Finally, we compare this protocol to others in the same space.

The paper was accepted by the Trust and Privacy on the Social and Semantic Web track of the ESWC. There are quite a number of interesting papers there.

I have never been to Greece, so I have a feeling I will really enjoy this trip. Hope to see many of you there.

Tuesday Mar 03, 2009

The foaf+ssl paradigm shift

passport picture

Foaf+SSL builds on PKI whose paradigmatic example is that of a traveller crossing the frontier and showing his passport. The problem is that this analogy breaks down for foaf+ssl (wiki page) and can make it difficult to understand what is going on. What is required is a paradigm shift, and I will here help walk you through it. (Thanks to a educational exchange with Bruno Harbulot on the foaf-protocols mailing list)

Traditional PKI

So first let us step into the old paradigm. You arrive at a web site. It asks you for your certificate which is somewhat like being asked for a passport at a border. So in this analogy you are playing the role of the traveller who is looking to cross the border (access that resource), and the server is playing the role of border patrol officer, whose job it is to permit you only if you are authorized to do so.

So of course you hand over a certificate. This contains a number of things:

  • Your identifier. This can be the passport number. In X509 it is the Distinguished Name. And with foaf+ssl we have are also making use of the subject alternative name, a WebId.
  • Something to tie the certificate to you. In the case of the passport this would be the photo on the passport which should match your face. In the certificate this role is played by the public key that corresponds to the private key only you posses. The public key that you send is what others see of your face, ie, the photo on the passport, the private key would be your face itself.
  • There may be a few other things written in the passport about you, such as your age, your birthplace, etc...
  • The whole passport is designed to be recogniseable as having been published by the Government which issued it - usually it is also signed by them. And indeed in the certificate space we have the same thing: a Certificate Authority takes the place of the Government and signs your certificate.
By giving over this passport to the officer, or by sending your certificate to the server, you have completed your role in identifying yourself.

The server receiving this certificate, playing the role of the border patrol agent, now himself needs to continue the process:

  • First he must identify you. Ie, his task is given the pasport, to verify the referent of its holder. He can do this simply by verifying that the picture matches your face. In the case of TLS this is done simply through the cryptographic mechanism that established the https connection.
  • Next the officer must verify that the information in the passport is issued by a government agency he trusts. This is the authentication step. (Authentication, from Greek αυθεντικός, real or genuine, from authentes for author) The officer verifies that the passport is genuinely from the government. To do this he verifies the watermarks, checks for signs of tampering, etc... In the case of the server this is very easy to do using encryption. The certificate is signed by the Certificate Authority, in such a way as to make it extremly difficult to tamper with. By verifiying the Certificate integrity and that the signature of the CA matches the one it has on file the server can be confident that the information it is reading was stated by the Authority it trusts. Since it trusts that authority it can believe its contents.
  • Having accepted the contents, it can trust the identifier is of you, and finding that you are not on a blacklist, can authorize you to cross the border.

How the analogy breaks down

The problem with that analogy is that it does not help one to understand foaf+ssl, because with foaf+ssl the certificate presented to the server is self signed, and the identity self created!

self signed certificate

To clearly see how the analogy breaks down, imagine what would happen if we mapped the foaf+ssl back to our border patrol situation. Imagine you arrive happily at the border and the officer asks you for a passport. You give him a piece of paper nicely crafted on your color laser printer at home, with your photo on the right hand side, your self created WebId, a URL you coined a few days before, your name, and your signature below. On the paper you put a nice logo, saying "Issued on 1 Jan 2009 by Me, valid for 1 year."

Now I don't recommend doing this during times of high tension between the countries on either side of the border, or unless you have some serious reason to believe that the officers have a good sense of humor. If they do, you can be certain you will be sent back from where you came from, and not into some more dingy place with bars instead of windows.

A better analogy

So lets leave those dreary, bureaucratic and slow moving border control situations where novelty is frowned upon far behind us. Instead let us try for a different example.

So imagine now you are going to a masked party, where only a preselected group of people were invited, of which you. As you arrive in your RoboCop costume which completely covers your body, and muffles your voice, you present a paper with a note on which is your typewritten ID. Having verified that the person with that ID is indeed authorized to join, the guard at the door asks you to move your right arm up and down three times, which you do. To wiggle your bottom as best you can, which you do. Satisfied that he has identified you, the guard lets you in, and you go party.

According to you, is this guard doing his job correctly? Has he correctly authorized you? Let me add here that the ID you gave him is a public ID and that the list of invited people is also public! What do you think? Because this is really not far from the foaf+ssl solution...

Well it all depends on a how the guard came to ask you to move your hand! Imagine that your ID was the URL <tel:+1.510.931.5491>, and that the guard took out his cell phone and called that number. You, in the depth of your RoboCop costume receive the call. The guard asks: "Hi are you in front of the party now?". You answer "yes", and the guard hears the voice in the phone answer "yes". He asks you to move your right hand up and down three times which you immediately do. He asks you to do the best to wiggle your bottom. Which you do. Now has he not identified you as being <tel:+1.510.931.5491>?

Are you thinking: "well yes, could I - being inside the costume - could I not have just overheard what the guard was saying, even had I not received the call?". If that thought bothers you, then replay the same scenario, but this time change the ID you give over to an email address, and have the guard send you an email, which you receive on your cell phone too. This is something we do all the time when signing up to web sites.

Notice now how this is similar to the foaf+ssl protocol. There you give the guard an https URL, and he queries that URL with an HTTP GET. That returns a response containing your public key, which is the one you used to communicate with the guard, thereby clearly tying you to your ID. Once that link is made, the guard can go straight to the authorization step: are you or are you not in the invited people's list.

The Web of Trust

We have been using email URLs for a long time to identify ourselves on sites. So what does foaf+ssl add that we did not have before? Well it does the same thing in a RESTful manner. REST is the architectural style on which the most successful hypermedia system ever was built. It is designed to make hypermedia easily possible. The advantage of building in this style is that it is very easy to link information together. So just as the original Web made it very easy to link documents together, so by following this style into the hyperdata space, we make it easy to link things together. By making identity RESTful we have layed the basic building blocks to then build a web of trust.

So to illustrate just a little bit more how this works, let us extend the access rules in our example slightly. To be allowed access to the party you either have to be on a list, or you have to be a friend of someone on the list. This just helps regulate the party somewhat, so that there are clear chains of responsibility that can be drawn in case of trouble. This time you are not on the list. You are still in your large RoboCop costume, and the guard calls you to ask you who you know. You say you know <tel:+44-161-275-014>. The skeptical guard, does not of course take everyone at their word automatically, so he does not let you in on this basis alone: you could be lying. But it is easy to verify. The guard checks that <tel:+44-161-275-014> is indeed a core guest of the party, and having checked that just calls that number, ask the person if he knows <tel:+1.510.931.5491>. If he answer yes, you are authorized.

With foaf+ssl we can do the same without requiring direct human intervention. By giving your WebId to the guard, he can "call" the WebId using HTTP GET (clicking on the link) and see what information that link returns. If the information returned identifies you (as it does by returning your public key) then we have, as shown previously, confirmed identification. Now if as we are supposing in this example, your URL is not on the list of directly authorized ones, the guard can check the document returned to find if any of your friends are directly authorized. If anyone of them is, he can 'call them' (with an HTTP GET of course) to find out if they claim you as a friend too. So the parallel with the above phone conversation holds very well. (For more detailed description of this example see "Building a web of Trust without Key Signing parties")

This way of linking between documents works because every object in those relations has a identifying URL, and because the documents are published RESTfully. The REST architecture is very strict about names referring to things independent of who uses them or of any previous state between the client and the server. If it were not, different people would be meaning different things with the same URL, which would be very confusing. By making it easy to link between documents we have the basic elements to grow a web of trust.

Conclusion

So how does foaf+ssl and the usual passport like PKI example compare? Here are a few thoughts:

  • foaf+ssl focuses on identity, as OpenId does, but much more RESTfully. Traditional PKI on the other hand also conceives of itself as certifying extra information: name, age, address...
  • In the passport example we pass a document by value, not by reference. The advantage is that that resource can be updated a lot faster than the passport can. One could easily imagine border control situations working like that. All that you would need would be to cite your passport Id to the officer and he could find your record in the government database and check your identity that way (by looking at your picture or checking your fingerprints).
  • To get something similar to the passport example in foaf+ssl, the government would just have to produce its WebIDs. Then the content of the representation returned by that resource would be the governments view on me. (What remains to be done is to find a way to make clear who is speaking for whom - so to distinguish the case when the WebId is my employers and when it is mine)
  • The WebId can much more easily be self created. This makes it easier to say more about oneself than an official source would ever want to be liable for certifying.
  • WebId's can easily be linked to, so other people can relate to you.

Tuesday Nov 11, 2008

REST APIs must be hypertext driven

Roy Fielding recently wrote in "REST APIs must be hypertext-driven"

I am getting frustrated by the number of people calling any HTTP-based interface a REST API. Today's example is the SocialSite REST API. That is RPC. It screams RPC. There is so much coupling on display that it should be given an X rating.

That was pretty much my thought when I saw that spec. In a comment to his post he continues.

The OpenSocial RESTful protocol is not RESTful. It could be made so with some relatively small changes, but right now it is just wrapping RPC results in common Web media types.

Clarification of Roy's points

Roy then goes on to list some key criteria for what makes an application RESTful.

  • REST API should not be dependent on any single communication protocol, though its successful mapping to a given protocol may be dependent on the availability of metadata, choice of methods, etc. In general, any protocol element that uses a URI for identification must allow any URI scheme to be used for the sake of that identification.

    In section 2.2 of the O.S. protocol we have the following JSON representation for a Person.

    {
        "id" : "example.org:34KJDCSKJN2HHF0DW20394",
        "displayName" : "Janey",
        "name" : {"unstructured" : "Jane Doe"},
        "gender" : "female"
    }
    

    Note that the id is not a URI. Further down in the XML version of the above JSON, it is made clear that by appending "urn:guid:" you can turn this string into a URI. By doing this the protocol has in essence tied itself to a URI scheme, since there is no way of expressing another URI type in the JSON - the JSON being the key representation in this Javascript specific API by the way, the aim of the exercise being to make the writing of social network widgets interoperable. Furthermore this scheme has some serious limitations such as for example that it limits one to 1 social network per internet domain, is tied to a quite controversial XRI spec that has been rejected by OASIS, and does not provide a clear mechanism for retrieving information about it. But that is not the point. The definition of the format is tying itself unnecessarily to a URI scheme, and moreover one that ties one to what is clearly a client/server model.

  • A REST API should not contain any changes to the communication protocols aside from filling-out or fixing the details of underspecified bits of standard protocols, such as HTTP's PATCH method or Link header field.
  • A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]

    Most of these so called RESTful APIs spend a huge amount of time specifying what response a certain resource should give to a certain message. Note for example section 2.1 entitled Responses

  • A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC's functional coupling].

    In section 6.3 one sees this example:

    /activities/{guid}/@self                -- Collection of activities generated by given user
    /activities/{guid}/@self/{appid}        -- Collection of activities generated by an app for a given user
    /activities/{guid}/@friends             -- Collection of activities for friends of the given user {guid}
    /activities/{guid}/@friends/{appid}     -- Collection of activities generated by an app for friends of the given user {guid}
    /activities/{guid}/{groupid}            -- Collection of activities for people in group {groupid} belonging to given user {uid}
    /activities/{guid}/{groupid}/{appid}    -- Collection of activities generated by an app for people in group {groupid} belonging to given user {uid}
    /activities/{guid}/@self/{appid}/{activityid}   -- Individual activity resource; usually discovered from collection
    /activities/@supportedFields            -- Returns all of the fields that the container supports on activity objects as an array in json and a repeated list in atom.
    

    For some reason it seems that this protocol does require a very precise lay out of the patterns of URLs. Now it is true that this is then meant to be specified in an XRDS document. But this document is not linked to from any of the representations as far as I can see. So there is some "out of band" information exchange that has happened and on which the rest of the protocol relies. Furthermore it ties the whole service again to one server. How open is a service which ties you to one server?

  • A REST API should never have "typed" resources that are significant to the client. Specification authors may use resource types for describing server implementation behind the interface, but those types must be irrelevant and invisible to the client. The only types that are significant to a client are the current representation's media type and standardized relation names. [ditto]

    Now clearly one does want to have URIs name resources, things, and these things have types. I think Roy is here warning against the danger that expectations are placed on types that depend on the resources themselves. This seems to be tied to the previous point that one should not have fixed resource names or hierarchies as we saw above. To see how this is possible check out my foaf file:

    
    $ cwm http://bblfish.net/people/henry/card --ntriples | grep knows | head
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://axel.deri.ie/~axepol/foaf.rdf#me> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://b4mad.net/FOAF/goern.rdf#goern> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://bigasterisk.com/foaf.rdf#drewp> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://crschmidt.net/foaf.rdf#crschmidt> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://danbri.org/foaf.rdf#danbri> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://data.boab.info/david/foaf.rdf#me> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://davelevy.info/foaf.rdf#me> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://dblp.l3s.de/d2r/page/authors/Christian_Bizer> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://dbpedia.org/resource/James_Gosling> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://dbpedia.org/resource/Roy_Fielding> .
    

    Notice that there is no pattern in the URIs to the right. (As it happens there are no ftp URLs there, but it would work just as well if there were). Yet the Tabulator extension for Firefox knows from the relations above alone that (if it believes my foaf file of course) the URIs to the right refer to people. This is because the foaf:knows relation is defined as

    
    @prefix foaf: <http://xmlns.com/foaf/0.1/> .
    
    foaf:knows  a rdf:Property, owl:ObjectProperty;
             :comment "A person known by this person (indicating some level of reciprocated interaction between the parties).";
             :domain <http://xmlns.com/foaf/0.1/Person>;
             :isDefinedBy <http://xmlns.com/foaf/0.1/>;
             :label "knows";
             :range foaf:Person .
    

    This information can then be used by a reasoner (such as the javascript one in the tabulator) to deduce that the resources pointed to by the URIs to the right and to the left of the foaf:knows relation are members of the foaf:Person class.

    Note also that there is no knowledge as to how those resources are served. In many cases they may be served by simple web servers sending resources back. In other cases the RDF may be generated by a script. Perhaps the resources could be generated by java objects served up by Jersey. The point is that the Tabulator does not need to know.

    Furthermore, the ontology information above is not out of band. It is GETable at the foaf:knows URIs itself. The name of the relation links to the information about the relations, which gives us enough to be able to deduce further facts. This is hypertext - hyperdata in this case - at its best. Compare that with the JSON example given above. There is no way to tell what that JSON means outside of the context of the totally misnamed 'Open Social RESTful API'. This is a limitation of JSON, or at least this name space less version. One would have to add a mime type to the JSON to make it clear that the JSON had to be interpreted in a particular manner for this application, but I doubt most JSON tools would know what to do with mime typed JSON versions. And do you really want to go through a mime type registration process every time a social networking application wants to add a new feature or interact with new types of data?

    as Roy summarizes in one one of the replies to this blog post:

    When representations are provided in hypertext form with typed relations (using microformats of HTML, RDF in N3 or XML, or even SVG), then automated agents can traverse these applications almost as well as any human. There are plenty of examples in the linked data communities. More important to me is that the same design reflects good human-Web design, and thus we can design the protocols to support both machine and human-driven applications by following the same architectural style.

    To get a feel of this it really helps to play with other hyperdata applications, other than ones residing in web browsers The semantic address book is one such, that I spent some time writing.

  • A REST API should be entered with no prior knowledge beyond the initial URI (bookmark) and set of standardized media types that are appropriate for the intended audience (i.e., expected to be understood by any client that might use the API). From that point on, all application state transitions must be driven by client selection of server-provided choices that are present in the received representations or implied by the user‚Äôs manipulation of those representations. The transitions may be determined (or limited by) the client's knowledge of media types and resource communication mechanisms, both of which may be improved on-the-fly (e.g., code-on-demand). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]

    That is the out of band point made previously, and confirms the point made about the danger of protocols that depend on URI patterns or resources that are somehow typed at the protocol level. You should be able to pick up a URI and just go from there. With the tabulator plugin you can in fact do just that on any of the URLs listen in my foaf file, or in other RDF.

What's the point?

Engineers under the spell of the client/server architecture, will find some of this very counter intuitive. This is indeed why Roy's thesis, and the work done by the people who engineered the web before that and whose wisdom is distilled in various writings by the Technical Architecture Group did something that was exceedingly original. These very simple principles that can feel unintuitive to someone who is not used to thinking at a global information scale, make a lot of sense when you do come to think at that level. When you do write such an Open system, that can allow people to access information globally, you want it to be such that you can send people a URI to any resource you are working with, so that both of you can speak about the same resource. Understanding what the resource that URL is about should be found by GETting the meaning of the URL. If the meaning of that URL depends on the way you accessed it, then you will no longer be able to just send a URL, but you will have to send 8 or 9 URLs with explanations on how to jump from one representation to the other. If some out of band information is needed to understand that one has to inspect the URL itself to understand what it is about, then you are not setting up an Open protocol, but a secret one. Secret protocols may indeed be very useful in some circumstances, and so as Roy points out may non RESTful ones be:

That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. That’s fine with me as long as you don’t call the result a REST API. I have no problem with systems that are true to their own architectural style.
but note: it is much more difficult for them to make use of the network effect: the value of information grows exponentially with its ability to be linked to other information. In another reply to a comment Roy puts this very succinctly:
encoding knowledge within clients and servers of the other side’s implementation mechanism is what we are trying to avoid.

Tuesday Jun 24, 2008

Webifying Integrated Development Environments

IDEs should be browsers of code on a Read Write Web. A whole revolution in how to build code editors is I believe hidden in those words. So let's imagine it. Fiction anticipates reality.

Imagine your favorite IDE, a future version of NetBeans perhaps or IntelliJ, which would make downloading a new project as easy as dragging and dropping a project url onto your IDE. The project home page would point to a description of the location of the code, the dependencies of this project on other projects, described themselves via URL references, which themselves would be set up in a similar manner. Let's imagine further: instead of downloading all the code from CVS, think of every source code document as having a URL on the web. ( Subversion is in fact designed like this, so this is not so far fetched at all.) And let's imagine that NetBeans thinks about each software component primarily via this URL.
Since every piece of code and every library has a URL, the IDE would be able to use RESTful architectural principles of the web. A few key advantages of this are

  • Caching: web architecture is the ability to cache information on the network or locally without ambiguity. This is how your web browser works ( though it could work better ). To illustrate: once a day Google changes its banner image. Your browser and every browser on earth only fetches that picture once a day, even if you do 100 searches. Does Google serve one image to each browser? No! numerous caches (company, country, or other) cache that picture and send it to the browser without sending the request all the way to the search engine, reducing the load on their servers very significantly.
  • Universal names: since every resource has a URL, any resource can relate in one way or another to any other resource wherever it is located. This is what enables hypertext and what is enabling hyperdata.

Back to the IDE. So now that all code, all libraries, can be served up RESTfully in a Resource Oriented Architecture what does this mean to the IDE? Well a lot. Each may seem small, but together they pack a huge punch:
  • No need to download libraries twice: if you have been working on open source projects at all frequently you must have noticed how often the same libraries are found in each of the projects you have downloaded. Apache logging is a good example.
  • No need to download source code: it's on the web! You don't therefore need a local cache of code you have never looked at. Download what you need when you need it (and then cache it!): the Just in Time principle.
  • Describe things globally: Since you have universal identifiers you can now describe how source code relates to documentation, to people working on the code, or anything else in a global way, that will be valid for all. Just describe the resources. There's a framework around just for that, that is very easy to use with the right introduction.

The above advantages may seem rather insignificant. After all, real developers are tough. They use vi. (And I do). So why should they change? Well notice that they also use Adobe Air or Microsoft Silverlight. So productivity considerations do in fact play a very important factor in the software ecosystem.
Don't normal developers just work on a few pieces of code? Well speaking for myself here, I have 62 different projects in my /Users/hjs/Programming directory, and in each of these I often have a handful of project branches. As more and more code is open source, and owned and tested by different organizations, the number of projects available on the web will continue to explode, and due to the laziness principle the number of projects using code from other projects will grow further. Already whole operating systems consisting of many tens of thousands of different modules can be downloaded and compiled. The ones I have downloaded are just the ones I have had the patience to get. Usually this means jumping through a lot of hoops:

  1. I have to finding the web site of the code. And I may only have a jar name to go by. So Google helps. But that is a whole procedure in itself that should be unecessary. If you have an image in your browser you know where it is located by right-clicking over it and selecting the URL. Why not so with code?
  2. Then I have to browse a web page, which may not be written in my language, and find the repository of the source code
  3. Then I have to find the command line to download the source code, or the command in the IDE and also somehow guess which version number produced the jar I am using.
  4. Once downloaded, and this can take some time, I may have to find the build procedure. There are a few out there. Luckily ant and maven are catching on. But some of these files can be very complicated to understand.
  5. Then I have to link the source code on my local file system to the jar on my local file system my project is using. In NetBeans this is exceedingly tedious - sometimes I have found it to be close to impossible even. IntelliJ has a few little tricks to automate some of this, but it can be pretty nasty too, requiring jumping around different forms. Especially if a project has created a large number of little jar files.
  6. And then all that work is only valid for me. Because all references are to files on my local file system, they cannot be published. NetBeans is a huge pain here in that it often creates absolute file URLs in its properties files. By replacing them with relative urls one can get publish some of the results, but at the cost of copying every dependency into the local repository. And working out what is local and what is remote can take up a lot of time. It will work on my system, but not on someone else's.
  7. Once that project downloaded one may discover that it depends on yet another project, and so we have to go back to step 1.

So doing the above is currently causing me huge headaches even for very simple projects. As a result I do it a lot less often than I could, missing valuable opportunities as a result. Each time I download a project in order to access the sources to walk through my code and find a bug, or to test out a new component I have to do all that download rigmarole described above. If you have a deadline, this can be a killer.

So why do we have to tie together all the components on our local file system? This is because the IDE's are not referring to the resources with global identifiers. The owner of the junit project should say somewhere, in his doap file perhaps that:

 
   @prefix java: <http://java.net/ont/java#> . #made this up
   @prefix code: <http://todo.eg/#> .

   <http://project.eg/svn/lib/junit-4.0.jar> a java:Jar;
         code:builtFrom <http://junit.sourceforge.net/> .

   #what would be needed here needs to be worked out more carefully. The point is that we don't
   #at any point refer to any local file.

Because this future IDE we are imagining together will then know that it has stored a local copy of the jar somewhere on the local file system, and because it will know where it placed the local copy of the source code, it will know how the cached jar relates to the cached source code, as illustrated in the diagram above. So just as when you click on a link on your web browser you don't have to do any maintenance to find out where the images and html files are cached on your hard drive, and how one resource (you local copy of an image) relates to the web page, so we should not have to do any of this type of work in our Development Environment either.

From here many other things follow. A couple of years ago I showed how this could be used link source code to bugs, to create a distributed bug database. Recently I showed how one could use this to improve build scripts. Why even download a whole project if you are stepping through code? Why not just fetch the code that you need when you need it from the web? One HTTP GET at a time. The list of functional improvements is endless. I welcome you to list some that you come up with in the comments section below.

If you want to make a big impact in the IDE space, that will be the way to go.

Friday Mar 28, 2008

RDFAuth: sketch of a buzzword compliant authentication protocol

Here is a proposal for an authentication scheme that is even simpler than OpenId ( see sequence diagram ), more secure, more RESTful, with fewer points of failure and fewer points of control, that is needed in order to make Open Distributed Social Networks with privacy controls possible.

Update

The following sketch led to the even simpler protocol described in Foaf and SSL creating a global decentralized authentication protocol. It is very close to what is proposed here but builds very closely on SSL, so as to reduce what is new down to nearly nothing.

Background

Ok, so now I have your attention, I would like to first mention that I am a great fan of OpenId. I have blogged about it numerous times and enthusiastically in this space. I came across the idea I will develop below, not because I thought OpenId needed improving, but because I have chosen to follow some very strict architectural guidelines: it had to satisfy RESTful, Resource oriented hyperdata constraints. With the Beatnik Address Book I have proven - to myself at least - that the creation of an Open Distributed Social Network (a hot topic at the moment, see the Economist's recent article on Online social network) is feasible and easy to do. What was missing is a way for people to keep some privacy, clearly a big selling point for the large Social Network Providers such as Facebook. So I went on the search of a solution to create a Open Distributed Social Network with privacy controls. And initially I had thought of using OpenId.

OpenId Limitations

But OpenId has a few problems:

  • First it is really designed to work with the limitations of current web browsers. It is partly because of this that there is a lot of hopping around from the service to the Identity Provider with HTTP redirects. As the Tabulator, Knowee or Beatnik.
  • Parts of OpenId 2, and especially the Attribute Exchange spec really don't feel very RESTful. There is a method for PUTing new property values in a database and a way to remove them that does not use either the HTTP PUT method or the DELETE method.
  • The OpenId Attribute Exchange is nice but not very flexible. It can keep some basic information about a person, but it does not make use of hyperdata. And the way it is set up, it would only be able to do so with great difficulty. A RESTfully published foaf file can give the same information, is a lot more flexible and extensible, whilst also making use of Linked Data, and as it happens also solves the Social Network Data Silo problems. Just that!
  • OpenId requires an Identity Server. There are a couple of problems with this:
    • This server provides a Dynamic service but not a RESTful one. Ie. the representations sent back and forth to it, cannot be cached.
    • The service is a control point. Anyone owning such a service will know which sites you authenticate onto. True, you can set up your own service, but that is clearly not what is happening. The big players are offering their customers OpenIds tied to particular authentication servers, and that is what most people will accept.
As I found out by developing what I am here calling RDFAuth, for want of a better name, none of these restrictions are necessary.

RDFAuth, a sketch

So following my strict architectural guidelines, I came across what I am just calling RDFAuth, but like everything else here this is a sketch and open to change. I am not a security specialist nor an HTTP specialist. I am like someone who comes to an architect in order to build a house on some land he has, with some sketch of what he would like the house to look like, some ideas of what functionality he needs and what the price he is willing to pay is. What I want here is something very simple, that can be made to work with a few perl scripts.

Let me first present the actors and the resources they wish to act upon.

  • Romeo has a Semantic Web Address Book, his User Agent (UA). He is looking for the whereabouts of Juliette.
  • Juliette has a URL identifier ( as I do ) which returns a public foaf representation and links to a protected resource.
  • The protected resource contains information she only wants some people to know, in this instance Romeo. It contains information as to her current whereabouts.
  • Romeo also has a public foaf file. He may have a protected one too, but it does not make an entrance in this scene of the play. His public foaf file links to a public PGP key. I described how that is done in Cryptographic Web of Trust.
  • Romeo's Public key is RESTfully stored on a server somewhere, accessible by URL.

So Romeo wants to find out where Juliette is, but Juliette only wants to reveal this to Romeo. Juliette has told her server to only allow Romeo, identified by his URL, to view the site. She could have also have had a more open policy, allowing any of her or Romeo's friends to have access to this site, as specified by their foaf file. The server could then crawl their respective foaf files at regular intervals to see if it needed to add anyone to the list of people having access to the site. This is what the DIG group did in conjunction with OpenId. Juliette could also have a policy that decides Just In Time, as the person presents herself, whether or not to grant them access. She could use the information in that person's foaf file and relating it to some trust metric to make her decision. How Juliette specifies who gets access to the protected resource here is not part of this protocol. This is completely up to Juliette and the policies she chooses her agent to follow.

So here is the sketch of the sequence of requests and responses.

  1. First Romeo's user Agent knows that Juliette's foaf name is http://juliette.org/#juliette so it sends an HTTP GET request to Juliette's foaf file located of course at http://juliette.org/
    The server responds with a public foaf file containing a link to the protected resource perhaps with the N3
      <> rdfs:seeAlso <protected/juliette> .
    
    Perhaps this could also contain some relations describing that resource as protected, which groups may access it, etc... but that is not necessary.
  2. Romeo's User Agent then decides it wants to check out protected/juliette. It sends a GET request to that resource but this time receives a variation of the Basic Authentication Scheme, perhaps something like:
    HTTP/1.0 401 UNAUTHORIZED
    Server: Knowee/0.4
    Date: Sat, 1 Apr 2008 10:18:15 GMT
    WWW-Authenticate: RdfAuth realm="http://juliette.org/protected/\*" nonce="ILoveYouToo"
    
    The idea is that Juliette's server returns a nonce (in order to avoid replay attacks), and a realm over which this protection will be valid. But I am really making this up here. Better ideas are welcome.
  3. Romeo's web agent then encrypts some string (the realm?) and the nonce with Romeo's private key. Only an agent trusted by Romeo can do this.
  4. The User Agent then sends a new GET request with the encrypted string, and his identifier, perhaps something like this
    GET /protected/juliette HTTP/1.0
    Host: juliette.org
    Authorization: RdfAuth id="http://romeo.name/#romeo" key="THE_REALM_AND_NONCE_ENCRYPTED"
    Content-Type: application/rdf+xml, text/rdf+n3
    
    Since we need an identifier, why not just use Romeos' foaf name? It happens to also point to his foaf file. All the better.
  5. Because Juliette's web server can then use Romeo's foaf name to GET his public foaf file, which contains a link to his public key, as explained in "Cryptographic Web of Trust".
  6. Juliette's web server can then query the returned representation, perhaps meshed with some other information in its database, with something equivalent to the following SPARQL query
    PREFIX wot: <http://xmlns.com/wot/0.1/>
    SELECT ?pgp
    WHERE {
         [] wot:identity <http://romeo.name/#romeo>;
            wot:pubkeyAddress ?pgp .
    } 
    
    The nice thing about working at the semantic layer, is that it decouples the spec a lot from the representation returned. Of course as usage grows those representations that are understood by the most servers will create a de facto convention. Intially I suggest using RDF/XML of course. But it could just as well be N3, RDFa, perhaps even some microformat dialect, or even some GRDDLable XML, as the POWDER working group is proposing to do.
  7. Having found the URL of the PGP key, Juliette's server, can GET it - and as with much else in this protocol cache it for future use.
  8. Having the PGP key, Juliette's server can now decrypt the encrypted string sent to her by Romeo's User Agent. If the decrypted string matches the expected string, Juliette will know that the User Agent has access to Romeo's private key. So she decides this is enough to trust it.
  9. As a result Juliette's server returns the protected representation.
Now Romeo's User Agent knows where Juliette is, displays it, and Romeo rushes off to see her.

Advantages

It should be clear from the sketch what the numerous advantages of this system are over OpenId. (I can't speak of other authentication services as I am not a security expert).

  • The User Agent has no redirects to follow. In the above example it needs to request one resource http://juliette.org/ twice (2 and 4) but that may only be necessary the first time it accesses this resource. The second time the UA can immediately jump to step 3. [but see problem with replay attacks raised in the comments by Ed Davies, and my reply] Furthermore it may be possible - this is a question to HTTP specialists - to merge step 1 and 2. Would it be possible for a request 1. to return a 20x code with the public representation, plus a WWWAuthenticate header, suggesting that the UA can get a more detailed representation of the same resource if authenticated? In any case the redirect rigmarole of OpenId, which is really there to overcome the limitations of current web browsers, in not needed.
  • There is no need for an Attribute Exchange type service. Foaf deals with that in a clear and extensible RESTful manner. This simplifies the spec dramatically.
  • There is no need for an identity server, so one less point of failure, and one less point of control in the system. The public key plays that role in a clean and simple manner
  • The whole protocol is RESTful. This means that all representations can be cached, meaning that steps 5 and 7 need only occur once per individual.
  • As RDF is built for extensibility, and we are being architecturally very clean, the system should be able to grow cleanly.

Contributions

I have been quietly exploring these ideas on the foaf and semantic web mailing lists, where I received a lot of excellent suggestions and feedback.

Finally

So I suppose I am now looking for feedback from a wider community. PGP experts, security experts, REST and HTTP experts, semantic web and linked data experts, only you can help this get somewhere. I will never have the time to learn these fields in enough detail by myself. In any case all this is absolutely obviously simple, and so completely unpatentable :-)

Thanks for taking the time to read this

Friday Jul 13, 2007

The limitations of JSON

A thread on REST-discuss recently turned into a JSON vs XML fight. I had not thought too deeply about JSON before this, but now that I have I though I should summarize what I have learnt.

JSON has a clear and simple syntax, described on json.org. As far as I could see there is no semantics associated with it directly, just as with XML. The syntax does make space for special tokens such as numbers, booleans, ints, etc which of course one automatically presumes will be mapped to the equivalent types: ie things that one can add or compare with boolean operators. Behind the scenes of course a semantics is clearly defined by the fact that it is meant to be used by JavaScript for evaluation purposes. In this it differs from XML, which only assumes it will be parsed by XML aware tools.

On the list there was quite a lot of confusion about syntax and semantics. The picture accompanying this post shows how logicians understand the distinction. Syntax starts by defining tokens and how they can be combined into well formed structures. Semantics defines how these tokens relate to things in the world, and so how one can evaluate the truth, among other things of the well formed syntactic structure. In the picture we are using the NTriples syntax which is very simple defined as the succession of three URIs or 2 URIs and a string followed by a full stop. URIs are Universal names, so their role is to refer to things. In the case of the formula

<http://richard.cyganiak.de/foaf.rdf#cygri> <http://xmlns.com/foaf/0.1/knows> <http://www.anjeve.de/foaf.rdf#AnjaJentzsch> .
the first URI refers to Richard Cyganiak on the left in the picture, the second URI refers to a special knows relation defined at http://xmlns.com/foaf/0.1/, and depicted by the red arrow in the center of the picture, and the third URI refers to Anja Jenzsch who is sitting on the right of the picture. You have to imagine the red arrow as being real - that makes things much easier to understand. So the sentence above is saying that the relation depicted is real. And it is: I took the photo in Berlin this Febuary during the Semantic Desktop workshop in Berlin.

I also noticed some confusion as to the semantics of XML. It seems that many people believe it is the same as the DOM or the Infoset. Those are in fact just objectivisations of the syntax. It would be like saying that the example above just consisted of three URIs followed by a dot. One could speak of which URI followed which one, which one was before the dot. And that would be it. One may even speak about the number of letters that appear in a URI. But that is very different from what that sentence is saying about the world, which is what really interests us in day to day life. I care that Richard knows Anja, not how many vowels appear in Richard's name.

At one point the debate between XML and JSON focused on which had the simplest syntax. I suppose xml with its entity encoding and DTD definitions is more complicated, but that is not really a clinching point. Because if syntactic simplicity were an overarching value, then NTriples and Lisp would have to be declared winners. NTriples is so simple I think one could use the well known very light weight grep command line tool to parse it. Try that with JSON! But that is of course not what is attractive about JSON to the people that use it, namely usually JavaScript developers. What is nice for them is that they can immediately turn the document into a JavaScript structure. They can do that because they assume the JSON document has the JavaScript semantics. [1]

But this is where JSON shows its greatest weakness. Yes the little semantics JSON datastructures have makes them easy to work with. One knows how to interpret an array, how to interpret a number and how to interpret a boolean. But this is very minimal semantics. It is very much pre web semantics. It works as long as the client and the server, the publisher of the data and the consumer of the data are closely tied together. Why so? Because there is no use of URIs, Universal Names, in JSON. JSON has a provincial semantics. Compare to XML which gives a place for the concept of a namespace specified in terms of a URI. To make this clearer let me look at the JSON example from the wikipedia page (as I found it today):

{
    "firstName": "John",
    "lastName": "Smith",
    "address": {
        "streetAddress": "21 2nd Street",
        "city": "New York",
        "state": "NY",
        "postalCode": 10021
    },
    "phoneNumbers": [
        "212 732-1234",
        "646 123-4567"
    ]
}

We know there is a map between something related to the string "firstName" and something related to the string "John". [2] But what exactly is this saying? That there is a mapping from the string firstName to the string John? And what is that to tell us? What if I find somewhere on the web another string "prenom" written by a French person. How could I say that the "firstName" string refers to the same thing the "prenom" name refers to? This does not fall out nicely.

The provincialism is similar to that which led the xmlrpc specification to forget to put time stamps on their dates, among other things, as I pointed out in "The Limitations of the MetaWeblog API". To assume that sending dates around on the internet without specifying a time zone makes sense, is to assume that every one in the world lives in the same time zone as you.
The web allows us to connect things just by creating hyperlinks. So to tie the meaning of data to a particular script in a particular page is not to take on the full thrust of the web. It is a bit like the example above which writes out phone numbers, but forgets to write the country prefix. Is this data only going to get used by people in the US? What about the provincialism of using a number to represent a postal code. In the UK postal codes are written out mostly with letters. Now those two elements are just modelling mistakes. But if one is going to be serious about creating a data modelling language, then one should avoid making mistakes that are attributable to the idea that string have universal meaning, as if the whole world spoke english, and as if english were not ambigous. Yes, natural language can be disambiguated when one is aware of the exact location and time and context of the speaker. But on a web were everything should link up to everything else, that is not and cannot be the case.
That JSON is so much tied to a web page should not come as a surprise if one looks at its origin, as a serialisation of JavaScript objects. JavaScript is a scripting language designed to live inside a web page, with a few hooks to go outwards. It was certainly not designed as a universal data format.

Compare the above with the following Turtle subset of N3 which presumably expresses the same thing:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix : <http://www.w3.org/2000/10/swap/pim/contact#> .

<http://eg.com/joe#p>
   a foaf:Person;
   foaf:firstName "John";
   foaf:family_name "Smith";
   :home [
         :address [
              :city "New York";
              :country "New York";
              :postalCode "10021";
              :street "21 2nd Street";
         ]
   ];
   foaf:phone <tel:+1-212-732-1234>, <tel:+1-646-123-4567>;
.

Now this may require a little learning curve - but frankly not that much - to understand. In fact to make it even simpler I have drawn out the relations specified above in the following graph:

(I have added some of the inferred types)

The RDF version has the following advantages:

  • you can know what any of the terms mean by clicking on them (append the prefix to the name) and do an HTTP GET
  • you can make statements of equality between relations and things, such as
    foaf:firstname = frenchfoaf:prenom .
  • you can infer things from the above, such as that
    <http://eg.com/joe#p> a foaf:Agent .
  • you can mix vocabularies from different namespaces as above, just as in Java you can mix classes developed by different organisations. There does not even seem to be the notion of a namespace in JSON, so how would you reuse the work of others?
  • you can split the data about something in pieces. So you can put your information about <http://eg.com/joe#p> at the "http://eg.com/joe" URL, in a RESTful way, and other people can talk about him by using that URL. I could for example add the following to my foaf file:
    <http://bblfish.net/people/henry/card#me> foaf:knows <http://eg.com/joe#p> .
    You can't do that in a standard way in JSON because it does not have a URI as a base type (weird for a language that wants to be a web language, to miss the core element of the web, and yet put so much energy into all these other features such as booleans and numbers!)

Now that does not mean JSON can't be made to work this way, as the SPARQL JSON result set serialisation does. But it does not do the right thing by default. A bit like languages before Java that did not have unicode support by default. The few who were aware of the problems would do the right things, all the rest would just discover the reality of their mistakes by painful experience.

This does not take away from the major advantage that JSON has of being much easier to integrate with JavaScript, which is a real benefit to web developers. It should possible to get the same effect with a few good libraries. The Tabulator project provides a javascript library to parse rdf, but it would probably require something like a so(m)mer mapping from relations to javascript objects for it to be as transparent to those developers as JSON is.

Notes

[1]
Now procedural languages such as JavaScript don't have the same notion of semantics as the one I spoke of previously. The notion of semantics defined there is a procedural one: namely two documents can be said to have the same semantics if they behave the same way.
[2]
The spec says that an "object is an unordered set of name-value pairs", which would mean that person could have another "firstName" I presume. But I also heard other people speak about those being hash maps, which only allow unique keys. Not sure which is the correct interpretation...

Vote for this: |dzone

Tuesday Jul 03, 2007

Restful semantic web services

Here is my first stab at an outline for what a restful semantic web services would look like.

Let me start with the obvious. Imagine we have an example shopping service, at http://shop.eg/, which sells books. Clearly we would want URLs for every book that we wish to buy, with RDF representations at the given URL. As I find RDF/XML hard to read and write, I'll show the N3 representations. So to take a concrete example, let us imagine our example shopping service selling the book "RESTful Web Services" at the URL http://shop.eg/books/isbn/0596529260 . If we do an HTTP GET on that URL we could receive the following representation:


@prefix : <http://books.eg/ns#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix shop: <http://shopping.eg/ns#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix currency: <http://bank.eg/currencies#> .


<#theBook> a shop:Book, shop:Product;
   dc:title "Restful Web Services"@en ;
   dc:creator [ a foaf:Person; foaf:name "Sam Ruby"],
             [ a foaf:Person; foaf:name "Leonard Richardson"] ;
   dc:contributor [ a foaf:Person; foaf:name "David Heinemeier Hansson" ];
   dc:publisher <http://www.oreilly.com/>;
   dct:created "08-07-2007T"\^\^xsd:dateTime;
   dc:description """This is the first book that applies the REST design philosophy to real web services. It sets 
 down the best practices you need to  make your design a success, and the techniques you need to turn your 
 design into working code. You can harness the power of the Web for programmable applications: you just 
 have to work with the Web instead of against it. This book shows you how."""@en;
   shop:price "26.39"\^\^currency:dollars;
   dc:subject </category/computing>, </category/computing/web>, </category/computing/web/rest>, 
                   </category/computing/architecture>,</category/computing/architecture/REST>, </category/computing/howto> .


So we can easily imagine a page like this for every product. These pages can be accessible either by browsing the categories pages, querying a SPARQL endpoint, among many other ways. It should be very easy to generate such representations for a web site. All it requires is to build up an ontology of products - which the shop already has available, if only for the purposes of building inventories - and tie these to the database using a tool such as D2RQ, or a combination of JSR311 and @rdf annotations (see so(m)mer).

Now what is missing is a way to let the browser know what it can do with this product. The simplest possible way of doing this would be to create a specialized relation for that web service to POST some information to a Cart resource, describing the item to be added to the cart. Perhaps something like:

<#theBook> shop:addToCart <http://shop.eg/cart/> .

This relation would just mean that one has to POST the url to the cart, to have it added there. The cart itself may then have a shop:buy relation to some resource, which by convention the user agent would need to send a credit card, expiration date, and other information to.
This means that one would have to define a number of RDF relationships, for every type of action one could do in a shop (and later on the web), and explain the types of messages to be sent to the endpoint, and what their consequences are. This is simple but it does not seem very extensible. What if one wants to buy the hard copy version of the book, or 10 items of the book? The hard copy version of course could have its own URL, and so it may be as simple as placing the buy relation on that page. But is this going to work with PCs where one can add and remove a huge number of pieces. I remember Steve Jobs being proud of the huge number of different configurations one could buy his desktop systems with, well over 100 thousand different configurations I remember. This could make it quite difficult to navigate a store, if one is not careful.

On the current web this is dealt with by using html forms, which can allow the user to choose between a large number of variables, by selecting check boxes, combo boxes, drop down menues and more, and then POST a representation to a collection, and thereby create a new action, such as adding the product to the cart, or buying it. The person browsing the site knows what the action does, because it is usually written out in a natural language, in a way that makes it quite obvious to a human being. The person then does that action because he desires to do so, because he wishes his desires to be fulfilled. Now this may seem very simple, but just consider the innumerable types of actions that we can fulfill using the very simple tool of html forms: we can add things to a virtual cart, buy things, comment about things, search for things, organise a meeting, etc, etc.... So forms can be seen both as shortcuts to navigate to a large number of different resources, and to create new resources (usually best done with POST).
If we want software agents to do such tasks for us we need both to have something like a machine understandable form, and some way of specifying what the action of POSTing the form will have on the world. So we need to find a way to do what the web does in a more clearly specified way, so that even machines, or simple user agents can understand it. Let's look at each one:

  • Forms are ways of asking the user to bind results to variables
  • the variables can the be used to build something, such as a URL, or a message.
  • The form then specifies the type of action to do with the constructed message, such as a GET, POST, PUT, etc...
  • The human readable text explains what the result of the action is, and what the meaning of each of the fields are.

Now what semantic technology binds variables to values? Which one asks questions? SPARQL comes immediately to mind. Seeing this and remembering a well known motto of sales people "Satisfy the customer every desire" a very general but conceptually simple solution to this problem occurred to me. It may seem a little weird at first (and perhaps it will continue to seem weird) but I thought it is elegant enough to be used as a starting point. The idea is really simple: the representation returned by the book resource will specify a collection end point to POST RDF too, and it will specify what to POST back by sending a SPARQL query in the representation. It will then be up to the software agent reading the representation to answer the query if he wishes a certain type of action to occur. If he understand the query he will be able to answer, if he does not, there should be no results. He need not do anything with the query at all.

The following is the first thing that occurred to me. The details are less important than the principle of thinking of forms as asking the client a question.

PREFIX shop: <http://shopping.eg/ns#>
PREFIX bdi: <http://intentionality.eg/ns#>
CONSTRUCT {
     ?mycart a shop:Cart ;
             shop:contains [ a shop:LineItem;
                               shop:SKU <http://shop.eg/books/isbn/0596529260#theBook> ;
                               shop:quantity ?q ;
                             ]  .
}  WHERE {
       ?mycart a shop:Cart ;
               shop:for ?me ;
               shop:ownedBy <http://shop.eg/>.
       GRAPH ?desire {
               ?mycart shop:contains
                            [ a shop:LineItem;
                               shop:SKU <http://shop.eg/books/isbn/0596529260#theBook> ;
                               shop:quantity ?q ;
                            ]  .

       }
       ?desire bdi:of ?me .
       ?desire bdi:fulfillby "2007-07-30T..."\^\^xsd:dateTime .
}

So this is saying quite simply: Find out if you want to have your shopping cart filled up with a number of this book. The user agent (the equivalent of the web browser) asks its data store the given SPARQL query. It asks itself whether it desires to add a number of books to its shopping cart, and if it wishes that desire to be fulfulled by a certain time. If the agent does not understand the relations in the query, then the CONSTRUCT clause will return an empty graph. If it does understand it, and the query returns a result, then it is because it wished the action to take place. The constructed graph may be something like:

@prefix shop: <http://shopping.eg/ns#>
 <http://shop.eg/cart/bblfish/> a shop:Cart ;
             shop:contains [ a shop:LineItem;
                               shop:SKU <http://shop.eg/books/isbn/0596529260#theBook> ;
                               shop:quantity 2 ;
                             ]  .

This can then be POSTed to the collection end point http://shop.eg/cart/, with the result of adding two instances of the book to the cart. Presumably the cart would return a graph with the above relations in it plus another SPARQL query explaining how to buy the items in the cart.

So the full RDF for the book page would look something like this:

@prefix : <http://books.eg/ns#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix shop: <http://shopping.eg/ns#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix currency: <http://bank.eg/currencies#> .


<#theBook> a shop:Book, shop:Product;
   dc:title "Restful Web Services"@en ;
   dc:creator [ a foaf:Person; foaf:name "Sam Ruby"],
             [ a foaf:Person; foaf:name "Leonard Richardson"] ;
   dc:contributor [ a foaf:Person; foaf:name "David Heinemeier Hansson" ];
   dc:publisher <http://www.oreilly.com/>;
   dct:created "08-07-2007T"\^\^xsd:dateTime;
   dc:description """This is the first book that applies the REST design philosophy to real web services. It sets 
 down the best practices you need to make your design a success, and the techniques you need to turn your
  design into working code. You can harness the power of the Web for  programmable applications: you just 
 have to work with the Web instead of against it. This book shows you how."""@en;
   shop:price "26.39"\^\^currency:dollars;
   dc:subject </category/computing>, </category/computing/web>, </category/computing/web/rest>, 
                      </category/computing/architecture>,</category/computing/architecture/REST>,</category/computing/howto>;
   shop:addToCart [ a Post;
                    shop:collection <http://shop.eg/cart/>;
                    shop:query """
PREFIX shop: <http://shopping.eg/ns#>
PREFIX bdi: <http://intentionality.eg/ns#>
CONSTRUCT {
     ?mycart a shop:Cart ;
             shop:contains [ a shop:LineItem;
                               shop:SKU <http://shop.eg/books/isbn/0596529260#theBook> ;
                               shop:quantity ?q ;
                             ]  .
}  WHERE {
       ?mycart a shop:Cart ;
               shop:for ?me ;
               shop:ownedBy <http://shop.eg/>.
       GRAPH ?desire {
               ?mycart shop:contains
                            [ a shop:LineItem;
                               shop:SKU <http://shop.eg/books/isbn/0596529260#theBook> ;
                               shop:quantity ?q ;
                            ]  .

       }
       ?desire bdi:of ?me .
       ?desire bdi:fulfillby "2007-07-30T..."\^\^xsd:dateTime .
}"""
                  ] .

So there are quite a few things to tie up here, but it seems we have the key elements here:

  • RESTful web services: we use GET and POST the way they are meant to be used,
  • Resource Oriented Architecture: each shoppable item has its resource, that can return a representation
  • the well known motto "hypermedia is the engine of application state": each URL is dereferenceable to further representations. Each page of a containing a buyable item describes how one can proceed to the next step to buy the product. In this case the SPARQL query returns a graph to be POSTed to a given url.
  • with the clarity of the Semantic framework thrown in too. Ie. We can proove certain things about the statements made, which is very helpful in bringing clarity to a vocabulary. Understanding the consequences of what is said is part and parcel of understanding itself.
Have I reached buzzword compliance yet? :-)

Notes

From discussions around the net (on #swig for example) I was made aware of certain problems.

  • SPARQL is a little powerful, and it may seem to give too much leverage to the service, who could ask all kinds of questions of the user agent, such as the SPARQL equivalent of "What is your bank account number". Possible answers may be:
    • Of course a user agent that does shopping automatically on the web, is going to have to be ready for all kinds of misuses, so whatever is done, this type of problem is going to crop up. Servers also need to protect themselves from probing questions by user agents. So this is something that both sides will need to look at.
    • Forms are pretty powerful too. Are forms really so different from queries? They can ask you for a credit card number, your date of birth, the name of your friends, your sexual inclinations, ... What can web formst not ask you?
  • SPARQL is a language that does a couple of things going for it: it has a way of binding variables to a message, and it builds on the solid semantic web structure. But there may be other ways of doing the same thing. OWL-S also uses rdf to describe actions, create a way to bing answers to messages, and descibe the preconditions and postconditions of actions. It even uses the proposed standard of Semantic Web Rules Language SWRL. As there seems to be a strong relation between SPARQL and a rule language (one can think of a SPARQL query as a rule), it may be that part of the interest in this solution is simply the same reason SWRL emerged in OWL-S. OWL-S has a binding to SOAP and none to a RESTful web service. As I have a poor grasp of SOAP I find that difficult to understand. Perhaps a binding to a more restful web service such as the one proposed here would make it more amenable to a wider public.

Wednesday Jun 06, 2007

RESTful Web Services: the book

RESTful Web Services is a newly published book that should be a great help in giving people an overview of how to build web services that work with the architecture of the Web. The authors of the book are I believe serious RESTafarians. They hang out (virtually) on the yahoo REST discuss newsgroup. So I know ahead of time that they will most likely never fail on the REST side of things. Such a book should therefore be a great help for people desiring to develop web services.

As an aside, I am currently reading it online via Safari Books, which is a really useful service, especially for people like me who are always traveling and don't have space to carry wads of paper around the world. As I have been intimately involved in this area for a while - I read Roy Fielding's thesis in 2004, and it immediately made sense of my intuitions - I am skipping through the book from chapter to chapter as my interests guide me, using the search tool when needed. As this is an important book, I will write up my comments here in a number of posts as I work my way through it.

What of course is missing in Roy's thesis, which is a high level abstract description of an architectural style, are practical examples, which is what this book sets out to provide. The advantage of Roy's level of abstraction is that it permitted him to make some very important points without loosing himself in arbitrary implementation debates. Many implementations can fit his architectural style. That is the power of speaking at the right level of abstraction: it permits one to say something well, in such a way that it can withstand the test of time. Developers of course want to see how an abstract theory applies to their everyday work, and so a cook book such as "RESTful Web Services" is going to appeal to them. The danger is that by stepping closer to implementation details, certain choices are made that turn out to be in fact arbitrary, ill conceived, non optimal or incomplete. The risk is well worth taking if it can help people find their way around more easily in a sea of standards. This is where the rubber hits the road.

Right from the beginning the authors, Sam Ruby and Leonard Richardson coin the phrase "Resource Oriented Architecture".

Why come up with a new term, Resource-Oriented Architecture? Why not just say REST? Well, I do say REST, on the cover of this book, and I hold that everything in the Resource-Oriented Architecture is also RESTful. But REST is not an architecture: it's a set of design criteria. You can say that one architecture meets those criteria better than another, but there is no one "REST architecture."

The emphasis on Resources is I agree with them fundamental. Their chapter 4 does a very good job of showing why. URIs name Resources. URLs in particular name Resources that can return representations in well defined ways. REST stands for "Representation of State Transfer", and the representations transferred are the representations of resources identified by URLs. The whole thing fits like a glove.

Except that where there is a glove, there are two, one for each hand. And they are missing the other glove, so to speak. And the lack is glaringly obvious. Just as important as Roy Fielding's work, just as abstract, and developed by some of the best minds on the web, even in the world, is RDF, which stands for Resource Description Framework. I emphasize the "Resource" in RDF because for someone writing a book on Resource Oriented Architecture, to have only three short mentions of the framework for describing resources standardized by non less that the World Wide Web Consortium is just ... flabbergasting. Ignoring this work is like trying to walk around on one leg. It is possible. But it is difficult. And certainly a big waste of energy, time and money. Of course since what they are proposing is so much better than what may have gone on previously, which seems akin to trying to walk around on a gloveless hand, it may not immediately be obvious what is missing. I shall try to make this clear in the series of notes.

Just as REST is very simple, so is RDF. It is easiest to describe something on the web if you have a URL for it. If you want to say something about it, that it relates to something else for example, or that it has a certain property, you need to specify which property it has. Since a property is a thing, it too is easiest to speak about if it has a URL. So once you have identified the property in the global namespace you want to say what its value is, you need to specify what the value of that property is, which can be a string or another object. That's RDF for you. It's so simple I am able to explain it to people in bars within a minute. Here is an example, which says that my name is Henry:

<http://bblfish.net/people/henry/card#me> <http://xmlns.com/foaf/0.1/name> "Henry Story" .

Click on the URLs and you will GET their meaning. Since resources can return any number of representations, different user agents can get the representation they prefer. For the name relation you will get an html representation back if you are requesting it from a browser. With this system you can describe the world. We know this since it is simply a generalization of the system found in relational databases, where instead of identifying things with table dependent primary keys, we identify them with URIs.

So RDF, just as REST, is at its base very easy to understand and furthermore the two are complementary. Even though REST is simple, it nevertheless needs a book such as "RESTful web services" to help make it practical. There are many dispersed standards out there which this books helps bring together. It would have been a great book if it had not missed out the other half of the equation. Luckily this should be easy to fix. And I will do so in the following notes, showing how RDF can help you become even more efficient in establishing your web services. Can it really be even easier? Yes. And furthermore without contradicting what this book says.

Friday May 11, 2007

Semantic Web Birds of a Feather at JavaOne 2007

Nova Spivack, Lew Tucker, and Tim Boudreau joined me today in a panel discussion on the Semantic Web at Java One. Given that it was at 8pm and that we were competing with a huge party downstairs with free drinks, with robots fighting each other, with live bands, and numerous other private pub parties, the turnout of over 250 participants was quite extraordinary [1]. There was a huge amount of material to cover, and we managed to save 13 minutes at the end for questions. The line of questioners was very long and I think most were answered to the satisfaction of the questioners. It was really great having Nova and Lew over. They brought a lot of experience to the discussion, which I hope gave everyone a feel for the richness of what is going on in this area.

Since many people asked for the presentation it is available here.

[1] It was quite difficult to tell from the stage how many people were in the room, but a good one third of the 1200 room was full . 580 people had registered for the talk.

Tuesday May 08, 2007

Dropping some Doap into NetBeans

Yesterday evening I gave a short 10 minute presentation on the Semantic Web in front of a crowd of 1000 NetBeans developers during James Gosling's closing presentation at NetBeans Day in San Francisco.

In interaction with Tim Boudreau we managed to give a super condensed introduction to the Semantic Web, something that is only possible because its foundations are so crystal clear - which was the underlying theme of the talk. It's just URIs and REST and relations to build clickable data. (see the pdf of the presentation)

All of this lead to a really simple demonstration of an integration with NetBeans that Tim Boudreau was key in helping me put together. Tim wrote the skeleton of a simple NetBeans plugin (stored in the contrib/doap section of the NetBeans CVS repository), and I used Sesame 2 beta 3 to extract the data from the so(m)mer doap file that got dropped onto NetBeans. As a result NetBeans asked us were we wanted to download the project, and after selecting a directory on my file system, it proceeded to check out the code. On completion it asked us if we wanted to install the modules in our NetBeans project. Now that is simple. Drag a DOAP url onto NetBeans: project checked out!

This Thursday we will be giving a much more detailed overview of the Semantic Web in the BOF-6746 - Web 3.0: This is the Semantic Web, taking place at 8pm at Moscone on Thursday. Hope to see you there!

Wednesday Feb 14, 2007

JSR-311: a Java API for RESTful Web Services?

JSR 311: Java (TM) API for RESTful Web Services has been put forward by Marc Hadley and Paul Sandoz from Sun. The initial expert membership group come from Apache, BEA, Google, JBoss, Jerome Louvel of the very nice RESTlet framework, and TmaxSoft.

But it has also created some very strong negative pushback. Eliot Rusty Harold does not like it at all:

Remember, these are the same jokers who gave us servlets and the URLConnection class as well as gems like JAX-RPC and JAX-WS. They still seem to believe that these are actually good specs, and they are proposing to tunnel REST services through JAX-WS (Java API for XML Web Services) endpoints.

They also seem to believe that "building RESTful Web services using the Java Platform is significantly more complex than building SOAP-based services". I don't know that this is false, but if it's true it's only because Sun's HTTP API were designed by architecture astronauts who didn't actually understand HTTP. This proposal does not seem to be addressing the need for a decent HTTP API on either the client or server side that actually follows RESTful principles instead of fighting against them.

To give you an idea of the background we're dealing with here, one of the two people who wrote the proposal "represents Sun on the W3C XML Protocol and W3C WS-Addressing working groups where he is co-editor of the SOAP 1.2 and WS-Addressing 1.0 specifications. Marc was co-specification lead for JAX-WS 2.0 (the Java API for Web Services) developed at the JCP and has also served as Sun's technical lead and alternate board member at the Web Services Interoperability Organization (WS-I)."

Heavy words indeed.

Roy fielding is dead against the name
Marc, I already explained to Rajiv last November that I would not allow Sun to go forward with the REST name in the API. It doesn't make any sense to name one API as the RESTful API for Java, and I simply cannot allow Sun to claim ownership of the name (which is what the JSR process does by design). Change the API name to something neutral, like JAX-RS.

Jerome Louvel, whose RESTlet framework is very promising, has a fuller explanation of what is being attempted in this JSR.

I am still not so sure what they want to do exactly, but it seems to be based on JRA which has annotations of the form:

@Get
@HttpResource(location="/customers")
public List<Customer> getCustomers();

@Post
@HttpResource(location="/customers")
public void addCustomer(Customer customer);
which frankly does seem a little weird. But I am willing to give it some time to understand. Marc Hadley has more details about what he is thinking of.

If one were to standardize an api why not standardize the RESTlet API? That makes more immediate sense to me. It is working code, people are already participating in the process, and the feedback is very good. Now I don't know exactly how JSRs work and what the relationship is between the initial proposal and the final solution. Is the final proposal going to be close to the current one, with those types of annotations? Or could it be something much closer to RESTlets?

On the other hand I am pleased to see a JSR that is proposing to standardise annotations. That makes me think it would probably the time may be ripe to think of standardize the @rdf annotations mapping POJOs to the semantic web the way so(m)mer and elmo are doing.

More reading:

Some blogs on the subject:

So in summary: it is great that REST has attracted so much attention as to stimulate research into how to make it easy for Java developers to do the right thing by default. Let's hope that out of this something is born that does indeed succeed in reaching that goal.

Off topic: Eliott Rusty Harold recently wrote up 10 predictions for xml in 2007 where he gives a very partial description of RDF, correctly pointing out the importance of GRDDL, but completely missing out the importance of SPARQL.

Wednesday Feb 07, 2007

foaf enabling an enterprise

Developing the Beatnik Address Book (BAB) I have found requires the software to implement the following capabilities:

  1. being able to read rdf files and understand foaf and vcard information at the very least. (on vcard see also the Notes on the vcard format).
  2. being able to store the information locally, keeping track of the source of the information so as to be able to merge and unmerge information on user demand
  3. being able to write semantic files out at some unchanging location
  4. an easy to use GUI (see the article on F3).

I would like to look at 3. the aspects of writing out a foaf file today. At its simplest this is really easy. But in an enterprise environment, if one wants to give every employee a foaf file so as to allow Universal Drag and Drop of people between applications inside the firewall, some questions need to be answered.

General Solution

The main solution and the obvious one is just to write a foaf file out to a server using ftp, scp, WebDav or the nascent Atom Protocol. Ftp and scp are a little tricky for the end user as he would have to understand the relation between the directory structure of the ftp server and its mapping to the web server urls, as well as what is required to specify the mime types of the foaf file, which is very much dependent on the setup of the web server. (see what I had to do to enable my personal foaf file) This may end up being a lot of work with a steep learning curve for someone who wishes to just publish their contact information. WebDav on the other hand, being a RESTful protocol, makes it much easier to specify the location of the file. Wherever the file is PUT that's it's name. Similarly with the Atom Protocol, though I am not sure for either of them how good they are when confronted with arbitrary mime types. My guess is that WebDav should do much better here.
In any case, using either of the above methods one can always later overwrite a previous version if one's address book changes. This is indeed the solution that will satisfy most use cases.

Professional Solution

In a professional setting though, things get to be a little more complicated. Consider for example any large fortune 500 company. These companies already have a huge amount of information on their employees in their ldap directory. This is reliable and authoritative information, and should be used to generate foaf files for each employee. These companies usually have some web interface to the ldap server which aggregates information about the person in human readable form. Such a web interface - call it a Namefinder – could easily point to the machine generated foaf file.

Now the question is: should this foaf file be read only or read/write? If it is read/write then an agent such as the Beatnik Address Book, could overwrite the file with different information from that stored in ldap, which could cause confusion, and be frowned upon. Well of course the WebDav server could be clever and parse the graph in such a way as to enforce the existence of a certain subgraph. So given the following graph generated from the ldap database

<#hjs> a foaf:Person;
             foaf:mbox <mailto:henry.story@sun.com>;
             foaf:name “Henry Story”;
             org:manager <12345#hjs> .

An Address Book that would want to PUT the a graph containing the following subgraph

<#hjs> a foaf:Person;
             foaf:mbox <mailto:henry.story@sun.com>;
             foaf:name “Henry Story”;
             org:manager <#hjs> .

might

  • get rejected, because the server decides it owns some of the relations, especially the org:manager one in this case. (What HTTP return code should be returned on failure?)
  • or it may decide to rewrite the graph and remove the elements it does not approve of and replace them. That is, replace the triple <#hjs> org:manager <#hjs> with <#hjs> org:manager <12345#bt> for example. (Again what should the HTTP return code be?)

Both of those solutions are valid, but they end up creating a file of mixed ownership. Perhaps it would be better to have the file be read only, officially owned by the company, and have it contain a relation pointing to some other file owned by the user himself. Perhaps something like the following would appear in file at http://foaf.sun.com/official/294551 :

<>   a foaf:PersonalProfileDocument;
       foaf:maker  <http://www.nasdaq.com/SUNW>;
       lnk:moreInfo </personal/294551> .

</personal/294551> a rights:EditableFile;
                    rights:ownedBy <#hjs> .

That is, in plain English the resource would say that it is a PersonalProfileDocument and that Sun Microsystems is the maker of the file and that more information is available at the resource </personal/294551>. It would also give ownership permissions on that resource. A PROPFIND on each of those files could easily confirm the access rights of each of them.

Now from there it should be possible for the user agent ( BAB in this case) to deduce that it has space to write information at </personal/294551>. There it can then write out all the personal information it likes: adding relations to DOAP files, to a personal home page, to interests and to other people known, etc... It could even add a pointer to a public foaf file with a statement such as

<http://foaf.sun.com/official/294551#hjs> owl:sameAs <http://bblfish.net/people/henry/card#me> .

Multiple Agent Problems

Having solved the problem of a writable user agent file, there remains one more distant problem of the same person ending up in a more Semantically enabled future with multiple user agents all capable of writing foaf files but each perhaps with slightly different interests. How would these user agents write files making sure that they don't contradict each other, overwrite important information that the other requires, etc...? The Netbeans user agent may want to write out some relations in the foaf file using the doap ontology to point to the projects the employee is working on... Well perhaps it is as easy as just adding those triples to the file or if then the same problem of ownership arises as above, it may be worth placing each triple into a different user agent space... Well. This seems a bit far out for the moment, I'll look at that problem when I build the next Semantic application. If people have already come across this problem please let me know.

Questions

The above are just some initial thoughts on how to do this. Are there perhaps already relations out there to help cut up the responsibility of writing out these files between different agents be they political or software ones? Are there other solutions I am missing?

Thursday Dec 14, 2006

Universal Drag And Drop

As the world becomes completely interconnected, we need to drag and drop anything over computers, between operating systems and across continents. There is one easy way to do this, which we have been using for a long time without perhaps realizing it: by using URLs to send people web pages we like. Now we can generalize this to drag and drop anything.

In RDF we can use URLs to name anything we want. We can name people for example. My foaf name is http://bblfish.net/people/henry/card#me. If you GET its meaning by clicking on the above link, the resource http://bblfish.net/people/henry/card will return a representation in rdf/xml describing me and people I know. Using this feature I will be able to drag and drop my foaf name onto the AddressBook I am writing. It will be able to ask for its preferred representation from the resource and from the returned rdf, deduce that the given url points to a foaf:Person, and so display those connections it understands in the associated graph.

But we could go one further with this mechanism. The operating system could easily be adapted to use this information to change the picture of the mouse cursor when the drag event occurs, by fetching a cached representation (REST is designed for caching) or by querying the cached graph, to find what the type of the resource is ( a foaf:Person ) and so display a stick person or if a relationship to a logo is available the logo of the person, with perhaps their initials printed below.

All one needs for drag and drop are URLs - metadata is just one GET away.

Wednesday Aug 09, 2006

All XML roads lead to RDF

Where I proove that thinking of the web as a database does not mean thinking of it in terms of an xml database. Rather we will have to think in terms of graphs.

XML applied to Documents

XML stands for eXtensible Markup Language. The most successful markup language is and remains html and its cleaned up xhtml successor. So let me start with this.

Xhtml is designed to markup strings such as the following "Still pond A frog jumps in Plop!" with presentation information. In a wysiwyg editor I would do this by highlighting parts of the text using my mouse, then setting a property for that part of the text by clicking some button such as bold available to me from some menu. XML allows me save this information in the document by giving me a way to create arbitrary parenthesizing properties. So for example using the xhtml markup language, the following ≺blockquote≻≺font face="papyrus,helvetica"≻Still pond≺br/≻A frog jumps in≺br/≻Plop!≺br/≻≺/font≻≺/blockquote≻ should display in a compliant browser as

Still pond
A frog jumps in
Plop!

XML applied to Data

Take some information written out one way or another. This will usually be in some tabular format as in a receipt, a spread sheet or a database table. Take this receipt I have in front of me for example. The text on it is "1 Fresh Orange Juice 1.95 1 Café au Lait 1,20 1 Sparkling Water 1,40". This is not very easy to parse for a human, let a alone a machine. But extrapolating from the experience with html, I can use xml to mark the text up like this:

≺bill≻
  ≺item≻≺description≻1 Fresh Orange Juice≺/description≻≺price currency="euro"≻1,95≺/price≻≺/item≻
  ≺item≻≺description≻1 Café au Lait≺/description≻≺price currency="euro"≻1,20≺/price≻≺/item≻
  ≺item≻≺description≻1 Sparking Water≺/description≻≺price currency="euro"≻1,40≺/price≻≺/item≻
≺/bill≻

This is now much more accessible to computer automation, as the machine can deduce from the tags what interpretation to give the enclosed string. And so was born the great enthusiasm for xml formats and web services.

World Wide Data

It is clear that by following the above procedure we can create machine readable documents for every type of data, stored in every kind of database available worldwide. We just need to mark it up. Wait. We need to do more. We need to agree on a vocabulary and a tree like way of displaying it, since xml forms a markup tree. Let us assume for this article that the naming problem has somehow been solved, and let's look more closely at the data format problem.

Say I want to describe a house, then I will want to have an xml format something like this

≺house≻
   ≺owner≻...≺/owner≻
   ≺address≻...≺/address≻
   ≺rooms≻
     ≺room≻...≺/room≻
     ...
   ≺/rooms≻
≺/house≻

but if I want to describe a person I would of course describe them like this

≺person≻
   ≺name≻...≺/name≻
   ≺owns≻≺house≻...≺/house≻≺/owns≻
   ≺friend≻...≺/friend≻
   ≺friend≻...≺/friend≻
   ....
≺/person≻

In the first case the person object is part of the house document, whereas in the second case the house information is part of the person document. Both are equally valid ways of doing this. This is not an isolated case. It will happen whenever we wish to describe some object. No object has priority of any other. Here is another example. We may want to describe a book like this:

≺book≻
   ≺author≻Ken Wilber≺/author≻
   ≺title≻Sex, Ecology, Spirituality≺/title≻
   ...
≺/book≻

but of course if I had a CV/resumé of Ken Wilber then my xml would be like this

≺CV≻
  ≺name≻Ken Wilber≺/name≻
  ≺publications≻
     ≺book≻...≺/book≻
  ≺/publications≻
≺/CV≻

again there is no natural way putting things. In one case the ≺book≻ element is the root of the tree, in another it is an element of the tree. It follows that every type of object will require its own type of document to describe it. This is not a problem if the world were composed just of turtles. But it isn't: there are an infinite number of types of things in our very rich world. Furthermore what is of interest in each type of document depends completely on the context. In one type of document we may be more interested in the friends a person has, in another in his medical history, in yet another his academic achievements, etc... So there is not even one objective way to describe anything! If we were to create a tree structured document to describe every type of thing we are interested in, we would therefore also need to create an uncountable number of document formats for every different way we wanted to describe each class of objects.

Conclusion

This is summarized simply by saying "The World is a Graph". The world can just be described holistically as consisting of objects and relations between those objects. Take any object in the world, you will be able to reach any other object by following relations stemming from it. Make that type of object the root of your graph, and you have an xml format.
So the problem is not so much that it is not possible to describe each subgraph we find using XML. One can! The problem emerges rather when considering the tools required to query and understand these documents. It is clear from the arguments above that when thinking web wide, one has to give up the idea that information will reach one in a limited number of hierarchically structured ways. As a result tools such as XQuery, that are designed to query documents at the xml structure level are not adapted for querying information across documents, since the tree structure of the xml documents will gets in the way of the description of the graph that the world is and that documents are attempting to describe. XQuery people know this, which is why they don't like RDF. But it is not RDF that is the problem. It is reality that is the problem. And that is a lot more difficult to change.

To repeat, if RDF never had been invented, your database of documents would end up containing an infinitely large number of different types of xml documents to describing the infinite types of objects out there, each of course requiring its own specific interpretation (since XML does not come with a semantics). And so you may as well start off using RDF, since that is where you will end up anyway.

The world is an interconnected graph of things. RDF allows one to describe the world as a graph. SPARQL is the query language to query such a graph. Use the tools that fit the world!

Note

  • This is not to say that rdf/xml is perfect. I myself believe it is a really good first attempt at trying to do something very ambitious. Sadly it was done a little too early. Something better will certainly come along. In the mean time it is good enough for nearly everything anyone may want to do with it when wishing to send data out on the web.
  • Having many XML documents is not a problem for the Semantic web since it is easy to convert each of the formats using GRDDL to rdf.
About

bblfish

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today