Thursday Feb 12, 2009

creating a foaf+ssl cert in a few clicks

In a previous blog I showed how to create a foaf+ssl cert manually. I have now put up a simple test server where you can do the same in a few clicks.

The service will add a certificate to your browser securely, and create a local acount to which your certificate is pointing. This account will itself point to your Web Id. An account in this scheme is nothing more than an RDF file, such as the ones listed in the certs directory. You can then login in one click, without needing to remember a URL to other foaf+ssl services on the web. There are a few and growing number of prototype implementations listed on the foaf+ssl wiki.

All that remains to be done now is to create more interesting and valuable services using the distributed social networks and foaf+ssl. For some ideas on things to do consult the foaf+ssl Use Cases. The foaf protocols mailing list is a great place to get help on implementations and discuss ideas.

The server is written using Wicket and deployed on a GlassFish Application Server. The code is open source under a BSD licence. Hackers welcome!

Saturday Jan 17, 2009

foaf+ssl: creating a web of trust without key signing parties

The concept of a Web of Trust is most closely associated with Phil Zimmerman and PGP. The basic idea is that by signing each other's keys, usually at things like key signing parties, people could grow the network of keys they trusted to sign or encrypt documents such as email, sign legal documents, etc... The distributed system of trust feels right, but the idea never really took off - even though the keysigning parties must have been fun - probably because they still required physical presence. Another problem with the PGP web of trust, is that the signers of your key, or your signature on someone else's key will forever be published on one of the many key servers, making it close to impossible to revoke an association once published.

In foaf+ssl we are also using a Web of Trust mechanism, but as I will show here, this does not require key signing. It should therefore be able to grow much faster, and hopefully give us the same benefits. The friendship relations are furthermore not embedded in the signature. They can be made to be only visible to those people you wish to make it visible to, and these can be changed at any moment.

I wrote this rather long post as I was starting to answer a question John Kemp asked in the comments of my duck rabbit post on the topic of authorization in foaf+ssl. As I found the answer was getting long and longer, I decided this justified its own blog entry. So I published it here instead.

John Asked a question that forced me to detail how the trust mechanism in foaf+ssl works. Here it is:

The problem (I think) with how you use the certs [in foaf+ssl] though is that the real trust (if Juliet does not know Romeo a priori) is that Juliet's friends know Romeo, and when I say "know", I don't mean that in any cryptographic sense (Juliet's friends haven't signed Romeo's key/key fingerprint for example). Why wouldn't it then be enough for Juliet to base her trust on the appearance of Romeo's OpenID in her friends' FOAF files, for example?

I like the web of trust model, but in order for there to be verifiable trust based on certs/keys, don't you also need key/cert/fingerprint signing parties?

Since this is going to require thinking carefully about the foaf+ssl protocol, we may as well have its UML sequence diagram in front of our eyes. Here it is:

Remember that at stage 5 Juliet's server knows only the following about <>

  1. The Agent connecting via ssl has access to the private key that matches the public key sent in the cert (because otherwise he could not have signed the cert, and could not have established the ssl connection)
  2. The Agent wishes to be identified as ""
  3. Dereferencing the information resource <> returns the document <> which states that anyone who can proove to have the private key for the given public key is <>

Juliet's server can then conclude that the Agent making the request is indeed <> - whoever or whatever that is. Juliet's server can be as confident in this fact as the cryptography algorithms allow her to, which is pretty good.

So at this point we have something between identification and authentication, I am not sure.

When Basing Trust on OpenId does not work

Other statements returned by <> are extra claims and are as trustworthy as statements made by <>. They may just by themselves solve some really interesting puzzles for Juliet's server that by could make it trust <> more than anyone else it ever met -- but web servers of this intelligence are unlike any I have yet seen.

More realistically, the question for Juliet's server is whether it should authorize access to the protected resource. If Juliet's friends identify Romeo indirectly via his OpenId, and using that only, with something like the following relations in N3

  @prefix foaf: <> .
  @prefix : <#> .

  :anne foaf:knows _:bN .
  _:bN foaf:name "Romeo";
       foaf:openid <> .

and Romeo publishes a relation he has to an OpenId too such as

  @prefix : <#> .  

  :romeo foaf:openid <> .

Then if Juliet's server wants to motivate access to the protected resource it would need to believe the following

  :anne foaf:knows :romeo .

which appears nowhere. It would have to be inferred from a statement such as

  _:bN = :romeo .

This can indeed be inferred from the statements in Romeo's file giving his OpenId, and the statement in :anne's file stating the relation of the blank node _:bN to the same OpenID, since foaf:openid is an inverse functional property. But what is the confidence Juliet's server can have in that? Well Juliet's server can be as confident of that as she is of the assertion made by <> that his openid is <>. This can hardly be said to count as confirmatory evidence. If Juliet's server thinks like that, it might as well make the resource public. For it would be the equivalent of a prison officer freeing a prisoner solely on the basis of the claim that he is himself an officer.

Who should one trust?

On the other hand if Juliet's friend :anne had claimed that

   :anne foaf:knows <> . 

then Juliet's server would have had the piece of information needed to authorize access to the protected resource, because that information came from a trusted party.

So in summary, when Juliet's server is looking to evaluate the trust it can have in :romeo it should not ask :romeo himself . It should ask other people in the social networks she trusts. So the graphs it needs to search is everything except what is said by :romeo . Juliet's server can go on the following:

  • that it is speaking to <>
  • what Juliet believes
  • what Juliet believes of what her friends claim
  • the consequences of the claims it is able to or willing to calculate

Now the OpenId can in fact come in useful, but not directly as may have been hoped initially. Imagine that Juliet now has another resource that she only gives access to, to people known by two of her friends. If only one of her friends, say :jane makes the assertion that

  :romeo foaf:openid <> .

but all other of her friends refer to :romeo indirectly, then her web server could use that information with the statement made by :anne to deduce that indeed at least two of Juliet's friends know :romeo.

The value of information published by Romeo

Now is it absolutely true that Juliet's server can do nothing with the information returned by the document <>? Not at all! It is just that it has to be used in an exploratory manner. Imagine a third resource that is accessible to friends of friends of Juliet's friends. We could imagine that <> had returned a list of friends for :romeo, perhaps with the following relations:

:romeo foaf:knows <>, <>, <> .

Juliet's server could decide to dereference (HTTP GET) both <> and <> because they were known by her friends and see if any of those claimed to know <>.

Special case: when the OpenId published by Romeo is trustworthy

This highlights then a special case where the OpenId published by Romeo can work correctly out of the box. This is when his OpenId is his foaf file. Ie when Romeo publishes something like:

@prefix foaf: <> .

<> a foaf:PersonalProfileDocument;
            foaf:primaryTopic <> .

<> a foaf:Person;
            foaf:openid <> .
Here the foaf personal profile document is the same as the one returned when dereferencing Romeo's web id <>. It is returned in the same representation as the statements we are trusting on :romeo's public key, and assigning this relation to the same person. Since an OpenId is a resource controlled by the person whose openid it is, and since we believe that this resource is now controlled by <> I think we can safely say that Juliet's server knows :romeo's OpenId in this particular case too.

There is another case where the OpenId published by :romeo could work, even when the OpenId is not the same as the foaf:PersonalProfileDocument, but in this case it would require some verification. So let us get back to the situatuon described earlier in this post where Romeo's WebID claims that

  :romeo foaf:openid <> .
For Juliet to believe this she would have to verify that controls This could be done very simply if Romeo has that OpenId page link back in some way to his foaf ID. Perhaps, as I suggested in "Foaf and Openid" this could be done simply by adding the following to the header of the OpenId page:
<link rel="meta" type="application/rdf+xml" title="FOAF" href=""/>
Juliet's web server would then be able to fetch the OpenId page, and having found the above link back to the :romeo's Personal Profile Document, it would be able to conclude that he controlled, and so that it is a legitimate OpenId for him. With foaf+ssl Juliet's server can verify an OpenId with only one extra connection!

Web of Trust and Key Signing

All the above clearly shows that you can create a web of trust without key signing parties. Parties are nice, but requiring Key Signing parties is something that has seriously dampened the adoption of PGP and the web of trust. By just dragging and droping a URL from a web page, an email, another application into your foaf Address Book (be it web based or not), you can grow your web of trust much faster than Keysigning can. Furthermore you can change your public key when you no longer need it, loose it or whatever without needing to re-sign all your keys. This therefore adds to the security of your web of trust.

This is not to say that foaf+ssl is incompatible with key signing, btw. and it may be interesting to find out where this remains useful.

Thursday Jan 15, 2009

The W3C Workshop on the Future of Social Networking Position Papers

picture by Salvadore Dali

I am in Barcelona, Spain (the country of Dali) for the W3C Workshop on the Future of Social Networking. To prepare for this I decided to read through the 75 position papers. This is the conference I have been the best prepared for ever. It really changes the way I can interact with other attendees. :-)

I wrote down a few notes on most paper I read through, to help me remember what I read. This took me close to a week, a good part of which I spent trying to track down the authors on the web, find their pictures, familiarise myself with their work, and fill out my Address Book. Anything I could do to help me find as many connections as possible to help me remember the work. I used delicious to save some subjective notes, which can be found on under the w3csn tag. I was going to publish this on Wednesday, but had not quite finished reading through all the papers. I got back to my hotel this evening to find that Libby Miller, who co-authored the foaf ontology, had beat me to it with the extend and quality of her reviews which she published in a two parts:

Amazing work Libby!

70 papers is more than most people can afford to read. If I were to recommend just a handful of papers that stand out in my mind for now these would be:

  • Paper 36 by Ching-man Au Yeung, Laria Liccardi, Kanghao Lu, Oshani Seneviratne and Tim Berners Lee wrote the must read paper entitled "Decentralization: The Future of Online Social Networking". I completely agree with this outlook. It also mentions my foaf+ssl position paper, which of course gives it full marks :-) I would use "distribution" perhaps over "decentralisation", or some word that better suggests that the social network should be able to be as much of a peer to peer system as the web itself.
  • "Leveraging Web 2.0 Communities in Professional Organisations" really prooves why we need distributed social networks. The paper focuses on the problem faced by Emergency Response organisation. Social Networks can massively improove the effectiveness of such responses, as some recent catastrophes have shown. But ER teams just cannot expect everyone they deal with to be part of just one social network silo. They need to get help from anywhere it can come from. From professional ER teams, from people wherever they are, from infromation wherever it finds itself. Teams need to be formed ad hoc, on the spot. Not all data can be made public. Distributed Open Secure Social Networks are what is needed in such situations. Perhaps the foaf+ssl proposal (wiki page) can help to make this a reality.
  • In "Social networking across devices: opportunity and risk for the disabled and older community", Henni Swan explains how much social networking information could be put to use to help make better user interface for the disabled. Surprisingly enough none of the web sites, so taken by web 2.0 technologies, seem to put any serious, effort in this space. Aparently though this can be done with web 2.0 technologies, as Henny explains in her blog. The semantic Web could help even further I suggested to her at her talk today, by splitting the data from the user interface. Specialised browsers for the disabled could adapt the information for their needs, making it easy for them to navigate the graph.
  • "Trust and Privacy on the Social Web" starts the discussion in this very important space. If there are to be distributed social networks, they have to be secure, and the privacy and trust issues need to be looked at carefully.
  • On a lighter note, Peter Ferne's very entertaining paper "Collaborative Filtering and Social Capital" comes with a lot of great links and is a pleasure to read. Did you know about the Whuffie Index or CELEBDAQ? Find out here.
  • Many of the telecoms papers, of which Telefonica's "The social network behind telecom networks" reveal the elephant in the room that nobody saw in social networking: the telecoms. Who has the most information about everyone's social network? What could they do with this information? How may people have phones, compared to internet access? Something to think about.
  • Nokia's position paper can then be seen in a different light. How can handset manufacturers help put to use the social networking and location information contemporay devices are able to access? The Address Book in cell phones is the most important application in a telephone. But do people want to only connect to other Nokia users? This has to be another reason for distributed social networks.

    I will blog about other posts as the occasion presents itself in future blogs. This is enough for now. I have to get up early and be awake for tomorrow's talks which start at 8:30 am.

    In the mean time you can follow a lively discussion of the ongoing conference on twitter under the w3csn tag.

  • Tuesday Dec 30, 2008

    foaf+ssl, pki and the duck-rabbit

    In part II §xi of the "Philosophical Investigations", Ludwig Wittgenstein introduces the duck-rabbit figure:

    I shall call the following figure derived from Jastrow, the duck-rabbit. It can be seen as a rabbit's head or as a duck's. And I must distinguish between the 'continuous seeing' of an aspect and the 'dawning' of an aspect.

    The picture might have been shewn me, and I never have seen anything but a rabbit in it.

    It is worth stopping here and considering that illustration carefully, making sure you can see it one way then the other. There is no illusion here notice. There is not one correct way to see the line. The figure itself is ambiguous. The duck-rabbit therefore shows very simply how the way we perceive the world can change without any new fact appearing in the world.

    Is that not what magic does?

    Much more complex examples of this phenomenon can be found. In some cases it is much more difficult to switch between meanings. I find this for the Young Woman Old Woman image for example. I really need to work hard there to see the other interpretation, and when I find that interpretation I find switching back very difficult.

    Recently I have felt that the foaf+ssl protocol does something similar to Public Key Cryptography (PKI). We use a tool that was always meant to be used one way, in a completely different way, a way of course that was always permitted, but that nobody saw (or if they did they did not pursue it openly).

    To perceive this different way of using this tool one has to - just as with the duck-rabbit - look at it differently. One has to see it in a new way, or perhaps even use it in a new way. Whereas PKI is used for hierarchical trust, we use it to build a web of trust. Where X509 certs built up a lot on the Distinguished Name hierarchy, we nearly ignore it. Where X509 tried to place information in the certificate, we place it outside at the name location. Even though SSL can request client certificates in the browser, nobody does this, yet we build on this little known feature. Self signed client certificates, which would not have made sense in traditional PKI infrastructure, because they proove nearly nothing about the client, is what we build everything on....

    All the usual X509 and ssl tools work just as they should, but magically it seems they are suddenly found to be doing something completely different.

    Friday Dec 19, 2008

    what does foaf+ssl give you that openid does not?

    Jason Kolb asked on Twitter "what does foaf+ssl give you that openid does not?". I can make the answer short but not short enough for a tweet. So here are my initial thoughts on this.

    • foaf+ssl gives people and other agents a URL for Identification, just like OpenId does. But in the case of foaf+ssl the user does not need to remember the URL, the browser or keychain does. A login button on a foaf+ssl web site is just a button. No need to enter any identifier. Just click the button. Your browser will then ask you what identity you wish to use. The user does not need to remember the password either (except perhaps that of the keychain if the browser requires it).
    • The foaf+ssl protocol requires minum 1 to 2 network connections. Compare this to the much more complex OpenId sequence diagram. In a world of distributed data where each site can point to data on any other site, this can become really important.
    • the description of foaf+ssl holds on one page. A page is required to list the OpenId specs.
    • foaf+ssl builds on well established standards: REST, RDF, SSL, X509. That is why of course it takes much less space to explain. It does not invent anything new.
    • foaf+ssl is clearly RESTful. You can GET your foaf file, and if you needed update it with PUT. You could create it with POST. No need to reinvent those verbs as OpenId has to do in OpenId Attribute Exchange spec
    • It is easy to add new attributes to the rdf file. It is easy to extend, and to give the extensions meaning. Every attribute is a URI, which when clicked on can give you yet more information about the relation, and participate in the Linked Data cloud. New classes can be created. You can add relations to objects, and those objects themselves can have yet more relations (see my foaf file, and how it relates me to an address, which is related to a country). The complex OpenId attribute exchange spec does not offer any of this.
    • You can reason about the foaf. Well that just comes for free with RDF and OWL. (So you can do this too with OpenId, but you'd have to treat it as a special case of RDF for that to work.)
    • Being simpler, it will be easier to
    • With foaf+ssl you get a web of trust. With OpenId you only get trust indirectly if you trust the OpenId provider. So for example you may trust the information gathered by the foaf+ssl attribute exchange mechanism of someone who has an OpenId provider at the url, because you trust Sun Microsystems. With foaf+ssl you can get trust of some file on some web server you never heard about because all your friends point to his foaf file.
    • Foaf+ssl is distributed. There is no need for a OpenId provider. You just need a web server, ideally your own at your own domain name. Yes you can run your OpenId server locally too, but then you loose the trust that might have been associated with that domain name. Have you ever wondered why there were so many very large OpenId providers, and not many small ones?
    • Foaf+ssl requires no HTTP redirects: these are problematic on many cell phones I am told, in part often because telecoms proxys get in the way.

    OpenId is very well known and widely used now. It has made people aware of the power of a URL for identifying people, and is what helped me find this solution. Furthermore it would be quite easy to create a foaf+openid service as I proposed some time ago, simplifying OpenId in the process. So the two technologies are not incompatible.

    More on foaf+ssl on the esw wiki

    foaf+ssl user story 1: web site personalisation

    In Agile development one creates simple User Stories. Here is the simplest one I can think of for foaf+ssl. It only uses the authentication piece, not the authorization part, so all the steps up to and including 5 in the sequence diagram.

    Prerequisite: A User has a foaf+ssl certificate in his browser and corresponding foaf file.

    The User arrives at a new web site he has never been to before. An https connection is made and the server asks for the client certificate. The User chooses one. The web site fetches the users foaf file at the URI contained in the certificate and uses this to personalise the site. Some things it could do would be

    • Welcome the user by name
    • List friends the user may know on the site
    • List projects the user may be interested in
    • Create an account for the user, ie, some space on the server dedicated to the user.
    What it can do will depend on the site, the information in the foaf file, and the location of the user's URI in the social network known to the web service.

    Wednesday Dec 17, 2008

    python and php implementations of foaf+ssl

    We now have two new implementations of foaf+ssl authentication protocol, in addition to the java one I blogged about earlier. If you have followed the procedure there to create your certificate, add it to your browser, and publish a minimal foaf file you can then try out these two servers.

    Melvin Carvhalo, who owns the great domain name, has implemented this in PHP in a very nicely layered fashion. In recent mail to the foaf protocols list he published the following end points:

    1. a test ssl resource will from a simple ssl connection that asks for the client certificate:
      • Display the output of the $_SERVER global variable
      • Display the details in the supplied Client Certificate
      • Display the Client Public Key info
      • Function returning the Client Public Key info in HEX
      • Function returning the subjectAltName in the Client Certificate
    2. foaf tester that after getting the URI in your certificate from the X509 v3 extensions section will fetch the foaf at that URL and
      • Convert the FOAF into an array of triples which it displays
      • Find the RSA Key of the declared subject ("owner") within a FOAF file
      • Get the list of friends in a FOAF file
    3. and finally the foaf+ssl tester, which Melvin pointed to in another email to the list, which will use the foaf+ssl protocol to log you into a server in one https connection. The server only does authentication and the minimal authorization: if it can authenticate you, then you are authorized

    These three minimal services are very helpful as they allow us to detect and debug each stage in the protocol carefully. I highly recomment this step by step approach (and will therefore have to add this to my own examples!)

    Ian Jacobi from MIT, has worked on extending authorization more with his python based server to also check your identity in a social network. See his detailed post on this "TAAC in action". Ian was in fact the first to have a running implementation I'd like to point out.

    Keep these coming!

    In the meantime I am working on authorization schemes, and am currently reading a complex paper Vladimir Kolovski, James Hendler, and Bijan Parsia entitled "Formalizing XACML Using Defeasible Description Logics". Clark Kendall is blogging about this under the policy management tag, which contains a less mathematical overview of the paper. I'll report back when I have managed to digest this. Read it if you need an antidote to twitter.

    Thursday Dec 04, 2008

    JavaOne 2009 call for papers

    Picture of JavaOne2008 keynote conference room

    The JavaOne 2009 call for papers is now open (direct link to form). The deadline for paper submissions is December 19th.

    Last year we had three Semantic Web related talks: one panel presentation, an introduction by Dean Allemang, and a small Birds of A Feather session. The talks went very well and were very well attended, surprisingly so given that they were somewhat in the wrong logical order, starting with the panel discussion, and ending with theory. Dean Allemang had over 300 attendees at his talk ( slides ). JavaOne is compared to most developer conferences huge. There are usually over 15 thousand attendees, so it is an excellent venue to speak to and convert a very large crowd to something new in one go.

    I don't expect us to grow at the same rate as we did last year (we had a 200% increase in the number of talks). But I think we really should fit in some presentations on Java Semantic Web Frameworks, such as Sesame, Mulgara, Jena, or something that gives an overview on all of them. But I am not here to decide what goes in these talks. The track to look at is probably services track which covers a huge swath from cloud computing to web 2.0 SOA and more.

    Remember that JavaOne attendees are practical people most of all. There is also a very large space for businesses to introduce attendees to their products. So we are here at the point where research meets business.

    I know this clashes with the 6th European Semantic Web Conference in Greece, so I myself may have to do the impossible task of being at both simultaneously. On the other hand it is only one week before the Semantic Technology Conference in San Jose, so it can be a good time to visit the Bay Area, and meet the companies here, or vacation in the sun. :-)

    See: JavaOne2008 or JavaOne tagged photos on flickr.

    video on distributed social network platform NoseRub

    I just came across this video on Twitter by pixelsebi explaining Distributed social networks in a screencast, and especially a php application NoseRub. Here is the video.

    Distributed Social Networking - An Introduction from pixelsebi on Vimeo.

    On a "Read Write Web" article on his video, pixelsebi summarizes how all these technologies fit together:

    To sum it up - if I would have to describe it somebody who has no real clue about it at all:
    1. Distributed Social Networking is an architecture approach for the social web.
    2. DiSo and Noserub are implementations of this "social web architecture"
    3. OpenSocial REST API is one of many ways to provide data in this distributed environment.
    4. OpenOScial based Gadgets might run some time at any node/junction of this distributed environment and might be able to handle this distributed social web architecture.

    So I would add that foaf provides semantics for describing distributed social networks, foaf+ssl is one way to add security to the system. My guess is that the OpenSocial Javascript API can be decoupled from the OpenSocial REST API and produce widgets however the data is produced (unless they made the mistake of tying it too closely to certain URI schemes)

    Tuesday Dec 02, 2008

    foaf+ssl: adding security to open distributed social networks

    For the "W3C Workshop on the Future of Social Networking", taking place in Barcelona January 2009

    Henry Story
    Bruno Harbulot, Ian Jacobi, Toby Inkster
    Melvin Carvalho

    Semantic Web vocabularies such as foaf permit distributed hyperlinked social networks to exist. We would like to discuss a group of related ways we are exploring (mailing list) to add information and services protection to such distributed networks.

    One major criticism of open networks is that they seem to have no way of protecting the personal information distributed on the web or limiting access to resources. Few people are willing to make all their personal information public, many would like large pieces to be protected, making it available only to a select group of agents. Giving access to information is very similar to giving access to services. There are many occasions when people would like services to only be accessible to members of a group, such as allowing only friends, family members, colleagues to post a blog, photo or comment on a site. How does one do this in a maximally flexible way, without requiring any central point of access control?

    Using an intuition made popular by OpenID we show how one can tie a User Agent to a URI by proving that he has write access to it. foaf+ssl is architecturally a simpler alternative to OpenID (fewer connections), that uses X.509 certificates to tie a User Agent (Browser) to a Person identified via a URI. However, foaf+ssl can provide additional features, in particular, some trust management, relying on signing FOAF files, in conjunction with set of locally trusted keys, as well as a bridge with traditional PKIs. By using the existing SSL certificate exchange mechanism, foaf+ssl integrates more smoothly with existing browsers (pictures with Firefox) including mobile devices, and permits automated sessions in addition to interactive ones.

    The steps in the protocol can be summarised simply:

    1. A web page points to a protected resources using a https URL, e.g.
    2. The client fetches the secure http URL .
    3. As part of that exchange the server requests the client certificate. The client returns Romeo's (possible self signed) certificate, containing the little known X.509 v3 extensions section:
              X509v3 extensions:
                 X509v3 Subject Alternative Name: 
      Because the connection is encrypted, Juliet's server knows that Romeo's client knows the private key of the public key that is also passed in the certificate. Something like:
            Subject Public Key Info:
                  Public Key Algorithm: rsaEncryption
                  RSA Public Key: (1024 bit)
                      Modulus (1024 bit):
                      Exponent: 65537 (0x10001)
    4. Juliet's server dereferences the URI found in the certificate, fetching a document .
    5. The document's log:semantics is queried for information regarding the public key contained in the previously mentioned X.509. This can be done in part with a SPARQL query such as:
      PREFIX cert: <>
      PREFIX rsa: <>
      SELECT ?modulus ?exp
      WHERE { 
         ?key cert:identity <>;
              a rsa:RSAPublicKey;
              rsa:modulus [ cert:hex ?modulus; ];
              rsa:public_exponent [ cert:decimal ?exp ] .   
      If the public keys in the certificate is found to be identical to the one published in the foaf file, the server knows that the client has write access over the resource.
    6. Romeo's identity is then checked as to its position in a graph of relations (including frienship ones) in order to determine trust according to some criteria . Juliet's server can get this information by crawling the web starting from her foaf file, or by other means.
    7. Access is granted or denied .

    We have tested this on multiple platforms in a number of different languages, (Java™, Python, ...) and across a number of existing web browsers (Firefox, Safari, more to come).

    foaf+ssl is one protocol that we would like to concentrate on due to its simplicity. But there are a number of other ways of achieving the same thing, by using OpenID for example. All of them require some extra pieces:

    • An ontology to describe what can be done with the data (copied, republished,...) or what obligations incur in using a service .
    • An ontology to describe who has access to the service. This would be useful to help people decide if they should bother trying to access it, or what else they need to do such as become friends with someone, or reveal a bug in the software somewhere .
    • Other things that might come up .

    We will discuss our experience implementing this, the problems we have encountered and where we think this is leading us to next.

    Sunday Nov 30, 2008

    personalising my blog

    image of the sidebar of my blog

    Those who read me via news feeds (I wonder how many those are), may not have seen the recent additions I have made to my blog pages. I have added a view onto:

    This is quite a lot of personal info. With my friend of a friend network it should be clear how you have more and more of the type of information you could find in social networking sites such as facebook on my blog. And this could keep growing of course.

    The current personalization is mostly powered by JavaScript (with one flash application for ). Here is the code I added to my blog template, pieces of which I found here and there on the web, often in templates provided by the web services themselves.

     <h2>Recent Photos</h2><!-- see -->
        <div id="flickr"><script type="text/javascript" 
        <div class="recentposts">
         <script type="text/javascript" 
        <div id="twitter_div" class="recentposts">
        <a href="">last 5 entries:</a><br/>
    <ul id="twitter_update_list"></ul>
    <script src="" type="text/javascript"></script>
    <script src="" type="text/javascript">
      <h2>Listening To</h2>
    <!-- I am looking for something lighter than this! -->
    <style type="text/css">table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 td 
      {margin:0 !important;padding:0 !important;border:0 !important;}
     table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 tr.lfmHead 
         no-repeat 0 0 !important;}
     table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 tr.lfmEmbed object {float:left;}
     table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 tr.lfmFoot td.lfmConfig a:hover 
        {background:url( no-repeat 0px 0 !important;;}
     table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 tr.lfmFoot td.lfmView a:hover 
        {background:url( no-repeat -85px 0 !important;}
     table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 tr.lfmFoot td.lfmPopup a:hover 
        {background:url( no-repeat -159px 0 !important;}
    <table class="lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484" cellpadding="0" cellspacing="0" border="0" 
       style="width:184px;"><tr class="lfmHead">
       <td><a title="bblfish: Recently Listened Tracks" href="" target="_blank" 
             no-repeat 0 -20px;text-decoration:none;border:0;">
       <tr class="lfmEmbed"><td>
       <object type="application/x-shockwave-flash" data="" 
         id="lfmEmbed_210272050" width="184" height="199"> 
       <param name="movie" value="" /> 
      <param name="flashvars" value="type=recenttracks&user=bblfish&theme=blue&lang=en&widget_id=chart_0bbc5b054e26d39362c0a10c7761f484" /> 
       <param name="allowScriptAccess" value="always" /> 
        <param name="allowNetworking" value="all" /> 
        <param name="allowFullScreen" value="true" /> 
        <param name="quality" value="high" /> <param name="bgcolor" value="6598cd" /> 
        <param name="wmode" value="transparent" /> <param name="menu" value="true" /> 
        </object></td></tr><tr class="lfmFoot">
        <td style="background:url( repeat-x 0 0;text-align:right;">
        <table cellspacing="0" cellpadding="0" border="0" style="width:184px;">
        <tr><td class="lfmConfig">
       <a href="" 
        title="Get your own widget" target="_blank" 
              no-repeat 0px -20px;text-decoration:none;border:0;">
        </a></td><td class="lfmView" 
        <a href="" title="View bblfish's profile" 
         target="_blank" style="display:block;overflow:hidden;width:74px;height:20px;background:url(
            no-repeat -85px -20px;text-decoration:none;border:0;">
        </td><td class="lfmPopup"
        <a href="" 
           title="Load this chart in a pop up" 
                 no-repeat -159px -20px;text-decoration:none;border:0;" 
           onclick=" + '&resize=0','lfm_popup','height=299,width=234,resizable=yes,scrollbars=yes'); return false;"

    So that as you can see is quite a lot of extra html every time someone wants to download my web page. This would not be too bad, but the above javascript widgets themselves go and fetch a lot of html, javascript, code and other content to further slow down the responsiveness of the web pages. This data is served to everyone whether they want to see all that information or not. Well, if they don't they can subscribe to the rss feed by dragging this page into a feed reader. In which case they will just see the blog posts themselves, and not the sidebar.

    Why add this information to my blog? Well it gives people an idea of where they can find out more about me. A lot of people don't know that I have a feed, so they may not know that they can follow what I am reading over there. This gives the initial feeling of what it would be like to have a deeper view on my activities.

    But as mentioned previously, there are a few problems with this.

    • This makes this page heavier.
    • Every page view on my blog will download that information and start those applets. ( A great way for those services to track the number of people directly visiting these pages btw. )
    • This can become tedious. People who want to follow me can do so by coming to this web page from time. But with enough sites like that this is going to become a bit difficult to do. One does not want to spend all day reading the different feeds of information of one's friends. This is what Facebook does for people: it is a giant web based feed reader of social information.
    • Difficult to track change: If I switch to a different book marking service, perhaps a semantic one like faviki, I will have to redo this page, and all my friends are going to have to update their feeds.
    • If I add more of the resources I am working on this page is going to become unmaintainably long
    • People who read my feed will not notice the changes occurring here.

    So those are the problems that Web 3.0, the semantic web is going to solve. By just downloading my foaf file, you should have access to my network of friends via linked data, and via pointers to all the other resources on the web that I may be using. Whatever tool you use will be able to then keep all this data easily up to date, and with great search tools, enhance your view of the many linked networks you will be part of and tracking.

    The whole code you see above could then be replaced with one link to my foaf file. That foaf file can itself be point to further resources in case it becomes large. To give a list of some of my the most interesting accounts I have I added the following N3 to my foaf file today:

    @prefix : <> .
    @prefix foaf: <> .
    @prefix rdfs: <> .
    :me foaf:holdsAccount 
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's skype account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <>
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's flickr pictures account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <>
                    foaf:accountProfilePage <>
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's music account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <>
                    foaf:accountProfilePage <>
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's delicious bookmarking account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <>
                    foaf:accountProfilePage <>
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's developer account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <>
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's twitter micro blogging account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <>
                    foaf:accountProfilePage <>
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's twine semantic aggregation account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <>
                    foaf:accountProfilePage <>
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's facebook social networking account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <>
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's linked in business social network account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <>
                    foaf:accountProfilePage <>
                  ] .

    First of all it should be clear that the above is a lot more readable that the javascript code shown earlier in this post. Secondly I listed over twice as many online accounts there than I currently have in my side bar. And finally this is in a file that a client would not need to download unless it had an interest in knowing more about me. This could easily be cached over a period of time, and need not be served up again on each page request.

    Again for one possible view on the above data it is worth installing the Tabulator Firefox extension and then clicking on my foaf icon. There are of course many more things specialized software could do with that infomation than present it like that.

    On this topic, you may want to continue by looking at the recently published, excellent and beautiful presentation on the subject of the Social Semantic Web, by John Breslin.

    variation on @timoreilly: hyperdata is the new intel outside

    Context: Tim O'Reilly said "Data is the new Intel Inside".

    Recently in a post "Why I love Twitter":

    What's different, of course, is that Twitter isn't just a protocol. It's also a database. And that's the old secret of Web 2.0, Data is the Intel Inside. That means that they can let go of controlling the interface. The more other people build on Twitter, the better their position becomes.

    The meme was launched in the well known "What is Web 2.0" paper in the section entitled "Data is the next Intel Inside"

    Applications are increasingly data-driven. Therefore: For competitive advantage, seek to own a unique, hard-to-recreate source of data.

    Most of the data is outside your database. It can only be that way, the world is huge, and you are just one small link in the human chain. Linking that data is knowledge and value creation. Hyperdata is the foundation of Web 3.0.

    Tuesday Nov 11, 2008

    REST APIs must be hypertext driven

    Roy Fielding recently wrote in "REST APIs must be hypertext-driven"

    I am getting frustrated by the number of people calling any HTTP-based interface a REST API. Today's example is the SocialSite REST API. That is RPC. It screams RPC. There is so much coupling on display that it should be given an X rating.

    That was pretty much my thought when I saw that spec. In a comment to his post he continues.

    The OpenSocial RESTful protocol is not RESTful. It could be made so with some relatively small changes, but right now it is just wrapping RPC results in common Web media types.

    Clarification of Roy's points

    Roy then goes on to list some key criteria for what makes an application RESTful.

    • REST API should not be dependent on any single communication protocol, though its successful mapping to a given protocol may be dependent on the availability of metadata, choice of methods, etc. In general, any protocol element that uses a URI for identification must allow any URI scheme to be used for the sake of that identification.

      In section 2.2 of the O.S. protocol we have the following JSON representation for a Person.

          "id" : "",
          "displayName" : "Janey",
          "name" : {"unstructured" : "Jane Doe"},
          "gender" : "female"

      Note that the id is not a URI. Further down in the XML version of the above JSON, it is made clear that by appending "urn:guid:" you can turn this string into a URI. By doing this the protocol has in essence tied itself to a URI scheme, since there is no way of expressing another URI type in the JSON - the JSON being the key representation in this Javascript specific API by the way, the aim of the exercise being to make the writing of social network widgets interoperable. Furthermore this scheme has some serious limitations such as for example that it limits one to 1 social network per internet domain, is tied to a quite controversial XRI spec that has been rejected by OASIS, and does not provide a clear mechanism for retrieving information about it. But that is not the point. The definition of the format is tying itself unnecessarily to a URI scheme, and moreover one that ties one to what is clearly a client/server model.

    • A REST API should not contain any changes to the communication protocols aside from filling-out or fixing the details of underspecified bits of standard protocols, such as HTTP's PATCH method or Link header field.
    • A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]

      Most of these so called RESTful APIs spend a huge amount of time specifying what response a certain resource should give to a certain message. Note for example section 2.1 entitled Responses

    • A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC's functional coupling].

      In section 6.3 one sees this example:

      /activities/{guid}/@self                -- Collection of activities generated by given user
      /activities/{guid}/@self/{appid}        -- Collection of activities generated by an app for a given user
      /activities/{guid}/@friends             -- Collection of activities for friends of the given user {guid}
      /activities/{guid}/@friends/{appid}     -- Collection of activities generated by an app for friends of the given user {guid}
      /activities/{guid}/{groupid}            -- Collection of activities for people in group {groupid} belonging to given user {uid}
      /activities/{guid}/{groupid}/{appid}    -- Collection of activities generated by an app for people in group {groupid} belonging to given user {uid}
      /activities/{guid}/@self/{appid}/{activityid}   -- Individual activity resource; usually discovered from collection
      /activities/@supportedFields            -- Returns all of the fields that the container supports on activity objects as an array in json and a repeated list in atom.

      For some reason it seems that this protocol does require a very precise lay out of the patterns of URLs. Now it is true that this is then meant to be specified in an XRDS document. But this document is not linked to from any of the representations as far as I can see. So there is some "out of band" information exchange that has happened and on which the rest of the protocol relies. Furthermore it ties the whole service again to one server. How open is a service which ties you to one server?

    • A REST API should never have "typed" resources that are significant to the client. Specification authors may use resource types for describing server implementation behind the interface, but those types must be irrelevant and invisible to the client. The only types that are significant to a client are the current representation's media type and standardized relation names. [ditto]

      Now clearly one does want to have URIs name resources, things, and these things have types. I think Roy is here warning against the danger that expectations are placed on types that depend on the resources themselves. This seems to be tied to the previous point that one should not have fixed resource names or hierarchies as we saw above. To see how this is possible check out my foaf file:

      $ cwm --ntriples | grep knows | head
          <>     <> <> .
          <>     <> <> .
          <>     <> <> .
          <>     <> <> .
          <>     <> <> .
          <>     <> <> .
          <>     <> <> .
          <>     <> <> .
          <>     <> <> .
          <>     <> <> .

      Notice that there is no pattern in the URIs to the right. (As it happens there are no ftp URLs there, but it would work just as well if there were). Yet the Tabulator extension for Firefox knows from the relations above alone that (if it believes my foaf file of course) the URIs to the right refer to people. This is because the foaf:knows relation is defined as

      @prefix foaf: <> .
      foaf:knows  a rdf:Property, owl:ObjectProperty;
               :comment "A person known by this person (indicating some level of reciprocated interaction between the parties).";
               :domain <>;
               :isDefinedBy <>;
               :label "knows";
               :range foaf:Person .

      This information can then be used by a reasoner (such as the javascript one in the tabulator) to deduce that the resources pointed to by the URIs to the right and to the left of the foaf:knows relation are members of the foaf:Person class.

      Note also that there is no knowledge as to how those resources are served. In many cases they may be served by simple web servers sending resources back. In other cases the RDF may be generated by a script. Perhaps the resources could be generated by java objects served up by Jersey. The point is that the Tabulator does not need to know.

      Furthermore, the ontology information above is not out of band. It is GETable at the foaf:knows URIs itself. The name of the relation links to the information about the relations, which gives us enough to be able to deduce further facts. This is hypertext - hyperdata in this case - at its best. Compare that with the JSON example given above. There is no way to tell what that JSON means outside of the context of the totally misnamed 'Open Social RESTful API'. This is a limitation of JSON, or at least this name space less version. One would have to add a mime type to the JSON to make it clear that the JSON had to be interpreted in a particular manner for this application, but I doubt most JSON tools would know what to do with mime typed JSON versions. And do you really want to go through a mime type registration process every time a social networking application wants to add a new feature or interact with new types of data?

      as Roy summarizes in one one of the replies to this blog post:

      When representations are provided in hypertext form with typed relations (using microformats of HTML, RDF in N3 or XML, or even SVG), then automated agents can traverse these applications almost as well as any human. There are plenty of examples in the linked data communities. More important to me is that the same design reflects good human-Web design, and thus we can design the protocols to support both machine and human-driven applications by following the same architectural style.

      To get a feel of this it really helps to play with other hyperdata applications, other than ones residing in web browsers The semantic address book is one such, that I spent some time writing.

    • A REST API should be entered with no prior knowledge beyond the initial URI (bookmark) and set of standardized media types that are appropriate for the intended audience (i.e., expected to be understood by any client that might use the API). From that point on, all application state transitions must be driven by client selection of server-provided choices that are present in the received representations or implied by the user‚Äôs manipulation of those representations. The transitions may be determined (or limited by) the client's knowledge of media types and resource communication mechanisms, both of which may be improved on-the-fly (e.g., code-on-demand). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]

      That is the out of band point made previously, and confirms the point made about the danger of protocols that depend on URI patterns or resources that are somehow typed at the protocol level. You should be able to pick up a URI and just go from there. With the tabulator plugin you can in fact do just that on any of the URLs listen in my foaf file, or in other RDF.

    What's the point?

    Engineers under the spell of the client/server architecture, will find some of this very counter intuitive. This is indeed why Roy's thesis, and the work done by the people who engineered the web before that and whose wisdom is distilled in various writings by the Technical Architecture Group did something that was exceedingly original. These very simple principles that can feel unintuitive to someone who is not used to thinking at a global information scale, make a lot of sense when you do come to think at that level. When you do write such an Open system, that can allow people to access information globally, you want it to be such that you can send people a URI to any resource you are working with, so that both of you can speak about the same resource. Understanding what the resource that URL is about should be found by GETting the meaning of the URL. If the meaning of that URL depends on the way you accessed it, then you will no longer be able to just send a URL, but you will have to send 8 or 9 URLs with explanations on how to jump from one representation to the other. If some out of band information is needed to understand that one has to inspect the URL itself to understand what it is about, then you are not setting up an Open protocol, but a secret one. Secret protocols may indeed be very useful in some circumstances, and so as Roy points out may non RESTful ones be:

    That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. That’s fine with me as long as you don’t call the result a REST API. I have no problem with systems that are true to their own architectural style.
    but note: it is much more difficult for them to make use of the network effect: the value of information grows exponentially with its ability to be linked to other information. In another reply to a comment Roy puts this very succinctly:
    encoding knowledge within clients and servers of the other side’s implementation mechanism is what we are trying to avoid.

    Friday Sep 12, 2008

    RDF: Reality Distortion Field

    Here is Kevin Kelly's presentation on the next 5000 days on the web, in clear easy English that every member of the family can watch and understand. It explains what the semantic web, also known as Web 3.0, is about and how it will affect technology and life on earth. Where is the web going? I can find no fault in this presentation.

    This is a great introduction. He explains how Metcalf's law brought us to the web of documents and is leading us inexorably to a web of things, in which we will be the eyes and the hands of this machine called the internet that never stops running.
    For those with a more technical mind, who want to see how this is possible, follow this up with a look at the introductory material to RDF.

    Warning: This may change the way you think. Don't Panic! Things will seem normal after a while.

    Thursday Sep 04, 2008

    Building Secure, Open and Distributed Social Network Applications

    Current Social Networks don't allow you to have friends outside their network. When on Facebook, you can't point to your friend on LinkedIn. They are data silos. This audio enhanced slide show explains how a distributed decentralized social network is being built, how it works, and how to make is secure using the foaf+ssl protocol (a list of pointers on the esw wiki).

    It is licenced under a CC Attribution ShareAlike Licence.
    My voice is a bit odd on the first slide, but it gets better I think as I go along.

    Building Secure Open & Distributed Social Networks( Viewing this slide show requires a flash plugin. Sorry I only remembered this limitation after having put it online. If you know of a good Java substitute let me know. The other solution would have been to use Slidy. PDF and Annotated Open Document Format versions of this presentation are available below. (why is this text visible in Firefox even when the plugin works?) )

    This is the presentation I gave at JavaOne 2008 and at numerous other venues in the past four months.

    The slidecast works a lot better as a presentation format, than my previous semantic web video RDF: Connecting Software and People which I published as a h.264 video over a couple of years ago, and which takes close to 64MB of disk space. The problem with that format is that it is not easy to skip through the slides to the ones that interest you, or to go back and listen to a passage carefully again. Or at least it feels very clunky. My mp3 sound file only takes 17MB of space in comparison, and the graphics are much better quality in this slide show.

    It is hosted by the excellent slideshare service, which translated my OpenOffice odp document ( once they were cleaned up a little: I had to make sure it had no pointers to local files remaining accessible from the Edit>Links menu (which otherwise choked their service)). I used the Audacity sound editor to create the mp3 file which I then place on my server. Syncing the sound and the slides was then very easy using SlideShare's SlideCast application. I found that the quality of the slides was a lot better once I had created an account on their servers. The only thing missing would be a button in addition to the forward and backward button that would allow one to show the text of the audio, for people with hearing problems - something equivalent to the Notes view in Open Office.

    You can download the OpenOffice Presentation which contains my notes for each slide and the PDF created from it too. These are all published under a Creative Commons Attribution, Share Alike license. If you would like some of the base material for the slides, please contact me. If you would like to present them in my absence feel free to.

    Tuesday Sep 02, 2008

    Getting started with RDF

    So you have seen Kevin Kelly's presentation on the next 5000 days of the web? You don't believe in magic, and you want to see how it can really work? This used to be quite difficult, but it has become a lot easier recently. Here are some pointers I will try to keep up to date.

    Introductory Material

    Read Dean Allemang and Jim Hendler's Book "The Semantic Web for the Working Ontologist". While you are waiting for that book to arrive you can already view and listen to Dean Allemang's excellent presentation at JavaOne 2008. If you are interested in Social Networking, then you could follow that up also with my JavaOne presentation that same year which goes more into the RESTful, self describing web, hyperdata side of things, ie the Web in the Semantic Web.

    One should also remember that one does not need to trust everything one finds on the web. A good semantic web engine will allow you to merge different graphs depending on which ones you trust, which will indeed be something partly subjective, but which can also evolve. The semantic web allows you to change your mind. Good reasoning engines to help make that fast are only just appearing though.


    A lot of the references on the W3C use the original RDF XML syntax, which happens to be somewhat unintuitive to use and leads people to think too syntactically about the semantic web. XML developers may feel tempted to take out their XML tools, which may not get them what they were looking for. Recently a non official Semantic Web primer was put together that uses the much easier to use Turtle notation, the one the SPARQL query language is inspired from.


    It is good to use different tools, as each have their own advantages. There are too many ( see the sweet tools listing ) for anyone to try them all out. Here are the one's I use regularly:

    • the cwm python script, does a lot of useful things. Just downloading RDF/XML , following redirects, etc, and transforming it to your preferred format (Turtle) can be extremely useful. It also has a resoning engine, has rules, and can be set up and queried with SPARQL.
    • For my programming tasks, as I am a Java developer, I use Sesame 2. It is a Java framework that has a large following. It's competitor in the Java space is the HP backed Jena, which has better out of the box inferencing support.
    • If you want to quickly view your SQL database as an RDF store I recommend D2RQ. It has not been evolving much recently though.
    • The Tabulator Firefox plugin, turns Firefox into a generic RDF browser. It is a prototype, but is very useful.

    Friday Aug 29, 2008

    excel and rdf

    Scott McNealy, rarely had much nice to say about spreadsheet software, when it was not web enabled. And indeed there are huge numbers of problems with them. Off the top of my head, some of these are:

    • Hidden formula that nobody looks at and that get tweaked without alerting people
    • Data that is never synchronized, with parts of it that is out of date
    • Data that cannot be merged
    • Some products even had virus problems...

    And yet they are immensely popular, especially with the people who never see the problems that they lead to.

    As it happens these are problems within the scope of the semantic web. Every spread sheet is like a mini SQL database. As long as you query the information inside of one database owned by one administrator all is fine. But what when you want to merge information from different databases? Ouch! That's really tough, because there is usually no clear understanding of which pieces should fit together. Do the columns in each database mean the same thing? Well if you have just a few big databases you can link them tediously together, but what if you have thousands of such databases? And each person wielding it is a complete novice to this problem? What if someone just renames a column in one spread sheet? What does that mean?

    The topic of spreadsheets and the semantic web came to be one of the highlights of the conferences I went to in May. Dean Allemang in his talk at JavaOne ( the sound track enhanced slides are now online! ), used this problem in one of his examples. Eric Miller, talked about a solution that involved using the momentum behind spreadsheets to help build ontologies (I think, it's a while back now). This is not all new of course. In a reply to this post Mike Bergman pointed to his year old article entitled "RDF123 Makes Generating Flexible RDF a Snap".

    But often a demo helps a lot, and the one that made me see the light was given by Lee Feigenbaum of Cambridge Semantics just before the end of the Semantic Tech Conference. Lee, who had been working on semantic web tools at IBM before going to start his own company, gave me a quick summary of the benefits of his SHAPE middleware. Essentially by adding URLs into the spreadsheet you can tie their meaning down a lot more carefully. By writing a plugin for Microsoft Excel ( they had a prototype working for openoffice before deciding to focus on M$ tools) that works together with the middleware, users can keep on behaving as they are used to, whilst helping link all the information together. Instead of working against each other, people in a company can build a web of information together. Here is a highlight from Lee's talk entitled Getting to Web Semantics for Spreadsheets in the U.S. Government:

    • Tight integration into Excel allows semantic  concepts to be dragged and dropped from the  semantic repository onto data tables
    • The data table's implicit row/column relations are  explicitly stored in an RDF semantic database
    • Cells, columns, and regions are tagged with explicit  semantics
    • Publish the data tables on the Web
    Intriguing for sure.

    Spreadsheets may yet be back again, but for the good.

    PS. Please send me further links on this so I can flesh out this story better.


    13 September 2008:

    Tuesday Aug 26, 2008

    Sun Intranet Foaf Experiment

    image of Address Book displaying internal sun foaf

    Building a foaf server from an ldap directory is pretty easy. Rinaldo Di Giorgio put a prototype server together for Sun in less than a week. As a result everyone in Sun now has a experimental temporary foaf id, that we can use to try out some things.

    So what can one do with foaf that one could not so easily do with ldap? Well the semantic web is all about linking and meshing information. So one really simple thing to do is to link an external foaf file with the internal one. I did this by adding an owl:sameAs statement to my public foaf file that links my public and my sun id. (It would be better to link the internal foaf file to the external one, but that would have required a bit more work internally). As a result by dragging and dropping my foaf iconfoaf file onto today's release of the AddressBook someone who is inside the Sun firewall, can follow both my internal and my external connections. Someone outside the firewall will not be able to follow the internal link.

    By extending the internal foaf server a little more one could easily give people inside of Sun a place to link to their external business connection, wherever they might be in the world. To allow other companies to do this too it would of course help if everyone in Sun had a minimally public foaf ID, which would return only minimal information, or whatever the employee was comfortable revealing about themselves. This would allow Sun to present a yet more human face to the world.

    Well that's just a thought, and this is just an experiment. Hopefully it will make the semantic web more real for us here, and allow people's to dream up some great way of bringing all the open source world together, ever closer.

    PS. For people inside of Sun it may be easier to just drag my foaf iconinternal foaf file directly on the the AddressBook (started via jnlp). Otherwise to get the internal foaf file to download you need to click the "fetch" button next to the "same As" combo box when viewing my info. Then you need to switch to "Last Imported" and back to allow "Bernard Traversat" to appear in the second column. He appears as someone I foaf:know after the merger of the internal and the external foaf. I know this is clumsy, and I'll try thinking up a way to make this more user friendly very soon. You are welcome to participate on the Address Book Project.

    PPS. Sun internal users can get more info on the project home page.

    PPPS. We of course use the Firefox Tabulator plugin too for tests. It gives a different interface to my AddressBook. It is more flexible, but less specialised... The Tabulator web application does not work currently because we only produce Turtle output. This is to avoid developers trying to use DOM tools to process these pages, as we don't want to put work into an RDF crystalisation. ( Note: If at some later time you find that the plugin is not compatible with the latest version of Firefox, you can manually disabling compatibility checks. )

    Saturday May 17, 2008

    Social Networks and Data Portability at Semantic Tech conference in San Jose

    The upcoming semantic conference in San Jose, is getting going tomorrow, with an excellent list of speakers and subjects. Here are some highlights of the sessions relating to topics on which I blog regularly.

    Many more interesting talks will make sure I will spend another packed week. The full program is available online.


    My presentation is now available online with audio as part of the longer Building Secure, Open and Distributed Social Network Applications

    Monday Apr 21, 2008

    FOAF & SSL: creating a global decentralised authentication protocol

    Following on my previous post RDFAuth: sketch of a buzzword compliant authentication protocol, Toby Inkster came up with a brilliantly simple scheme that builds very neatly on top of the Secure Sockets Layer of https. I describe the protocol shortly here, and will describe an implementation of it in my next post.

    Simple global ( passwordless if using a device such as the Aladdin USB e-Token ) authentication around the web would be extremely valuable. I am currently crumbling under the number of sites asking me for authentication information, and for each site I need to remember a new id and password combination. I am not the only one with this problem as the data portability video demonstrates. OpenId solves the problem but the protocol consumes a lot of ssl connections. For hyperdata user agents this could be painfully slow. This is because they may need access to just a couple of resources per server as they jump from service to service.

    As before we have a very simple scenario to consider. Romeo wants to find out where Juliette is. Juliette's hyperdata Address Book updates her location on a regular basis by PUTing information to a protected resource which she only wants her friends and their friends to have access to. Her server knows from her foaf:PersonalProfileDocument who her friends are. She identifies them via dereferenceable URLs, as I do, which themselves usually (the web is flexible) return more foaf:PersonalProfileDocuments describing them, and pointing to further such documents. In this way the list of people able to find out her location can be specified in a flexible and distributed manner. So let us imagine that Romeo is a friend of a friend of Juliette's and he wishes to talk to her. The following sequence diagram continues the story...

    sequence diagram of RDF+SSL

    The stages of the diagram are listed below:

    1. First Romeo's User Agent HTTP GETs Juliette's public foaf file located at The server returns a representation ( in RDFa perhaps ) with the same semantics as the following N3:

      @prefix : <#> . 
      @prefix foaf: <> .
      @prefix rdfs: <> .
      @prefix todo: <> .
      @prefix openid: <> .
      <> a foaf:PersonalProfileDocument;
         foaf:primaryTopic :juliette ;
         openid:server <>; # see The Openid Sequence Diagram .
      :juliette a foaf:Person;
         foaf:name "Juliette";
         foaf:openid <>;
         foaf:blog </blog>;    
         rdfs:seeAlso <>; 
         foaf:knows <>,
                    <> .
      <> a todo:LocationDocument .

      Romeo's user agent receives this representation and decides to follow the https protected resource because it is a todo:LocationDocument.

    2. The todo:LocationDocument is at an https URL, so Romeo's User Agent connects to it via a secure socket. Juliette's server, who wishes to know the identity of the requestor, sends out a Certificate Request, to which Romeo's user agent responds with an X.509 certificate. This is all part of the SSL protocol.

      In the communication in stage 2, Romeo's user agent also passes along his foaf id. This can be done either by:

      • Sending in the HTTP header of the request an Agent-Id header pointing to the foaf Id of the user. Like this:
        This would be similar to the current From: header, but instead of requiring an email address, a direct name of the agent would be required. (An email address is only an indirect identifier of an agent).
      • The Certificate could itself contain the Foaf ID of the Agent in the X509v3 extensions section:
                X509v3 extensions:
                   X509v3 Subject Alternative Name: 

        I am not sure if it would be correct use of the X509 Alternative names field. So this would require more standardization work with the X509 community. But it shows a way where the two communities could meet. The advantage of having the id as part of the certificate is that this could add extra weight to the id, depending on the trust one gives the Certificate Authority that signed the Certificate.

    3. At this point Juliette's web server knows of the requestor (Romeo in this case):
      • his alleged foaf Id
      • his Certificate ( verified during the ssl session )

      If the Certificate is signed by a CA that Juliette trusts and the foaf id is part of the certificate, then she will trust that the owner of the User Agent is the entity named by that id. She can then jump straight to step 6 if she knows enough about Romeo that she trusts him.

      Having Certificates signed by CA's is expensive though. The protocol described here will work just as well with self signed certificates, which are easy to generate.

    4. Juliette's hyperdata server then GETs the foaf document associated with the foaf id, namely <> . Romeo's foaf server returns a document containing a graph of relations similar to the graph described by the following N3:
      @prefix : <#> . 
      @prefix foaf: <> .
      @prefix rdfs: <> .
      @prefix wot: <> .
      @prefix wotodo: <> .
      <> a foaf:PersonalProfileDocument;
          foaf:primaryTopic :romeo .
      :romeo a foaf:Person;
          foaf:name "Romeo";
          is wot:identity of [ a wotodo:X509Certificate;
                               wotodo:dsaWithSha1Sig """30:2c:02:14:78:69:1e:4f:7d:37:36:a5:8f:37:30:58:18:5a:
                                                   eb:8c:11:08:1c:aa:93:7d:71:01""" ;
                             ] ;
          foaf:knows <> .
    5. By querying the semantics of the returned document with a SPARQL query such as
      PREFIX wot: <> 
      PREFIX wotodo: <> 
      SELECT { ?sig }
      WHERE {
          [] a wotodo:X509Certificate;
            wotodo:signature ?sig;
            wot:identity <> .

      Juliette's web server can discover the certificate signature and compare it with the one sent by Romeo's user agent. If the two are identical, then Juliette's server knows that the User Agent who has access to the private key of the certificate sent to it, and who claims to be the person identified by the URI, is in agreement as to the identity of the certificate with the person who has write access to the foaf file So by proving that it has access to the private key of the certificate sent to the server, the User Agent has also proven that it is the person described by the foaf file.

    6. Finally, now that Juliette's server knows an identity of the User Agent making the request on the protected resource, it can decide whether or not to return the representation. In this case we can imagine that my foaf file says that
       @prefix foaf: <> .
       <> foaf:knows <> .  
      As a result of the policy of allowing all friends of Juliette's friends to be able to read the location document, the server sends out a document containing relations such as the following:
      @prefix contact: <> .
      @prefix : <> .
          contact:location [ 
                contact:address [ contact:city "Paris";
                                  contact:country "France";
                                  contact:street "1 Champs Elysees" ]
                           ] .


    • Create an ontology for X509 certificates.
    • test this. Currently there is some implementation work going on in the so(m)mer repository in the misc/FoafServer directory.
    • Can one use the Subject Alternative name of an X509 certificate as described here?
    • For self signed certificates, what should the X509 Distinguished Name (DN) be? The DN is really being replaced here by the foaf id, since that is where the key information about the user is going to be located. Can one ignore the DN in a X509 cert, as one can in RDF with blank nodes? One could I imagine create a dummy DN where one of the elements is the foaf id. These would at least, as opposed to DN, be guaranteed to be unique.
    • what standardization work would be needed to make this

    Discussion on the Web




    « July 2016