Wednesday Oct 07, 2009

Sketch of a RESTful photo Printing service with foaf+ssl

Let us imagine a future where you own your data. It's all on a server you control, under a domain name you own, hosted at home, in your garage, or on some cloud somewhere. Just as your OS gets updates, so all your server software will be updated, and patched automatically. The user interface for installing applications may be as easy as installing an app on the iPhone ( as La Distribution is doing).

A few years back, with one click, you installed a myPhoto service, a distributed version of fotopedia. You have been uploading all your work, social, and personal photos there. These services have become really popular and all your friends are working the same way too. When your friends visit you, they are automatically and seamlessly recognized using foaf+ssl in one click. They can browse the photos you made with them, share interesting tidbits, and more... When you organize a party, you can put up a wiki where friends of your friends can have write access, leave notes as to what they are going to bring, and whether or not they are coming. Similarly your colleagues have access to your calendar schedule, your work documents and your business related photos. Your extended family, defined through a linked data of family relationship (every member of your family just needs to describe their relation to their close family network) can see photos of your family, see the videos of your new born baby, and organize Christmas reunions, as well as tag photos.

One day you wish to print a few photos. So you go to web site we will provisionally call print.com. Print.com is neither a friend of yours, nor a colleague, nor family. It is just a company, and so it gets minimal access to the content on your web server. It can't see your photos, and all it may know of you is a nickname you like to use, and perhaps an icon you like. So how are you going to allow print.com access to the photos you wish to print? This is what I would like to try to sketch a solution for here. It should be very simple, RESTful, and work in a distributed and decentralized environment, where everyone owns and controls their data, and is security conscious.

Before looking at the details of the interactions detailed in the UML Sequence diagram below, let me describe the user experience at a general level.

  1. You go to print.com site after clicking on a link a friend of your suggested on a blog. On the home web page is a button you can click to add your photos.
  2. You click it, and your browser asks you which WebID you wish to use to Identify yourself. You choose your personal ID, as you wish to print some personal photos of yours. Having done that, your are authenticated, and print.com welcomes you using your nicknames and displays your icon on the resulting page.
  3. When you click a button that says "Give Print.com access to the pictures you wish us to print", a new frame is opened on your web site
  4. This frame displays a page from your server, where you are already logged in. The page recognized you and asks if you want to give print.com access to some of your content. It gives you information about print.com's current stock value on NASDAQ, and recent news stories about the company. There is a link to more information, which you don't bother exploring right now.
  5. You agree to give Print.com access, but only for 1 hour.
  6. When your web site asks you which content you want to give it access to, you select the pictures you would like it to have. Your server knows how to do content negotiation, so even though copying each one of the pictures over is feasible, you'd rather give print.com access to the photos directly, and let the two servers negotiate the best representation to use.
  7. Having done that you drag and drop an icon representing the set of photos you chose from this frame to a printing icon on the print.com frame.
  8. Print.com thanks you, shows you icons of the pictures you wish to print, and tells you that the photos will be on their way to your the address of your choosing within 2 hours.

In more detail then we have the following interactions:

  1. Your browser GETs print.com's home page, which returns a page with a "publish my photos" button.
  2. You click the button, which starts the foaf+ssl handshake. The initial ssl connection requests a client certificate, which leads your browser to ask for your WebID in a nice popup as the iPhone can currently do. Print.com then dereferences your WebId in (2a) to verify that the public key in the certificate is indeed correct. Your WebId (Joe's foaf file) contains information about you, your public keys, and a relation to your contact addition service. Perhaps something like the following:
    :me xxx:contactRegistration </addContact> .
    Print.com uses this information when it creates the resulting html page to point you to your server.
  3. When you click the "Give Print.com access to the pictures you wish us to print" you are sending a POST form to the <addContact> resource on your server, with the WebId of Print.com <https://nasdaq.com/co/PRNT#co> in the body of the POST. The results of this POST are displayed in a new frame.
  4. Your web server dereferences Print.com, where it gets some information about it from the NASDAQ URL. Your server puts this information together (4a) in the html it returns to you, asking what kind of access you want to give this company, and for how long you wish to give it.
  5. You give print.com access for 1 hour by filling in the forms.
  6. You give access rights to Print.com to your individual pictures using the excellent user interface available to you on your server.
  7. When you drag and drop the resulting icon depicting the collection of the photos accessible to Print.com, onto its "Print" icon in the other frame - which is possible with html5 - your browser sends off a request to the printing server with that URL.
  8. Print.com dereferences that URL which is a collection of photos it now has access to, and which it downloads one by one. Print.com had access to the photos on your server after having been authenticated with its WebId using foaf+ssl. (note: your server did not need to GET print.com's foaf file, as it still had a fresh version in its cache). Print.com builds small icons of your photos, which it puts up on its server, and then links to in the resulting html before showing you the result. You can click on those previews to get an idea what you will get printed.

So all the above requires very little in addition to foaf+ssl. Just one relation, to point to a contact-addition POST endpoint. The rest is just good user interface design.

What do you think? Have I forgotten something obvious here? Is there something that won't work? Comment on this here, or on the foaf-protocols mailing list.

Notes

Creative Commons License
print.com sequence diagram by Henry Story is licensed under a Creative Commons Attribution 3.0 United States License.
Based on a work at blogs.sun.com.

Wednesday Jan 21, 2009

Outline of a business model for open distributed social networks

illustration of SN data silos by the Economist

The organisers of the W3C Workshop on the future of social networking had the inspiration to include a track on Business Models. What would be the interest of large players to open up? to play ball in a larger ecosystem of social networks? What would the business model be for new entrants? This question clearly was on many of the attendees minds, and it is one I keep coming across.

First without Linked Social Networks there are a lot of disincentives to create new web services. This is because users of these services need to duplicate the data they have already created elsewhere, and also because the network effect of the data they are using is bound to be much smaller. I have found myself uninterested in trying out many new web 2.0 offerings for just these reasons. It is much more rewarding for developers to create applications for the Facebook platform for example, where the users just need to click a button to try out the application on the data they are already using, and that may yet enrich this data further.

Open Secure Linked Social Networks, such as that made possible by foaf+ssl, give one the benefits enjoyed by application developers of the closed social networks but in a distributed space, which should allow the unleashing of a huge amount of energy. Good for the consumer. But good for the big network players? I will argue that they need consider the opportunities this is opening up.

First of all one should notice that the network effect applies just as much to the big players as to the small ones. A growing distributed social network (SN) is a tide that will lift all boats, and since metcalf's law states that the value of the network grows exponentially with the number of nodes in it, then doubling the number of nodes in the network may just quadruple the value of the network to all. Perhaps the big operators are thinking that they control such a large slice of the market that there is not much doubling that they can do by linking to the smaller players. As it happens most social networks are geographical monopolies, which would go to strengthen that point of view (see slide 8 of the opening presentation at the W3C workshop). [ Nothing stays still, and everyone should watch out for the potential SN controlled by the telecoms.]

But the network effect applies to the big players also in another way. Namely that if they wish to create small networks then the value of those networks will be just as insignificant as those of other smaller players. So let me take a simple example of very valuable smaller networks which have a huge and unsatisfied demand for social networking technologies and the money to pay for tools that could help them quench their need: companies! Companies need to connect their employees to their managers and colleagues inside the same company, to knowledge resources, to paychecks and many other internal resources such as wikis, blogs, etc... Usually companies of a large enough size have software to deal with these. But even in a company such as Sun Microsystems, that is relatively large, the network effect of that information is not interesting enough for people to participate gladly in keeping the information up to date. I often find it easier to go to Facebook to find a picture of someone in Sun. Why? Well there is a very large value in adding one's picture to facebook: 200 million other users to connect to. In Sun Microsystems only 34 thousand people to connect to, and it is true a financial incentive. Clearly the value of 200 million squared is larger than the incentive of being efficient at work.

One thing is clear: it is impossible to imagine that such large software companies can allow their employees to just use the closed SN sites directly to do their work - I mean here: have all their employess just use the tools on facebook.com for example. This would give those SN companies way too much insight into the dealings of the company. Certainly very large sections of industry, including defence contractors, large competitive industries such as car manufacturers, and governments, cannot legally and strategically permit such outsourcing to happen, even though the value of the information in such a large network would be astronomically larger. In some case there are serious privacy and security issues that just cannot be ignored however attractive the value calculation is.

So large SN players would have to enter the Fortune 1000 with Social Networking software that did not leak information. But in that case they won't be in any seriously better position than the software that is already in there, and they won't be able to compete any better than any of the smaller companies that are working in this space, as they will not find it easy to leverage their main advantage, namely the network effect of their core offering.

And even if they did manage to leverage this in some ways, they would find it impossible to leverage that advantage in the ways that really count. Companies don't just want their employees to link up with their colleagues, they need them to link up with people outside their company, be it customers, government agencies, researchers, competitors, external contractors, local governement, insurance or health agencies, etc, etc..... The list is huge. So even if a large Social Network could convince one of these players of the advantage of their offering, they will never be able to convince every single partner of that company - for that would be to convince the whole world. Companies really want global SN, and that is what Emergency Response teams really need.

To make such a globaly linkable platfrom a reality one needs to build at the level of abstraction and clarity at which the only such global network in existence is built: the web itself. By using Linked Data one can create distributed social networks where each player can maintain the information they feel the business need to maintain. With the very simple foaf+ssl protocol we have the lightest possible means to build security into a distributed social network. Simplicity and clarity - mathematical clarity - is essential when developing something that is to grow to a global scale.

Companies therefore that work at providing tools for distributed social networks, will if they play well together, find a huge opportunity opening up in front of them in helping enterprises in every industry segment link up their employees to people in every walk of life.

Thursday Jan 15, 2009

The W3C Workshop on the Future of Social Networking Position Papers

picture by Salvadore Dali

I am in Barcelona, Spain (the country of Dali) for the W3C Workshop on the Future of Social Networking. To prepare for this I decided to read through the 75 position papers. This is the conference I have been the best prepared for ever. It really changes the way I can interact with other attendees. :-)

I wrote down a few notes on most paper I read through, to help me remember what I read. This took me close to a week, a good part of which I spent trying to track down the authors on the web, find their pictures, familiarise myself with their work, and fill out my Address Book. Anything I could do to help me find as many connections as possible to help me remember the work. I used delicious to save some subjective notes, which can be found on under the w3csn tag. I was going to publish this on Wednesday, but had not quite finished reading through all the papers. I got back to my hotel this evening to find that Libby Miller, who co-authored the foaf ontology, had beat me to it with the extend and quality of her reviews which she published in a two parts:

Amazing work Libby!

70 papers is more than most people can afford to read. If I were to recommend just a handful of papers that stand out in my mind for now these would be:

  • Paper 36 by Ching-man Au Yeung, Laria Liccardi, Kanghao Lu, Oshani Seneviratne and Tim Berners Lee wrote the must read paper entitled "Decentralization: The Future of Online Social Networking". I completely agree with this outlook. It also mentions my foaf+ssl position paper, which of course gives it full marks :-) I would use "distribution" perhaps over "decentralisation", or some word that better suggests that the social network should be able to be as much of a peer to peer system as the web itself.
  • "Leveraging Web 2.0 Communities in Professional Organisations" really prooves why we need distributed social networks. The paper focuses on the problem faced by Emergency Response organisation. Social Networks can massively improove the effectiveness of such responses, as some recent catastrophes have shown. But ER teams just cannot expect everyone they deal with to be part of just one social network silo. They need to get help from anywhere it can come from. From professional ER teams, from people wherever they are, from infromation wherever it finds itself. Teams need to be formed ad hoc, on the spot. Not all data can be made public. Distributed Open Secure Social Networks are what is needed in such situations. Perhaps the foaf+ssl proposal (wiki page) can help to make this a reality.
  • In "Social networking across devices: opportunity and risk for the disabled and older community", Henni Swan explains how much social networking information could be put to use to help make better user interface for the disabled. Surprisingly enough none of the web sites, so taken by web 2.0 technologies, seem to put any serious, effort in this space. Aparently though this can be done with web 2.0 technologies, as Henny explains in her blog. The semantic Web could help even further I suggested to her at her talk today, by splitting the data from the user interface. Specialised browsers for the disabled could adapt the information for their needs, making it easy for them to navigate the graph.
  • "Trust and Privacy on the Social Web" starts the discussion in this very important space. If there are to be distributed social networks, they have to be secure, and the privacy and trust issues need to be looked at carefully.
  • On a lighter note, Peter Ferne's very entertaining paper "Collaborative Filtering and Social Capital" comes with a lot of great links and is a pleasure to read. Did you know about the Whuffie Index or CELEBDAQ? Find out here.
  • Many of the telecoms papers, of which Telefonica's "The social network behind telecom networks" reveal the elephant in the room that nobody saw in social networking: the telecoms. Who has the most information about everyone's social network? What could they do with this information? How may people have phones, compared to internet access? Something to think about.
  • Nokia's position paper can then be seen in a different light. How can handset manufacturers help put to use the social networking and location information contemporay devices are able to access? The Address Book in cell phones is the most important application in a telephone. But do people want to only connect to other Nokia users? This has to be another reason for distributed social networks.

    I will blog about other posts as the occasion presents itself in future blogs. This is enough for now. I have to get up early and be awake for tomorrow's talks which start at 8:30 am.

    In the mean time you can follow a lively discussion of the ongoing conference on twitter under the w3csn tag.

  • Thursday Sep 04, 2008

    Building Secure, Open and Distributed Social Network Applications

    Current Social Networks don't allow you to have friends outside their network. When on Facebook, you can't point to your friend on LinkedIn. They are data silos. This audio enhanced slide show explains how a distributed decentralized social network is being built, how it works, and how to make is secure using the foaf+ssl protocol (a list of pointers on the esw wiki).

    It is licenced under a CC Attribution ShareAlike Licence.
    My voice is a bit odd on the first slide, but it gets better I think as I go along.

    Building Secure Open & Distributed Social Networks( Viewing this slide show requires a flash plugin. Sorry I only remembered this limitation after having put it online. If you know of a good Java substitute let me know. The other solution would have been to use Slidy. PDF and Annotated Open Document Format versions of this presentation are available below. (why is this text visible in Firefox even when the plugin works?) )

    This is the presentation I gave at JavaOne 2008 and at numerous other venues in the past four months.

    The slidecast works a lot better as a presentation format, than my previous semantic web video RDF: Connecting Software and People which I published as a h.264 video over a couple of years ago, and which takes close to 64MB of disk space. The problem with that format is that it is not easy to skip through the slides to the ones that interest you, or to go back and listen to a passage carefully again. Or at least it feels very clunky. My mp3 sound file only takes 17MB of space in comparison, and the graphics are much better quality in this slide show.

    It is hosted by the excellent slideshare service, which translated my OpenOffice odp document ( once they were cleaned up a little: I had to make sure it had no pointers to local files remaining accessible from the Edit>Links menu (which otherwise choked their service)). I used the Audacity sound editor to create the mp3 file which I then place on my bblfish.net server. Syncing the sound and the slides was then very easy using SlideShare's SlideCast application. I found that the quality of the slides was a lot better once I had created an account on their servers. The only thing missing would be a button in addition to the forward and backward button that would allow one to show the text of the audio, for people with hearing problems - something equivalent to the Notes view in Open Office.

    You can download the OpenOffice Presentation which contains my notes for each slide and the PDF created from it too. These are all published under a Creative Commons Attribution, Share Alike license. If you would like some of the base material for the slides, please contact me. If you would like to present them in my absence feel free to.

    Tuesday Aug 26, 2008

    Sun Intranet Foaf Experiment

    image of Address Book displaying internal sun foaf

    Building a foaf server from an ldap directory is pretty easy. Rinaldo Di Giorgio put a prototype server together for Sun in less than a week. As a result everyone in Sun now has a experimental temporary foaf id, that we can use to try out some things.

    So what can one do with foaf that one could not so easily do with ldap? Well the semantic web is all about linking and meshing information. So one really simple thing to do is to link an external foaf file with the internal one. I did this by adding an owl:sameAs statement to my public foaf file that links my public and my sun id. (It would be better to link the internal foaf file to the external one, but that would have required a bit more work internally). As a result by dragging and dropping my foaf iconfoaf file onto today's release of the AddressBook someone who is inside the Sun firewall, can follow both my internal and my external connections. Someone outside the firewall will not be able to follow the internal link.

    By extending the internal foaf server a little more one could easily give people inside of Sun a place to link to their external business connection, wherever they might be in the world. To allow other companies to do this too it would of course help if everyone in Sun had a minimally public foaf ID, which would return only minimal information, or whatever the employee was comfortable revealing about themselves. This would allow Sun to present a yet more human face to the world.

    Well that's just a thought, and this is just an experiment. Hopefully it will make the semantic web more real for us here, and allow people's to dream up some great way of bringing all the open source world together, ever closer.

    PS. For people inside of Sun it may be easier to just drag my foaf iconinternal foaf file directly on the the AddressBook (started via jnlp). Otherwise to get the internal foaf file to download you need to click the "fetch" button next to the "same As" combo box when viewing my info. Then you need to switch to "Last Imported" and back to allow "Bernard Traversat" to appear in the second column. He appears as someone I foaf:know after the merger of the internal and the external foaf. I know this is clumsy, and I'll try thinking up a way to make this more user friendly very soon. You are welcome to participate on the Address Book Project.

    PPS. Sun internal users can get more info on the project home page.

    PPPS. We of course use the Firefox Tabulator plugin too for tests. It gives a different interface to my AddressBook. It is more flexible, but less specialised... The Tabulator web application does not work currently because we only produce Turtle output. This is to avoid developers trying to use DOM tools to process these pages, as we don't want to put work into an RDF crystalisation. ( Note: If at some later time you find that the plugin is not compatible with the latest version of Firefox, you can manually disabling compatibility checks. )

    Monday Apr 21, 2008

    FOAF & SSL: creating a global decentralised authentication protocol

    Following on my previous post RDFAuth: sketch of a buzzword compliant authentication protocol, Toby Inkster came up with a brilliantly simple scheme that builds very neatly on top of the Secure Sockets Layer of https. I describe the protocol shortly here, and will describe an implementation of it in my next post.

    Simple global ( passwordless if using a device such as the Aladdin USB e-Token ) authentication around the web would be extremely valuable. I am currently crumbling under the number of sites asking me for authentication information, and for each site I need to remember a new id and password combination. I am not the only one with this problem as the data portability video demonstrates. OpenId solves the problem but the protocol consumes a lot of ssl connections. For hyperdata user agents this could be painfully slow. This is because they may need access to just a couple of resources per server as they jump from service to service.

    As before we have a very simple scenario to consider. Romeo wants to find out where Juliette is. Juliette's hyperdata Address Book updates her location on a regular basis by PUTing information to a protected resource which she only wants her friends and their friends to have access to. Her server knows from her foaf:PersonalProfileDocument who her friends are. She identifies them via dereferenceable URLs, as I do, which themselves usually (the web is flexible) return more foaf:PersonalProfileDocuments describing them, and pointing to further such documents. In this way the list of people able to find out her location can be specified in a flexible and distributed manner. So let us imagine that Romeo is a friend of a friend of Juliette's and he wishes to talk to her. The following sequence diagram continues the story...

    sequence diagram of RDF+SSL

    The stages of the diagram are listed below:

    1. First Romeo's User Agent HTTP GETs Juliette's public foaf file located at http://juliette.net/. The server returns a representation ( in RDFa perhaps ) with the same semantics as the following N3:

      @prefix : <#> . 
      @prefix foaf: <http://xmlns.com/foaf/0.1/> .
      @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
      @prefix todo: <http://eg.org/todo#> .
      @prefix openid: <http://eg.org/openid/todo#> .
      
      <> a foaf:PersonalProfileDocument;
         foaf:primaryTopic :juliette ;
         openid:server <https://aol.com/openid/service>; # see The Openid Sequence Diagram .
      
      :juliette a foaf:Person;
         foaf:name "Juliette";
         foaf:openid <>;
         foaf:blog </blog>;    
         rdfs:seeAlso <https://juliette.net/protected/location>; 
         foaf:knows <http://bblfish.net/people/henry/card#me>,
                    <http://www.w3.org/People/Berners-Lee/card#i> .
      
      <https://juliette.net/protected/location> a todo:LocationDocument .
      

      Romeo's user agent receives this representation and decides to follow the https protected resource because it is a todo:LocationDocument.

    2. The todo:LocationDocument is at an https URL, so Romeo's User Agent connects to it via a secure socket. Juliette's server, who wishes to know the identity of the requestor, sends out a Certificate Request, to which Romeo's user agent responds with an X.509 certificate. This is all part of the SSL protocol.

      In the communication in stage 2, Romeo's user agent also passes along his foaf id. This can be done either by:

      • Sending in the HTTP header of the request an Agent-Id header pointing to the foaf Id of the user. Like this:
        Agent-Id: http://romeo.net/#romeo
        
        This would be similar to the current From: header, but instead of requiring an email address, a direct name of the agent would be required. (An email address is only an indirect identifier of an agent).
      • The Certificate could itself contain the Foaf ID of the Agent in the X509v3 extensions section:
                X509v3 extensions:
                   ...
                   X509v3 Subject Alternative Name: 
                                   URI:http://romeo.net/#romeo
        

        I am not sure if it would be correct use of the X509 Alternative names field. So this would require more standardization work with the X509 community. But it shows a way where the two communities could meet. The advantage of having the id as part of the certificate is that this could add extra weight to the id, depending on the trust one gives the Certificate Authority that signed the Certificate.

    3. At this point Juliette's web server knows of the requestor (Romeo in this case):
      • his alleged foaf Id
      • his Certificate ( verified during the ssl session )

      If the Certificate is signed by a CA that Juliette trusts and the foaf id is part of the certificate, then she will trust that the owner of the User Agent is the entity named by that id. She can then jump straight to step 6 if she knows enough about Romeo that she trusts him.

      Having Certificates signed by CA's is expensive though. The protocol described here will work just as well with self signed certificates, which are easy to generate.

    4. Juliette's hyperdata server then GETs the foaf document associated with the foaf id, namely <http://romeo.net/> . Romeo's foaf server returns a document containing a graph of relations similar to the graph described by the following N3:
      @prefix : <#> . 
      @prefix foaf: <http://xmlns.com/foaf/0.1/> .
      @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
      @prefix wot: <http://xmlns.com/wot/0.1/> .
      @prefix wotodo: <http://eg.org/todo#> .
      
      <> a foaf:PersonalProfileDocument;
          foaf:primaryTopic :romeo .
      
      :romeo a foaf:Person;
          foaf:name "Romeo";
          is wot:identity of [ a wotodo:X509Certificate;
                               wotodo:dsaWithSha1Sig """30:2c:02:14:78:69:1e:4f:7d:37:36:a5:8f:37:30:58:18:5a:
                                                   f6:10:e9:13:a4:ec:02:14:03:93:42:3b:c0:d4:33:63:ae:2f:
                                                   eb:8c:11:08:1c:aa:93:7d:71:01""" ;
                             ] ;
          foaf:knows <http://bblfish.net/people/henry/card#me> .
      
    5. By querying the semantics of the returned document with a SPARQL query such as
      PREFIX wot: <http://xmlns.com/wot/0.1/> 
      PREFIX wotodo: <http://eg.org/todo#> 
      
      SELECT { ?sig }
      WHERE {
          [] a wotodo:X509Certificate;
            wotodo:signature ?sig;
            wot:identity <http://romeo.net/#romeo> .
      }
      

      Juliette's web server can discover the certificate signature and compare it with the one sent by Romeo's user agent. If the two are identical, then Juliette's server knows that the User Agent who has access to the private key of the certificate sent to it, and who claims to be the person identified by the URI http://romeo.net/#romeo, is in agreement as to the identity of the certificate with the person who has write access to the foaf file http://romeo.net/. So by proving that it has access to the private key of the certificate sent to the server, the User Agent has also proven that it is the person described by the foaf file.

    6. Finally, now that Juliette's server knows an identity of the User Agent making the request on the protected resource, it can decide whether or not to return the representation. In this case we can imagine that my foaf file says that
       @prefix foaf: <http://xmlns.com/foaf/0.1/> .
      
       <http://bblfish.net/people/henry/card#me> foaf:knows <http://romeo.net/#romeo> .  
       
      As a result of the policy of allowing all friends of Juliette's friends to be able to read the location document, the server sends out a document containing relations such as the following:
      @prefix contact: <http://www.w3.org/2000/10/swap/pim/contact#> .
      @prefix : <http://juliette.org/#> .
      
      :juliette 
          contact:location [ 
                contact:address [ contact:city "Paris";
                                  contact:country "France";
                                  contact:street "1 Champs Elysees" ]
                           ] .
      

    Todo

    • Create an ontology for X509 certificates.
    • test this. Currently there is some implementation work going on in the so(m)mer repository in the misc/FoafServer directory.
    • Can one use the Subject Alternative name of an X509 certificate as described here?
    • For self signed certificates, what should the X509 Distinguished Name (DN) be? The DN is really being replaced here by the foaf id, since that is where the key information about the user is going to be located. Can one ignore the DN in a X509 cert, as one can in RDF with blank nodes? One could I imagine create a dummy DN where one of the elements is the foaf id. These would at least, as opposed to DN, be guaranteed to be unique.
    • what standardization work would be needed to make this

    Discussion on the Web

    Friday Apr 18, 2008

    The OpenId Sequence Diagram

    OpenId very neatly solves the global identity problem within the constraints of working with legacy browsers. It is a complex protocol though as the following sequence diagram illustrates, and this may be a problem for automated agents that need to jump around the web from hyperlink to hyperlink, as hyperdata agents tend to do.

    The diagram illustrates the following scenario. Romeo wants to find the current location of Juliette. So his semantic web user agent GET's her current foaf file. But Juliette wants to protect information about her current whereabouts and reveal it only to people she trusts, so she configures her server to require the user agent to authenticate itself in order to get more information. If the user agent can prove that is is owned by one of her trusted friends, and Romeo in particular, she will deliver the information to it (and so to him).

    The steps numbered in the sequence diagram are as follows:

    1. A User Agent fetches a web page that requires authentication. OpenId was designed with legacy web browsers in mind, for which it would return a page containing an OpenId login box such as the one to the right. openid login box In the case of a hyperdata agent as in our use case, the agent would GET a public foaf file, which might contain a link to an OpenId authentication endpoint. Perhaps with some rdf such as the following N3:
      <> openid:login </openidAuth.cgi> .
      
      Perhaps some more information would indicate which resources were protected.
    2. In current practice a human user notices the login box and types his identifying URL in it, such as http://openid.sun.com/bblfish This is the brilliant invention of OpenId: getting hundreds of millions of people to find it natural to identify themselves via a URL, instead of an email. The user then clicks the "Login button".
      In our semantic use case the hyperdata agent would notice the above openid link and would deduce that it needs to login to the site to get more information. Romeo's Id ( http://romeo.net/ perhaps ) would then be POSTed to the /openidAuth.cgi authentication endpoint.
    3. The OpenId authentication endpoint then fetches the web page by GETing Romeo's url http://romeo.net/. This returned representation contains a link in the header of the page pointing Romeo's OpenId server url. If the representation returned is html then this would contain the following in the header
       <link rel="openid.server" href="https://openid.sun.com/openid/service" />
      
    4. The representation returned in step 3, could contain a lot of other information too. A link to a foaf file may not be a bad idea as I described in foaf and openid. The returned representation in step 3 could even be RDFa extended html, in which case this step may not even be necessary. For a hyperdata server the information may be useful, as it may suggest a connection Romeo could have to some other people that would allow it to decide whether it wishes to continue the login process.
    5. Juliette's OpenId authentication endpoint then sends a redirect to Romeo's user agent, directing it towards his OpenId Identity Provider. The redirect also contains the URL of the OpenId authentication cgi, so that in step 8 below the Identity Provider can redirect a message back.
    6. Romeo user agent dutifully redirects romeo to the identity provider, which then returns a form with a username and password entry box.
    7. Romeo's user agent could learn to fill the user name password pair in automatically and even skip the previous step 6 . In any case given the user name and password, the Identity Provider then sends back some cryptographic tokens to the User Agent to have it redirect to the OpenId Authentication cgi at http://juliette.net/openidAuth.cgi.
    8. Romeo's Hyperdata user agent then dutifully redirects back to the OpenId authentication endpoint
    9. The authentication endpoint sends a request to the Openid Identity provider to verify that the cryptographic token is authentic. If it is, a conventional answer is sent back.
    10. The OpenId authentication endpoint finally sends a response back with a session cookie, giving access to various resources on Juliette's web site. Perhaps it even knows to redirect the user agent to a protected resource, though that would have required some information concerning this to have been sent in stage 2.
    11. Finally Romeo's user agent can GET Juliette's protected information if Juliette's hyperdata web server permits it. In this case it will, because Juliette loves Romeo.

    All of the steps above could be automatized, so from the user's point of view they may not be complicated. The user agent could even learn to fill in the user name and password required by the Identity Provider. But there are still a very large number of connections between the User Agent and the different services. If these connections are to be secure they would need to protected by SSL (as hinted at by the double line arrows). And SSL connections are not cheap. So the above may be unacceptably slow. On the other hand it would work with a protocol that is growing fast in acceptance.

    It is is certainly worth comparing this sequence diagram with the very light weight one presented in "FOAF & SLL: creating a global decentralised authentication protocol".

    Thanks again to Benjamin Nowack for bringing the discussion on RDFAuth to thinking about using the OpenId protocol directly as described above. See his post on the semantic web mailing list. Benjamin also pointed to the HTTP OpenID Authentication proposal, which shows how some of the above can be simplified if certain assumptions about the capabilities of the client are made. It would be worth making a sequence diagram of that proposal too.

    Friday Mar 28, 2008

    RDFAuth: sketch of a buzzword compliant authentication protocol

    Here is a proposal for an authentication scheme that is even simpler than OpenId ( see sequence diagram ), more secure, more RESTful, with fewer points of failure and fewer points of control, that is needed in order to make Open Distributed Social Networks with privacy controls possible.

    Update

    The following sketch led to the even simpler protocol described in Foaf and SSL creating a global decentralized authentication protocol. It is very close to what is proposed here but builds very closely on SSL, so as to reduce what is new down to nearly nothing.

    Background

    Ok, so now I have your attention, I would like to first mention that I am a great fan of OpenId. I have blogged about it numerous times and enthusiastically in this space. I came across the idea I will develop below, not because I thought OpenId needed improving, but because I have chosen to follow some very strict architectural guidelines: it had to satisfy RESTful, Resource oriented hyperdata constraints. With the Beatnik Address Book I have proven - to myself at least - that the creation of an Open Distributed Social Network (a hot topic at the moment, see the Economist's recent article on Online social network) is feasible and easy to do. What was missing is a way for people to keep some privacy, clearly a big selling point for the large Social Network Providers such as Facebook. So I went on the search of a solution to create a Open Distributed Social Network with privacy controls. And initially I had thought of using OpenId.

    OpenId Limitations

    But OpenId has a few problems:

    • First it is really designed to work with the limitations of current web browsers. It is partly because of this that there is a lot of hopping around from the service to the Identity Provider with HTTP redirects. As the Tabulator, Knowee or Beatnik.
    • Parts of OpenId 2, and especially the Attribute Exchange spec really don't feel very RESTful. There is a method for PUTing new property values in a database and a way to remove them that does not use either the HTTP PUT method or the DELETE method.
    • The OpenId Attribute Exchange is nice but not very flexible. It can keep some basic information about a person, but it does not make use of hyperdata. And the way it is set up, it would only be able to do so with great difficulty. A RESTfully published foaf file can give the same information, is a lot more flexible and extensible, whilst also making use of Linked Data, and as it happens also solves the Social Network Data Silo problems. Just that!
    • OpenId requires an Identity Server. There are a couple of problems with this:
      • This server provides a Dynamic service but not a RESTful one. Ie. the representations sent back and forth to it, cannot be cached.
      • The service is a control point. Anyone owning such a service will know which sites you authenticate onto. True, you can set up your own service, but that is clearly not what is happening. The big players are offering their customers OpenIds tied to particular authentication servers, and that is what most people will accept.
    As I found out by developing what I am here calling RDFAuth, for want of a better name, none of these restrictions are necessary.

    RDFAuth, a sketch

    So following my strict architectural guidelines, I came across what I am just calling RDFAuth, but like everything else here this is a sketch and open to change. I am not a security specialist nor an HTTP specialist. I am like someone who comes to an architect in order to build a house on some land he has, with some sketch of what he would like the house to look like, some ideas of what functionality he needs and what the price he is willing to pay is. What I want here is something very simple, that can be made to work with a few perl scripts.

    Let me first present the actors and the resources they wish to act upon.

    • Romeo has a Semantic Web Address Book, his User Agent (UA). He is looking for the whereabouts of Juliette.
    • Juliette has a URL identifier ( as I do ) which returns a public foaf representation and links to a protected resource.
    • The protected resource contains information she only wants some people to know, in this instance Romeo. It contains information as to her current whereabouts.
    • Romeo also has a public foaf file. He may have a protected one too, but it does not make an entrance in this scene of the play. His public foaf file links to a public PGP key. I described how that is done in Cryptographic Web of Trust.
    • Romeo's Public key is RESTfully stored on a server somewhere, accessible by URL.

    So Romeo wants to find out where Juliette is, but Juliette only wants to reveal this to Romeo. Juliette has told her server to only allow Romeo, identified by his URL, to view the site. She could have also have had a more open policy, allowing any of her or Romeo's friends to have access to this site, as specified by their foaf file. The server could then crawl their respective foaf files at regular intervals to see if it needed to add anyone to the list of people having access to the site. This is what the DIG group did in conjunction with OpenId. Juliette could also have a policy that decides Just In Time, as the person presents herself, whether or not to grant them access. She could use the information in that person's foaf file and relating it to some trust metric to make her decision. How Juliette specifies who gets access to the protected resource here is not part of this protocol. This is completely up to Juliette and the policies she chooses her agent to follow.

    So here is the sketch of the sequence of requests and responses.

    1. First Romeo's user Agent knows that Juliette's foaf name is http://juliette.org/#juliette so it sends an HTTP GET request to Juliette's foaf file located of course at http://juliette.org/
      The server responds with a public foaf file containing a link to the protected resource perhaps with the N3
        <> rdfs:seeAlso <protected/juliette> .
      
      Perhaps this could also contain some relations describing that resource as protected, which groups may access it, etc... but that is not necessary.
    2. Romeo's User Agent then decides it wants to check out protected/juliette. It sends a GET request to that resource but this time receives a variation of the Basic Authentication Scheme, perhaps something like:
      HTTP/1.0 401 UNAUTHORIZED
      Server: Knowee/0.4
      Date: Sat, 1 Apr 2008 10:18:15 GMT
      WWW-Authenticate: RdfAuth realm="http://juliette.org/protected/\*" nonce="ILoveYouToo"
      
      The idea is that Juliette's server returns a nonce (in order to avoid replay attacks), and a realm over which this protection will be valid. But I am really making this up here. Better ideas are welcome.
    3. Romeo's web agent then encrypts some string (the realm?) and the nonce with Romeo's private key. Only an agent trusted by Romeo can do this.
    4. The User Agent then sends a new GET request with the encrypted string, and his identifier, perhaps something like this
      GET /protected/juliette HTTP/1.0
      Host: juliette.org
      Authorization: RdfAuth id="http://romeo.name/#romeo" key="THE_REALM_AND_NONCE_ENCRYPTED"
      Content-Type: application/rdf+xml, text/rdf+n3
      
      Since we need an identifier, why not just use Romeos' foaf name? It happens to also point to his foaf file. All the better.
    5. Because Juliette's web server can then use Romeo's foaf name to GET his public foaf file, which contains a link to his public key, as explained in "Cryptographic Web of Trust".
    6. Juliette's web server can then query the returned representation, perhaps meshed with some other information in its database, with something equivalent to the following SPARQL query
      PREFIX wot: <http://xmlns.com/wot/0.1/>
      SELECT ?pgp
      WHERE {
           [] wot:identity <http://romeo.name/#romeo>;
              wot:pubkeyAddress ?pgp .
      } 
      
      The nice thing about working at the semantic layer, is that it decouples the spec a lot from the representation returned. Of course as usage grows those representations that are understood by the most servers will create a de facto convention. Intially I suggest using RDF/XML of course. But it could just as well be N3, RDFa, perhaps even some microformat dialect, or even some GRDDLable XML, as the POWDER working group is proposing to do.
    7. Having found the URL of the PGP key, Juliette's server, can GET it - and as with much else in this protocol cache it for future use.
    8. Having the PGP key, Juliette's server can now decrypt the encrypted string sent to her by Romeo's User Agent. If the decrypted string matches the expected string, Juliette will know that the User Agent has access to Romeo's private key. So she decides this is enough to trust it.
    9. As a result Juliette's server returns the protected representation.
    Now Romeo's User Agent knows where Juliette is, displays it, and Romeo rushes off to see her.

    Advantages

    It should be clear from the sketch what the numerous advantages of this system are over OpenId. (I can't speak of other authentication services as I am not a security expert).

    • The User Agent has no redirects to follow. In the above example it needs to request one resource http://juliette.org/ twice (2 and 4) but that may only be necessary the first time it accesses this resource. The second time the UA can immediately jump to step 3. [but see problem with replay attacks raised in the comments by Ed Davies, and my reply] Furthermore it may be possible - this is a question to HTTP specialists - to merge step 1 and 2. Would it be possible for a request 1. to return a 20x code with the public representation, plus a WWWAuthenticate header, suggesting that the UA can get a more detailed representation of the same resource if authenticated? In any case the redirect rigmarole of OpenId, which is really there to overcome the limitations of current web browsers, in not needed.
    • There is no need for an Attribute Exchange type service. Foaf deals with that in a clear and extensible RESTful manner. This simplifies the spec dramatically.
    • There is no need for an identity server, so one less point of failure, and one less point of control in the system. The public key plays that role in a clean and simple manner
    • The whole protocol is RESTful. This means that all representations can be cached, meaning that steps 5 and 7 need only occur once per individual.
    • As RDF is built for extensibility, and we are being architecturally very clean, the system should be able to grow cleanly.

    Contributions

    I have been quietly exploring these ideas on the foaf and semantic web mailing lists, where I received a lot of excellent suggestions and feedback.

    Finally

    So I suppose I am now looking for feedback from a wider community. PGP experts, security experts, REST and HTTP experts, semantic web and linked data experts, only you can help this get somewhere. I will never have the time to learn these fields in enough detail by myself. In any case all this is absolutely obviously simple, and so completely unpatentable :-)

    Thanks for taking the time to read this

    Tuesday Jan 15, 2008

    Data Portability: The Video

    Here is an excellent video to explain the problem faced by Web 2.0 companies and what Data Portability means. It is amazing how a good video can express something so much more powerfully, so much more directly than words can. Sit back and watch.


    DataPortability - Connect, Control, Share, Remix from Smashcut Media on Vimeo.

    Feeling better? You are gripped by the problem? Good. You should now find that my previous years posts start making a lot more sense :-)

    Will the Data Portability group get the best solution together? I don't know. The problem with the name they have chosen is that it is so general, one wonders whether XML is not the solution to their problem. Won't XML make data portability possible, if everyone agrees on what they want to port? Of course getting that agreement on all the topics in the world is a never ending process.... Had they retained the name of the original group this stemmed from, Social Network Portability then one could see how to tackle this particular issue. And this particular issue seems to be the one this video is looking at.

    But the question is also whether portability is the right issue. Well in some ways it is. Currently each web site has information locked up in html formats, in natural language (or even sometimes in jpegs (see the previous story of Scoble and Facebook), in order to make it difficult to export the data, which each service wants to hold onto as if it was theirs to own.

    Another way of looking at this is that the Data Portability group cannot so much be about technology as policy. The general questions it has to address are question of who should see what data, who should be able to copy that data, and what they should be able to do with it. This does indeed involve identity technology insofar as all of the above questions turn around questions of identity ("who?"). Now if every site requires one to create a new identity in order to access one's data one has the nightmare scenario depicted in the video, where one has to maintain one's identity across innumerable sites. As a result the policy issue of Data Portability does require one to solve the technical problem of distributed identity: how can people maintain the minimum number of identities on the web? (ie not one per site) Another issue that follows right upon the first is that if one wants information to only be visible to a select group of people - the "who sees what" part of the question - then one also needs a distributed way to be able to specify group membership, be it friendship based or other. The video again makes that point very clearly why having to recreate one's social network on every site is impractical.

    What may be misleading about the term Data Portability is that it may lead one to think that what one wants is to copy one's social information from one social service to another. That would just automate the job of what the video illustrates people having to do by hand currently. But that is not a satisfactory solution. Because one cannot extract a graph of information from one space to another without loss. If I extract my friends from LinkedIn into FaceBook, it is quite certain that Facebook will not recognise a large number of the people I know on LinkedIn. Furthermore the ported information on FaceBook would soon be out of date, as people updated their network and profiles on LinkedIn. Unless of course Facebook were able to make a constant copy of the information on LinkedIn. But that's impossible right? Wrong! That is the difference between copy by value and copy by reference. If FaceBook can refer to people on LinkedIn, then the data will always be as up to date as it can be. So this is how one moves from DataPortability to Linked Data, also known as hyper data.

    About

    bblfish

    Search

    Archives
    « April 2014
    MonTueWedThuFriSatSun
     
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
        
           
    Today