Wednesday Jan 13, 2010

Faviki: social bookmarking for 2010

faviki logo

Faviki is simply put the next generation social bookmarking service. "A bookmarking service? You must be kidding?!" I can hear you say in worried exasperation. "How can one innovate in that space?" Not only is it possible to innovate here, let me explain why I moved all my bookmarks from delicious over to faviki.

Like delicious, digg, twitter and others... Faviki uses crowd sourcing to allow one to share interesting web pages one has found, stay up to date on a specific topic of interest, and keep one's bookmarks synchronized across computers. So there is nothing new at that level. If you know del.icio.us, you won't be disoriented.

What is new is that instead of this being one crowd sourced application, it is in fact two. It builds on wikipedia to help you tag your content intelligently with concepts taken from dbpedia. Instead of tagging with strings the meaning of which you only understand at that time, you can have tags that make sense, backed by a real evolving encyclopedia. Sounds simple? Don't be deceived: there is a huge potential in this.

Let us start with the basics: What is tagging for? It is here to help us find information again, to categorize our resources into groups so that we can find them again in the rapidly increasing information space. I now have close to ten years of bookmarks saved away. As a result I can no longer remember what strings I used previously to tag certain categories of resources. Was it "hadopi", "paranoia", "social web", "socialweb", "web", "security", "politics", "zensursula", "bigbrother", "1984", ... If I tag a document about a city should I tag it "Munich", "München", "capital", "Bavaria", "Germany", "town", "agglomeration", "urbanism", "living", ...? As time passed I found it necessary to add more and more tags to my bookmarks, hoping that I would be able to find a resource again in the future by accidentally choosing one of those tags. But clearly that is not the solution. Any of those tags could furthermore be used very differently by other people on delicious. Crowd sourcing only partially works, because there is no clear understanding on what is meant by a tag, and there is no space to discuss that. Is "bank" the bank of a river, or the bank you put money in? Wikipedia has a disambiguation page for this, which took some time to put together. No such mechanism exists on delicious.

Faviki neatly solves this problem by using the work done by another crowd sourced application, and allowing you to tag your entries with concepts taken from there. Before you tag a page, Faviki finds some possible dbpedia concepts that could fit the content of the page to tag. When you then choose the tags, the definition from wikipedia is made visible so that you can choose which meaning of the tag you want to use. Finally when you tag, you don't tag with a string, but with a URI: the DBPedia URI for that concept. Now you can always go back and check the detailed meaning of your tags.

But that is just the beginning of the neatness of this system. Imagine you tag a page with http://dbpedia.org/resource/Munich (the user does not see this URL of course!). Then by using the growing linked data cloud Faviki or other services will be able to start doing some very interesting inferencing on this data. So since the above resource is known to be a town, a capital, to be in Germany which is in Europe, to have more than half a million inhabitants, to be along a certain river, that contains certain museums, to have different names in a number of other languages, to be related in certain ways to certain famous people (such as the current Pope)... it will be possible to improve the service to allow you to search for things in a much more generic way: you could search by asking Faviki for resources that were tagged with some European Town and the concept Art. If you are searching for "München" Faviki will be able to enlarge the search to Munich, since they will be known to be tags for the same city...

I will leave it as an exercise to the reader to think about other interesting ways to use this structured information to make finding resources easier. Here is an image of the state of the linked data cloud 6 months ago to stimulate your thinking :-)

.

But think about it the other way now. Not only are you helping your future self find information bookmarked semantically - let's use the term now - you are also making that information clearly available to wikipedia editors in the future. Consider for example the article "Lateralization of Brain Function" on wikipedia. The Faviki page on that subject is going to be a really interesting place to look to find good articles on the subject appearing on the web. So with Faviki you don't have to work directly on wikipedia to participate. You just need to tag your resources carefully!

Finally I am particularly pleased by Faviki, because it is exactly the service I described on this blog 3 years ago in my post Search, Tagging and Wikis, at the time when the folksonomy meme was in full swing, threatening according to it's fiercest proponents to put the semantic web enterprise into the dustbin of history.

Try out Faviki, and see who makes more sense.

Some further links:

Sunday Nov 29, 2009

Web Finger proposals overview

If all you had was an email address, would it not be nice to be able to have a mechanism to find someone's home page or OpenId from it? Two proposals have been put forward to show how this could be done. I will look at them and add a sketch of my own that hopefully should lead us to a solution that takes the best of both proposals.

The WebFinger GoogleCode page explains what webfinger is very well:

Back in the day you could, given somebody's UNIX account (email address), type
$ finger email@example.com 
and get some information about that person, whatever they wanted to share: perhaps their office location, phone number, URL, current activities, etc.

The new ideas generalize this to the web, by following a very simple insight: If you have an email address like henry.story@sun.com, then the owner of sun.com is responsible for managing the email. That is the same organization responsible for managing the web site http://sun.com. So all that is needed is some machine readable pointer from http://sun.com/ to a lookup giving more information about owner of the email address. That's it!

The WebFinger proposal

The WebFinger proposed solution showed the way so I will start from here. It is not too complicated, at least as described by John Panzer's "Personal Web Discovery" post.

John suggests that there should be a convention that servers have a file in the /host-meta root location of the HTTP server to describe metadata about the site. (This seems to me to break web architecture. But never mind: the resource http://sun.com/ can have a link to some file that describes a mapping from email ids to information about it.) The WebFinger solution is to have that resource be in a new application/host-meta file format. (not xml btw). This would have mapping of the form

Link-Pattern: <http://meta.sun.com/?q={%uri}>; 
    rel="describedby";type="application/xrd+xml"
So if you wanted to find out about me, you'd be able to do a simple HTTP GET request on http://meta.sun.com/?q=henry.story@sun.com, which will return a representation in another new application/xrd+xml format about the user.

The idea is really good, but it has three more or less important flaws:

  • It seems to require by convention all web sites to set up a /host-meta location on their web servers. Making such a global requirement seems a bit strong, and does not in my opinion follow web architecture. It is not up to a spec to describe the meaning of URIs, especially those belonging to other people.
  • It seems to require a non xml application/host-meta format
  • It creates yet another file format to describe resources the application/xrd+xml. It is better to describe resources at a semantic level using the Resouces Description Framework, and not enter the format battle zone. To describe people there is already the widely known friend of a friend ontology, which can be clearly extended by anyone. Luckily it would be easy for the XRD format to participate in this, by simply creating a GRDDL mapping to the semantics.

All these new format creation's are a real pain. They require new parsers, testing of the spec, mapping to semantics, etc... There is no reason to do this anymore, it is a solved problem.

But lots of kudos for the good idea!

The FingerPoint proposal

Toby Inkster, co inventor of foaf+ssl, authored the fingerpoint proposal, which avoids the problems outlined above.

Fingerpoint defines one useful relation sparql:fingerpoint relation (available at the namespace of the relation of course, as all good linked data should), and is defined as

sparql:fingerpoint
	a owl:ObjectProperty ;
	rdfs:label "fingerpoint" ;
	rdfs:comment """A link from a Root Document to an Endpoint Document 
                        capable of returning information about people having 
                        e-mail addresses at the associated domain.""" ;
	rdfs:subPropertyOf sparql:endpoint ;
	rdfs:domain sparql:RootDocument .
It is then possible to have the root page link to a SPARQL endpoint that can be used to query very flexibily for information. Because the link is defined semantically there are a number of ways to point to the sparql endpoint:
  • Using the up and coming HTTP-Link HTTP header,
  • Using the well tried html <link> element.
  • Using RDFa embedded in the html of the page
  • By having the home page return any other represenation that may be popular or not, such as rdf/xml, N3, or XRD...
Toby does not mention those last two options in his spec, but the beauty of defining things semantically is that one is open to such possibilities from the start.

So Toby gets more power as the WebFinger proposal, by only inventing 1 new relation! All the rest is already defined by existing standards.

The only problem one can see with this is that SPARQL, though not that difficult to learn, is perhaps a bit too powerful for what is needed. You can really ask anything of a SPARQL endpoint!

A possible intermediary proposal: semantic forms

What is really going on here? Let us think in simple HTML terms, and forget about machine readable data a bit. If this were done for a human being, what we really would want is a page that looks like the webfinger.org site, which currently is just one query box and a search button (just like Google's front page). Let me reproduce this here:

Here is the html for this form as its purest, without styling:

     <form  action='/lookup' method='GET'>
         <img src='http://webfinger.org/images/finger.png' />
         <input name='email' type='text' value='' />         
         <button type='submit' value='Look Up'>Look Up</button>
     </form>

What we want is some way to make it clear to a robot, that the above form somehow maps into the following SPARQL query:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?homepage
WHERE {
   [] foaf:mbox ?email;
      foaf:homepage ?homepage
}

Perhaps this could be done with something as simple as an RDFa extension such as:

     <form  action='/lookup' method='GET'>
         <img src='http://webfinger.org/images/finger.png' />
         <input name='email' type='text' value='' />         
         <button type='submit' value='homepage' 
                sparql='PREFIX foaf: <http://xmlns.com/foaf/0.1/> 
                 GET ?homepage
                 WHERE {
                   [] foaf:mbox ?email;
                      foaf:homepage ?homepage
                 }">Look Up</button>
     </form>

When the user (or robot) presses the form, the page he ends up on is the result of the SPARQL query where the values of the form variables have been replaced by the identically named variables in the SPARQL query. So if I entered henry.story@sun.com in the form, I would end up on the page http://sun.com/lookup?email=henry.story@sun.com, which could perhaps just be a redirect to this blog page... This would then be the answer to the SPARQL query

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?homepage
WHERE {
   [] foaf:mbox "henry.story@bblfish.net";
      foaf:homepage ?homepage
}
(note: that would be wrong as far as the definition of foaf:mbox goes, which relates a person to an mbox, not a string... but let us pass on this detail for the moment)

Here we would be defining a new GET method in SPARQL, which find the type of web page that the post would end up landing on: namely a page that is the homepage of whoever's email address we have.

The nice thing about this is that as with Toby Inkster's proposal we would only need one new relation from the home page to such a finder page, and once such a sparql form mapping mechanism is defined, it could be used in many other ways too, so that it would make sense for people to learn it. For example it could be useful to make web sites available to shopping agents, as I had started thinking about in RESTful semantic web services before RDFa was out.

But most of all, something along these lines, would allow services to have a very simple CGI to answer such a query, without needing to invest in a full blown SPARQL query engine. At the same time it makes the mapping to the semantics of the form very clear. Perhaps someone has a solution to do this already. Perhaps there is a better way of doing it. But it is along these lines that I would be looking for a solution...

(See also an earlier post of mine SPARQLing AltaVista: the meaning of forms)

How this relates to OpenId and foaf+ssl

One of the key use cases for such a Web Finger comes from the difficulty people have of thinking of URLs as identifiers of people. Such a WebFinger proposal if successful, would allow people to type in their email address into an OpenId login box, and from there the Relying Party (the server that the user wants to log into), could find their homepage (usually the same as their OpenId page), and from there find their FOAF description (see "FOAF and OpenID").

Of course this user interface problem does not come up with foaf+ssl, because by using client side certificates, foaf+ssl does not require the user to remember his WebID. The browser does that for him - it's built in.

Nevertheless it is good that OpenId is creating the need for such a service. It is a good idea, and could be very useful even for foaf+ssl, but for different reasons: making it easy to help people find someone's foaf file from the email address could have many very neat applications, if only for enhancing email clients in interesting new ways.

Updates

It was remarked in the comments to this post that the format for the /host-meta format is now XRD. So that removes one criticism of the first proposal. I wonder how flexible XRD is now. Can it express everything RDF/XML can? Does it have a GRDDL?

Thursday Nov 19, 2009

http://openid4.me/ -- OpenId ♥ foaf+ssl

OpenId4.me is the bridge between foaf+ssl and OpenId we have been waiting for.

OpenId and foaf+ssl have a lot in common:

  • They both allow one to log into a web site without requiring one to divulge a password to that web site
  • They both allow one to have a global identifier to log in, so that one does not need to create a username for each web site one wants to identify oneself at.
  • They also allow one to give more information to the site about oneself, automatically, without requiring one to type that information into the site all over again.

OpenId4.me allows a person with a foaf+ssl profile to automatically login to the millions of web sites that enable authentication with OpenId. The really cool thing is that this person never has to set up an OpenId service. OpenId4.me does not even store any information about that person on it's server: it uses all the information in the users foaf profile and authenticates him with foaf+ssl. OpenId4.me does not yet implement attribute exchange I think, but it should be relatively easy to do (depending on how easy it is to hack the initial OpenId code I suppose).

If you have a foaf+ssl cert (get one at foaf.me) and are logging into an openid 2 service, all you need to type in the OpenId box is openid4.me. This will then authenticate you using your foaf+ssl certificate, which works with most existing browsers without change!

If you then want to own your OpenId, then just add a little html to your home page. This is what I placed on http://bblfish.net/:

    <link rel="openid.server" href="http://openid4.me/index.php" />
    <link rel="openid2.provider openid.server" href="http://openid4.me/index.php"/>
    <link rel="meta" type="application/rdf+xml" title="FOAF" href="http://bblfish.net/people/henry/card%23me"/>

And that's it. Having done that you can then in the future change your openid provider very easily. You could even set up your own OpenId4.me server, as it is open source.

More info at OpenId4.me.

Thursday Oct 15, 2009

November 2nd: Join the Social Web Camp in Santa Clara

The W3C Social Web Incubator Group is organizing a free Bar Camp in the Santa Clara Sun Campus on November 2nd to foster a wide ranging discussion on the issues required to build the global Social Web.

Imagine a world where everybody could participate easily in a distributed yet secure social web. In such a world every individual will control their own information, and every business could enter into a conversation with customers, researchers, government agencies and partners as easily as they can now start a conversation with someone on Facebook. What is needed to go in the direction of The Internet of Subjects Manifesto? What existing technologies can we build on? What is missing? What could the W3C contribute? What could others do? To participate in the discussion and meet other people with similar interests, and push the discussion further visit the Santa Clara Social Web Camp wiki and

If you are looking for a reason to be in the Bay Area that week, then here are some other events you can combine with coming to the Bar Camp:

  • The W3C is meeting in Santa Clara for its Technical Plenary that week in Santa Clara.
  • The following day, the Internet Identity Workshop is taking place in Mountain View until the end of the week. Go there to push the discussion further by meeting up with the OpenId, OAuth, Liberty crowd, which are all technologies that can participate in the development of the Social Web.
  • You may also want to check out ApacheCon which is also taking place that week.

If you can't come to the west coast at all due to budget cuts, then not all is lost. :-) If you are on the East coast go and participate in the ISWC Building Semantic Web Applications for Government tutorial, and watch my video on The Social Web which I gave at the Free and Open Source Conference this summer. Think: if the government wants to play with Social Networks, it certainly cannot put all its citizens information on Facebook.

Monday Oct 12, 2009

One month of Social Web talks in Paris

Poster for the Social Web Bar Camp @LaCantine

As I was in Berlin preparing to come to Paris, I wondered if I would be anywhere near as active in France as I had been in Germany. I had lived for 5 years in Fontainebleau, an hour from Paris, close but just too far to be in the swing of things. And from that position, I got very little feel for what was happening in the capital. This is what had made me long to live in Paris. So this was the occasion to test it out: I was going to spend one month in the capital. On my agenda there was just a Social Web Bar Camp and a few good contacts.

The Social Web Bar Camp at La Cantine which I blogged about in detail, was like a powder keg for my stay here. It just launched the whole next month of talks, which I detail below. It led me to make a very wide range of contacts, which led to my giving talks at 2 major conferences, 2 universities, one other Bar Camp, present to a couple of companies, get one implementation of foaf+ssl in Drupal, and meet a lot of great people.

Through other contacts, I also had an interview with a journalist from Le Monde, and met the very interesting European citizen journalism agency Cafe Babel (for more on them see this article).

Here follows a short summary of each event I presented the Social Web at during my short stay in Paris.

Friday, 18 September 2009
Arrived in plane from Berlin, and met the journalists at the Paris offices of Cafe Babel, after reading an article on them in the July/August issue of Internationale Politik, "Europa aus Erster Hand".
Saturday, 19 September 2009
Went to the Social Web Bar Camp at La Cantine which I blogged about in detail. Here I met a many people, who connected me up with the right people in the Paris conference scene, where I was then able to present. A couple of these did not work out due to calendar clashes, such as an attempted meeting with engineers and users of Elgg a distributed Open Source Social Networking Platform popular at Universities here in France and the UK.
Monday, 21 September 2009
Visited the offices of Le Monde, and had lunch with a journalist there. I explain my vision of the Social Web and the functioning of foaf+ssl. He won't be writing about it directly he told me, but will develop these ideas over time in a number of articles. ( I'll post updates here, though it is sadly very difficult to link to articles in Le Monde, as they change the URLs for their articles, make them paying only after a period of time, and then don't even make an abstract available for non paying members).
Friday, 25 September 2009
I visited the new offices of af83.com a startup with a history: they participated in the building of the web site of Ségolène Royal the contender with Nicholas Sarkozi, during the last French Presidential Elections.
There I met up with Damien Tournoud, and expert Drupal Developer, explained the basics of foaf+ssl, pointed him to the Open Source project foaf.me, and let him work on it. With a bit of help from Benjamin Nowack the creator of the ARC2 Semantic Web library for PHP, Damien had a working implementation the next day. We waited a bit, before announcing it the following Wednesday on the foaf-protocols mailing list.
Tuesday 29 September, 2009
La Cantine organised another Bar Camp, on a wide range of topics, which I blogged about in detail. There I met people from Google, Firefox, and reconnected up with others. We also had a more open round table discussion on the Social Web.
Thursday 1st and Friday 2nd October, 2009
I visited the Open World Forum, which started among others with a track on the Semantic Desktop "Envisioning the Open Desktop of the future", headed by Prof Stefan Decker, with examples of implementations in the latest KDE (K Desktop Environment).
I met a lot of people here, including Eric Mahé, previously Technology Advisor at Sun Microsystems France. In fact I met so many people that I missed most of the talks. One really interesting presentation by someone from a major open source code search engine, explained that close to 60% of Open Source software came from Eastern and Western Europe combined. (anyone with a link to the talk?)
Saturday, 3rd October 2009
I presented The Social Web in French at the Open Source Developer Conference France which took place in La Villette.
I was really happily surprised to find that I was part of a 3 hour track dedicated to the Semantic Web. This started with a talk by Oliver Berger "Bugtracking sur le web sémantique. Oliver has been working on the Baetle ontology as part of the 2 year government financed HELIOS project. This is something I talked about a couple of years ago and wrote about here in my presentation Connecting Software and People. It is really nice to see this evolving. I really look forward to seeing the first implementations :-)
Oliver's was followed by a talk by Jean-Marc Vanel, introducing Software and Ontology Development, who introduced many of the key Semantic Web concepts.
Tuesday 6th October, morning
Milan Stankovitch whom I had met at the European Semantic Web Conference, and again at the Social Web Bar Camp, invited me to talk to the developers of hypios.com, a very interesting web platform to help problem seekers find problem solvers. The introductory video is really worth watching. I gave them the talk I keep presenting, but with a special focus on how this could help them in the longer term make it easier for people to join and use their system.
Tuesday 6th September, afternoon
I talked and participated in a couple of round table talks at the 2nd Project Accelerator on Identity at the University of Paris 1, organised by the FING. Perhaps the most interesting talk there was the one by François Hodierne , who works for the Open Source Web Applications & Platforms company h6e.net, and who presented the excellent project La Distribution whose aim it is to make installing the most popular web applications as easy as installing an app on the iPhone. This is the type of software needed to make The Internet of Subjects Manifesto a reality. In a few clicks everyone should be able to get a domain name, install their favorite web software on it - Wordpress, mail, wikis, social network, photo publishing tool - and get on with their life, whilst owning their data, so that if they at a later time find the need to move, they can, and so that nobody can kick them off their network. This will require rewriting a little each of the applications so as to enable them to work with the distributed secure Social Web, made possible by foaf+ssl: an application without a social network no longer being very valuable.
Thurday 9th October, 2009
Pierre Antoine Champin from the CNRS, the National French Research organisation, had invited me to Lyon to present The Social Web. So I took the TGV from Paris at 10:54 and was there 2 hours later, which by car would have been a distance of 464km (288.3 miles) according to Google Maps. The talk was very well attended with close to 50 students showing up, and the session lasted two full hours: 1 hour of talks and by many good questions.
After a chat and a few beers, I took the train back to Paris where the train arrived just after 10pm.
Saturday October 10, 2009
I gave a talk on the Social Web at Paris-Web, on the last day of a 3 day conference. This again went very well.
After lunch I attended two very good talks that complemented mine perfectly:
  • David Larlet had a great presentation on Data Portability, which sparked a very lively and interesting discussion. Issues of Data ownership, security, confidentiality, centralization versus decentralization came up. One of his slides made the point very well: by showing the number of Web 2.0 sites that no longer exist, some of them having disappeared by acquisition, others simply technical meltdown, leaving the data of all their users lost forever. (Also see David's Blog summary of Paris-Web. )
  • Right after coffee we had a great presentation on the Semantic Web by Fabien Gandon, who managed to give in the limited amount of time available to him an overview of the Semantic Web stack from bottom to top, including OWL 1 and 2, Microformats, RDFa, and Linked data, and various very cool applications of it, that even I learned a lot. His slides are available here. He certainly inspired a lot of people.
Tuesday, 13 October 2009
Finally I presented at the hacker space La suite Logique, which takes place in a very well organized very low cost lodging space in Paris. They had presentations on a number of projects happening there:
  • One project is to build a grid by taking pieces from the remains of computers that people have brought them. They have a room stashed full of those.
  • Another projects is to add wifi to the lighting to remotely control the projectors for theatrical events taking place there.
  • There was some discussion on how to add sensors to dancers, as one Daito Manabe a Japanese artist has done, in order to create a high tech butoh dance (see the great online videos).
  • Three engineers presented the robots they are constructing for a well known robot fighting competition
Certainly a very interesting space to hang out in, meet other hackers, and get fun things done in.
All of these talks were of course framed by some great evenings out, meeting people, and much more, which I just don't have time to write down right here. Those were the highlights of my month's stay in Paris. I must admit that I really had no idea it to be so active!

Wednesday Oct 07, 2009

Sketch of a RESTful photo Printing service with foaf+ssl

Let us imagine a future where you own your data. It's all on a server you control, under a domain name you own, hosted at home, in your garage, or on some cloud somewhere. Just as your OS gets updates, so all your server software will be updated, and patched automatically. The user interface for installing applications may be as easy as installing an app on the iPhone ( as La Distribution is doing).

A few years back, with one click, you installed a myPhoto service, a distributed version of fotopedia. You have been uploading all your work, social, and personal photos there. These services have become really popular and all your friends are working the same way too. When your friends visit you, they are automatically and seamlessly recognized using foaf+ssl in one click. They can browse the photos you made with them, share interesting tidbits, and more... When you organize a party, you can put up a wiki where friends of your friends can have write access, leave notes as to what they are going to bring, and whether or not they are coming. Similarly your colleagues have access to your calendar schedule, your work documents and your business related photos. Your extended family, defined through a linked data of family relationship (every member of your family just needs to describe their relation to their close family network) can see photos of your family, see the videos of your new born baby, and organize Christmas reunions, as well as tag photos.

One day you wish to print a few photos. So you go to web site we will provisionally call print.com. Print.com is neither a friend of yours, nor a colleague, nor family. It is just a company, and so it gets minimal access to the content on your web server. It can't see your photos, and all it may know of you is a nickname you like to use, and perhaps an icon you like. So how are you going to allow print.com access to the photos you wish to print? This is what I would like to try to sketch a solution for here. It should be very simple, RESTful, and work in a distributed and decentralized environment, where everyone owns and controls their data, and is security conscious.

Before looking at the details of the interactions detailed in the UML Sequence diagram below, let me describe the user experience at a general level.

  1. You go to print.com site after clicking on a link a friend of your suggested on a blog. On the home web page is a button you can click to add your photos.
  2. You click it, and your browser asks you which WebID you wish to use to Identify yourself. You choose your personal ID, as you wish to print some personal photos of yours. Having done that, your are authenticated, and print.com welcomes you using your nicknames and displays your icon on the resulting page.
  3. When you click a button that says "Give Print.com access to the pictures you wish us to print", a new frame is opened on your web site
  4. This frame displays a page from your server, where you are already logged in. The page recognized you and asks if you want to give print.com access to some of your content. It gives you information about print.com's current stock value on NASDAQ, and recent news stories about the company. There is a link to more information, which you don't bother exploring right now.
  5. You agree to give Print.com access, but only for 1 hour.
  6. When your web site asks you which content you want to give it access to, you select the pictures you would like it to have. Your server knows how to do content negotiation, so even though copying each one of the pictures over is feasible, you'd rather give print.com access to the photos directly, and let the two servers negotiate the best representation to use.
  7. Having done that you drag and drop an icon representing the set of photos you chose from this frame to a printing icon on the print.com frame.
  8. Print.com thanks you, shows you icons of the pictures you wish to print, and tells you that the photos will be on their way to your the address of your choosing within 2 hours.

In more detail then we have the following interactions:

  1. Your browser GETs print.com's home page, which returns a page with a "publish my photos" button.
  2. You click the button, which starts the foaf+ssl handshake. The initial ssl connection requests a client certificate, which leads your browser to ask for your WebID in a nice popup as the iPhone can currently do. Print.com then dereferences your WebId in (2a) to verify that the public key in the certificate is indeed correct. Your WebId (Joe's foaf file) contains information about you, your public keys, and a relation to your contact addition service. Perhaps something like the following:
    :me xxx:contactRegistration </addContact> .
    Print.com uses this information when it creates the resulting html page to point you to your server.
  3. When you click the "Give Print.com access to the pictures you wish us to print" you are sending a POST form to the <addContact> resource on your server, with the WebId of Print.com <https://nasdaq.com/co/PRNT#co> in the body of the POST. The results of this POST are displayed in a new frame.
  4. Your web server dereferences Print.com, where it gets some information about it from the NASDAQ URL. Your server puts this information together (4a) in the html it returns to you, asking what kind of access you want to give this company, and for how long you wish to give it.
  5. You give print.com access for 1 hour by filling in the forms.
  6. You give access rights to Print.com to your individual pictures using the excellent user interface available to you on your server.
  7. When you drag and drop the resulting icon depicting the collection of the photos accessible to Print.com, onto its "Print" icon in the other frame - which is possible with html5 - your browser sends off a request to the printing server with that URL.
  8. Print.com dereferences that URL which is a collection of photos it now has access to, and which it downloads one by one. Print.com had access to the photos on your server after having been authenticated with its WebId using foaf+ssl. (note: your server did not need to GET print.com's foaf file, as it still had a fresh version in its cache). Print.com builds small icons of your photos, which it puts up on its server, and then links to in the resulting html before showing you the result. You can click on those previews to get an idea what you will get printed.

So all the above requires very little in addition to foaf+ssl. Just one relation, to point to a contact-addition POST endpoint. The rest is just good user interface design.

What do you think? Have I forgotten something obvious here? Is there something that won't work? Comment on this here, or on the foaf-protocols mailing list.

Notes

Creative Commons License
print.com sequence diagram by Henry Story is licensed under a Creative Commons Attribution 3.0 United States License.
Based on a work at blogs.sun.com.

Thursday Jun 11, 2009

The foaf+ssl world tour

As you can see from the map here I have been cycling from Fontainebleau to Vienna (covering close to 1000km of road), and now around Cyprus in my spare time. On different occasions along my journey I had the occasion to present foaf+ssl and combine it with a hands on session, where members of the audience were encouraged to create their own foaf file and certificates, and also start looking into what it takes to develop foaf+ssl enabled services. This seems like a very good way to proceed: it helps people get some hands on experience which they can then hopefully pass on to others, it helps me prioritize what need to be done next, and should also lead to the development of foaf+ssl services that will increase the network value of the community, creating I hope a viral effect.

I started this cycle tour in order to loose some weight. I still have 10kg to loose or so, which at the rate of 3kg per 1000km will require me to cycle another 3000km. So that should enable me to visit quite a few places yet. I will be flying back to Vienna where I will stay 10 days or so, after which I will cycle to Prague for a Kiwi meeting on the 3rd of July. After that I could cycle on to Berlin. But really it's up to you to decide. If you know a good hacker group that I can present to and cycle to, let me know, and I'll see how I can fit it into my timetable. So please get in contact! :-)

Wednesday May 20, 2009

You are a Terrorist!

Every country in Europe seems to be on the verge of introducing extremely powerful legislation for state monitoring of the internet, bringing us a lot closer to the dystopia described in George Orwell's novel Nineteen Eighty Four. Under the guise of laws to help combat terrorism or pedophilia - emotional subjects that immediately get everybody's unthinking assent - massive powers are to be given to the state, which could very easily be misused. As internauts we all need to make it our duty to follow very closely these debates, and participate actively in them, if we do not want to find ourselves waking up one morning in a world that is the exact opposite of what we have been dreaming of.

Germany

In Germany a new Data Retention law passed already it seems in 2008, allows the state (quote)

to trace who has contacted whom via telephone, mobile phone or e-mail for a period of six months. In the case of mobile calls or text messages via mobile phone, the user's location is also logged. Anonymising services will be prohibited as of 2009.

To increase awareness of this law Alexander Lehmann put together this excellent presentation, with English subtitles, Du bist Terrorist!:

Du bist Terrorist (You are a Terrorist) english subtitles from lexela on Vimeo.

France

The passage of the hadopi law in France, will create a strong incentive for citizens to place state built snooper software on each their computers in order to make it possible to defend themselves against accusations of copyright infringement. But that is nothing compared to the incredibly broad powers the state wishes to give itself with Loppsi 2 law (detailed article in Le Monde, and Ars Technica) which would give the president the power to insert spyware onto users computers (which could record anything being done of course), create a very large database of people's activities, help link information from various databases, and much more... The recent case of the sacking of the web site director of the once national, now private, TF1 television channel for having communicated his doubts on Hadopi privately to his Member of Parliament - as reported on Slashdot recently - does not give one much faith in the way privacy is being handled currently by the government.

The United Kingdom

In the UK the Home Secretary Jaqui Smith had proposed to create a database dubbed Big Brother to log every single activity of every one of it's citizens - in order of course to root out the very 21 century crimes of pedophilia and terrorism (did the IRA not operate before the internet? Are pedophile rings something that only emerged with the internet, or is it that they just became more visible?). She had to pull back somewhat from the initial proposal, and now wishes all that information still to be tracked, but only to be kept on the service provider's databases as reported by the Daily Mail, The Telegraph, The Independent...

Conclusion

So are we now all suspected terrorists, pornographers, pedophiles, murderers, subversives, ... that the governments must know all about us? We may have voted for the current government and have complete faith in their use of these tools. But what when the opposition comes in, and takes hold of those same powers? Will we be as comfortable then? The excellent 2006 film The Lives of Others shows just how intrusive the East German state was on its own citizens during the cold war - and that with the very limited tools they had available. With modern computing tools, that type of spy operation could be done at much much lower cost and so perhaps even be viable for the state.

If you feel things just can't go this wrong, then I would also recommend watching Julie Taymor's adaptation of Shakespear's Titus Andronicus. It really is important to realize that things can go badly, very very badly wrong. Ignoring a problem, not taking responsibilities in fighting them will lead to disaster, as the current economic crisis - predicted years before it occurred, but without any action being taken - should have amply proven by now. Sadly for people who predict danger, if people do act on the danger and avoid it, nobody may even notice how close to danger they really were. So our actions may remain unsung. But at least we may put some chances on our side not to wake up in a new form of dictatorship, worse than any ever dreamed of by our those who helped forge our democracies.

Thursday May 14, 2009

FOAF+SSL: RESTful Authentication for the Social Web

The European Semantic Web Conference (ESWC) will be held in Heraklion on the Island of Crete in Greece from 31 May to 4 June. I will be presenting the paper "FOAF+SSL: RESTful Authentication for the Social Web" which I co-authored with Bruno Harbulot, Ian Jacobi and Mike Jones. Here is the abstract:

We describe a simple protocol for RESTful authentication, using widely deployed technologies such as HTTP, SSL/TLS and Semantic Web vocabularies. This protocol can be used for one-click sign-on to web sites using existing browsers — requiring the user to enter neither an identifier nor a password. Upon this, distributed, open yet secure social networks and applications can be built. After summarizing each of these technologies and how they come together in FOAF+SSL, we describe declaratively the reasoning of a server in its authentication decision. Finally, we compare this protocol to others in the same space.

The paper was accepted by the Trust and Privacy on the Social and Semantic Web track of the ESWC. There are quite a number of interesting papers there.

I have never been to Greece, so I have a feeling I will really enjoy this trip. Hope to see many of you there.

Tuesday May 12, 2009

A Simple foaf+ssl Identity Provider (IdP)

In order to help people get started with foaf+ssl, we have put together a very simple Identity Provider service (IdP). This removes the need for web services to have to deal with setting up https certificates and changing much to their current web setup. With a few lines of server side code any server can now easily find the WebId of a user, and try out some interesting ideas at little cost. If the experiment is useful, for extra security and reliability a business case can then be made for integrating a full foaf+ssl stack.

The protocol is very much as we outlined in a earlier post entitled "Sketch of a foaf+ssl+openid service". The details of the API are listed directly on the root of the first foaf+ssl IdP serviced, available here: https://foafssl.org/srv/idp. All the Service Provider - that is the consumer of the IdP - needs to do is to add a login button or link to his web page that points to the above IdP with a authreqissuer=$url parameter that points back to a CGI controlled by the Service Provider that can parse the redirect containing the user's WebId. That url comes with a timestamp to avoid replay attacks, and is signed to assure authenticity.

Bruno Harbulot wrote the code and published it under a BSD licence by the University of Manchester where he studies. The code is available on the So(m)mer Subversion repository. You can download it with:

$ svn checkout https://sommer.dev.java.net/svn/sommer/foafssl/trunk foafssl --username guest
and start your own IdP if you want. Please feel free to contribute back improovements, or ping us for missing features.

Update September 14, 2009

The IdP is now RDFa enabled, using Damian Steer's RDFa parser for Jena which I ported to Sesame. The war file can be downloaded directly from the dev.java.net Maven repository. To set up your own IdP use that WAR and follow the foaf+ssl setup instructions for Tomcat. This war may only work for Tomcat 7.

Tuesday Apr 07, 2009

Sun Initiates Social Web Interest Group

I am very pleased to announce that Sun Microsystems is one of the initiating members of the Social Web Incubator Group launched at the W3C.

Quoting from the Charter:

The mission of the Social Web Incubator Group, part of the Incubator Activity, is to understand the systems and technologies that permit the description and identification of people, groups, organizations, and user-generated content in extensible and privacy-respecting ways.

The topics covered with regards to the emerging Social Web include, but are not limited to: accessibility, internationalization, portability, distributed architecture, privacy, trust, business metrics and practices, user experience, and contextual data. The scope includes issues such as widget platforms (such as OpenSocial, Facebook and W3C Widgets), as well as other user-facing technology, such as OpenID and OAuth, and mobile access to social networking services. The group is concerned also with the extensibility of Social Web descriptive schemas, so that the ability of Web users to describe themselves and their interests is not limited by the imagination of software engineers or Web site creators. Some of these technologies are independent projects, some were standardized at the IETF, W3C or elsewhere, and users of the Web shouldn't have to care. The purpose of this group is to provide a lightweight environment designed to foster and report on collaborations within the Social Web-related industry or outside which may, in due time affect the growth and usability of the Social Web, rather than to create new technology.

I am glad we are supporting this along with these other prestigious players:

This should certainly help create a very interesting forum for discussing what I believe is one of the most important issue on the web today.

Thursday Feb 12, 2009

sketch of a foaf+ssl+openid service

Discussing foaf+ssl with Melvin Carvalho he pointed out that we need a service to help non https enabled servers to participate in our distributed open secure social network. This discussion led me to sketch out the following simple protocol, where I make use of parts of the OpenId protocol at key points. This results in something that does what OpenId does, but without the need for users to remember their URL, and so without many of the problems that plague that protocol. And all this with minimal protocol invention.

So first here is the UML sequence diagram for what I am calling here tentatively foaf+ssl+openid.

  1. First Romeo arrives on a public page with a login button.
    • On an OpenId server there would be a field for the user to enter their ID, with foaf+ssl this is not needed. So we have a simple login button.
    • That button's action attribute points to some foaf+ssl+openid service that server trusts (it is therefore an https URL). It can be any such service. In OpenId the Id entered by the user points the server to a web page that points the service to an openid server the user (Romeo here) trusts. All of this is no longer needed with this protocol. The html for the login button can be static.
    • The URL has to encode information for the foaf+ssl service to know who to contact back. One should use exactly the same URL format here as OpenId does. (minus the need to encode User's URL since that will be in the X509 certificate)
  2. When Romeo clicks the login button he opens an https request to the foaf+ssl+openid service.
  3. The foaf+ssl+openid service on opening the connection asks for the client's certificate after sending its own. This would contain
    • The User's Public key
            Subject Public Key Info:
                  Public Key Algorithm: rsaEncryption
                  RSA Public Key: (1024 bit)
                      Modulus (1024 bit):
                          00:b6:bd:6c:e1:a5:ef:51:aa:a6:97:52:c6:af:2e:
                          71:94:8a:b6:da:9e:5a:5f:08:6d:ba:75:48:d8:b8:
                          01:50:d3:92:11:7d:90:13:89:48:06:2e:ec:6e:cb:
                          57:45:a4:54:91:ee:a0:3a:46:b0:a1:c2:e6:32:4d:
                          54:14:4f:42:cd:aa:05:ca:39:93:9e:b9:73:08:6c:
                          fe:dc:8e:31:64:1c:f7:f2:9a:bc:58:31:0d:cb:8e:
                          56:d9:e6:da:e2:23:3a:31:71:67:74:d1:eb:32:ce:
                          d1:52:08:4c:fb:86:0f:b8:cb:52:98:a3:c0:27:01:
                          45:c5:d8:78:f0:7f:64:17:af
                      Exponent: 65537 (0x10001)
      
    • The Subject's Alternative Name WebId
              X509v3 extensions:
                 ...
                 X509v3 Subject Alternative Name: 
                                 URI:http://romeo.net/#romeo
      
    The ability for browsers to do this is all part of the standard TLS handshake available in most browsers for a few generations.
  4. The server looks in the client certificate for the Subject Alternative Name in the SSLv3 extensions, and fetches the foaf file at that URL
  5. The service then does a simple match on the information from the foaf file and the information from the certificate. If they match the foaf+ssl+openid service knows that the user <http://romeo.net/#rome> controls <http://romeo.net/> web page. This is enough for simple authentication. (For more on this see Creating a Web of Trust withouth Key Signing Parties )
  6. Depending on the result, the foaf+ssl+openid service can return a redirect with an authentication token to the original service Romeo wanted to log into. This can also be done using the patterns developed in the OpenId community. foaf+ssl+openid sequence diagram
  7. The browser then redirects to the Original service.
  8. The service now has Romeo's URL. But to avoid a man in the middle attack, or replay attacks it follows the OpenId protocol and does a little check with its service on a token sent to it in the redirect in step 6.
    ((Perhaps this step could be avoided if the foaf+ssl+openid service made public it's public key, and encrypted some token sent to by the client to the server. But we could just stick closely to the well trodden OpenId path and just reuse their libraries.))
  9. Having verified the identity of the user, the service could optionally GET the user's foaf file, for public information about him.
  10. Or it could check the relation that user has to it's trusted graph of friends,
  11. and return a presonalised resource

One could also imagine a foaf+ssl+openid server enabled with attribute exchange functionality, which it could get access to simply by reading the foaf file.

I am not sure how much of a problem it really is for servers not to have SSL access. But this could easily fill that gap.

Saturday Dec 13, 2008

Typealizer: analyzing your personality through your blog

illustration of the scientist Thanks to Mark Dixon I discovered Typealizer, a service that reads your blog and finds your psychological type. So of course I tried it on my own blog, as you will on yours shortly :-) . This is what it had to say:

INTJ - The Scientists

The long-range thinking and individualistic type. They are especially good at looking at almost anything and figuring out a way of improving it - often with a highly creative and imaginative touch. They are intellectually curious and daring, but might be pshysically hesitant to try new things.

The Scientists enjoy theoretical work that allows them to use their strong minds and bold creativity. Since they tend to be so abstract and theoretical in their communication they often have a problem communcating their visions to other people and need to learn patience and use conrete examples. Since they are extremly good at concentrating they often have no trouble working alone.

Well that not bad for flattery. So I reward them with this blog post.

They accompany their analysis with a brain activity diagram. This is the one I got:

Brain activity diagram for main blog

illustration of the travel category There is a lot in the cross section intuition and thinking, with some but not a lot of positioning in the practical. So being all happily scientifical I decided to try out what it would say if I pointed Typealiser to the Travel category on this blog. This is what it has to say on that aspect of my personality, perhaps it is true, a little in retreat recently.

ESTP - The Doers

The active and play-ful type. They are especially attuned to people and things around them and often full of energy, talking, joking and engaging in physical out-door activities.

The Doers are happiest with action-filled work which craves their full attention and focus. They might be very impulsive and more keen on starting something new than following it through. They might have a problem with sitting still or remaining inactive for any period of time.

This also came with a brain activity diagram for that part of the blog

So clearly a lot more biased towards action, as a travel blog should.

Still both of these blogs are not allowing me to capture around half of my brain activity. The spiritual idealistic side is not very visible. I wonder if that means I should speak more about open source and linux? ;-) I tried the Art category of my blog but that did not move me more to the feeling type, nor did the philosophy section make me more idealistic, just again more of a thinker, which they characterise like this:

INTP - The Thinkers

The logical and analytical type. They are espescially attuned to difficult creative and intellectual challenges and always look for something more complex to dig into. They are great at finding subtle connections between things and imagine far-reaching implications.

They enjoy working with complex things using a lot of concepts and imaginative models of reality. Since they are not very good at seeing and understanding the needs of other people, they might come across as arrogant, impatient and insensitive to people that need some time to understand what they are talking about.

Now what could be interesting would be some way then to do the inverse search. Find out what your brains activity diagram should look like, and ask to find blogs that fit those categories, which one could then use as a guide to help one develop that aspect of one's personality - or find a partner :-)

Ps. a thought: after categorizing people into 16 different groups this still leave you with 8 billion people/16 = 500 million people to chose from and if every person just had 1000 web pages that would leave you with half a trillion pages to look at. So this character analysis can be useful, but there still has to be a lot of other criteria to make a good judgement call.

PPS. Oddly enough - or not - Ken Wilber's blog is categorised as being of the "executive type".

Sunday Nov 30, 2008

variation on @timoreilly: hyperdata is the new intel outside

Context: Tim O'Reilly said "Data is the new Intel Inside".

Recently in a post "Why I love Twitter":

What's different, of course, is that Twitter isn't just a protocol. It's also a database. And that's the old secret of Web 2.0, Data is the Intel Inside. That means that they can let go of controlling the interface. The more other people build on Twitter, the better their position becomes.

The meme was launched in the well known "What is Web 2.0" paper in the section entitled "Data is the next Intel Inside"

Applications are increasingly data-driven. Therefore: For competitive advantage, seek to own a unique, hard-to-recreate source of data.

Most of the data is outside your database. It can only be that way, the world is huge, and you are just one small link in the human chain. Linking that data is knowledge and value creation. Hyperdata is the foundation of Web 3.0.

Tuesday Nov 11, 2008

REST APIs must be hypertext driven

Roy Fielding recently wrote in "REST APIs must be hypertext-driven"

I am getting frustrated by the number of people calling any HTTP-based interface a REST API. Today's example is the SocialSite REST API. That is RPC. It screams RPC. There is so much coupling on display that it should be given an X rating.

That was pretty much my thought when I saw that spec. In a comment to his post he continues.

The OpenSocial RESTful protocol is not RESTful. It could be made so with some relatively small changes, but right now it is just wrapping RPC results in common Web media types.

Clarification of Roy's points

Roy then goes on to list some key criteria for what makes an application RESTful.

  • REST API should not be dependent on any single communication protocol, though its successful mapping to a given protocol may be dependent on the availability of metadata, choice of methods, etc. In general, any protocol element that uses a URI for identification must allow any URI scheme to be used for the sake of that identification.

    In section 2.2 of the O.S. protocol we have the following JSON representation for a Person.

    {
        "id" : "example.org:34KJDCSKJN2HHF0DW20394",
        "displayName" : "Janey",
        "name" : {"unstructured" : "Jane Doe"},
        "gender" : "female"
    }
    

    Note that the id is not a URI. Further down in the XML version of the above JSON, it is made clear that by appending "urn:guid:" you can turn this string into a URI. By doing this the protocol has in essence tied itself to a URI scheme, since there is no way of expressing another URI type in the JSON - the JSON being the key representation in this Javascript specific API by the way, the aim of the exercise being to make the writing of social network widgets interoperable. Furthermore this scheme has some serious limitations such as for example that it limits one to 1 social network per internet domain, is tied to a quite controversial XRI spec that has been rejected by OASIS, and does not provide a clear mechanism for retrieving information about it. But that is not the point. The definition of the format is tying itself unnecessarily to a URI scheme, and moreover one that ties one to what is clearly a client/server model.

  • A REST API should not contain any changes to the communication protocols aside from filling-out or fixing the details of underspecified bits of standard protocols, such as HTTP's PATCH method or Link header field.
  • A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]

    Most of these so called RESTful APIs spend a huge amount of time specifying what response a certain resource should give to a certain message. Note for example section 2.1 entitled Responses

  • A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC's functional coupling].

    In section 6.3 one sees this example:

    /activities/{guid}/@self                -- Collection of activities generated by given user
    /activities/{guid}/@self/{appid}        -- Collection of activities generated by an app for a given user
    /activities/{guid}/@friends             -- Collection of activities for friends of the given user {guid}
    /activities/{guid}/@friends/{appid}     -- Collection of activities generated by an app for friends of the given user {guid}
    /activities/{guid}/{groupid}            -- Collection of activities for people in group {groupid} belonging to given user {uid}
    /activities/{guid}/{groupid}/{appid}    -- Collection of activities generated by an app for people in group {groupid} belonging to given user {uid}
    /activities/{guid}/@self/{appid}/{activityid}   -- Individual activity resource; usually discovered from collection
    /activities/@supportedFields            -- Returns all of the fields that the container supports on activity objects as an array in json and a repeated list in atom.
    

    For some reason it seems that this protocol does require a very precise lay out of the patterns of URLs. Now it is true that this is then meant to be specified in an XRDS document. But this document is not linked to from any of the representations as far as I can see. So there is some "out of band" information exchange that has happened and on which the rest of the protocol relies. Furthermore it ties the whole service again to one server. How open is a service which ties you to one server?

  • A REST API should never have "typed" resources that are significant to the client. Specification authors may use resource types for describing server implementation behind the interface, but those types must be irrelevant and invisible to the client. The only types that are significant to a client are the current representation's media type and standardized relation names. [ditto]

    Now clearly one does want to have URIs name resources, things, and these things have types. I think Roy is here warning against the danger that expectations are placed on types that depend on the resources themselves. This seems to be tied to the previous point that one should not have fixed resource names or hierarchies as we saw above. To see how this is possible check out my foaf file:

    
    $ cwm http://bblfish.net/people/henry/card --ntriples | grep knows | head
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://axel.deri.ie/~axepol/foaf.rdf#me> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://b4mad.net/FOAF/goern.rdf#goern> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://bigasterisk.com/foaf.rdf#drewp> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://crschmidt.net/foaf.rdf#crschmidt> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://danbri.org/foaf.rdf#danbri> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://data.boab.info/david/foaf.rdf#me> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://davelevy.info/foaf.rdf#me> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://dblp.l3s.de/d2r/page/authors/Christian_Bizer> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://dbpedia.org/resource/James_Gosling> .
        <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://dbpedia.org/resource/Roy_Fielding> .
    

    Notice that there is no pattern in the URIs to the right. (As it happens there are no ftp URLs there, but it would work just as well if there were). Yet the Tabulator extension for Firefox knows from the relations above alone that (if it believes my foaf file of course) the URIs to the right refer to people. This is because the foaf:knows relation is defined as

    
    @prefix foaf: <http://xmlns.com/foaf/0.1/> .
    
    foaf:knows  a rdf:Property, owl:ObjectProperty;
             :comment "A person known by this person (indicating some level of reciprocated interaction between the parties).";
             :domain <http://xmlns.com/foaf/0.1/Person>;
             :isDefinedBy <http://xmlns.com/foaf/0.1/>;
             :label "knows";
             :range foaf:Person .
    

    This information can then be used by a reasoner (such as the javascript one in the tabulator) to deduce that the resources pointed to by the URIs to the right and to the left of the foaf:knows relation are members of the foaf:Person class.

    Note also that there is no knowledge as to how those resources are served. In many cases they may be served by simple web servers sending resources back. In other cases the RDF may be generated by a script. Perhaps the resources could be generated by java objects served up by Jersey. The point is that the Tabulator does not need to know.

    Furthermore, the ontology information above is not out of band. It is GETable at the foaf:knows URIs itself. The name of the relation links to the information about the relations, which gives us enough to be able to deduce further facts. This is hypertext - hyperdata in this case - at its best. Compare that with the JSON example given above. There is no way to tell what that JSON means outside of the context of the totally misnamed 'Open Social RESTful API'. This is a limitation of JSON, or at least this name space less version. One would have to add a mime type to the JSON to make it clear that the JSON had to be interpreted in a particular manner for this application, but I doubt most JSON tools would know what to do with mime typed JSON versions. And do you really want to go through a mime type registration process every time a social networking application wants to add a new feature or interact with new types of data?

    as Roy summarizes in one one of the replies to this blog post:

    When representations are provided in hypertext form with typed relations (using microformats of HTML, RDF in N3 or XML, or even SVG), then automated agents can traverse these applications almost as well as any human. There are plenty of examples in the linked data communities. More important to me is that the same design reflects good human-Web design, and thus we can design the protocols to support both machine and human-driven applications by following the same architectural style.

    To get a feel of this it really helps to play with other hyperdata applications, other than ones residing in web browsers The semantic address book is one such, that I spent some time writing.

  • A REST API should be entered with no prior knowledge beyond the initial URI (bookmark) and set of standardized media types that are appropriate for the intended audience (i.e., expected to be understood by any client that might use the API). From that point on, all application state transitions must be driven by client selection of server-provided choices that are present in the received representations or implied by the user‚Äôs manipulation of those representations. The transitions may be determined (or limited by) the client's knowledge of media types and resource communication mechanisms, both of which may be improved on-the-fly (e.g., code-on-demand). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]

    That is the out of band point made previously, and confirms the point made about the danger of protocols that depend on URI patterns or resources that are somehow typed at the protocol level. You should be able to pick up a URI and just go from there. With the tabulator plugin you can in fact do just that on any of the URLs listen in my foaf file, or in other RDF.

What's the point?

Engineers under the spell of the client/server architecture, will find some of this very counter intuitive. This is indeed why Roy's thesis, and the work done by the people who engineered the web before that and whose wisdom is distilled in various writings by the Technical Architecture Group did something that was exceedingly original. These very simple principles that can feel unintuitive to someone who is not used to thinking at a global information scale, make a lot of sense when you do come to think at that level. When you do write such an Open system, that can allow people to access information globally, you want it to be such that you can send people a URI to any resource you are working with, so that both of you can speak about the same resource. Understanding what the resource that URL is about should be found by GETting the meaning of the URL. If the meaning of that URL depends on the way you accessed it, then you will no longer be able to just send a URL, but you will have to send 8 or 9 URLs with explanations on how to jump from one representation to the other. If some out of band information is needed to understand that one has to inspect the URL itself to understand what it is about, then you are not setting up an Open protocol, but a secret one. Secret protocols may indeed be very useful in some circumstances, and so as Roy points out may non RESTful ones be:

That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. That’s fine with me as long as you don’t call the result a REST API. I have no problem with systems that are true to their own architectural style.
but note: it is much more difficult for them to make use of the network effect: the value of information grows exponentially with its ability to be linked to other information. In another reply to a comment Roy puts this very succinctly:
encoding knowledge within clients and servers of the other side’s implementation mechanism is what we are trying to avoid.

Monday Nov 10, 2008

Possible Worlds and the Web

Tim Berner's Lee pressed to define his creation said recently (from memory): "...my short definition is that the web is a mapping from URI's onto meaning".

Meaning is defined in terms of possible interpretations of sentences, also known as possible worlds. Possible Worlds under the guise of the 5th and higher dimensions are fundamental components of contemporary physics. When logic and physics meet we are in the realm of metaphysics. To find these two meet the basic architecture of the web should give anyone pause for thought.

The following extract from RDF Semantics spec is a good starting point:

The basic intuition of model-theoretic semantics is that asserting a sentence makes a claim about the world: it is another way of saying that the world is, in fact, so arranged as to be an interpretation which makes the sentence true. In other words, an assertion amounts to stating a constraint on the possible ways the world might be. Notice that there is no presumption here that any assertion contains enough information to specify a single unique interpretation. It is usually impossible to assert enough in any language to completely constrain the interpretations to a single possible world, so there is no such thing as 'the' unique interpretation of an RDF graph. In general, the larger an RDF graph is - the more it says about the world - then the smaller the set of interpretations that an assertion of the graph allows to be true - the fewer the ways the world could be, while making the asserted graph true of it.

A few examples may help here. Take the sentence "Barack Obama is the 44th president of the U.S.A". There are many many ways the world/universe/complete 4 dimensional space time continuum from the beginning of the universe to the end if there is one, yes, there are many ways the world could be and that sentence be true. For example I could not have bothered to write this article now, I could have written it just a little later, or perhaps even not at all. There is a world in which you did not read it. There is a world in which I went out this morning to get a baguette from one of the many delicious local french bakeries. The world could be all these ways and yet still Barack Obama be the 44th president of the United States.

In N3 we speak about the meaning of a sentence by quoting it with '{' '}'. So for our example we can write:

@prefix dbpedia: <http://dbpedia.org/resource/> .
{ dbpedia:Barack_Obama a dbpedia:President_of_the_United_States . } = :g1 .

:g1 is the set of all possible worlds in which Obama is president of the USA. The only worlds that are not part of that set, are the worlds where Obama is not President, but say McCain or Sarah Palin is. That McCain might have become president of the United States is quite conceivable. Both those meanings are understandable, and we can speak about both of them

@prefix dbpedia: <http://dbpedia.org/resource/> .
{ dbpedia:Barack_Obama a dbpedia:President_of_the_United_States . } = :g1 .
{ dbpedia:John_McCain a dbpedia:President_of_the_United_States . } = :g2 .
:g1 hopedBy :george .
:g2 feardedBy :george .
:g1 fearedBy :jane .

Ie. we can say that George hopes Barack Obama to be the 44th president of the United States, but that Jane fears it.

Assume wikipedia had a resource for each member of the list of presidents of the USA, and that we were pointing to the 44th element above. Then even though we can speak about :g1 and :g2, there is no world that fits them both: The intersection of both :g1 and :g2 is { } , the empty set, whose extension according to David Lewis' book on Mereology is the fusion of absolutely all possibilities. The thing that is everything and everywhere and around at all times. Ie. you don't make any distinction when you say that: you don't say anything.

The definition of meaning in terms of possible worlds, make a few things very simple to explain. Implication being one of them. If every president has to be human, then


@prefix log: <http://www.w3.org/2000/10/swap/log#> .
{ dbpedia:Barack_Obama a dbpedia:President_of_the_United_States . } log:implies { dbpedia:Barack_Obama a dbpedia:Human . }

Ie the set of possible worlds in which Obama is a president of the United States is a subset of the set of worlds in which he is Human. There are worlds after all where Barack is just living a normal Lawyer's life.

So what is this mapping from URIs to meaning that Tim Berners Lee is talking about? I interpret him as speaking of the log:semantics relation.


@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
log:semantics a rdf:Property;
         :label "semantics";
         :comment """The log:semantics of a document is the formula.
         achieved by parsing representation of the document.
          For a document in Notation3, log:semantics is the
          log:parsedAsN3 of the log:contents of the document.
          For a document in RDF/XML, it is parsed according to the
          RDF/XML specification to yield an RDF formula [snip]""";
         :domain foaf:Document;
         :range log:Formula .

Of course it is easier to automate the mapping from resources that return RDF based representations, but log:semantics can be applied to any document. Any web page, even those written in natural languages, have some semantics. It is just that they currently require very advanced wetware processors to interpret them. These can indeed be very specialised wetware processors, as for example those that one meets at air ports.

Friday Sep 12, 2008

RDF: Reality Distortion Field

Here is Kevin Kelly's presentation on the next 5000 days on the web, in clear easy English that every member of the family can watch and understand. It explains what the semantic web, also known as Web 3.0, is about and how it will affect technology and life on earth. Where is the web going? I can find no fault in this presentation.

This is a great introduction. He explains how Metcalf's law brought us to the web of documents and is leading us inexorably to a web of things, in which we will be the eyes and the hands of this machine called the internet that never stops running.
For those with a more technical mind, who want to see how this is possible, follow this up with a look at the introductory material to RDF.

Warning: This may change the way you think. Don't Panic! Things will seem normal after a while.

Thursday Jul 24, 2008

My Semantic Web BlogRoll

I have not had time to automate my blog roll publication yet. Here is the first step down that path. The following are the semantic web blogs I follow closely. I am sure I must be missing many others that are interesting. Though I already am way past the point of information overload. (For those in the same position here are some tips (via Danny))

AI3:::Adaptive Information - Atom
Mike Bergman on the semantic Web and structured Web
About the social semantic web - RSS
Web 2.0 - what's next?
Bnode - atom
bobdc.blog - RSS
Bob DuCharme's weblog, mostly on technology for representing and linking information.
Bill de hOra - atom
Bill de HOra's blog
captsolo weblog - RSS 1.0
CaptSolo weblog
connolly's blog - RSS
Dan Connolly's blog
Cloudlands - RSS
John Breslin's Blog
Daniel Lewis - RSS
A technological, personal, spiritual, and academic blog.
Dave Beckett - Journalblog - RSS 1.0
RDF and free software hacking
David Seth - RSS
Semantic Web &amp; my backyard
dowhatimean.net - RSS
Richard Cyganiak's Weblog
Elastic Grid Blog - RSS
The ultimate blog about the Elastic Grid solution...
Elias Torres - RSS
I'm working on a tagline. I promise.
Inchoate Curmudgeon - RSS
I'm getting there. What's the rush? It's about the journey, right?
Internet Alchemy - RSS
Seeing the world through RDF goggles since 2007
Kashori - RSS
Kingsley Idehen's Blog Data Space - RSS atom
Data Space Endpoint for - Knowledge, Information, and Raw Data
Les petites cases - Fourre-tout personnel virtuel de Got - RSS
Lost Boy - RSS 1.0
A journal of no fixed aims or direction by Leigh Dodds. If you see him wandering, point him in the direction of home.
Mark Wahl, CISA - RSS
Discussions on organizing principles for identity systems
Michael Levin's Weblog and Swampcast! - RSS
Software development, technobuzz, and everything else.
Minding the Planet - RSS
Nova Spivack's Journal of Unusual News & Ideas
More News - RSS
Nodalities - RSS
From Semantic Web to Web of Data
opencontentlawyer.com - RSS
copyright, content, and you
Perspectives - RSS
Interfaces, web sémantique, hypermédia
Planet Kiwi - RSS
... where all the KiwiKnows is!
Planet RDF - RSS
It's triples all the way down
Planete Web Semantique - RSS
French Semantic Web planet
Raw - RSS 1.0
Danny's linkiness
Rinke Hoekstra - RSS
"Time is nature's way to keep everything from happening at once." - John Wheeler
S is for Semantics - Atom
Dean Allemang's Blog - Check out our new book on the Semantic Web!
Semantic Focus - RSS
On the Semantic Web, Semantic Web technology and computational semantics
Semantic Wave - RSS
News feeds and commentary maintained by semantic web developer Jamie Pitts.
Semantic Web Interest Group Scratchpad - RSS
Semantic Web Interest Group IRC scratchpad where items mentioned and commented on in IRC get collected.
Semantic Web Wire - RSS
Comprehensive News Feed for Semantic Web.
semantic weltbild 2.0 (Building the Semantic Web is easier together) - RSS 1.0
Building the Semantic Web is easier together
SemanticMetadata.net - Atom
Speaking my mind - RSS
The whole is more than the sum
TagCommons - RSS
toward a basis for sharing tag data
TechBrew - RSS
Informative geekery on software and technology
Technical Ramblings - RSS
Ramblings of a GIS Hacker
Thinking Clearly - RSS
Make lots of money through stealth in shadows
W3C Semantic Web Activity News - RSS

I automated the creation of this blogroll by transforming the opml of my blog reader with the following xquery

declare namespace loc = "http://test.org/";

declare function loc:string($t as xs:string) {
             $t
};


<html>
<body>
<dl>
{
   for $outline in //outline
   order by $outline/@title
   return 
     <span>
          <dt><a href="{ $outline/@htmlUrl}">{ loc:string($outline/@text) }</a> - <a href="{ $outline/@xmlUrl}">{ loc:string($outline/@version)}</a> </dt>
          <dd>{ loc:string($outline/@description) }</dd>
     </span>
}
</dl>
</body>
</html>

I then had to edit a bit of the generated html by hand to make it presentable.

Thanks to the Oxygen editor for making this really easy to do.

Wednesday Jun 18, 2008

Firefox 3 is out

Firefox 3.0 is out. It looks really, really good! Get it here! and help set a world record :-)

Monday Apr 21, 2008

FOAF & SSL: creating a global decentralised authentication protocol

Following on my previous post RDFAuth: sketch of a buzzword compliant authentication protocol, Toby Inkster came up with a brilliantly simple scheme that builds very neatly on top of the Secure Sockets Layer of https. I describe the protocol shortly here, and will describe an implementation of it in my next post.

Simple global ( passwordless if using a device such as the Aladdin USB e-Token ) authentication around the web would be extremely valuable. I am currently crumbling under the number of sites asking me for authentication information, and for each site I need to remember a new id and password combination. I am not the only one with this problem as the data portability video demonstrates. OpenId solves the problem but the protocol consumes a lot of ssl connections. For hyperdata user agents this could be painfully slow. This is because they may need access to just a couple of resources per server as they jump from service to service.

As before we have a very simple scenario to consider. Romeo wants to find out where Juliette is. Juliette's hyperdata Address Book updates her location on a regular basis by PUTing information to a protected resource which she only wants her friends and their friends to have access to. Her server knows from her foaf:PersonalProfileDocument who her friends are. She identifies them via dereferenceable URLs, as I do, which themselves usually (the web is flexible) return more foaf:PersonalProfileDocuments describing them, and pointing to further such documents. In this way the list of people able to find out her location can be specified in a flexible and distributed manner. So let us imagine that Romeo is a friend of a friend of Juliette's and he wishes to talk to her. The following sequence diagram continues the story...

sequence diagram of RDF+SSL

The stages of the diagram are listed below:

  1. First Romeo's User Agent HTTP GETs Juliette's public foaf file located at http://juliette.net/. The server returns a representation ( in RDFa perhaps ) with the same semantics as the following N3:

    @prefix : <#> . 
    @prefix foaf: <http://xmlns.com/foaf/0.1/> .
    @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
    @prefix todo: <http://eg.org/todo#> .
    @prefix openid: <http://eg.org/openid/todo#> .
    
    <> a foaf:PersonalProfileDocument;
       foaf:primaryTopic :juliette ;
       openid:server <https://aol.com/openid/service>; # see The Openid Sequence Diagram .
    
    :juliette a foaf:Person;
       foaf:name "Juliette";
       foaf:openid <>;
       foaf:blog </blog>;    
       rdfs:seeAlso <https://juliette.net/protected/location>; 
       foaf:knows <http://bblfish.net/people/henry/card#me>,
                  <http://www.w3.org/People/Berners-Lee/card#i> .
    
    <https://juliette.net/protected/location> a todo:LocationDocument .
    

    Romeo's user agent receives this representation and decides to follow the https protected resource because it is a todo:LocationDocument.

  2. The todo:LocationDocument is at an https URL, so Romeo's User Agent connects to it via a secure socket. Juliette's server, who wishes to know the identity of the requestor, sends out a Certificate Request, to which Romeo's user agent responds with an X.509 certificate. This is all part of the SSL protocol.

    In the communication in stage 2, Romeo's user agent also passes along his foaf id. This can be done either by:

    • Sending in the HTTP header of the request an Agent-Id header pointing to the foaf Id of the user. Like this:
      Agent-Id: http://romeo.net/#romeo
      
      This would be similar to the current From: header, but instead of requiring an email address, a direct name of the agent would be required. (An email address is only an indirect identifier of an agent).
    • The Certificate could itself contain the Foaf ID of the Agent in the X509v3 extensions section:
              X509v3 extensions:
                 ...
                 X509v3 Subject Alternative Name: 
                                 URI:http://romeo.net/#romeo
      

      I am not sure if it would be correct use of the X509 Alternative names field. So this would require more standardization work with the X509 community. But it shows a way where the two communities could meet. The advantage of having the id as part of the certificate is that this could add extra weight to the id, depending on the trust one gives the Certificate Authority that signed the Certificate.

  3. At this point Juliette's web server knows of the requestor (Romeo in this case):
    • his alleged foaf Id
    • his Certificate ( verified during the ssl session )

    If the Certificate is signed by a CA that Juliette trusts and the foaf id is part of the certificate, then she will trust that the owner of the User Agent is the entity named by that id. She can then jump straight to step 6 if she knows enough about Romeo that she trusts him.

    Having Certificates signed by CA's is expensive though. The protocol described here will work just as well with self signed certificates, which are easy to generate.

  4. Juliette's hyperdata server then GETs the foaf document associated with the foaf id, namely <http://romeo.net/> . Romeo's foaf server returns a document containing a graph of relations similar to the graph described by the following N3:
    @prefix : <#> . 
    @prefix foaf: <http://xmlns.com/foaf/0.1/> .
    @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
    @prefix wot: <http://xmlns.com/wot/0.1/> .
    @prefix wotodo: <http://eg.org/todo#> .
    
    <> a foaf:PersonalProfileDocument;
        foaf:primaryTopic :romeo .
    
    :romeo a foaf:Person;
        foaf:name "Romeo";
        is wot:identity of [ a wotodo:X509Certificate;
                             wotodo:dsaWithSha1Sig """30:2c:02:14:78:69:1e:4f:7d:37:36:a5:8f:37:30:58:18:5a:
                                                 f6:10:e9:13:a4:ec:02:14:03:93:42:3b:c0:d4:33:63:ae:2f:
                                                 eb:8c:11:08:1c:aa:93:7d:71:01""" ;
                           ] ;
        foaf:knows <http://bblfish.net/people/henry/card#me> .
    
  5. By querying the semantics of the returned document with a SPARQL query such as
    PREFIX wot: <http://xmlns.com/wot/0.1/> 
    PREFIX wotodo: <http://eg.org/todo#> 
    
    SELECT { ?sig }
    WHERE {
        [] a wotodo:X509Certificate;
          wotodo:signature ?sig;
          wot:identity <http://romeo.net/#romeo> .
    }
    

    Juliette's web server can discover the certificate signature and compare it with the one sent by Romeo's user agent. If the two are identical, then Juliette's server knows that the User Agent who has access to the private key of the certificate sent to it, and who claims to be the person identified by the URI http://romeo.net/#romeo, is in agreement as to the identity of the certificate with the person who has write access to the foaf file http://romeo.net/. So by proving that it has access to the private key of the certificate sent to the server, the User Agent has also proven that it is the person described by the foaf file.

  6. Finally, now that Juliette's server knows an identity of the User Agent making the request on the protected resource, it can decide whether or not to return the representation. In this case we can imagine that my foaf file says that
     @prefix foaf: <http://xmlns.com/foaf/0.1/> .
    
     <http://bblfish.net/people/henry/card#me> foaf:knows <http://romeo.net/#romeo> .  
     
    As a result of the policy of allowing all friends of Juliette's friends to be able to read the location document, the server sends out a document containing relations such as the following:
    @prefix contact: <http://www.w3.org/2000/10/swap/pim/contact#> .
    @prefix : <http://juliette.org/#> .
    
    :juliette 
        contact:location [ 
              contact:address [ contact:city "Paris";
                                contact:country "France";
                                contact:street "1 Champs Elysees" ]
                         ] .
    

Todo

  • Create an ontology for X509 certificates.
  • test this. Currently there is some implementation work going on in the so(m)mer repository in the misc/FoafServer directory.
  • Can one use the Subject Alternative name of an X509 certificate as described here?
  • For self signed certificates, what should the X509 Distinguished Name (DN) be? The DN is really being replaced here by the foaf id, since that is where the key information about the user is going to be located. Can one ignore the DN in a X509 cert, as one can in RDF with blank nodes? One could I imagine create a dummy DN where one of the elements is the foaf id. These would at least, as opposed to DN, be guaranteed to be unique.
  • what standardization work would be needed to make this

Discussion on the Web

About

bblfish

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today