Sunday Nov 29, 2009

Web Finger proposals overview

If all you had was an email address, would it not be nice to be able to have a mechanism to find someone's home page or OpenId from it? Two proposals have been put forward to show how this could be done. I will look at them and add a sketch of my own that hopefully should lead us to a solution that takes the best of both proposals.

The WebFinger GoogleCode page explains what webfinger is very well:

Back in the day you could, given somebody's UNIX account (email address), type
$ finger email@example.com 
and get some information about that person, whatever they wanted to share: perhaps their office location, phone number, URL, current activities, etc.

The new ideas generalize this to the web, by following a very simple insight: If you have an email address like henry.story@sun.com, then the owner of sun.com is responsible for managing the email. That is the same organization responsible for managing the web site http://sun.com. So all that is needed is some machine readable pointer from http://sun.com/ to a lookup giving more information about owner of the email address. That's it!

The WebFinger proposal

The WebFinger proposed solution showed the way so I will start from here. It is not too complicated, at least as described by John Panzer's "Personal Web Discovery" post.

John suggests that there should be a convention that servers have a file in the /host-meta root location of the HTTP server to describe metadata about the site. (This seems to me to break web architecture. But never mind: the resource http://sun.com/ can have a link to some file that describes a mapping from email ids to information about it.) The WebFinger solution is to have that resource be in a new application/host-meta file format. (not xml btw). This would have mapping of the form

Link-Pattern: <http://meta.sun.com/?q={%uri}>; 
    rel="describedby";type="application/xrd+xml"
So if you wanted to find out about me, you'd be able to do a simple HTTP GET request on http://meta.sun.com/?q=henry.story@sun.com, which will return a representation in another new application/xrd+xml format about the user.

The idea is really good, but it has three more or less important flaws:

  • It seems to require by convention all web sites to set up a /host-meta location on their web servers. Making such a global requirement seems a bit strong, and does not in my opinion follow web architecture. It is not up to a spec to describe the meaning of URIs, especially those belonging to other people.
  • It seems to require a non xml application/host-meta format
  • It creates yet another file format to describe resources the application/xrd+xml. It is better to describe resources at a semantic level using the Resouces Description Framework, and not enter the format battle zone. To describe people there is already the widely known friend of a friend ontology, which can be clearly extended by anyone. Luckily it would be easy for the XRD format to participate in this, by simply creating a GRDDL mapping to the semantics.

All these new format creation's are a real pain. They require new parsers, testing of the spec, mapping to semantics, etc... There is no reason to do this anymore, it is a solved problem.

But lots of kudos for the good idea!

The FingerPoint proposal

Toby Inkster, co inventor of foaf+ssl, authored the fingerpoint proposal, which avoids the problems outlined above.

Fingerpoint defines one useful relation sparql:fingerpoint relation (available at the namespace of the relation of course, as all good linked data should), and is defined as

sparql:fingerpoint
	a owl:ObjectProperty ;
	rdfs:label "fingerpoint" ;
	rdfs:comment """A link from a Root Document to an Endpoint Document 
                        capable of returning information about people having 
                        e-mail addresses at the associated domain.""" ;
	rdfs:subPropertyOf sparql:endpoint ;
	rdfs:domain sparql:RootDocument .
It is then possible to have the root page link to a SPARQL endpoint that can be used to query very flexibily for information. Because the link is defined semantically there are a number of ways to point to the sparql endpoint:
  • Using the up and coming HTTP-Link HTTP header,
  • Using the well tried html <link> element.
  • Using RDFa embedded in the html of the page
  • By having the home page return any other represenation that may be popular or not, such as rdf/xml, N3, or XRD...
Toby does not mention those last two options in his spec, but the beauty of defining things semantically is that one is open to such possibilities from the start.

So Toby gets more power as the WebFinger proposal, by only inventing 1 new relation! All the rest is already defined by existing standards.

The only problem one can see with this is that SPARQL, though not that difficult to learn, is perhaps a bit too powerful for what is needed. You can really ask anything of a SPARQL endpoint!

A possible intermediary proposal: semantic forms

What is really going on here? Let us think in simple HTML terms, and forget about machine readable data a bit. If this were done for a human being, what we really would want is a page that looks like the webfinger.org site, which currently is just one query box and a search button (just like Google's front page). Let me reproduce this here:

Here is the html for this form as its purest, without styling:

     <form  action='/lookup' method='GET'>
         <img src='http://webfinger.org/images/finger.png' />
         <input name='email' type='text' value='' />         
         <button type='submit' value='Look Up'>Look Up</button>
     </form>

What we want is some way to make it clear to a robot, that the above form somehow maps into the following SPARQL query:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?homepage
WHERE {
   [] foaf:mbox ?email;
      foaf:homepage ?homepage
}

Perhaps this could be done with something as simple as an RDFa extension such as:

     <form  action='/lookup' method='GET'>
         <img src='http://webfinger.org/images/finger.png' />
         <input name='email' type='text' value='' />         
         <button type='submit' value='homepage' 
                sparql='PREFIX foaf: <http://xmlns.com/foaf/0.1/> 
                 GET ?homepage
                 WHERE {
                   [] foaf:mbox ?email;
                      foaf:homepage ?homepage
                 }">Look Up</button>
     </form>

When the user (or robot) presses the form, the page he ends up on is the result of the SPARQL query where the values of the form variables have been replaced by the identically named variables in the SPARQL query. So if I entered henry.story@sun.com in the form, I would end up on the page http://sun.com/lookup?email=henry.story@sun.com, which could perhaps just be a redirect to this blog page... This would then be the answer to the SPARQL query

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?homepage
WHERE {
   [] foaf:mbox "henry.story@bblfish.net";
      foaf:homepage ?homepage
}
(note: that would be wrong as far as the definition of foaf:mbox goes, which relates a person to an mbox, not a string... but let us pass on this detail for the moment)

Here we would be defining a new GET method in SPARQL, which find the type of web page that the post would end up landing on: namely a page that is the homepage of whoever's email address we have.

The nice thing about this is that as with Toby Inkster's proposal we would only need one new relation from the home page to such a finder page, and once such a sparql form mapping mechanism is defined, it could be used in many other ways too, so that it would make sense for people to learn it. For example it could be useful to make web sites available to shopping agents, as I had started thinking about in RESTful semantic web services before RDFa was out.

But most of all, something along these lines, would allow services to have a very simple CGI to answer such a query, without needing to invest in a full blown SPARQL query engine. At the same time it makes the mapping to the semantics of the form very clear. Perhaps someone has a solution to do this already. Perhaps there is a better way of doing it. But it is along these lines that I would be looking for a solution...

(See also an earlier post of mine SPARQLing AltaVista: the meaning of forms)

How this relates to OpenId and foaf+ssl

One of the key use cases for such a Web Finger comes from the difficulty people have of thinking of URLs as identifiers of people. Such a WebFinger proposal if successful, would allow people to type in their email address into an OpenId login box, and from there the Relying Party (the server that the user wants to log into), could find their homepage (usually the same as their OpenId page), and from there find their FOAF description (see "FOAF and OpenID").

Of course this user interface problem does not come up with foaf+ssl, because by using client side certificates, foaf+ssl does not require the user to remember his WebID. The browser does that for him - it's built in.

Nevertheless it is good that OpenId is creating the need for such a service. It is a good idea, and could be very useful even for foaf+ssl, but for different reasons: making it easy to help people find someone's foaf file from the email address could have many very neat applications, if only for enhancing email clients in interesting new ways.

Updates

It was remarked in the comments to this post that the format for the /host-meta format is now XRD. So that removes one criticism of the first proposal. I wonder how flexible XRD is now. Can it express everything RDF/XML can? Does it have a GRDDL?

Saturday Jul 25, 2009

Saving Face: The Privacy Architecture of Facebook

In his very interesting thesis draft Saving Face: The Privacy Architecture of Facebook, Chris Peterson, describes through a number of real life stories some very subtle and interesting issues concerning privacy and context that arose during the rapid evolution of the now 250 million member social network.

Perhaps the most revealing of these stories is that of Junior High School student Rachel who broadcast the following distress status message my grandmother just friend requested me. no Facebook, you have gone too far! Chris Peterson develops: Rachel and her grandmother are close. She trusts her grandmother. She confides in her grandmother. She tells her grandmother "private" things. She is certainly closer to her grandmother than many of her Facebook Friends. So what's the big deal? Rachel explains:

Facebook started off as basically an online directory of COLLEGE STUDENTS. I couldn't wait until I had my college email so that I could set up an account of my own, since no other emails would give you access to the site. Now, that was great. One could [meet] classmates online or stay in touch with high school mates [but it] has become a place, no longer for college students, but for anyone. [About] five days ago, the worst possible Facebook scenario occurred, so bizarre that it hadn't even crossed my mind as possible. MY GRANDMOTHER!? How did she get onto facebook?...As my mouse hovered between the accept and decline button, images flashed through my mind of sweet Grandma [seeing] me drinking from an ice luge, tossing ping pong balls into solo cups full of beer, and countless pictures of drunken laughter, eyes half closed. Disgraceful, I know, but these are good memories to me. To her, the picture of my perfectly angelic self, studying hard away at school, would be shattered forever.

The paper is full of legally much more serious stories, but this one is especially revealing as it makes apparent how the flat friendship relation on Facebook does not take into account the context of the relationship. Not all frienships are equal. Most people have only very few friends they can tell everything to. And most often one tells very different stories to different groups of friends. In the physical world we intuitively understand how to behave in different contexts. One behaves one way in church, another in the bar, and yet another way in front of one's teachers, or parents. The context in real life is set by the architecture of the space we are in (something Peter Sloterdijk develops at length in his philosophical trilogy Spheres). The space in which we are speaking and the distance others have to us guides us in what we should say, and how loud we can say it. On Facebook all your friends get to see everything you say.

It turns out that it is possible to create an equivalent contextual space on Facebook using a little know and recently added feature, which allows one to build groups of friends and specify access control policies on posts per group. Chris shows clearly that this by itself is not enough: it requires a much more thorough embedding in the User Interface so that the intuitive feel one has in real life for who hears what and to whom one is speaking is available with the same clarity in the digital space. In the later part of the thesis Chris explores what such a User Interface would need to do to enable a similarly intuitive notion of space to be available.

Applications to the Social Web

One serious element of the privacy architecture of Facebook (and other similar social networks) not covered by this thesis, yet that has a very serious impact in a very large number of domains, is the constant presence of a third party in the room: Facebook itself. Whatever you say on these Social Networks, is visible not only to your group of friends, but also to Facebook itself, and indirectly to its advertisers. Communicating in Facebook puts one then in a similar frame of mind to what people in the middle ages would have been in, when mankind was under the constant, omnipotent and omniscient presence of God who could read every thought, even the most personal. Except that this God is incorporated and has a stock market value fluctuating daily.

For those who wish to escape such an omni-presence yet reap the benefits of online electronic communication, the only solution lies in the development of distributed secure social networks, of a Social Web where every body could own what they say and control who sees it. It turns out that this is possible with semantic web technologies such as foaf and access control mechanisms based on ssl.

One very positive element I take from this thesis is that the minimal technical building blocks for reconstituting a sense of context is the notion of a group and access control of resources. In a the Social Web we should be able to reconstitute this using the foaf:Group class and foaf+ssl for access control. On this basis Chris Peterson's user interface suggestions should be applicable in a distributed social network.

All in all then I found this thesis to be very rewarding and a very interesting read. I recommend it to all people interested in the Social Web.

Saturday Jan 17, 2009

foaf+ssl: creating a web of trust without key signing parties

The concept of a Web of Trust is most closely associated with Phil Zimmerman and PGP. The basic idea is that by signing each other's keys, usually at things like key signing parties, people could grow the network of keys they trusted to sign or encrypt documents such as email, sign legal documents, etc... The distributed system of trust feels right, but the idea never really took off - even though the keysigning parties must have been fun - probably because they still required physical presence. Another problem with the PGP web of trust, is that the signers of your key, or your signature on someone else's key will forever be published on one of the many key servers, making it close to impossible to revoke an association once published.

In foaf+ssl we are also using a Web of Trust mechanism, but as I will show here, this does not require key signing. It should therefore be able to grow much faster, and hopefully give us the same benefits. The friendship relations are furthermore not embedded in the signature. They can be made to be only visible to those people you wish to make it visible to, and these can be changed at any moment.

I wrote this rather long post as I was starting to answer a question John Kemp asked in the comments of my duck rabbit post on the topic of authorization in foaf+ssl. As I found the answer was getting long and longer, I decided this justified its own blog entry. So I published it here instead.

John Asked a question that forced me to detail how the trust mechanism in foaf+ssl works. Here it is:

The problem (I think) with how you use the certs [in foaf+ssl] though is that the real trust (if Juliet does not know Romeo a priori) is that Juliet's friends know Romeo, and when I say "know", I don't mean that in any cryptographic sense (Juliet's friends haven't signed Romeo's key/key fingerprint for example). Why wouldn't it then be enough for Juliet to base her trust on the appearance of Romeo's OpenID in her friends' FOAF files, for example?

I like the web of trust model, but in order for there to be verifiable trust based on certs/keys, don't you also need key/cert/fingerprint signing parties?

Since this is going to require thinking carefully about the foaf+ssl protocol, we may as well have its UML sequence diagram in front of our eyes. Here it is:

Remember that at stage 5 Juliet's server knows only the following about <https://romeo.net/#romeo>

  1. The Agent connecting via ssl has access to the private key that matches the public key sent in the cert (because otherwise he could not have signed the cert, and could not have established the ssl connection)
  2. The Agent wishes to be identified as "https://romeo.net/#romeo"
  3. Dereferencing the information resource <https://romeo.net/#romeo> returns the document <https://romeo.net/> which states that anyone who can proove to have the private key for the given public key is <https://romeo.net/#romeo>

Juliet's server can then conclude that the Agent making the request is indeed <https://romeo.net/#romeo> - whoever or whatever that is. Juliet's server can be as confident in this fact as the cryptography algorithms allow her to, which is pretty good.

So at this point we have something between identification and authentication, I am not sure.

When Basing Trust on OpenId does not work

Other statements returned by <https://romeo.net/> are extra claims and are as trustworthy as statements made by <https://romeo.net/#romeo>. They may just by themselves solve some really interesting puzzles for Juliet's server that by could make it trust <https://romeo.net/#romeo> more than anyone else it ever met -- but web servers of this intelligence are unlike any I have yet seen.

More realistically, the question for Juliet's server is whether it should authorize access to the protected resource. If Juliet's friends identify Romeo indirectly via his OpenId, and using that only, with something like the following relations in N3

  @prefix foaf: <http://xmlns.com/foaf/0.1/> .
  @prefix : <#> .

  :anne foaf:knows _:bN .
  _:bN foaf:name "Romeo";
       foaf:openid <https://openid.sun.com/romeo> .

and Romeo publishes a relation he has to an OpenId too such as

  @prefix : <#> .  

  :romeo foaf:openid <https://openid.sun.com/openid> .

Then if Juliet's server wants to motivate access to the protected resource it would need to believe the following

  :anne foaf:knows :romeo .

which appears nowhere. It would have to be inferred from a statement such as

  _:bN = :romeo .

This can indeed be inferred from the statements in Romeo's file giving his OpenId, and the statement in :anne's file stating the relation of the blank node _:bN to the same OpenID, since foaf:openid is an inverse functional property. But what is the confidence Juliet's server can have in that? Well Juliet's server can be as confident of that as she is of the assertion made by <https://romeo.net/#romeo> that his openid is <https://openid.sun.com/romeo>. This can hardly be said to count as confirmatory evidence. If Juliet's server thinks like that, it might as well make the resource public. For it would be the equivalent of a prison officer freeing a prisoner solely on the basis of the claim that he is himself an officer.

Who should one trust?

On the other hand if Juliet's friend :anne had claimed that

   :anne foaf:knows <https://romeo.net/#romeo> . 

then Juliet's server would have had the piece of information needed to authorize access to the protected resource, because that information came from a trusted party.

So in summary, when Juliet's server is looking to evaluate the trust it can have in :romeo it should not ask :romeo himself . It should ask other people in the social networks she trusts. So the graphs it needs to search is everything except what is said by :romeo . Juliet's server can go on the following:

  • that it is speaking to <https://romeo.net/#romeo>
  • what Juliet believes
  • what Juliet believes of what her friends claim
  • the consequences of the claims it is able to or willing to calculate

Now the OpenId can in fact come in useful, but not directly as may have been hoped initially. Imagine that Juliet now has another resource that she only gives access to, to people known by two of her friends. If only one of her friends, say :jane makes the assertion that

  :romeo foaf:openid <https://openid.sun.com/romeo> .

but all other of her friends refer to :romeo indirectly, then her web server could use that information with the statement made by :anne to deduce that indeed at least two of Juliet's friends know :romeo.

The value of information published by Romeo

Now is it absolutely true that Juliet's server can do nothing with the information returned by the document <https://romeo.net/>? Not at all! It is just that it has to be used in an exploratory manner. Imagine a third resource that is accessible to friends of friends of Juliet's friends. We could imagine that <https://romeo.net/> had returned a list of friends for :romeo, perhaps with the following relations:

:romeo foaf:knows <https://jack.name/#jack>, <https://john.name/#john>, <https://duffy.duck/p/#dd> .

Juliet's server could decide to dereference (HTTP GET) both <https://jack.name/#jack> and <https://john.name/#john> because they were known by her friends and see if any of those claimed to know <https://romeo.net/#romeo>.

Special case: when the OpenId published by Romeo is trustworthy

This highlights then a special case where the OpenId published by Romeo can work correctly out of the box. This is when his OpenId is his foaf file. Ie when Romeo publishes something like:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

<http://romeo.net/> a foaf:PersonalProfileDocument;
            foaf:primaryTopic <http://romeo.net/#romeo> .

<http://romeo.net/#romeo> a foaf:Person;
            foaf:openid <http://romeo.net/> .
Here the foaf personal profile document is the same as the one returned when dereferencing Romeo's web id <http://romeo.net/#romeo>. It is returned in the same representation as the statements we are trusting on :romeo's public key, and assigning this relation to the same person. Since an OpenId is a resource controlled by the person whose openid it is, and since we believe that this resource is now controlled by <http://romeo.net/#romeo> I think we can safely say that Juliet's server knows :romeo's OpenId in this particular case too.

There is another case where the OpenId published by :romeo could work, even when the OpenId is not the same as the foaf:PersonalProfileDocument, but in this case it would require some verification. So let us get back to the situatuon described earlier in this post where Romeo's WebID claims that

  :romeo foaf:openid <https://openid.sun.com/romeo> .
For Juliet to believe this she would have to verify that http://romeo.net/#romeo controls https://openid.sun.com/romeo. This could be done very simply if Romeo has that OpenId page link back in some way to his foaf ID. Perhaps, as I suggested in "Foaf and Openid" this could be done simply by adding the following to the header of the OpenId page:
<link rel="meta" type="application/rdf+xml" title="FOAF" href="http://romeo.net/"/>
Juliet's web server would then be able to fetch the OpenId page, and having found the above link back to the :romeo's Personal Profile Document, it would be able to conclude that he controlled https://openid.sun.com/romeo, and so that it is a legitimate OpenId for him. With foaf+ssl Juliet's server can verify an OpenId with only one extra connection!

Web of Trust and Key Signing

All the above clearly shows that you can create a web of trust without key signing parties. Parties are nice, but requiring Key Signing parties is something that has seriously dampened the adoption of PGP and the web of trust. By just dragging and droping a URL from a web page, an email, another application into your foaf Address Book (be it web based or not), you can grow your web of trust much faster than Keysigning can. Furthermore you can change your public key when you no longer need it, loose it or whatever without needing to re-sign all your keys. This therefore adds to the security of your web of trust.

This is not to say that foaf+ssl is incompatible with key signing, btw. and it may be interesting to find out where this remains useful.

Wednesday Dec 17, 2008

python and php implementations of foaf+ssl

We now have two new implementations of foaf+ssl authentication protocol, in addition to the java one I blogged about earlier. If you have followed the procedure there to create your certificate, add it to your browser, and publish a minimal foaf file you can then try out these two servers.

Melvin Carvhalo, who owns the great domain name foaf.me, has implemented this in PHP in a very nicely layered fashion. In recent mail to the foaf protocols list he published the following end points:

  1. a test ssl resource will from a simple ssl connection that asks for the client certificate:
    • Display the output of the $_SERVER global variable
    • Display the details in the supplied Client Certificate
    • Display the Client Public Key info
    • Function returning the Client Public Key info in HEX
    • Function returning the subjectAltName in the Client Certificate
  2. foaf tester that after getting the URI in your certificate from the X509 v3 extensions section will fetch the foaf at that URL and
    • Convert the FOAF into an array of triples which it displays
    • Find the RSA Key of the declared subject ("owner") within a FOAF file
    • Get the list of friends in a FOAF file
  3. and finally the foaf+ssl tester, which Melvin pointed to in another email to the list, which will use the foaf+ssl protocol to log you into a server in one https connection. The server only does authentication and the minimal authorization: if it can authenticate you, then you are authorized

These three minimal services are very helpful as they allow us to detect and debug each stage in the protocol carefully. I highly recomment this step by step approach (and will therefore have to add this to my own examples!)

Ian Jacobi from MIT, has worked on extending authorization more with his python based server to also check your identity in a social network. See his detailed post on this "TAAC in action". Ian was in fact the first to have a running implementation I'd like to point out.

Keep these coming!

In the meantime I am working on authorization schemes, and am currently reading a complex paper Vladimir Kolovski, James Hendler, and Bijan Parsia entitled "Formalizing XACML Using Defeasible Description Logics". Clark Kendall is blogging about this under the policy management tag, which contains a less mathematical overview of the paper. I'll report back when I have managed to digest this. Read it if you need an antidote to twitter.

Thursday Dec 04, 2008

video on distributed social network platform NoseRub

I just came across this video on Twitter by pixelsebi explaining Distributed social networks in a screencast, and especially a php application NoseRub. Here is the video.


Distributed Social Networking - An Introduction from pixelsebi on Vimeo.

On a "Read Write Web" article on his video, pixelsebi summarizes how all these technologies fit together:

To sum it up - if I would have to describe it somebody who has no real clue about it at all:
  1. Distributed Social Networking is an architecture approach for the social web.
  2. DiSo and Noserub are implementations of this "social web architecture"
  3. OpenSocial REST API is one of many ways to provide data in this distributed environment.
  4. OpenOScial based Gadgets might run some time at any node/junction of this distributed environment and might be able to handle this distributed social web architecture.

So I would add that foaf provides semantics for describing distributed social networks, foaf+ssl is one way to add security to the system. My guess is that the OpenSocial Javascript API can be decoupled from the OpenSocial REST API and produce widgets however the data is produced (unless they made the mistake of tying it too closely to certain URI schemes)

About

bblfish

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today