Data Portability: Scoble Right or Wrong and beyond
By bblfish on Jan 06, 2008
In this video Scoble explains how he got thrown off Facebook.
Here is a short summary, but the video is well worth watching as the emotions come through much better...
Facebook, which asks its users for their Gmail password in order to extract all the contacts someone has from their mail history and build up a possible list of friends, Facebook which scans the web for information to suggests friendships you may have, that same Facebook does not want anyone, including YOU, to be able to extract the data in your account on their web site even were it only into your own electronic address book. To do this they encode all email addresses as images which make it very difficult for a computer to decode, and so makes it tedious to move and use that information. So when Scoble tried to extract his 5000 friends using Optical Character Recognition - an idea suggested by Plaxo which wants to be a hub of people information - , Facebook noticed this and cut off his account. (I think he may have been reinstated now - but whether there is a point in belonging to such a service is a serious question now).As a result Scoble and other have asked people to join the conversation on the Data Portability group.
This clearly is a very important issue. But his solution to the issue was not the best one. By using Plaxo - which wants to be the social graph hub of the web - to extract his data, he would have been able to do what clearly he should be able to do, namely add his contact information easily to Outlook. But he did this at the cost of allowing a third entity to gather a lot of information about him and his contacts. CNET's The Scoble scuffle: Facebook, Plaxo at odds over data portability, touches on the issue. Allowing a third service provider to extract all your data in order to give you access to it, is not improving your freedom. It is just giving another commercial entity access to a huge network of information about you. And the more a company knows about its users the more valuable the advertising its sells becomes. There is no mystery here as to why Social Networking sites have had so much money pumped into them over the last few years. So you have jumped out of the frying pan right into the fire here. Clearly if you are concerned with security of your information - with Facebook you had one commercial entity that had a lot more information about you than it should - now you have two.
Really what you want is the following:
- Selectivity in who gets what information about you:
- Strangers should be able to see the minimum information I want to make public.
- acquaintances should see more
- family should see other information
- ... these policies should be flexible and determinable by the owner of the information, by the person making the speech act of affirming it.
- Link to friends wherever they are. After all if you have to go through one central aggregator of relationship information, then that aggregator will have a view of all the relationship information available, giving one actor complete and overwhelming advantage as opposed to everyone else. You need distributed data, also known as linked data or hyperdata.
- An Open Data structure so as to allow ecosystems to grow and use that information. I want the tools on my computer to all be able to work with my social network information.
- A way to determine trust
Allowing different people to see more or less information (point 1 above) should be quite easy to set up by having the server return different representation depending on who is viewing the information, determined by their having logged in to your site with something like OpenId. Linking information in a distributed way is easy using Semantic Web technologies, and is demonstrated by tools such as Beatnik. Beatnik is just one of the tools that could use such information on my desktop (thereby fulfilling point 3 above).
What you say, out loudly or on your web site is a speech act. All information is the speech act of some one, and it is this that allows us to determine our level of trust it in. This is also why one should try to say less rather than more, since every piece of information one publishes is information one may have to defend. It is therefore much better if we have a system where everyone can look after a small part of the graph of information they have a responsibility for and defend it. They can then point to information maintained by other people, who will have to defend their piece. But since pointing to information maintained by others is a vote of confidence in them, an economy of links will emerge whereby people want to increase the number of quality links to them, which will only happen if they are deemed trustworthy. So the system allows for distributed trust. For a simple but excellent example see the Distributed Information Group wiki's policy for allowing people to post.