Data Portability: The Video
By bblfish on Jan 15, 2008
Here is an excellent video to explain the problem faced by Web 2.0 companies and what Data Portability means. It is amazing how a good video can express something so much more powerfully, so much more directly than words can. Sit back and watch.
DataPortability - Connect, Control, Share, Remix from Smashcut Media on Vimeo.
Feeling better? You are gripped by the problem? Good. You should now find that my previous years posts start making a lot more sense :-)
Will the Data Portability group get the best solution together? I don't know. The problem with the name they have chosen is that it is so general, one wonders whether XML is not the solution to their problem. Won't XML make data portability possible, if everyone agrees on what they want to port? Of course getting that agreement on all the topics in the world is a never ending process.... Had they retained the name of the original group this stemmed from, Social Network Portability then one could see how to tackle this particular issue. And this particular issue seems to be the one this video is looking at.
But the question is also whether portability is the right issue. Well in some ways it is. Currently each web site has information locked up in html formats, in natural language (or even sometimes in jpegs (see the previous story of Scoble and Facebook), in order to make it difficult to export the data, which each service wants to hold onto as if it was theirs to own.
Another way of looking at this is that the Data Portability group cannot so much be about technology as policy. The general questions it has to address are question of who should see what data, who should be able to copy that data, and what they should be able to do with it. This does indeed involve identity technology insofar as all of the above questions turn around questions of identity ("who?"). Now if every site requires one to create a new identity in order to access one's data one has the nightmare scenario depicted in the video, where one has to maintain one's identity across innumerable sites. As a result the policy issue of Data Portability does require one to solve the technical problem of distributed identity: how can people maintain the minimum number of identities on the web? (ie not one per site) Another issue that follows right upon the first is that if one wants information to only be visible to a select group of people - the "who sees what" part of the question - then one also needs a distributed way to be able to specify group membership, be it friendship based or other. The video again makes that point very clearly why having to recreate one's social network on every site is impractical.
What may be misleading about the term Data Portability is that it may lead one to think that what one wants is to copy one's social information from one social service to another. That would just automate the job of what the video illustrates people having to do by hand currently. But that is not a satisfactory solution. Because one cannot extract a graph of information from one space to another without loss. If I extract my friends from LinkedIn into FaceBook, it is quite certain that Facebook will not recognise a large number of the people I know on LinkedIn. Furthermore the ported information on FaceBook would soon be out of date, as people updated their network and profiles on LinkedIn. Unless of course Facebook were able to make a constant copy of the information on LinkedIn. But that's impossible right? Wrong! That is the difference between copy by value and copy by reference. If FaceBook can refer to people on LinkedIn, then the data will always be as up to date as it can be. So this is how one moves from DataPortability to Linked Data, also known as hyper data.