Open up your databases! Web 2 is coming.

I have had a few questions regarding SPARQL where people are misunderstanding the intended end user of this technology. Clearly the target of a SPARQL service is not your mom and pops, who would need a point and click interface on a device adapted to their needs. No, the end users are developers. By making your database available through a SPARQL end point you are allowing software engineers to create very powerful apps from your data that may in different ways help grow your business. This is what Web 2 is all about. Here are a few examples.

Imagine you are You know that you don't have the resources to think of all the different ways you can sell your products. By making your catalog available for query in such a general way you enable some clever programmer a to create a tool that will end up helping you sell your goods, be they books, CDs, electronic equipment or any of the other numerous things amazon makes available. Amazon allready does this with their web services. SPARQL would just allow them to generalise on that, by making the query interface much more flexible, and dramatically reduce the bandwidth required to extract information from their database.

Imgine that your are imdb, the amazingly useful Internet Movie database. By making all the movie collection available trough SPARQL you give some entrepreneurial programmer out there the opportunity to create services that you had never yourself considered, or for which you don't posess all the elements, as that service may require bringing data together from disparate sources that you don't control [1]. If this new service that makes use of your data is successful, you will see it in your logs and the author of the application will want quality of service guarantees so that he can satisfy his own clients: he may end up paying you some money for that guarantee.

Imagine your are running the French railways and you make all your time table info available this way. Other travel services could use this information in the process of helping people organize their trips, and so increase usage of trains. Someone else could write some small apps specifically aimed at a particular audience whose aims and habits they understand better than you do. Just think of how trainspotters could put this information to use...

Imagine you are a hotel chain. Here again the travel agencies could find out if your rooms are available, and send you customers. GPS enabled phones are not far away: someone will certainly be prepared to integrate this information in order to help owners of cell phones locate a hotel in their vicinity that has rooms to their liking.

And if you are a student at a university searching for a quick project, why not create a simple courses ontology, that would describe where lectures takes place, which classes need attending, what subject they are about, and whatever else seems to make sense. The put up all the data up in a server and write a few little cool apps that make use of this information. One nice one would clearly be an OSX dashboard widget that queries the database to let you know which courses to go to next. :-) That would give you some good hands on experience with all the relevant technologies.

If you are big or have information that is valuable you can create your own ontology by yourself. If you are smaller you can be the first to do it, and others will likely follow. Otherwise you can work together with partners in your industry to create an ontology that reflects the objects your business is about. The beauty of SPARQL is that all the developer needs to know is your ontology, and he will know how to query your service. SPARQL does not deal with the booking part of it, for sure. That is where another technology will have to take the relay. But it need not be very complicated. A link pointing the user to a "buy" or "book" form would be enough for many cases.

The main point is that there are too many ways your information (if it is at all valuable) can be used for you to have the resources to explore even a small fraction of them. By opening up your database you enable the vast talent pool of humanity to use your resources in ways you could never have imagined. By allowing your information to be transformed in myriads of new services the value of the goods you offer, or of the information you provide will be dramatically increased.

[1] The way housingmaps used information from craigslist and google maps to create an entirely original service.

Update: there have been some very good comments this post, which have helped me tune my arguments.


proprietary companies like google, ebay, imdb/amazon, value highly their data silos. this is a bit like asking microsoft to open-source office. youre better building an alternative than appealing to their 'interesting mashup' sensibilities and hoping they give you a christmas gift. i do believe we can eliminate the need for google/yahoo/msft in the next 5 years but we'll need to solve some problems like efficient distributed hash tables of semantic data, caching and the like. sort of mashing up bittorrent/emule with rdf/sparql. it can be done..

Posted by carmen on November 13, 2006 at 09:50 PM CET #

Amazon is a good example that makes my point very well. The Amazon API exposes a huge amount of data to the end user in a machine processable form. By opening up their databases they enable new services to be built in micro markets that are too small for them to be bothered with. Since the bulk of the internet is in the long tail, they gain by allowing others to explore the many micro markets that take place therein. By exposing their data, they leverage Metcalf's law to their advantage: selling more of their products.

The semantic web tools such as RDF and SPARQL would have simply made their life easier in that they would not have needed to invent both an xml format and a query language to do their job. At the time this was not an option as SPARQL was not available. They are big enough for this not to be a big problem for them. But as the number of these services grow, and as more people copy their work, the advantages to both the publishers of information and the consumers of this information of using Semantic Web technologies will become more and more evident.

Posted by Henry Story on November 13, 2006 at 11:03 PM CET #

Post a Comment:
Comments are closed for this entry.



« June 2016