By bblfish on May 26, 2007
I woke up this morning with a large number of comments to my previous post "Duck Typing Done right" . It would be confusing to answer them all together in the comments section there, so I aggregated my responses here.
I realize the material covered here is very new to many people. Luckily it is very easy to understand. For a quick introduction see my short Video introduction to the Semantic Web.
Also I should mention that the RDF is a declarative framework. So its relationship to method Duck Typing is not a direct one. But nevertheless there is a lot to learn by understanding the simplicity of the RDF framework.
On the reference of Strings
Kevin asks why the URI "http://a.com/Duck" is less ambigous than a string "Duck". In one respect Kevin is completely correct. In RDF they are both equally precise. But what they refer to is quite different from what one expects. The string "Duck" refers to the string "Duck". A URI on the other hand refers to the resource identified by it; URIs stand for Universal Resource Identifiers after all. The URI "http://a.com/Duck", as defined above, refers to the set of Ducks. How do you know? Well you should be able to GET <http://a.com/Duck> and receive a human or machine representation for it, selectable via content negotiation. This won't work in the simple examples I gave in my previous post, as they were just quick examples I hacked together by way of illustration. But try GETing <http://xmlns.com/foaf/0.1/knows> for a real working example. See my longer post on this issue GET my meaning?
Think about the web. Everyday you type in URLs into a web browser and you get the page you want. When you type "http://google.com/" you don't sometimes get <http://altavista.com>. The web works as well as it does, because URLs identify things uniquely. Everyone can mint their own if they own some section of the namespace, and PUT the meaning for that resource at that resource's location.
On Ambiguity and Vagueness
Phil Daws is correct to point out that URIs don't remove all fuzziness or vagueness. We can have fuzzy or vague concepts, and that is a good thing. foaf:knows (<http://xmlns.com/foaf/0.1/knows> ) whilst unambigous is quite a fuzzily defined relation. If you click on it's URL this is what you will get:
We take a broad view of 'knows', but do require some form of reciprocated interaction (ie. stalkers need not apply). Since social attitudes and conventions on this topic vary greatly between communities, counties and cultures, it is not appropriate for FOAF to be overly-specific here.
If someone foaf:knows a person, it would be usual for the relation to be reciprocated. However this doesn't mean that there is any obligation for either party to publish FOAF describing this relationship. A foaf:knows relationship does not imply friendship, endorsement, or that a face-to-face meeting has taken place: phone, fax, email, and smoke signals are all perfectly acceptable ways of communicating with people you know.
You probably know hundreds of people, yet might only list a few in your public FOAF file. That's OK. Or you might list them all. It is perfectly fine to have a FOAF file and not list anyone else in it at all. This illustrates the Semantic Web principle of partial description: RDF documents rarely describe the entire picture. There is always more to be said, more information living elsewhere in the Web (or in our heads...).
Since foaf:knows is vague by design, it may be suprising that it has uses. Typically these involve combining other RDF properties. For example, an application might look at properties of each foaf:weblog that was foaf:made by someone you "foaf:knows". Or check the newsfeed of the online photo archive for each of these people, to show you recent photos taken by people you know.
For more information on this see my post "Fuzzy thinking in Berkeley"
Paddy worries that this requires a Universal Class Hierarchy. No worries there. The Semantic Web is designed to work in a distributed way. People can grow their vocabularies, just like we all have grown the web by each publishing our own files on it. The Semantic Web is about linked data. The semantic web does not require UFOs (Unified Foundational Ontologies) to get going, and it may never need them at all, though I suspect that having one could be very helpful. See my longer post UFO's seen growing on the Web.
Relations are first class objects
Paddy and Jon Olson were mislead by my uses of classes to think that RDF ties relations/properties to classes. They don't. Relations in RDF are first class citizens, as you may see in the Dublin Core metadata initiative, which defines a set of very simple and very general relations to describe resources on the web, such as
dc:created etc... I think we need a
:sparql relation that would relate anything to an authoritative SPARQL endpoint, for example. There clearly is no need to constrain the domain of such a relation in any way.
Scalability and efficiencyJon Olson agrees with me that duck typing is good enough for some very large and good software projects. One of my favorite semantic web tools for example is cwm, which is written in python. When I say Duck Typing does not scale as implemented in those languages, I mean really big scale, like you know, the web. URIs is what has allowed the web to scale to the whole planet, and what will allow it to scale into structured data way beyond what we may even be comfortable imagining right now. This is not over engineered at all as Eric Biesterfeld fears. In fact it works because it gets the key elements right. And they are very simple as I demonstrated in my recent JavaOne BOF. The key concepts are:
- URIs refer to resources,
- resources return representations,
- to describe something on the web one needs to
- refer to the thing one wishes to describe, and that requires a URI,
- second specify the property relation one wishes to attribute to it (and that also requires a URI)
- and finally specify the value of that property.
An anonymous writer mentions the "ugliness" of the syntax. This is not a problem. The semantic web is about semantic (see the illustration on this post) It defines the relationship of a string to what it names. It does not require a specific syntax. If you don't like the xml/rdf syntax, which most people think is overly complicated, then use the N3 syntax, or come up with something better.
On Other Languages
As mentioned above there need not be one syntax for RDF. Of course it helps in communication if we agree on something, and currently, for better of for worse that is rdf/xml.
But that does not mean that other procedural languages cannot play well with it. They can since the syntax is not what is important, but the semantics, and those are very well defined.
There are a number of very useful bindings in pretty much every language. From Franz lisp to the redland library for c, python, perl, ruby, to Prolog bindings, and many Java bindings such as Sesame and Jena. Way too much to list here. For a very comprehensive overview see Mike Bergman's full survey of Semantic tools.