Duck Typing done right
By bblfish on May 25, 2007
Dynamic Languages such as Python, Ruby and Groovy, make a big deal of their flexibility. You can add new methods to classes, extend them, etc... at run time, and do all kinds of funky stuff. You can even treat an object as of a certain type by looking at it's methods. This is called Duck Typing: "If it quacks like a duck and swims like a Duck then it's a duck", goes the well known saying. The main criticism of Duck Typing has been that what is gained in flexibility is lost in precision: it may be good for small projects, but it does not scale. I want to show here both that the criticism is correct, and how to overcome it.
Let us look at Duck Typing a little more closely. If something is a bird that quacks like a duck and swims like a duck, then why not indeed treat it like a duck? Well one reason that occurs immediately, is that in nature there are always weird exceptions. It may be difficult to see the survival advantage of looking like a duck, as opposed to say looking like a lion, but one should never be surprised at the surprising nature of nature.
Anyway, that's not the type of problem people working with duck typing ever have. How come? Well it's simple: they usually limit the interactions of their objects to a certain context, where the objects being dealt with are such that if any one of them quacks like a duck, then it is a duck. And so here we in essence have the reason for the criticism: In order for duck typing to work, one has to limit the context, one has to limit the objects manipulated by the program, in such a way that the duck typing falls out right. Enlarge the context, and at some point you will find objects that don't fit the presuppositions of your code. So: for simple semantic reasons, those programs won't scale. The more the code is mixed and meshed with other code, the more likely it is that an exception will turn up. The context in which the duck typing works is a hidden assumption, usually held in the head of the small group of developers working on the code.
A slightly different way of coming to the same conclusion, is to realize that these programming languages don't really do an analysis of the sound of quacking ducks. Nor do they look at objects and try to classify the way these are swimming. What they do is look at the name of the methods attached on an object, and then do a simple string comparison. If an object has the
swim method, they will assume that
swim stands for the same type of thing that ducks do. Now of course it is well established that natural language is ambiguous and hence very context dependent. The methods names gain their meaning from their association to english words, which are ambiguous. There may for example be a method named
swim, where those letters stand for the acronym "See What I Mean". That method may return a link to some page on the web that describes the subject of the method in more detail, and have no relation to water activities. Calling that method in expectation of a sound will lead to some unexpected results
But once more, this is not a problem duck typing programs usually have. Programmers developing in those languages will be careful to limit the execution of the program to only deal with objects where
swim stand for the things ducks do. But it does not take much for that presupposition to fail. Extend the context somewhat by loading some foreign code, and at some point these presuppositions will break down and nasty difficult to locate bugs will surface. Once again, the criticism of duck typing not being scalable is perfectly valid.
So what is the solution? Well it requires one very simple step: one has to use identifiers that are context free. If you can use identifiers for swimming that are universal, then they will alway mean the same thing, and so the problem of ambiguity will never surface. Universal identifiers? Oh yes, we have those: they are called URIs.
Here is an example. Let us
- name the class of ducks
<http://a.com/Duck> a owl:Class; rdfs:subClassOf <http://a.com/Bird>; rdfs:comment "The class of ducks, those living things that waddle around in ponds" .
- name the relation
<http://a.com/swimming>which relates a thing to the time it is swimming
<http://a.com/swimming> a owl:DatatypeProperty; rdfs:domain <http://a.com/Animal> ; rdfs:range xsd:dateTime .
- name the relation
<http://a.com/quacking>which relates a thing to the time it is quacking (like a duck)
<http://a.com/quacking> a owl:DatatypeProperty; rdfs:domain <http://a.com/Duck> ; rdfs:range xsd:dateTime .
- state that an duck is an animal
<http://a.com/Duck> rdfs:subClassOf <http://a.com/Animal> .
:d1 <http://a.com/quacking> "2007-05-25T16:43:02"\^\^xsd:dateTime .then you know that :d1 is a duck ( or that the relation is false, but that is another matter ), and this will be true whatever the context you find the relation in. You know this because the url
http://a.com/quackingalways refers to the same relation, and that relation was defined as linking ducks to times.
Furthermore notice how you may conclude many more things from the above statement. Perhaps you have an ontology of animals written in OWL, that states that Ducks are one of those animals that always has two parents. Given that, you would be able to conclude that
:d1has two parents, even if you don't know which they are. Animals are physical beings, you may discover by clicking on the
http://a.com/AnimalURL, and in particular one of those physical things that always has a location. It would therefore be quite correct to query for the location of
You can get to know a lot of things with just one simple statement. In fact with the semantic web, what that single statement tells you gets richer and richer the more you know. The wider the context of your knowledge the more you know when someone tells you something, since you can use inferencing to deduce all the things you have not been told. The more things you know, the easier it is to make inferences (see Metcalf's law).
In conclusion, duck typing is done right on the semantic web. You don't have to know everything about something to work with what you have, and the more you know the more you can do with the information given to you. You can have duck typing and scale.