Wednesday Jan 13, 2010

Faviki: social bookmarking for 2010

faviki logo

Faviki is simply put the next generation social bookmarking service. "A bookmarking service? You must be kidding?!" I can hear you say in worried exasperation. "How can one innovate in that space?" Not only is it possible to innovate here, let me explain why I moved all my bookmarks from delicious over to faviki.

Like delicious, digg, twitter and others... Faviki uses crowd sourcing to allow one to share interesting web pages one has found, stay up to date on a specific topic of interest, and keep one's bookmarks synchronized across computers. So there is nothing new at that level. If you know, you won't be disoriented.

What is new is that instead of this being one crowd sourced application, it is in fact two. It builds on wikipedia to help you tag your content intelligently with concepts taken from dbpedia. Instead of tagging with strings the meaning of which you only understand at that time, you can have tags that make sense, backed by a real evolving encyclopedia. Sounds simple? Don't be deceived: there is a huge potential in this.

Let us start with the basics: What is tagging for? It is here to help us find information again, to categorize our resources into groups so that we can find them again in the rapidly increasing information space. I now have close to ten years of bookmarks saved away. As a result I can no longer remember what strings I used previously to tag certain categories of resources. Was it "hadopi", "paranoia", "social web", "socialweb", "web", "security", "politics", "zensursula", "bigbrother", "1984", ... If I tag a document about a city should I tag it "Munich", "München", "capital", "Bavaria", "Germany", "town", "agglomeration", "urbanism", "living", ...? As time passed I found it necessary to add more and more tags to my bookmarks, hoping that I would be able to find a resource again in the future by accidentally choosing one of those tags. But clearly that is not the solution. Any of those tags could furthermore be used very differently by other people on delicious. Crowd sourcing only partially works, because there is no clear understanding on what is meant by a tag, and there is no space to discuss that. Is "bank" the bank of a river, or the bank you put money in? Wikipedia has a disambiguation page for this, which took some time to put together. No such mechanism exists on delicious.

Faviki neatly solves this problem by using the work done by another crowd sourced application, and allowing you to tag your entries with concepts taken from there. Before you tag a page, Faviki finds some possible dbpedia concepts that could fit the content of the page to tag. When you then choose the tags, the definition from wikipedia is made visible so that you can choose which meaning of the tag you want to use. Finally when you tag, you don't tag with a string, but with a URI: the DBPedia URI for that concept. Now you can always go back and check the detailed meaning of your tags.

But that is just the beginning of the neatness of this system. Imagine you tag a page with (the user does not see this URL of course!). Then by using the growing linked data cloud Faviki or other services will be able to start doing some very interesting inferencing on this data. So since the above resource is known to be a town, a capital, to be in Germany which is in Europe, to have more than half a million inhabitants, to be along a certain river, that contains certain museums, to have different names in a number of other languages, to be related in certain ways to certain famous people (such as the current Pope)... it will be possible to improve the service to allow you to search for things in a much more generic way: you could search by asking Faviki for resources that were tagged with some European Town and the concept Art. If you are searching for "München" Faviki will be able to enlarge the search to Munich, since they will be known to be tags for the same city...

I will leave it as an exercise to the reader to think about other interesting ways to use this structured information to make finding resources easier. Here is an image of the state of the linked data cloud 6 months ago to stimulate your thinking :-)


But think about it the other way now. Not only are you helping your future self find information bookmarked semantically - let's use the term now - you are also making that information clearly available to wikipedia editors in the future. Consider for example the article "Lateralization of Brain Function" on wikipedia. The Faviki page on that subject is going to be a really interesting place to look to find good articles on the subject appearing on the web. So with Faviki you don't have to work directly on wikipedia to participate. You just need to tag your resources carefully!

Finally I am particularly pleased by Faviki, because it is exactly the service I described on this blog 3 years ago in my post Search, Tagging and Wikis, at the time when the folksonomy meme was in full swing, threatening according to it's fiercest proponents to put the semantic web enterprise into the dustbin of history.

Try out Faviki, and see who makes more sense.

Some further links:

Wednesday Apr 29, 2009

Adding twitter to my blog using Scala

Having added javascript widgets to my blog a few months ago, I found that this slowed the page downloads a lot. Here is a way to speed this up again, by pre-processing the work with a Scala script, and using iFrames to include the result.

Here are the short steps to do this:

  1. I wrote a Scala Program (see source) to take the twitter Atom feed, and generate xhtml from it.
  2. I wrote a shell script to run the compiled scala jar
    export CP=$HOME/java/scala/lib/scala-library.jar:$HOME/java/scala/lib/learning.jar
    /usr/bin/java -cp $CP learning.BlogIFrame $\*
  3. Then I just started a cron job on my unix server to process the script every half an hour
    $ crontab -l
    5,36 \* \* \* \* $HOME/bin/ $HOME/htdocs/tmp/
  4. Finally I added the iFrame to my blog here pointing to the produced html <IFRAME src="" height="300" frameborder="0"></IFRAME>

As a result there is a lot less load on the twitter server - it only has to serve one atom feed every half an hour instead of 1000 or so a day - and my html blog page does not stall if the twitter site itself is overloaded.

Also I learnt a lot about Scala by doing this little exercise.

Saturday Dec 13, 2008

Typealizer: analyzing your personality through your blog

illustration of the scientist Thanks to Mark Dixon I discovered Typealizer, a service that reads your blog and finds your psychological type. So of course I tried it on my own blog, as you will on yours shortly :-) . This is what it had to say:

INTJ - The Scientists

The long-range thinking and individualistic type. They are especially good at looking at almost anything and figuring out a way of improving it - often with a highly creative and imaginative touch. They are intellectually curious and daring, but might be pshysically hesitant to try new things.

The Scientists enjoy theoretical work that allows them to use their strong minds and bold creativity. Since they tend to be so abstract and theoretical in their communication they often have a problem communcating their visions to other people and need to learn patience and use conrete examples. Since they are extremly good at concentrating they often have no trouble working alone.

Well that not bad for flattery. So I reward them with this blog post.

They accompany their analysis with a brain activity diagram. This is the one I got:

Brain activity diagram for main blog

illustration of the travel category There is a lot in the cross section intuition and thinking, with some but not a lot of positioning in the practical. So being all happily scientifical I decided to try out what it would say if I pointed Typealiser to the Travel category on this blog. This is what it has to say on that aspect of my personality, perhaps it is true, a little in retreat recently.

ESTP - The Doers

The active and play-ful type. They are especially attuned to people and things around them and often full of energy, talking, joking and engaging in physical out-door activities.

The Doers are happiest with action-filled work which craves their full attention and focus. They might be very impulsive and more keen on starting something new than following it through. They might have a problem with sitting still or remaining inactive for any period of time.

This also came with a brain activity diagram for that part of the blog

So clearly a lot more biased towards action, as a travel blog should.

Still both of these blogs are not allowing me to capture around half of my brain activity. The spiritual idealistic side is not very visible. I wonder if that means I should speak more about open source and linux? ;-) I tried the Art category of my blog but that did not move me more to the feeling type, nor did the philosophy section make me more idealistic, just again more of a thinker, which they characterise like this:

INTP - The Thinkers

The logical and analytical type. They are espescially attuned to difficult creative and intellectual challenges and always look for something more complex to dig into. They are great at finding subtle connections between things and imagine far-reaching implications.

They enjoy working with complex things using a lot of concepts and imaginative models of reality. Since they are not very good at seeing and understanding the needs of other people, they might come across as arrogant, impatient and insensitive to people that need some time to understand what they are talking about.

Now what could be interesting would be some way then to do the inverse search. Find out what your brains activity diagram should look like, and ask to find blogs that fit those categories, which one could then use as a guide to help one develop that aspect of one's personality - or find a partner :-)

Ps. a thought: after categorizing people into 16 different groups this still leave you with 8 billion people/16 = 500 million people to chose from and if every person just had 1000 web pages that would leave you with half a trillion pages to look at. So this character analysis can be useful, but there still has to be a lot of other criteria to make a good judgement call.

PPS. Oddly enough - or not - Ken Wilber's blog is categorised as being of the "executive type".

Thursday Jan 03, 2008

Scoble gets thrown off Facebook

picture of current version of Beatnik

Scoble, who became very famous for getting blogging started at Microsoft, got ejected from FaceBook for crawling his network of friends. This is the problem with closed social networks and data silos in general. He seems to think the solution is data portability. More than that: the solution is Open Social Networks. You should be able to use a simple web server and just link up to your friends friend of a friend (foaf) file, whichever service they are using be it their own machine located in their basement, a service provider, a government owned machine, ... . Just as I can link from this blog to any blog. This would allow people to own their piece of the network, like they can own their blogs.

This is what Beatnik, a friend of a friend browser, which I described in this email to the social network portability group, will make it easy for anyone to do.

Everyone is welcome to help on this open source project: artists, documenters, Swing experts, testers, RESTafarians, ...

Friday Nov 02, 2007

Vote for Java6 on Leopard!

As mentioned previously a lot of Java developers on OSX are upset at Apple's silence as to its intentions with respect to the release of Java 6. There used to be a developer preview available, which was pulled recently with no indication as to when a replacement would be available. People like me who upgraded in the hope of having the latest and greatest - which we have been very patiently waiting for over a year for - are very disappointed. It creates all kinds of annoyances, like not being able to run Java Tutorial examples. Some who are working on Java 6 projects cannot use their computer easily, without resorting to installation of a separate OS in a virtual machine, to do their job. We all like OSX: its a beautiful easy to use Unix that usually really helps us get our work done. I have been very happily using it since 2004.

The first solution of course is to have our voice heard. One way to do this is to file a bug with Apple. Please do this! The only problem I have with it is that as opposed to the Java bug database which is completely open, the Apple bug database is completely closed. So there's no real way of verifying how many people have posted a report. We must therefore complement that action with an equal Open action. Following the noble example given to us by Nova Spivack, when he asked for people to make their voice heard in support of the Burmese people and got some real results, let us do the same to help Apple make the right decision.
Anybody who would like to support this issue in the blogosphere, should help post a blog with the string


The first part of the string is the decimal notation for 0xCAFEBABE [1], the magic cookie for JavaClass files (thanks David for the number and the pointer to Fredericiana's photo). Then post similar instructions on your blog or point people here. Let's see how far this gets us! [2]

We should then be able to use any search engine, Google is a good choice, to search for this string [3], and hopefully motivate the managers at Apple to invest more time on Java and be more open about their plans with the community.

Your vote may also be an energizer to those groups that are starting to port the OpenJDK to OSX (via the mac java community).


  1. Oops I just noticed a mistake here. 13949712720901 in dec = 0xCAFEBABE405 in Hex. Even better. So that's CAFEBABE + the HTTP 405 Response, which means "Method not available". :-)
  2. If you know a foreign language then please translate the instructions and explanations so that more people can understand what is going on. Always post a link to some instructions. Language is a Virus, but it is most virulent when it is understandable and hyperlinked, of course.
  3. A search on Google Web returns more results - more than AllTheWeb or AltaVista - but Google Blog Search contains less duplicates. The real number of votes is somewhere between those two numbers, as some people are voting on their open source web sites, which are not always feed enabled. Simon is keeping count.
  4. Karussell is keeping a list of related articles.


Tuesday Nov 13: Landon Fuller has been able to get a very nice hello world GUI app running on OSX using the FreeBSD jdk1.6 port. It runs under X Windows only. Excellent work!

Nov 20th, 2007: Dave Dirbin publishes the first beta release of the open source java 6. This campaign has gathered 105 blog votes if we count the results from Google Blog Search, placing it easily among the top 10 bug reports at the Java Bug database. The Google web search returns 256 results, which will contain the blog search, many duplicate pages pointing to blogs + some extra votes people may have placed on the web. I guess that those extra votes may pop this bug report up to the top 5 position.

Wednesday Dec 19: Apple has put a developer preview of Java 6 up on Apple Developer Connection. It is nice to see things progress on that side. As a result of this conflict, Java development on OSX has become a lot richer, with an open source JDK starting to compete with the closed one from Apple. This can only be good for both, and for developer and customer confidence in the platform.

Saturday Sep 08, 2007

the limits of a free flickr account

I took some time, but I just hit one of the limits of my free flickr account. Here is the message I am seeing at the top of my account:

You've run into one of the limits of a free account. Your free account will only display the most recent 200 photos you've uploaded. All of your photos beyond 200 will remain hidden from view until you either delete newer photos, or upgrade to a Pro account.

None of your photos have been deleted, and if you upgrade, they'll all come back unharmed.

To get a pro account I need to shell out $24.95 per year. This is not unreasonable, but it is making me pause and think how much I want to continue with this service...

What are the alternatives? Well I have my own server at I am already paying for that service, so I might as well use it more fully, and upload my pictures there. But pictures take up quite a lot of space so this is not going to end up being cheaper, as it will use up more space and force me at some time to either increase the space on my rented server or even buy my own server. What would I get for that price? I would certainly get more control over my work, but at the cost of more work on my part. Publishing photos could be done simply with an improved BlogEd, which can push those photos to the file system. This would make publishing easy, but would not give me the interactive collaborative features of flickr, where people can directly add notes to the pictures, tag them, etc... Adding that functionality is not a huge amount of work, but it makes maintenance of the service more difficult, which means costs in developer time, which has to be paid somehow too...

What is the price of freedom? Owning your content at URLs you control may long term be worth quite a lot, a bet Tim Bray clearly is making, by hosting everything on his own server...

Thursday Apr 05, 2007

Scoble Gets the Semantic Web

Scoble just wrote a post "I finally get 'semantic' Web" after seeing a preview of what Radar Networks is about to unveil. I could have had a look a few months back when I was in SF, but I like to not have to keep secrets... Still for Scoble to change his mind like this is quite remarkable.

It is interesting in the remarks to his post all the reasons people put forward to doubt the possibility of the semantic web being able to coming about at all. The most vocal type of point made recently, is that it can't work because you just need one liar to muddy up the waters. This was the jist of a pretty nasty Register article Tim Berners Lee goes postal on spam. But the point Tim makes is clear and easy to make, in a simple question and answer session:

  • Don't we all comunicate?
  • Yes
  • Are there not many liars amongst us?
  • Too many
  • So how can we communicate then, if the above reasoning is correct? After all if it just needs a few liars to mess up communication, and there are many liars, then it follows that communication must be badly messed up. In fact what do you do when you encounter a liar?
  • When I recognise him I never talk to him again.
  • So that's what one does with spam web sites and junk news sources. One just avoids them, and all their friends.
The point is that liars are like software bugs, "Given enough eyeballs, all bugs are shallow", and the same is true with liars. Funily enough in many cases, everyone knows who the liars are. That information travells quickly.

Why is it easy to uncover liars? Because what they say does not mesh. It does not mesh with the rest of reality. So the more information you have at your disposal, the more meshing of different points of view you do, the more the inconsistencies are brought to light, and the easier it is to pin point the source of the inconsistency. Trust is a very valuable resource. Those who squander it, are doomed to eternal silence, as their words become meaningless and unheard by all.

Another noteworthy example of an up and coming, but public and visible Semantic Web application is DBPedia, described in great length and with a lot of pictures by Michael K. Berman in his Did You Blink? The Structured Web Just Arrived. If you don't want to wait for Radar to Unveil their service, to get some taste of what's happening go check it out.

Or of course you could go to the Semantic Web Conference in San Jose coming up in May.

Thursday Feb 15, 2007

sorry for the updates!

Roller the engine behind this weblog, does not give one the ability to make minor changes to a post. That is every change, however minor - be it just adding a new tag to a post - changes the updated time stamp in the associated rss 1.0 or atom feed. For people reading this post from my official blog site this won't be noticeable at all. But for those reading this with aggregators such as JNN or BlogBridge, or even for those web aggregators such as Planet RDF, they will be forced to either never take account of updates and only order posts by created time stamps, or they will have to suffer what may seem like SPAM behavior. So this is a plea for forgiveness from my Planet RDF readers.

To solve this problem Roller's editing window should have a small checkbox with the text "minor updated" next to it. Ticking that checkbox would leave the updated time stamp as is, though the change could be noted by an edited time stamp. Note that having an edited time stamp is not at all necessary. But it would make a couple of things more easy, such as helping clients to synchronize with minor edits on the server without disturbing run of the mill readers. The app:edited element was added to the Atom Application Protocol, for just this reason in fact.

Update: I have created feature enhancement request 1358 on the roller JIRA site. Please add your comments or support for this feature there. It should require only a very little amount of coding.




« July 2016