Running Mahout on Elastic MapReduce
By searchguy on May 01, 2009
The Apache Mahout project was started to build the algorithms described in the paper on the Hadoop MapReduce framework (the original paper describes running the algorithms on multicore processors.) They've also brought in the Taste Collaborative Filtering framework, which is interesting to us as recommendation folks. As it turns out, they had just released Mahout 0.1. around the time we were going to read the paper.
Coincidentally, Amazon had just announced their Elastic MapReduce (EMR) service that lets you run a MapReduce job on EC2 instances, so I decided to see what it would take to get Mahout running on EMR.
I didn't manage to get it running in time for the reading group, but one Mahout issue and a few "Oh, that's the way it works"es later, I had it running.
Apparently I'm the first person to have run Mahout on Elastic MapReduce, which just shows, as my father used to say, that brute force has an elegance all its own.
If you're interested the details are on the Mahout wiki.