By Melvin Koh on May 05, 2008
Apache Hadoop is gaining a lot of attention in the web community, especially support from Yahoo. It has a distributed filesystem and supports data intensive distributed application using the MapReduce computational model. It is been viewed as an important piece of the puzzle in Cloud computing, but can also be very useful to datamining type of applications. I think it won't be long before it catches attention in HPC, if it hasn't yet. With it's high scalability and fault tolerant nature, I think it has a lot of uses in HPC. Due to the data intensive nature, I wonder if there can be any value with using Hadoop with Lustre. If anyone has any insight to the I/O characteristics, I'll be glad to hear about it.