Cameron, Meet Larry
By templedf on Mar 28, 2007
Last Friday, Oracle announced plans to acquire Tangosol, one of the three leading distributed data cache companies. (The other two being GigaSpaces and GemStone.) Oracle is now preaching about XTP, eXtreme Transaction Processing, which is their name for what Gartner calls E-OLTP, Extreme Online Transaction Processing.
E-OLTP is a special case of a compute grid. In a compute grid, work is matched to workers in an attempt to get the maximum resource utilization. The work-worker matching is done by some kind of scheduler that is essentially trying to solve the classic bin-packing problem. The bin-packing problem is NP complete, which means it can't be solved in a practical way with any reasonable degree of scalability. Translation: as the amount of things to schedule and the number of places to schedule them goes up, the time required to do the scheduling optimally gets rapidly out of hand. The Grid Engine scheduler is very highly tuned to be as scalable as possible, but even it can be overwhelmed with a large enough workload.
In some segments, such as the financial sector, the workload is very large. Moreover, the size of individual pieces of work is very small, making the scheduling overhead that much more significant. This is the land of E-OLTP. Because each job is so small, in the end it really doesn't make much difference where you schedule it. In a couple hundred milliseconds, it'll be done anyway. It turns out that in such an environment it's better to forgo the scheduling part altogether and just feed whatever work comes in to whatever worker can handle it.
Enter the distributed data cache. Instead of the traditional compute grid architecture where a centralized scheduler decides where work should go, the distributed data cache fills the roll of a schedulerless work queue. As work comes in, it gets put in the queue. As workers finish what they're doing, they go to the queue and grab the next piece of work. The distributed data cache makes sure that none of the data is lost and that none of the workers step on each other's toes.
There are a variety of ways to achieve a distributed data cache. GigaSpaces does it through a JavaSpaces implementation. Tangosol does it through a special set of Collections classes. The interesting thing about that solution is that the developer doesn't really need to know that he's working with a distributed data cache. To him or her, it's just another Collections object to store data in. Tangosol transparently takes care of all of the complex issues of maintaining data availability, persistence, and coherency in a data grid.
I spoke with Cameron Purdy, the president of Tangosol and one of the founders, at SuperComputing last year. He's seemed like a really nice guy, and he really believed in the Tangosol technology. He told me that when it comes down to hard numbers, Tangosol beats the pants of GigaSpaces and GemStone. Of course, I wouldn't have expected him to say anything different. I wish him the best of luck with his new best friend, Larry.