Grid Engine: The World's First Cloud-aware Distributed Resource Manager
By user12601629 on Jan 13, 2010
Sun has just released a new version of Grid Engine. Grid Engine is a market leading product in the Distributed Resource Management space, but this new release really brings the product to the next level. Specifically, it brings it up into the cloud!
So, what's so exciting about this release? There are a number of things, but I'll focus on two. First, Dynamic resource reallocation, including the ability to use on-demand resources from Amazon EC2. Second is deep integration with Apache Hadoop -- one of the most popular workloads in the cloud today.
A new feature in Grid Engine allows you to manage resources across logical clusters (or even clouds). This could be two collections of systems inside a corporation, or can include non-local cloud resources (such as EC2). Why would you want to do this? Let's look at a scenario.
Many auto companies use Grid Engine to coordinate the resources on the Grid/Cluster/Cloud they use for mechanical design and simulation. Users across the company submit jobs (e.g. a crash simulation) and Grid Engine queues them and dispatches them based on priority and policy. However, what happens when your submissions start to outpace the ability of your systems to keep up? In the traditional model, you'd have to buy new hardware and add it to your Grid/Cluster/Cloud. With the Grid Engine you can now configure rules that allow you to "cloud burst" these workloads out to another cloud. With Amazon EC2 specifically, you pre-configure a set of AMI images on EC2 that have your application software and register them with Grid Engine. You also give Grid Engine the credentials to manage your EC2 account. Then, based on your policy, Grid Engine will:
- Fire up new EC2 instances on demand (using your supplied AMIs)
- Automatically set up a secure VPN network tunnel between your network and your EC2 instances
- Join them to the Grid Engine cluster
- Dispatch work to them
- Take them back down once demand has subsided
It's a great example of on-demand resource management, and it has the potential to save customers real money in avoiding over-provisioning their internal clouds.
The next thing that's really exciting is Grid Engine's new integration with Hadoop. Hadoop is a popular open-source implementation of Map-Reduce. Map-Reduce is the fundamental building block that power's the internal clouds at Yahoo and Google, and it's commonly used as a way to enable applications that can process huge collections of data.
While Hadoop has seen a large amount of deployment in the web space (at companies like Facebook and others) it's only starting to see adoption in the Enterprise. This new Grid Engine release can help change that. Grid Engine is now a key ingredient to make Hadoop enterprise ready. At a technical level, Hadoop applications can now be submitted to Grid Engine, just like any other kind of parallel computation job. This means you can now more easily share a single set of physical resources between Hadoop and other tradition applications (financial risk modeling, crash simulations, weather prediction, batch processing -- you name it). That means reduced cost to the customer. Beyond that, Grid Engine now has a deep understanding of Hadoop's global file systems (HDFS), which means that Grid Engine can send work to the right part of the cluster (where the data lives locally) to make it ultra-efficient -- even when sharing. And lastly, Grid Engine has a mature usage accounting and billing feature (ARCo) built-in. That means you can now track and (internally) charge back for Hadoop jobs -- giving IT a real way to interact with the business.
There's a lot more to this release and you can read all about it over at Dan Templeton's blog, so I won't try to go into all the details. Let it suffice to say that I'm really excited about this release. Grid Engine has a future that makes it an increasingly important part of the infrastructure for Cloud Computing going forward.