Wednesday Jul 01, 2009

Run an HPC Cluster...On your Laptop

With one free download, you can now turn your laptop into a virtual three-node HPC cluster that can be used to develop and run HPC applications, including MPI apps. We've created a pre-configured virtual machine that includes all the components you need:

Sun Studio C, C++, and Fortran compilers with performance analysis, debugging tools, and high-performance math library; Sun HPC ClusterTools -- MPI and runtime based on Open MPI; and Sun Grid Engine -- Distributed resource management and cloud connectivity

Inside the virtual machine, we use OpenSolaris 2009.06, the latest release of OpenSolaris, to create a virtual cluster using Solaris zones technology and have pre-configured Sun Grid Engine to manage it so you don't need to. MPI is ready to go as well---we've configured everything in advance.

If you haven't tried OpenSolaris before, this will also give you a chance to play with ZFS, with DTrace, with Time Slider (like Apple's Time Machine, but without the external disk) and a host of other cool new OpenSolaris capabilities.

For full details on Sun HPC Software, Developer Edition for OpenSolaris check out the wiki.

To download the virtual image for VMware, go here. (VirtualBox image coming soon.)

If you have comments or questions, send us a note at

Thursday Dec 18, 2008

Beta Testers Wanted: Sun Grid Engine 6.2 Update 2

A busy day for fresh HPC bits, apparently...

The Sun Grid Engine team is looking for experienced SGE users interested in taking their latest Update release for a test drive. The Update includes bug fixes, but also some new features as well. Two features in particular caught my eye: a new GUI-based installer and optimizations to support very large Linux clusters (think TACC Ranger.)

Full details are below in the official call for beta testers. The beta program will run until February 2nd, 2009. Look no further for something to do during the upcoming holiday season. :-)

Sun Grid Engine 6.2 Update 2 Beta (SGE 6.2u2beta) Program

This README contains important information about the targeted audience of this beta release, new functionality, the duration of this SGE beta program and your possibilities to get support and provide feedback.

  1. Audience of this beta program
  2. Duration of the beta program and release date
  3. New functionality delivered with this release
  4. Installing SGE 6.2u2beta in parallel to a production cluster
  5. Beta program feedback and evaluation support
  1. Audience of this beta program

    This Beta is intended for users who already have experience with the Sun Grid Engine software or DRM (Distributed Resource Management) systems of other vendors. This beta adds new features to the SGE 6.2 software. Users new to DRM systems or users who are seeking a production ready release should use the Sun Grid Engine 6.2 Update 1 (SGE 6.2u1) release which is available from here.

    For the shipping SGE 6.2u1 release we are offering a free 30 day evaluation email support.

  2. Duration of the Beta program and release date

    This beta program lasts until Monday, February 2, 2009. The final release of Sun Grid Engine 6.2 Update 2 is planned for March 2009.

  3. New functionality delivered with this release

    Sun Grid Engine 6.2 Update 2 (SGE 6.2u2) is a feature update release for SGE 6.2 which adds the following new functionality to the product:

    • a GUI based installer helping new users to more easily install the software. It complements the existing CLI based installation routine.
    • new support for 32-bit and 64-bit editions of Microsoft Windows Vista (Enterprise and Ultimate Edition), Windows Server 2003R2 and Windows Server 2008.
    • a client and server side Job Submission Verifier (JSV) allows an administrator to control, enforce and adjust jobs requests, including job rejection. JSV scripts can be written in any scripting language, e.g. Unix shells, Perl or TCL.
    • consumable resource attributes can now be requested per job. This makes resource requests for parallel jobs much easier to define, especially when using slot ranges.
    • on Linux, the use of the 'jemalloc' malloc library improves performance and reduces memory requirements.
    • the use of the poll(2) system call instead of select(2) on Linux systems improves scalability of qmaster in extremely huge clusters.
  4. Installing SGE 6.2u2 in parallel to a production cluster

    Like with every SGE release it is safe to install multiple Grid Engine clusters running multiple versions in parallel if all of the following settings are different:

    • directory
    • ports (environment variables) for qmaster and execution daemons
    • unique "cluster name" - from SGE 6.2 the cluster name is appended to the name of the system wide startup scripts
    • group id range ("gid_range")

    Starting with SGE 6.2 the Accounting and Reporting Console (ARCo) accepts reporting data from multiple Sun Grid Engine clusters. Following the installation directions for ARCo and using a unique cluster name for this beta release there is no risk of losing or mixing reporting data from multiple SGE clusters.

  5. Beta Program Feedback and Evaluation Support

    We welcome your feedback and questions on this Beta. Weask you to restrict your questions to this Beta release only. If you need general evaluation support for the Sun Grid Engine software please subscribe to the free evaluation support by downloading and using the shipping version of SGE 6.2 Update 1.

    The following email aliases are available:

Wednesday May 14, 2008

Growing Flowers with Datacenter Heat

The Open Source Grid and Cluster Conference is being held this week in Oakland, California. I attended the first day of the conference before flying home to meet a personal commitment. My favorite talk of the day was Paul Brenner's presentation titled Grid Heating: Dynamic Thermal Allocation via Grid Engine Tools.

Brenner, who works as a scientist in the University of Notre Dame's Center for Research Computing, is exploring innovative ways to exploit the waste heat generated by HPC and other datacenters via partnerships with various municipal entities in the South Bend area. His first prototype, currently in progress, involves placing a rack of HPC compute nodes at a local municipal greenhouse, the South Bend Greenhouse and Botanical Garden.

The greenhouse had recently been forced to close portion of its facility due to high natural gas heating costs. Brenner wondered if he could help. Since current datacenters can be viewed as massive electricity-to-heat converters (with a computational byproduct), it seemed there might be an opportunity to exploit the waste heat in some useful way. But transferring heat, especially low-grade waste heat, over distances is very inefficient. Was there a way to overcome this barrier?

Enter grid computing with its ability to harness remotely located compute resources. If Brenner couldn't transport the heat to the greenhouse, why not place the datacenter at the greenhouse? The garden gets the heat and Notre Dame gets the compute resources via established grid computing capabilities like Sun's Grid Engine distributed resource manager, which is already in use at Notre Dame. Cool idea? Hot idea!

Based on early prototype work which involves placing single rack in the greenhouse, the idea looks like a promising way to reduce natural gas heating requirements for the facility. Brenner has shown he can use grid scheduling software to deliver a desired temperature (within a range, of course) by simply adding or throttling compute jobs on the greenhouse cluster, which communicates with Notre Dame via a wide-area wireless broadband connection.

He has looked at humidity issues and so far they don't seem to be a problem given the ranges supported by typical compute gear. And he points out that while the greenhouse environment does not offer the highly filtered environment of a controlled datacenter, the particulate tolerance for typical compute gear is far in excess of EPA guidelines for people.

Phase II will involve placing three full racks of gear at the greenhouse to significantly reduce heating costs. Notre Dame will pay the electrical costs and use the compute resources. The city saves money on heating.

While the greenhouse is an interesting experiment, it is not ideal since its heating requirements will fluctuate seasonally. There are, however, other installations that have constant heating requirements--for example, hospitals have a 24x7 need for hot water. Sites like this could be interesting for future deployments.

Brenner's full presentation is available [PDF].


Josh Simons


« April 2014