Monday Jun 29, 2009

Sun HPC Software, Developer Edition 1.0 for OpenSolaris

Just couple days ago, Sun announced the Sun HPC Software, Developer Edition 1.0 for OpenSolaris. It's a fully-featured HPC development environment that is distributed as a virtual machine (VM). It is based on OpenSolaris 2009.06 and includes pre-configured:

  • Sun Grid Engine 6.2u3
  • Sun HPC ClusterTools 8.1
  • Sun Studio 12u1
  • Sun Studio IDE Plugins
  • Examples of DTrace and performance analysis
  • Accounting and Reporting Console (ARCo)

The compelling feature for HPC developers is that this VM can add cluster resources from Amazon EC2 when local simulated 3-node cluster is insufficient. See the architecture diagram here.

If you wish to give it a try on VirtualBox rather than on VMware follow this blog post.

Additional links:

Download page
Wiki documentation
Release notes

Wednesday Jun 24, 2009

Sun Grid Engine 6.2u3 Released

The Sun Grid Engine 6.2u3 is yet another product update. This means that besides the bug fixes it introduces several new features. But there is another very important difference to the previous releases. The license has changed!

Without a valid Sun Grid Engine license, evaluation use is only permitted for 90 days. If you want to use Grid Engine after that you need to replace the Sun binaries with unsupported courtesy binaries. However the courtesy binaries do not include the new Amazon EC2 adapter and the SGE Inspect modules.

New features in Sun Grid Engine 6.2u3:

Amazon EC2 Adapter

The Service Domain Manager (SDM) adds connectivity to Amazon Elastic Cloud EC2 and the ability to flexibly add execution hosts as needed on demand.

Initial Power Saving Support

A new power saving scheme in SDM enables the creation of a special resource spare pool in which systems can be powered on or off when added or removed from this spare pool.

Service Domain Manager (SDM) Simple Install

It is now possible to install and run an SDM system with only one JVM per (managed or master) host. Previously, the system was using up to three separate JVMs per host. This new feature simplifies installation, configuration and maintenance.

SGE Inspect

A new Java based Sun Grid Engine Inspect module allows to monitor SGE clusters and the Service Domain Manager (SDM). The similarity to VisualVM or Netbeans is not coincidental.

Exclusive Host Scheduling

Exclusive host scheduling allows users to request that jobs and parallel tasks run exclusively on a host if allowed by an administrator.

Microsoft Windows Vista Display Support

The display_win_gui feature is now fully supported. This feature allows a job to launch a GUI on the currently visible desktop on the Windows host that displays job information. This works only if the job is a native Windows application.

As always, you may test this release in parallel with your old cluster to try out the new features. Optionally, you may use the upgrade procedure (clone configuration method) to install SGE 6.2u3, while keeping all your other cluster configuration settings. See the upgrade video. While the video shows an upgrade from 6.1 to 6.2, it will work the same way for cloning any 6.0u2 or later to 6.2u3.


Additional links:

Download page
Wiki documentation
Release notes
Fixed bugs
Patch matrix

Tuesday Jun 09, 2009

Grid Engine Workshop 2009 (Sep 7-10, 2009)

Grid Engine Workshop 2009 - Sun HPC SW Workshop has the same format as in previous years, i.e. 2.5 days of technology, user and solution provider presentations. It is augmented by a separate SGE Advanced Seminar prior to the workshop and by parallel workshop tracks about OpenStorage and Sun Developer Tools. It also builds on the very successful tradition of having that workshop in a beautiful and charming city of Regensburg, Germany.

Register quickly to take advantage of early bird fees and secure your room at the conference hotel quickly (limited capacity). Also submit your presentation proposals ASAP to apply for further discounts as speaker and to make your work known to the Grid Engine community.

More information at  http://hpcworkshop.com

Friday Apr 24, 2009

Sun Grid Engine 6.2u3 Beta Program Started

What's New in this 6.2u3 Beta Release

  • Sun Grid Engine Inspect - SGE Inspect, a new Monitoring and Configuration Console, allows to monitor Sun Grid Engine clusters and to monitor and configure the Service Domain Manager (SDM).

  • Service Domain Manager (SDM) Cloud Adapter and Initial Power Saving Support - The new SDM service adapter interface adds support to manage external virtual resources. The implementation provides an interface to manage Amazon's EC2 AMIs. Those AMIs could include SGE execution hosts which can be added to a local Sun Grid Engine cluster. The enhanced spare pool implementation can be configured to power cycle machines when being added or removed from the spare pool.

  • "Exclusive Job" Scheduling Enhancement - Jobs and all parallel tasks of a job can request exclusive scheduling on a host. Jobs requiring resources only available for one job per host or jobs having access patterns to hardware resources like CPUs or memory, which make it useful to run only one job per machine can request non shared access.
  • Complete Microsoft Windows Vista Support - A native Windows job can now open a GUI on a Windows Vista and Windows Server 2008 desktop (SGE 6.2u2 added support for those Windows operating systems).

This Beta lasts until Wednesday, 5/20/2009.

There are the following email aliases available:

Technical support alias: sge-beta_at_sun_dot_com

Feedback on documentation: sge-beta-doc_at_sun_dot_com

General feedback: sge-beta-feedback_at_sun_dot_com

Sunday Mar 15, 2009

Using the New Installer in SGE 6.2u2

The screencast below demonstrates the capabilities of the new installer in SGE 6.2u2. See part 1 of this screencast for an example of how to configure the environment needed by the new installer.

Download for iPod

Preparing the Environment for the New Graphical Installer in SGE 6.2u2

I've prepared a screencast showing an example how to prepare the environment needed by the new installer in Sun Grid Engine 6.2u2. You can see installer's screenshots in my previous post.


Download for iPod

See part 2 of the screencast, that demonstrates the powerful capabilities on the new installer.

Thursday Mar 05, 2009

Sun Grid Engine 6.2 Update 2 Available

The new product update of the Sun Grid Engine is now available. It's called simply Update  2, but in addition to many bug fixes, it includes following new features:

GUI Installer

It allows an easy interactive installation of a whole cluster.

See more on the Installing With GUI Installer page.

Microsoft Windows Vista Support

New support for Microsoft Windows Vista Enterprise and Ultimate Edition, Windows Server 2003R2 and Windows Server 2008, both 32 and 64-bit versions).

Job Submission Verifier (JSV)

JSVs allow users and administrators to define rules that determine which jobs are allowed to enter into a cluster and which jobs should be rejected immediately. A JSV is a script or binary that can be used to verify, modify, or reject a job during the time of job submission or on the master host.

See more on the Using Job Submission Verifiers wiki page or jsv man page.

Consumable Resource per Job

Consumable complex attributes can now be configured as per job. Such consumables are consumed as requested and are no longer multiplied by the requested slots. This makes resource requests for parallel jobs much easier to define, especially when using slot ranges.

See more on the Defining Consumable Resources page (Multiplied Resource Requests Versus Non-Multiplied Resource Requests) or complex man page (look for consumable section).

Jemalloc Library (on Linux x64)

Most Linux distributions come with a default memory allocator library which is not as efficient as the open source jemalloc memory allocator library, developed by Jason Evans for FreeBSD 7, and also used by the Firefox 3. Sun Grid Engine 6.2 Update 2 replaces the native Linux malloc library with the jemalloc library on x64 platforms. This has a positive effect on the master host performance in large and high throughput Sun Grid Engine clusters on Linux x64 and reduces the memory footprint up to 20% leading to a significant performance increase.

As always, you may test this release in parallel with your old cluster to try out the new features. Optionally, you may use the upgrade procedure (clone configuration method) to install SGE 6.2u2, while keeping all your other cluster configuration settings. See the upgrade video. While the video shows an upgrade from 6.1 to 6.2, it will work the same way for cloning any 6.0u2 or later to 6.2u2.


Additional links:

Download page
Wiki documentation
Release notes
Fixed bugs
Patch matrix

Friday Dec 19, 2008

Sun Grid Engine 6.2 Update 2 Beta Program Started

The SGE 6.2u2 beta program just started and will last until lasts until Monday, February 2, 2009.

The beta program is intended for users who already have experience with the Sun Grid Engine software or DRM (Distributed Resource Management) systems of other vendors. This beta adds new features to the SGE 6.2 software.

Sun Grid Engine 6.2 Update 2 (SGE 6.2u2) is a feature update release for SGE 6.2 which adds the following new functionality to the product:

  • a GUI based installer helping new users to more easily install the software. It complements the existing CLI based installation routine
  • new support for 32-bit and 64-bit editions of Microsoft Windows Vista (Enterprise and Ultimate Edition), Windows Server 2003R2 and Windows Server 2008.
  • a client and server side Job Submission Verifier (JSV) allows an administrator to control, enforce and adjust jobs requests, including job rejection. JSV scripts can be written in any scripting language, e.g. Unix shells, Perl or TCL.
  • consumable resource attributes can now be requested per job. This makes resource requests for parallel jobs much easier to define, especially when using slot ranges.
  • on Linux, the use of the 'jemalloc' malloc library improves performance and reduces memory requirements
  • the use of the poll(2) system call instead of select(2) on Linux systems improves scalability of qmaster in extremely huge clusters

The SGE 6.2u2 Beta can be downloaded from http://www.sun.com/software/gridware/get_it.jsp.

Like with every SGE release it is safe to install multiple Grid Engine clusters running multiple versions in parallel if all of the the following settings are different:

  • <sge_root> directory
  • ports (environment variables) for qmaster and execution daemons
  • unique "cluster name" - from SGE 6.2 the cluster name is appended to the name of the system wide startup scripts
  • group id range ("gid_range")

Optionally you may use the upgrade procedure (clone configuration method) to install SGE 6.2u2 with the different settings mentioned above, while keeping all your other cluster configuration settings. See the upgrade video. While the video shows an upgrade from 6.1 to 6.2, it will work the same way for cloning 6.2(u1) to 6.2u2 beta.

We welcome your feedback and questions on this Beta. We are asking you to restrict your questions specific to this Beta release. In case you are seeking for general evaluation support for the Sun Grid Engine software please subscribe to the free evaluation support by downloading and using the shipping version of SGE 6.2 Update 1.

There are the following email aliases available:

Technical support alias: sge-beta_at_sun_dot_com

Feedback on documentation: sge-beta-doc_at_sun_dot_com

General feedback: sge-beta-feedback_at_sun_dot_com

Sunday Aug 17, 2008

Upgrading to Sun Grid Engine 6.2

I've just posted a video showing how to upgrade to SGE 6.2 while keeping the old cluster running throughout the upgrade process and beyond.

Download for iPod

Monday Aug 04, 2008

Sun Grid Engine 6.2 Is Here

New version of  Sun Grid Engine 6.2 will be released tomorrow. Checkout DanT's blog entry describing the new features.

Here's a quick list of new features in SGE 6.2:

  • Advance Reservation
  • Multi-Clustering with Service Domain Manager (SDM)
  • Scalability Improvements (Scheduler as a Thread, New Interactive Job Support, etc.)
  • Array Task Dependencies
  • Accounting and Reporting Console (ARCo) Improvements (Multi-Cluster support, DBwriter is now up to 10x faster)
  • Solaris Enhancements (Service Tags and SMF support)
  • New Upgrade Procedure

Since Dan already discribed most of them. I'll just blog about the New Upgrade Procedure and SMF support today.

New Upgrade Procedure

The original upgrade procedure had many restrictions. The most troublesome in my opinion was that you couldn't simply leave the old cluster running and in parallel start a new SGE version with the same configuration. This and many other issues were solved by the new upgrade procedure.

You can now create a backup of the whole cluster configuration and later, at anytime, restore it while the qmaster is running! The old upgrade required to shutdown the qmaster before the configuration could have been loaded to the upgraded cluster.

The upgrade/update to a newer version should now be easy as never before (hello 6.0 users!). The complete description of the upgrade procedure can be found here.

SMF Support

Service Management Facility (SMF) has been introduced with Solaris 10 and provides an alternative model to the service management as opposed to Run Control (RC) scripts.

It solved many problems and I'll just list my 3 favourite:

  • service dependencies (services can depend on each other)
  • service fail-overs (services can be automatically restarted on failure)
  • single place for all log files

With SMF services on your system start up faster and are generally more reliable.

Regarding the Sun Grid Engine we've introduced following services:

  • qmaster service
  • shadowd service
  • execd service

If any of those get killed or fail (e.g.: dump core) the SMF will detect this and will automatically restart the failing services. It basically reduces your cluster downtime for free.

SMF is now installation default on all Solaris 10+ machines.

To get more information refer to:

Installing SMF Services

Managing SMF Services


Sunday Jun 29, 2008

Managing Grid Engine SMF services

Hi guys, I just added a section about managing Grid Engine SMF services to wikis.sun.com. There's not much now, but it's a start. Let me know what would you like to add there.

About

Lubomir Petrik

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today