Tuesday Jan 18, 2011

Full Speed Ahead

Last week I had the opportunity to do a webcast with Moe Fardoost, our marketing director, on the future direction for the Oracle Grid Engine product. If you're curious about where Grid Engine is headed, take a look. For the very lazy among you, the summary is that we're focused on three major themes: core infrastructure and feature improvements, tighter integrations with other Oracle products, and a richer cloud feature set.

Thursday Dec 23, 2010

Oracle Grid Engine: Changes for a Bright Future at Oracle

For the past decade, Oracle Grid Engine has been helping thousands of customers marshal the enterprise technical computing processes at the heart of bringing their products to market. Many customers have achieved outstanding results with it via higher data center utilization and improved performance. The latest release of the product provides best-in-class capabilities for resource management including: Hadoop integration, topology-aware scheduling, and on-demand connectivity to the cloud.

Oracle Grid Engine has a rich history, from helping BMW Oracle Racing prepare for the America’s Cup to helping isolate and identify the genes associated with obesity; from analyzing and predicting the world's financial markets to producing the digital effects for the popular Harry Potter series of films. Since 2001, the Grid Engine open source project has made Oracle Grid Engine functionality available for free to open source users. The Grid Engine open source community has grown from a handful of users in 2001 into the strong, self-sustaining community that it is now.

Today, we are entering a new chapter in Oracle Grid Engine’s life. Oracle has been working with key members of the open source community to pass on the torch for maintaining the open source code base to the Open Grid Scheduler project hosted on SourceForge. This transition will allow the Oracle Grid Engine engineering team to focus their efforts more directly on enhancing the product. In a matter of days, we will take definitive steps in order to roll out this transition. To ensure on-going communication with the open source community, we will provide the following services:

  • Upon the decommissioning of the current open source site on December 31st, 2010, we will begin to transition the information on the open source project to Oracle Technology Network’s home page for Oracle Grid Engine. This site will ultimately contain the resources currently available on the open source site, as well as a wealth of additional product resources.
  • The Oracle Grid Engine engineering team will be available to answer questions and provide guidance regarding the open source project and Oracle Grid Engine via the online product forum
  • The Open Grid Scheduler project will be continuing on the tradition of the Grid Engine open source project. While the Open Grid Scheduler project will remain independent of the Oracle Grid Engine product, the project will have the support of the Oracle team, including making available artifacts from the original Grid Engine open source project.

Oracle is committed to enhancing Oracle Grid Engine as a commercial product and has an exciting road map planned. In addition to developing new features and functionality to continue to improve the customer experience, we also plan to release game-changing integrations with several other Oracle products, including Oracle Enterprise Manager and Oracle Coherence. Also, as Oracle's cloud strategy unfolds, we expect that the Oracle Grid Engine product's role in the overall strategy will continue to grow. To discuss our general plans for the product, we would like to invite you to join us for a live webcast on Oracle Grid Engine’s new road map. Click here to register.

Next Steps:

Thank you to everyone in the community for their support over the last decade and their continued support going forward!

Tuesday Nov 30, 2010

JARYBA Achieves Oracle Validated Integration and Announces Support for Oracle Grid Engine With SmartSuspend v2.0

I am very pleased to announce that we've signed up our first partner to the Oracle Validated Integration program for Oracle Grid Engine. Jaryba's SmartSuspend product is a clever way to allow jobs suspended by Grid Engine to release all of the resources they're holding, even memory and FLEXlm licenses. And it works without requiring any changes to the applications. You don't even have to recompile.

If you've ever run into the issue of running out of swap space because of preempted jobs holding onto their memory, SmartSuspend might be the answer you're looking for. It works by inserting itself between the application and the OS so that it can track the memory and license usage. When a job is suspended, SmartSuspend first uses its knowledge of the resources requested by the application to let all of those resources go. When the job is resumed, SmartSuspend first attempts to recapture those resources before allowing the application to run. From the application's perspective, nothing changes. From the administrator's perspective, the difference is huge.

Thursday Feb 11, 2010

Intro to Service Domain Manager

Let's take a break from the Sun Grid Engine 6.2u5 feature posts and talk about something that's been in the product since 6.2. (It's actually the foundation of two of the remaining three features, so consider this ground work for finishing my u5 features series.)

Service Domain Manager (or the open source Project Hedeby (formerly Project Haithabu)) is an add-on component for Sun Grid Engine that enables multiple clusters to share resources. It was designed to allow for services of all types to share resources with each other. The basic idea is this: each cluster has a set of performance metrics specified via service level objectives (SLOs). If at any point a cluster is in violation of its SLOs, it appeals to the SDM resource provider service for additional resources. The resource provider will look for resources wherever they're available: in spare resource pools, from cloud service providers, or from other less-loaded clusters. If resources are available, the resource provider will (re)assign the resources to the cluster in need. From the users' perspective, nothing really changes, except that the overloaded cluster is now feeling better. Let's get into a little more detail.

A Little More Detail

The resource provider is the heart and brain of SDM. It's job is to keep track of services and resources and adjust resource assignments as needed. At the level of the resource provider, everything is very abstract. It doesn't know (or care) what any of its managed services do, as long as they implement the required interface. It also doesn't care about the details of the resources its managing, beyond the fact that there are details, and that the services it's managing may care about those details.

One other abstract concept that the resource provider understands is a need. When a service managed by the resource provider needs more resources, it tells the resource provider about its need. That need is expressed as a description of the desired resources to satisfy the need (including quantity), and how important the need is. For example, a managed service might say to the resource provider, "Hey! I want two OpenSolaris x86 resources with at least 4GB memory each. This need is critical to me continuing to service my users!" To satisfy this request, the resource provider will look around at the other services it's managing to see who could potentially give up the requested resources. Among the other services there might be spare pools (basically just holding tanks for idle resources), cloud service providers (e.g. Amazon EC2), or other services. If the requested resources are free, they will be reassigned to the requesting service. With a spare pool, the decision is easy: any resources in the spare pool are fair game. Same for the cloud. With other services, though, it's not so simple. In general, if a service is still holding a resource, that's because it's still using it to some degree. How do we know when it's OK to take a resource away from a service? Well, the resource provider has a set of policies that govern the relative importance of the services. Using those policies, the resource provider will decide if the importance of the requesting service plus the criticality of its need outweighs the importance of the potential donor service and how much it's using the resources in question. If, in the end, there are no resources that can reasonably be reassigned to the needy cluster, then the request stays pending and will be reevaluated again later.

On the service side of things there is a service adapter. The job of the service adapter is to be the shim between the service itself and the resource provider. It implements that abstracted service interface that the resource provider expects and translates those abstract concepts we just talked about into concrete artifacts understood by the service. In particular, it's up to the service adapter to define and implement the SLOs for the service. Why? Well, consider this use case. Imagine you have a cluster of application servers and a Sun Grid Engine cluster, and you want to share resources between them. The service level criteria will be very different between them, and it wouldn't make any sense to expect the service provider to understand them all. Instead, it's more flexible and more scalable to allow the service adapters to manage the SLOs and only report the results (e.g. needs) to the resource provider.

Let's use the Sun Grid Engine adapter to illustrate how a service adapter works. Starting with 6.2, the Sun Grid Engine qmaster includes a JMX interface known as JGDI. (While JGDI is openly accessible, we don't really advertise it because it's not really abstract enough for public consumption.) The Sun Grid Engine service adapter uses the JGDI interface to monitor the state of the qmaster. The service adapter implements one unique policy: maximum number of pending jobs. (It actually inherits a couple other policies from the service adapter SDK that are universally applicable, such as the minimum number of resources that should be assigned.) When the state of the cluster changes, the qmaster sends an event to the service adapter. The service adapter then checks the new cluster state against the SLOs that have been configured to see if any SLO has been violated. If an SLO has been violated, the SLO configuration specifies what kind of resource is needed to address the issue. For example, suppose there's an SLO that states that there should never be more than 100 pending Solaris x86 jobs. If the service adapter finds out that the 101st Solaris job is pending, it will appeal to resource provider and request an additional Solaris x86 resource.

When the resource provider assigns a resource to the service, the service adapter is responsible for prepping the resource and adding it into the service. Now, here's the interesting part. After the new resource takes on its share of the workload and the service is happy again, we don't take the resource away. The resource stays with the service until someone else needs it more. Resources are shared, not leased. It is possible to configure SDM to behave in a fashion that is in effect leasing, but it's something you have to explicitly set up.

On the other side of the coin, when the resource provider is asked for a resource, it talks to the service adapters for the managed services to find out who has something that can be borrowed. The resource provider keeps a map of where all the resources are assigned, so it can immediately tell which services are currently holding resources that are candidates for reassignment. It then contacts those services' service adapters to find out whether the resources are in use. The service adapter's job is to look at the service and place a numerical value of how well the resources are being used by the service. Once the resource provider has collected the usage values for all the candidate resources, it applies policies (such as relative importance of the services) and picks the resources that seem most available. This process applies equally to services, spare pools\*, and cloud service providers. (\* There is a built-in spare pool in the resource provider that doesn't actually have its own service adapter, but it works as though it did.)

With the 6.2u5 release, we have two service adapter implementations. One is for the Sun Grid Engine software itself. The other is a generic cloud adapter that comes with integration scripts for use with Amazon EC2 and for use with IPMI power management. Out of the box, you can use SDM to manage Sun Grid Engine clusters and to resource those clusters on demand from EC2. You can also configure a spare pool\* that powers down idle or underutilized machines. (\* It's not technically a spare pool, but it behaves like one.) The intention is to add additional service adapter implementations as we uncover the concrete demand for them. In addition, the original plan was to make the service adapter API clean, public, and well-documented. So far, it's fairly clean, fairly well documented, but only public in so far as the Hedeby Project is open source. If you have interest in seeing or (even better) developing a service adapter for a particular service, please do let us know, and we'll see what we can do to help.

Hopefully this overview gives you a pretty good idea of what SDM does and at least an inkling of how it does it. If not, let me know!

Wednesday Feb 03, 2010

Self Control

Good day, and welcome to week four of my continuing attempt to cover all the features added in the latest release (6.2u5) of Sun Grid Engine. This week we'll talk about array task throttling.

Sun Grid Engine supports four classes of jobs. Interactive jobs are the equivalent of doing an rsh/rlogin/ssh to a node in the cluster, except that the connection is managed by Sun Grid Engine. Batch jobs are your traditional "go run this somewhere" type of job. They represent a single instance of an executable. Parallel jobs consist of multiple processes working in collaboration. All of the processes need to be scheduled and running at the same time in order for the job to run. Parametric or array jobs are like what you see in Apache Hadoop, where multiple copies of the same executable are run across multiple nodes against different parts of the data set. The important characteristic that distinguishes array jobs from parallel jobs is that the tasks of an array job are completely independent from each other and hence do not need to all be running together.

The way that Sun Grid Engine processes array jobs is particularly efficient. In fact, a common trick to improve cluster throughput is to bundle many batch jobs together to be submitted as a single array job. Because array jobs are so efficient, users use lots of them, sometimes with huge task counts. There is no explicit limit on the number of tasks that an array job can contain. Hundreds of thousands of tasks in a single array job are not uncommon.

There is a problem, however. From the Sun Grid Engine scheduler's perspective, all of the tasks of an array job are equal. That means that if the highest priority job waiting to execute is an array job, then all of that job's tasks are higher priority than any other job (or task) waiting to run. If that job has a million tasks, then the cluster is going to have to process all million of those tasks before anything else will be executed. Now, the policies do come into play here, and if a higher priority job is submitted or if the array job loses priority through some policy (like the fair share policy), then it and its remaining tasks will fall back in the execution order. Nonetheless, this approach makes it possible for a user to unintentionally execute a denial of service attach on the cluster.

For quite some time there has been an option that an administrator can configure to set a limit on the maximum number of tasks that can be simultaneously executed from a single array job (max_aj_instances in sge_conf(5)). That solves the problem, but only in a very general and somewhat suboptimal way. As with any such global setting, the administrator has to make a trade-off between having a limit that works well for the majority and having a limit that doesn't unduly restrict certain users. (The default is 2000 tasks per array job.) Well, it turns out that given the opportunity, most users will willing set such a limit themselves, both to avoid being bonked on the head by the administrator for abusing the cluster, and for reasons of self interest, such as by allowing multiple of their array jobs to share cluster time rather than being processed sequentially. So, with 6.2u5, we've given users exactly that ability.

Let's look at an example:

% qsub -t 1-100000 myjob.sh

will submit an array job that will run the myjob.sh script one hundred thousand times. Each time it runs, an environment variable ($SGE_TASK_ID) will be set to tell that instance which task number it is. The myjob.sh script must be able to translate that task ID into a pointer to its portion of the data set. In a cluster with default settings, up to 2000 of the tasks of this job will be allowed to be running at a time. If the cluster only has 2000 slots, that could be a bad thing.

% qsub -t 1-100000 -tc 20 myjob.sh

submits the same job, except that it places a limit of 20 on the number of tasks allowed to be running simultaneously. In our fictitious 2000-slot cluster, that's a quite neighborly thing to do. If you try to set the limit above the global limit set by the administrator, the global limit prevails.

While this feature is pretty simple, it can mean a large difference in job throughput for some clusters. I know one customer in particular that went way out of their way to implement this feature themselves using clever configuration tricks. The massive headache of hacking together a solution was worth it to them to be able to set per-job task limits.

Wednesday Jan 20, 2010

Topology-Aware Scheduling

Continuing in my feature deep dives, let's talk about topology-aware scheduling. Some applications have serious resource needs. Not only do they need raw CPU cores, but they also beat the snot out of the local cache or burn up the I/O channels. These sorts of applications don't play well with others. In fact, they often don't play well with themselves. For these applications, how the threads/processes are distributed across the CPUs makes a huge difference. If, for example, all the threads/processes have their own core but are all sharing a socket, they might end up fighting over cache space or I/O bandwidth. Depending on the CPU architecture, the conflicts may be more subtle, such as only the processes on specific groups of cores colliding. The price for making a bad choice of how to assign these applications to cores is poor performance, in some cases doubling the time to completion.

It's not just the powerhouse apps that care about CPU topology, though. Most operating systems will schedule processes and threads to execute on available cores rather willy-nilly, with no sense of core affinity. Because an average OS does context switches at a rather high frequency, an application may find itself executing on a different CPU and core every time it gets the chance to run. If that application makes any use of the CPU cache, for example, its performance will suffer for it. The performance might not suffer much, but the difference is usually measurable.

For these reasons, we've added topology-aware scheduling to Sun Grid Engine 6.2 update 5. With topology-aware scheduling, the user who submits the job can specify how that job should be laid out across a machine's CPUs. Users are allowed to specify three different flavors of distribution strategy: linear, striding, or explicit. In linear distribution, the execution daemon will place the job's threads/processes on consecutive cores if possible. If it can't fit the entire job on a single socket, it will span the job across sockets. The striding strategy tells the execution daemon to place the job on every nth core, e.g. every 4th core or every other core. The explicit strategy lets the user decide exactly which cores will be assigned to the job. Note that the core binding is a request, not a requirement. If for some reason the execution daemon can't fulfill the request, the job will still be executed; it just won't be bound.

In addition to the three binding strategies, there are also three possible binding mechanisms. You can either allow Sun Grid Engine to do the binding automatically as part of the job execution, or you can have Sun Grid Engine add the binding parameters to the machines file for OpenMPI jobs, or you can have Sun Grid Engine just describe the intended binding in an environment variable with the expectation that the job will bind itself based on that information. When the job is bound by Sun Grid Engine during execution, the job will be tied to specific CPU cores using an OS-specific system call. On Linux, the bound processors may be shared with other processes. On Solaris, the bound processors are used exclusively for the job. In either case, the job will only be allowed to execute on the bound processors.

In order to allow users to tell what kinds of topologies are provided by the machines in the cluster, some new default complexes have been added that describe the socket/core/thread layouts of the machines. These new complexes can be used during job submission to request specific topologies, or they can be used with qhost to report what's available.

Let's look at a couple of examples (taken from the docs).

% qsub -binding linear:4 -l m_core=8 -l m_socket=2 -l arch=lx26-amd64 job.sh

This example will look for a machine with 8 cores and 2 sockets (i.e. dual-socket, quad-core) and try to bind to four consecutive cores. The execution daemon will try to put all four cores on the same socket, but if that's not possible, it will spread the job out over as many sockets as required (but as few as possible).

% qsub -binding striding:2:4 -l m_core=8 -l m_socket=2 -l arch=lx26-amd64 job.sh

This example will again look for a dual-socket, quad-core machine, but this time the job will occupy the third core on both sockets. (The first core is number 0.) If the third core on either socket is occupied, the job will not be bound.

% qsub -binding explicit:0,0:0,3:1,0:1,3 -l m_core=8 -l m_socket=2 -l arch=lx26-amd64 job.sh

This last example will yet again look for a dual-socket, quad-core machine. This time the job will be bound to the first and fourth cores on both sockets. Again, if any of those core are already bound to another job, the job will not be bound.

It's clear that jobs that benefit from specific process placement with respect to CPU cores will perform much better in a 6.2u5 cluster, thanks to this new feature. Even for regular old run-of-the-mill jobs, though, submitting with -binding linear:1 should provide a small performance bump because it will keep them from being jostled around between context switches. In fact, I won't be surprised if 12 months from now I include adding that switch to the sge_request file in my top 10 list of best practices.

Wednesday Jan 06, 2010

Welcome Sun Grid Engine 6.2 update 5

The Sun Grid Engine 6.2 update 5 release is now available. Don't let the unassuming version number fool you; there's quite a few interesting features packed into this release. Let's talk about them, shall we?

Integration with Apache Hadoop

SGE 6.2u5 gets to claim the title of first workload manager with direct support for Apache Hadoop applications. What does that mean? First, it means that you can submit Hadoop applications to an SGE cluster just like you would any other parallel job. The cluster will take care of setting up the Hadoop jobtracker and tasktrackers for you. Second, it means that the SGE scheduler knows about the HDFS data locality such that it can route Hadoop jobs to nodes where the jobs' data already lives. The net result is that you can now realistically consolidate your Hadoop cluster into your SGE cluster, saving you time, money, and lots of headaches. See the docs for more info. [Also see my next post.]

Topology-aware Scheduling

Many applications benefit greatly by being tied to specific CPU sockets and/or cores. For example, some cache-hungry applications will execute in half the time if run on four cores on different sockets versus running on four cores in the same socket. With SGE 6.2u5, we've added the ability to specify these topology preferences when submitting your jobs. Whenever possible, the scheduler will honor the topology preferences when assigning jobs to nodes. For topology-sensitive applications and clusters with lots of Nehalem boxes, SGE 6.2u5 can speed up application execution considerably. See the docs for more info. [Also see my follow-up post.]

Slotwise Subordination

The SGE preemption model is what I call "after-market preemption" meaning that it's not an inherit aspect of every cluster. You have to take preemption (AKA subordination) into account when designing your cluster layout. Prior to SGE 6.2u5, the preemption model was rather coarse grained. SGE could only suspend an entire queue instance at a time, meaning that one high-priority job might be suspending two or four or sixteen or more lower-priority jobs. With SGE 6.2u5, we're introducing finer grained preemption. Now, rather than declaring that just Queue A is subordinated to Queue B, you can say that between Queues A and B there shouldn't be more than 4 jobs running, and given a conflict, Queue B wins. This new finer-grained preemption model means that you can now use subordination without paying for it with utilization. See the docs for more info. [Also see my follow-up post.]

User-controlled Array Task Throttling

One of the unique things about Sun Grid Engine is that it handles array jobs extremely efficiently. In many cases users will consolidate individual batch jobs together into array jobs to take advantage of that fact. The down side is that all tasks within an array job are considered equal with regard to scheduling policies. If an array job is the highest priority job in the system, all of it's tasks are also higher priority than any other jobs. If that array job has ten thousand tasks (something not uncommon or really even all that stressful for SGE), then all ten thousand tasks will be run before any other jobs (unless another job later becomes higher priority), at least by default. An administrator can configure a global limit to the number of tasks from a single array job that are allowed to execute at a time. Better than nothing, but global policies always leave something to be desired.

With SGE 6.2u5, we've introduced the ability for a user to apply self-imposed limits to his individual array jobs. Why would a user voluntarily set limits? In most cases it turns out that users want to do the right thing and will gladly do so given the chance. Self-imposed limits help the cluster run more smoothly, meaning that everyone gets what they want faster, and no one gets bonked on the head by the administrator. Additionally, if a user has more than one large array job pending, setting self-imposed limits allows them all to make progress instead of completing them serially. For more than one customer I know about, this feature alone will be reason enough to upgrade. [See my follow-up post for more info.]

Extended SGE Inspect

SGE Inspect, the new UI introduced in SGE6.2u3, was previously only a monitoring tool. With SGE 6.2u5, we've added the ability to manage parallel environments. Going forward we will continue adding management functionality. See the docs for more info.

Improved Cloud Connectivity

With SGE 6.2u3, we added the ability through the Service Domain Manager component to automatically provision additional cluster nodes from Amazon EC2 during peak periods. With SGE 6.2u5, we've expanded that functionality a bit and made it easier to use. See the docs for more info.

Improved Power Management

Same story as the cloud connectivity, really. We introduced the ability to automatically power down idle or underused nodes with SGE 6.u3 through the Service Domain Manager component. With SGE 6.2u5, we've fleshed it out a bit more and more it easier to use.

 
 

Over the next couple of weeks I'll try to write some posts about these features individually. If you're already Grid Engine savvy, go grab a copy and get started. If you need more info, try starting with the beginner's guide.

Thursday Jul 30, 2009

Sun HPC Software Workshop '09 -- Early Bird's Almost Over!

Just wanted to remind everyone that the early bird registration for the Sun HPC Software Workshop '09, Sept 7-10 in Regensburg, Germany, ends tomorrow (31 July 2009). It's your last chance to sign up at the discounted rate. After tomorrow, you will still be able to register, but the cost of registration will be higher.

In a nutshell, the Sun HPC Software Workshop '09 is a combination of our annual Grid Engine Workshop, a European edition of the popular Lustre Users Group meeting, and a conference on developing applications and services for HPC and cloud environments. The Workshop lasts three days, with a presentation track representing each of these topics. One the day before the main Workshop starts, we're also holding deeper technology seminars: a Lustre Deep Dive, a Grid Engine admin training, and a class on parallel application development taught by Ruud van der Pas. The Workshop and the preceding seminars are an excellent opportunity to learn more about these technologies and connect with the product engineers, partners, and other community members.

There is an open Call for Presentations for the Workshop, but it also closes tomorrow. If you're interested in proposing a talk for the Workshop (and getting a discounted registration fee if it's accepted), send a title, duration, and brief summary to the email address listed on the Agenda page. But, hurry. We'll be making our final decisions and notifying the speakers soon.

I look forward to seeing you there!

Tuesday Jul 21, 2009

Lies, Damned Lies, & DRMs

Some of our competitors seem to be very fond of spreading the rumor that the Sun Grid Engine product team has been laid off and/or that the product has been discontinued. It would appear that since they can't claim to have a better, more scalable, or more cost-effective product, they're willing to go with lying through their teeth to make the sale. Since I keep getting asked this question, I figured it would be worthwhile to post an official response.

To plagiarize Mark Twain, the rumors of our death have been greatly exaggerated. We're still here and going strong. The team is now roughly four times the size it was when I joined six years ago. It spans six offices in five countries on three continents. The product has a road map that reaches out past 2012 (which is as far as we're willing to speculate). We have a massive (if not leading) share in both the open source and licensed DRM system markets, and we're not planning to go away any time soon.

Of course, with the deal with Larry pending, nothing is certain. The only comment I can make there is "no comment." That said, for now at least, it's business as usual. We're still writing code, preparing releases, doing trainings, holding our annual Workshop, etc. Look for the next update this quarter. Look for the next release next year. And look for a whole lot more good stuff coming from our team over the next several updates and releases. With the features that have been added in the 6.2, 6.2u2 and 6.2u3 releases, Sun Grid Engine is in a great position. With what's coming up, I'd resort to lying too, if I worked for one of our competitors.

Monday Jul 20, 2009

European Students: Want a Free Laptop?

Are you a student in Europe\*? Do you want a new Toshiba laptop? Willing to write some code to get it? Good. Read on.

The OpenSolaris HPC team is currently running a programming contest for European students that was launched at ISC in Hamburg last month. The contest is to write the most performant and scalable implementation of a distributed hash table. Submission can be from teams of up to three people. The top prize is a new Toshiba laptop for each member of the winning team.

For more information, check out the contest site. Better hurry, though, because the contest deadline is coming up quick!

\* Contest participation is limited to legal residents of a specific list of European countries. See the contest site for details.


OFFICIAL RULES
NO PURCHASE NECESSARY

1. DESCRIPTION OF THE CONTEST: The Sun HPC Software Student Programming Challenge ISC 2009 ("Contest") is designed to promote the use of the Sun HPC Software, Developer Edition 1.0 for OpenSolaris among students by having them compete to design and implement the most scalable and best-performing implementation of a common parallel algorithm. Prizes will be awarded to those who submit the best entries as determined by the judges in accordance with these Official Rules.

2. ELIGIBILITY: This contest is open only to teams of 1 to 3 currently-enrolled, full- or part-time, undergraduate or graduate, university or college students, who are the legal age of majority in their country, province or state of legal residence and residents of Denmark, France, Germany, Italy, Poland, Russia, Spain, Sweden, Switzerland, and the United Kingdom. Void in Puerto Rico, Quebec and where prohibited by law. Persons in any of the following categories are not eligible to participate or win the prize(s) offered: (a) Employees or agents of Sun Microsystems, their parent companies, affiliates and subsidiaries, participating advertising and promotion agencies, application development partner companies, and prize suppliers; (b) immediate family members (defined as parents, children, siblings and spouse, regardless of where they reside) and/or those living in the same household as any person in (a) above; and (c) employees of any government entity. You must also have access to the Internet and a valid email address in order to enter or win.

3. HOW TO ENTER: This contest begins at 12:01 P.M. Pacific Time (PT) Zone in the United States (e.g. San Francisco time) which is 5:01 A.M. Greenwich Mean Time (GMT) on the 29th of June 2009 and ends at 11:59 P.M. (PT) which is 4:59 A.M. (GMT) on 10th of August 2009 ("Contest Period"). IMPORTANT NOTICE TO ENTRANTS: ENTRANTS ARE RESPONSIBLE FOR DETERMINING THE CORRESPONDING TIME ZONE IN THEIR RESPECTIVE JURISDICTIONS.

4. THE SUBMISSION: Create an implementation of a fault-tolerant distributed hash table as described at http://wikis.sun.com/display/HPCContest/Sun+HPC+Software+Student+Programming+Challenge+ISC+2009 The implementation must be written in C for the OpenSolaris 2009.06 operating environment using the Sun HPC ClusterTools 8.1 OpenMPI implementation and must be submitted as a Sun Studio 12 project. All Entries must include a valid and complete Sun Studio 12 project that builds without errors on an unmodified instance of the Sun HPC Software, Developer Edition 1.0 for OpenSolaris. Entries may be submitted either electronically or via mail. All Entries must be comprised of original work of the submitter(s). No participant may submit an Entry as a member of more than one team.

Electronic Entries must include a 1-3 page written summary of the implementation approach and the name(s) of the submitter(s). The electronic file must be a gzipped tar file that includes the Sun Studio 12 project directory, including all required files, and must be no larger than 5MB in size. If the electronic file is larger than 5MB in size, it must be submitted by mail in accordance with the instructions below. The electronic entry must be sent via email to hpccontest@sun.com and received no later than 11:59 PM (PDT) on August 10th, 2009 in the United States.

Mailed Entries must include a 1-3 page written summary of the implementation approach and the name(s) of the submitter(s), and a CD or DVD containing the project code as described above. All mailed Entries must be sent to Sun HPC Software Programming Challenge, c/o Sun Microsystems, Inc., 17 Network Circle, Menlo Park, CA 94025, MS-MPK17-207, and must be received no later than 11:59 PM (PDT) on August 10th, 2009 in the United States.

All Entries must be in English. Registration or Entries that are in any other language will not be considered. Entries that are lewd, obscene, pornographic, disparaging of the Sponsor or otherwise contain objectionable material may be disqualified in the Sponsor's sole and unfettered discretion.

5. JUDGING: All Entries will be judged by a panel of experts based on the following equally weighted judging criteria: data retrieval throughput for requests coming from a single node, data retrieval throughput for parallel requests coming from multiple nodes, ability to withstand processing node failure, and scalability with respect to number of processing nodes and number of data items. In the event of a tie, the person or team among the tied Entries with the highest score in scalability with respect to number of processing nodes and number of data items will be declared the winner. In the event that no entries are received, no prize will be awarded. Decisions of judges are final and binding. Winner will be notified by email.

6. PRIZES AND APPROXIMATE RETAIL VALUE: First prize: Toshiba OpenSolaris laptop valued at approximately $2,000. Second and third prizes: Apple iPod valued at approximately $150. Up to three Toshiba laptops and six Apple iPods may be awarded. Prize includes round-trip coach air transportation for one person from major airport nearest winner's residence and hotel accommodations for one person for four nights. Hotel accommodations at Sponsor's discretion. Certain black out dates apply. In the event the Sun HPC Software Workshop is cancelled or postponed for any reason, Sponsor reserves the right to award the remainder of the prize with no further obligation to the winner. All other expenses not specified herein are the responsibility of the winner. ALL TAXES AND ANY APPLICABLE WITHOLDING AND REPORTING REQUIREMENTS ARE THE SOLE RESPONSIBILITY OF THE WINNER. Cash prizes will be awarded in US Dollars. All costs associated with currency exchange are the sole responsibility of the winner.

7. CONDITIONS OF PARTICIPATION. Sponsor reserves the right to substitute a prize for an item of equal or greater value in the event all or part of a prize becomes unavailable. Prizes are awarded without warranty of any kind from Sponsor, express or implied, without limitation, except where this would be contrary to federal, state, provincial, or local laws or regulations. All federal, state, provincial and local laws and regulations apply. Submission of entry into this Contest deems that entrants agree to be bound by the terms of these Official Rules and by the decisions of Sponsor, which are final and binding on all matters pertaining to this Contest. Return of any prize/prize notification may result in disqualification and selection of an alternate winner. Any potential winner who cannot be contacted within 15 days of attempted first notification will forfeit his/her prize. Potential prize winner(s) may be required to sign and return an Affidavit or Declaration of Eligibility/Liability & Publicity Release within 30 days following the date of first attempted notification. Failure to comply within this time period may result in disqualification and selection of an alternate winner. Travel companion of winner must also execute an Affidavit of Eligibility/Liability & Publicity Release prior to ticketing and must possess required travel documents (e.g. valid photo I.D.) prior to departure. Once the travel schedule has been arranged, it cannot be altered and failure of winner to follow such schedule shall not obligate Sponsor in any way to provide the winner with alternate arrangements. The intellectual and industrial property rights to the contest submission, if any, will remain with the participants, except that these terms do not supersede any other assignment or grant of rights according to any other separate agreements between participants and other parties. As a condition of entry, participants agree that Sun shall have the right to use, copy, modify and make available the application or code in connection with the operation, conduct, administration, and advertising and promotion of the Contest via communication to the public, including, but not limited to the right to make screenshots, animations and video clips available to the public for promotional and publicity purposes. Notwithstanding the foregoing, ownership of and all intellectual and industrial property rights in and to the application and code shall remain with the participant. Acceptance of the prize constitutes permission for, and winners consent to, Sponsor and its agencies to use a winner's name and/or likeness and entry for advertising and promotional purposes without additional compensation, unless prohibited by law. To the extent permitted by law, entrants, agree to hold Sponsor, its parent, subsidiaries, agents, directors, officers, employees, representatives and assigns harmless from any injury or damage caused or claimed to be caused by participation in the Contest and/or use or acceptance of any prize won, except to the extent that any death or personal injury is caused by the negligence of the Sponsor. Sponsor is not responsible for any typographical or other error in the printing of the offer, administration of the Contest or in the announcement of the prize. A participant may be prohibited from participating in this Contest if, in the Sponsor's sole discretion, it reasonably believes that the participant has attempted to undermine the legitimate operation of this Contest by cheating, deception, or other unfair playing practices or annoys, abuses, threatens or harasses any other participants, the Sponsor or associated agencies. In the event a winner/potential winner's employer has a policy, which prohibits the awarding of a prize to an employee, the prize will be forfeited and an alternate winner will be selected.

8. NO RECOURSE TO JUDICIAL OR OTHER PROCEDURES: To the extent permitted by law, the rights to litigate, to seek injunctive relief or to make any other recourse to judicial or any other procedure in case of disputes or claims resulting from or in connection with this contest are hereby excluded, and any participant expressly waives any and all such rights.

Participants agree that these Official Rules are governed by the laws of California, USA.

9. DATA PRIVACY: Participants agree that personal data, especially name and address, may be processed, stored and otherwise used for the purposes and within the context of the contest and any other purposes outlined in these Official Rules. The data may also be used by the Sponsor in order to check participants' identity, their postal address and telephone number, or to otherwise verify their eligibility to participate in the Contest and to receive any prize. Participants have a right to access, review, rectify or cancel any personal data held by the Sponsor by writing to Sponsor (Attention: Daniel Templeton) at the address listed below. If participant's data is not provided or is canceled participants' Entries will be ineligible.

10. WARRANTY AND INDEMNITY: Entrants certify that their entry is original and that they are the sole and exclusive owner and right holder of the submitted entry and that they have the right to submit the Entry in the Contest. Each participant agrees not to submit any Entry that (1) infringes any 3rd party proprietary, intellectual property, industrial property, personal rights or other rights, including without limitation, copyright, trademark, patent, trade secret or confidentiality obligation; or (2) otherwise violates applicable law in any countries in the world. To the maximum extent permitted by law, each participant indemnifies and agrees to keep indemnified the Sponsor its parent, subsidiaries, agents, directors, officers, employees, representatives and assigns harmless at all times from and against any liability, claims, demands, losses, damages, costs and expenses resulting from any act, default or omission of the participant and/or a breach of any warranty set forth herein. To the maximum extent permitted by law, each participant indemnifies and agrees to keep indemnified the Sponsor, its parent, subsidiaries, agents, directors, officers, employees, representatives and assigns harmless at all times from and against any liability, actions, claims, demands, losses, damages, costs and expenses for or in respect of which the Sponsor will or may become liable by reason of or related or incidental to any act, default or omission by a participant under these Official Rules including without limitation resulting from or in relation to any breach, non-observance, act or omission whether negligent or otherwise, pursuant to these official rules by a participant.

11. ELIMINATION: Any false information provided within the context of the Contest by any participant concerning identity, postal address, telephone number, ownership of right or non-compliance with these rules or the like may result in the immediate elimination of the participant from the Contest. Sponsor further reserves the right to disqualify any Entry that it believes in its sole and unfettered discretion infringes upon or violates the rights of any third party or otherwise does not comply with these official rules.

12. INTERNET: Sponsor is not responsible for electronic transmission errors resulting in omission, interruption, deletion, defect, delay in operations or transmission. Sponsor is not responsible for theft or destruction or unauthorized access to or alterations of entry materials, or for technical, network, telephone equipment, electronic, computer, hardware or software malfunctions or limitations of any kind. Sponsor is not responsible for inaccurate transmissions of or failure to receive entry information by Sponsor on account of technical problems or traffic congestion on the Internet or at any Web site or any combination thereof, except to the extent that any death or personal injury is caused by the negligence of the Sponsor. If for any reason the Internet portion of the program is not capable of running as planned, including infection by computer virus, bugs, tampering, unauthorized intervention, fraud, technical failures, or any other causes which corrupt or affect the administration, security, fairness, integrity, or proper conduct of this Contest, Sponsor reserves the right at its sole discretion to cancel, terminate, modify or suspend the Contest. Sponsor reserves the right to select winners from eligible entries received as of the termination date. Sponsor further reserves the right to disqualify any individual who tampers with the entry process. Caution: Any attempt by a contestant to deliberately damage any Web site or undermine the legitimate operation of the game is a violation of criminal and civil laws and should such an attempt be made, Sponsor reserves the right to seek damages from any such contestant to the fullest extent of the law.

13. If any provision(s) of these Official Rules are held to be invalid or unenforceable, all remaining provisions hereof will remain in full force and effect.

14. WINNER'S LIST: For winner's name, log onto http://wikis.sun.com/display/HPCContest on or about August 14th, available for a period of up to 60 days.

15. SPONSOR: The Sponsor of this Contest is Sun Microsystems, Inc., 4220 Network Circle, Santa Clara, CA 95054.

Monday Nov 03, 2008

SGE Blog Planet

There's a new Grid Engine blog aggregator on planets.sun.com. The idea is to capture all of the relevant Grid Engine blogs in a single place for easy access. It's similar to the aggregator on the OpenSolaris HPC Community site, except that the HPC one also contains general HPC blogs and blogs on other Sun HPC products as well. If you have suggestions for a blog that should be included in either, let me know.

Monday Aug 11, 2008

Sun Grid Engine 6.2 Information

Since there's actually quite a lot of information out there about the Sun Grid Engine 6.2 release, I thought it might be useful to provide a single source for where to find it. (Actually, the completely revamped Sun Grid Engine is already a single source for this information, but you have to browse a bit to find it all.) Here ya go:

There are still a couple more things in the coming soon category. As they go live, I will update the above list.

Thursday Jun 26, 2008

Xen and the Art of Cluster Scheduling

I keep finding myself talking about this paper, and I keep having to search for it. To save everyone the trouble in the future, here it is.

Where Not to Run

Reuti just reminded me of a nice application of one of the new features we added in Grid Engine 6.1. Before 6.1, resource requests were limited to simple boolean AND and OR expressions. For example, when submitting a job, a user might request "-l a=sol-x\*|sol-amd64 -l mem_free=4G -l exclusive=TRUE", meaning that the job must run on a Solaris i386 or AMD64 machine, and the machine must have at least 4GB of memory free, and the job wants exclusive access to the host. (AND is represented by multiple -l switches.) There was no way, however, to request, for example, Solaris on anything but x86.

Enter 6.1. With 6.1 we introduced full boolean expressions for resource requests. A user can now make requests like, "-l a =sol-\*&!sol-sparc\*". (The job must run on Solaris, but not on SPARC or SPARC64.) Even better, you make create complex boolean statements, like "-l (sol-\*&!\*-x86)|(lx2[46]-\*&!(\*-x86|\*-ia64))". (The job must run on either Solaris on anything but x86 or Linux on anything except x64 or Itanium.)

Now, to the title problem. In the email that prompted this post, Reuti responded to a question about how to submit a job to any host, except for one. With 6.1, the answer is simple. Grid Engine has a built-in complex called hostname, or h for short. Using the new boolean expressions, it's very simple to request "-l h=!badhostname", which allows the job to run on any machine except the one named badhostname.

Monday Jun 23, 2008

Announcing Grid Engine 6.2 Beta 2 Binaries

I'm a little slow on the draw, but in case you haven't noticed already, Grid Engine 6.2 Beta 2 is now ready for download! Go pull it down and give it a whirl!

You should also have a look at my slide deck from SuperComputing '07 talking about what's new in 6.2 You can find it on the OpenSolaris HPC Community's presentations page.

About

templedf

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today