Tuesday Apr 22, 2008

Parallel Programming Patterns: Part 5

This is the last (but not the least!) design pattern in our series of parallel programming design patterns: the ring pattern.

This pattern can be applied to those set of problems which can be modeled as a ring of processes communicating with each other in a circular fashion. The requirement in such applications is that a set of data is to be repeatedly operated upon by a fixed set of operations. This pattern can be considered as an extension of pipeline Pattern where the output of last process goes as input to the first process and the data keeps on rotating in the initial set of processes.

Now that we have seen five parallel programming design patterns in previous posts, we should get our hands dirty with MPI code, next time we will see an example of MPI program, for doing matrix vector multiplication(for very large matrices).

Wednesday Apr 02, 2008

Parallel Programming Patterns: Part 4

As promised, we will see the divide and conquer pattern implemented in MPI Plugin for Netbeans.

This pattern is employed in solving many sequential problems where a problem can be split into number of smaller problems which can be solved independently. The intermediate solutions are merged to get the final solution. The sub problems are generally independent of each other and structured such that they can not be further sub divided. Also if the program correctness is independent of whether the subproblems are solved sequentially or concurrently, a hybrid system can be designed that sometimes solves them sequentially and sometimes concurrently, based on which approach is likely to be more efficient. With this strategy, the subproblems can be solved directly, or they can in turn be solved using the same divide-and-conquer strategy, leading to an overall recursive program structure.In summary program following this pattern should have a recursive process creation, base case solving mechanism, and merging the result. Maintaining the right level or recursion depth and problem size may need to be tuned.

The final pattern of our series would be the ring pattern, for more information, please see MPI Plugin design patterns.

Tuesday Apr 01, 2008

Parallel Programming Patterns: Part 3

The pipeline pattern implemented in MPI Plugin, can be applied to those set of problems which can be modeled as a set of data flowing through multiple sets of computations.

The computations are ordered and independent and can also be seen as series of time-step operations. In a sequential execution scenario, the output of first step of computation would serve as input to the second step of computation, and so on for all the sets of computation. Parallelism is introduced in the application by overlapping the operations through different time step operations. The first step of component start operating as soon as the input is available, and the output of this step is passed to the second step component. Not during the nest time unit, the first time step component is free to accept more input and it does so making available the output to the second time step component ion next iteration. In next iteration the second time step component passes on its output to the third time step component and it accepts the output produced by first time step component in this iteration. this cycle keeps on continuing till all input is exhausted and all sets of operations are completely applied over the ordered data. Also a point to be noted is that each computation step must be comparably equal in size to have equal length of time steps, only then substantial parallelism can be achieved.

Next pattern which we will see in this series is Divide and Conquer Pattern. Stay tuned!!

Monday Mar 31, 2008

Parallel Programming Patterns: Part 2

This is the second pattern which is implemented in MPI Plugin, called Master Worker Pattern.

This pattern is used to solve those class of problems which need performing same set of operations over multiple data sets. The set of operations are generally independent of each other and can be performed concurrently. Parallelism is achieved here by dividing the number of computations amongst available processes and each process creating identical number of processes. There is generally a Master process also called a managerial component) present which is responsible for distributing the work amongst the worker processes and then collating the data as the computation completes. Also the distribution of data among the worker processes can generally be done in any specific order, but it is important to preserve the order of processed data. The responsibility of each worker task is to perform each computation repeatedly on multiple sets of data as given by the master process. The decisive factors for choosing this pattern are among (but not limited to) Load Balancing, data integrity and data distribution.

In the next post, we will have a look at the Pipeline pattern.

Thursday Mar 06, 2008

Parallel Programming Patterns: Part 1

Recently we released MPI Development environment for Netbeans IDE, and this series is a consolidated summary of Parallel Programming Patterns implemented in the Plugin. The first Pattern which we will see is SPMD(Single Process Multiple Data) Pattern. This is a technique used to achieve data level parallelism. One of the dominant style of parallel programming, where all processors use the same program, though each has its own data, SPMD pattern exploits data parallelism in applications where a large mass of data of a uniform type needs the same instruction performed on it. The data is divided among processes to be independently operated. The example provided in the MPI Netbeans plugin shows following:
  1. An array of elements is created on main process which is then distributed amongst other processes.
  2. All processes do independent processing of data which is sent to them.
  3. If the main process wants, it can collect the data from other processes for some final processing, etc.
For more details please refer to MPI Plugin Download page and its Development guide. This is the link to Parallel Programming Patterns documentation.

Tuesday Aug 14, 2007

MPI Development environment for Netbeans IDE

Recently we released a MPI plugin for Netbeans. The purpose of this plugin is to allow application developers to access Netbeans platform to develop, test, debug MPI applications for the Sun Grid Compute Utility. This plugin includes an early access version of the new MPI Development Plugin for NetBeans(tm) IDE, which is targeted at C/C++ developers who are working with MPI applications that can be modeled as a set of independent, compute-bound tasks. The software is published under the GNU General Public License.

MPI Development Plugin for NetBeans(tm) IDE project offers following in its current early access state:

  • MPI programming model to simplify the design and development of C/C++ MPI applications.
  • Netbeans IDE framework built in features enhanced to support the efficient execution of C/C++ MPI applications on the Sun Grid Compute Utility.
  • MPI testing plug-in for the NetBeans IDE to ease local development and testing of C/C++ MPI applications.
  • Pre built collection of Sample MPI applications for illustrating effective use of Parallel Programming Patterns to build C/C++ MPI applications for Sun Grid.

Learn More:

More in this series:

In the next posts, look out for Parallel Programming Patterns and related examples for this plugin, which we have developed.

Monday May 14, 2007

GNU Linear Programming Toolkit available on Sun Grid

GNU Linear Programming toolkit

The GLPK (GNU Linear Programming Kit) package is intended for solving large-scale linear programming (LP), mixed integer programming (MIP), and other related problems. It is a set of routines written in ANSI C and organized in the form of a callable library.

How to utilize GLPK on Sun Grid?

Detailed steps are available on https://gnu-glpk.dev.java.net/. As usual, for running GNU Linear Programming Toolkit on Sun Grid, you would need an account on http://www.network.com.

Sample data and example files

Example data files are available on the developer page of GNU LPK.


GNU Linear Programming Toolkit download: http://www.gnu.org/software/glpk/
Running GNU GLPK on Sun Grid: https://gnu-glpk.dev.java.net/

Sunday May 13, 2007

New application on Sun Grid: Calculix


CalculiX is a software used to solve field problems by using the finite element method. With CalculiX Finite Element Models can be build, calculated and post-processed. The pre- and post-processor is an interactive 3D-tool using the openGL API. The solver is able to do linear and non-linear calculations. Static, dynamic and thermal solutions are available.

Why Calculix on Sun Grid?

Calculix is an ideal choice for running on Sun Grid as it is an compute intensive application. Making it available as a service would tremendously benefit scientists, mathematical solvers, etc. You only need to have your input files ready, without worrying about any other aspects of running Calculix.

How to run Calculix on Sun Grid?

Detailed steps are available on https://calculix.dev.java.net/. For running Calculix on Sun Grid, you would need an account on http://www.network.com. If you don't have an account on Sun Grid, you can request for an account here. Now that Sun Grid is available in 24 countries, Calculix has become much more accessible to end users in these countries.

Sample data and example files

To get a head start in running Calculix, example data files are available on the developer page of Calculix.


Calculix download: http://www.dhondt.de/
Calculix home page: http://www.calculix.de/
Running Calculix on Sun Grid: https://calculix.dev.java.net

Friday May 11, 2007

Sun Grid compute utility gets feature rich

Sun Grid compute utility has added a bunch of new features making it more flexible and powerful. The new release of Sun Grid has following capabilities:

International access

Previous to this release Sun Grid was available only in the United States, but it's now available in 24 countries across the globe: United States, Australia, Austria, Belgium, Canada, China, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, India, Ireland, Italy, Japan, New Zealand, Poland, Portugal, Singapore, Spain, Sweden, and the United Kingdom. This is a significant feature addition considering the legal implications.

Internet access(Bi directional)

A job which ran on SunGrid was helpless if it needed to access information or any service outside network.com. But now with Internet access being added to the feature list will enable an application running in Sun Grid compute environment to access outside world information/service. While the use of this feature may not be immediately evident, it has opened a host of opportunities for creating innovative applications. For example:

  • A Bioinformatics application can access any public database
  • Results of your application can be delivered to specific source.
  • An application can be hosted in form of mashup to offer myriad ranges of services
  • and the possibilities are endless..
Security is easy to achieve by pairing Internet access with a tunnel technology such as ssh.

Job submission API

Although in limited beta release, this feature will give an API for submission and management of jobs programmatically. Accessing network.com jobs can also be done via an command line interface. As an alternative to the web based interface, Job submission API gives the much desired flexibility to the end user of using a programmatic interface. This feature has to be requested from the Sun Grid Customer Care (only possible if you have an account on Sun Grid Compute Utility).


Whats new on Sun Grid http://www.sun.com/service/sungrid/whatsnew.jsp
Jim Parkinson's blog http://blogs.sun.com/jpblog
Network.com news http://network.com/news.html

Saturday Mar 17, 2007

Which is expensive: eBook or printed book?

Recently while reading an ebook on Beowulf cluster this struck me: What is more expensive affair? Reading an eBook or owning a printed book? Let us run through some quick calculations for a book of 300 pages.

Reading an eBook:

Assuming you have standard pc (including 17" monitor, cable modem, etc), it will consume typically 330 Watts in one hour (for more details refer How much electricity do computers use?). Now if you have reading speed of 100 words per minute (fairly average for reading technical texts) it would take (1000 words per page / 100 words per minute) =  10 minutes per page to read a page of 1000 words. If the book is of 300 pages then you will take 300 pages per book \* 10 minutes per page = 3000 minutes per book or 50 hrs approximately to complete the book.

So the cost of  reading the book once would be  (50 hrs \* 330 Watts / 1000 ) \*  3.40 Rs per KWH = 56 Rs. and also the cost repeats for each subsequent reads. The costs we have neglected (as we don't have to pay those from our pockets) are the cost of servers(electric, data centers' maintenance costs, etc.) which host the book, electric costs of intermediate routers, proxy servers, etc. Adding those costs would increase the costs many folds and my guess is that they would make the original cost negligible.

Owning a physical book

A good technical book of 300 pages would cost anywhere between Rs 300 to 500. But this would afford us multiple readings at no extra cost. Some more facts which come to my mind :

- Often eBooks are not legal, where as we are assured for the originality of hard copies of books.
- Often one hard copy of book is read by multiple readers. This is at no extra cost. On line books can also be shared easily, but each reading of  on line book will bring recurring electricity costs.
- I find reading on line books a big strain on the eyes. I haven't ever successfully competed an on line book !!
- An advantage of on line books is : more flexibility in organization of contents (we can separate important contents and take prints if necessary). We have no such flexibility for hard copies of books.
- We can read books in bed but not ebooks :)

Wednesday Mar 14, 2007

Welcome App Catalog..

Are you a open source developer who has created a cool application and want to give it enhanced visibility? Are you a research scientist lacking the infrastructure and service-provider know-how to run complex applications? Welcome to Sun Grid ! Additional features were announced for Network.com adding muscle to the already cool pay per use utility offering which would enable end users to tap into high performance computing (HPC), enterprise applications and infrastructure for complex computations as a service.

From creating your own application for Sun Grid to publishing the application for other end users (making some bounty in the process if you choose to do so), Sun Grid would also enable you to instantly access popular ISV and open source applications on a pay-per-use basis.  You can choose an already existing application in the Catalog, or you can create and publish your own application.(A how to is available here).

Some resources for Sun Grid users:

Tuesday Mar 13, 2007

ClustalW on Sun Grid !!

What is ClustalW?

ClustalW is a general purpose multiple sequence alignment program for DNA or proteins.It produces biologically meaningful multiple sequence alignments of divergent sequences. There are three main steps for achieving alignment: pairwise alignment, guide-tree generation and progressive alignment.

Why Clustal-w ?

ClustalW-mpi is an ideal choice for running on Sun Grid as it has inherent support for MPI. In the MPI version of ClustalW, both the pairwise and progressive alignments stages are parallelized.

How to run ClustalW on Sun Grid?

Detailed steps are available on https://clustal-w.dev.java.net/. To run ClustalW on Sun Grid, you would need an account on http://www.network.com. If you don't have an account on Sun Grid, you can request for an account here.

Sample data and example files

To get a head start in running ClustalW, example data files are available on the developer page of ClustalW.


ClustalW download: ftp://ftp.ebi.ac.uk/pub/software/unix/clustalw/
ClustalW MPI home page: http://packages.debian.org/unstable/science/clustalw-mpi
Running ClustalW on Sun Grid: https://clustal-w.dev.java.net

Monday Dec 04, 2006

Sun Open Sources Java !!!

Well that is a bit old news, but what better news to start a blog :). Well for readers who are unaware, have a look here. Key Java implementations - Java Platform Standard Edition (Java SE), Java Platform Micro Edition (Java ME), and Java Platform Enterprise Edition (Java EE) - under the GNU General Public License version 2 (GPLv2), the same license as GNU/Linux. Read more here.

More about the Grid Engine @ FOSS.in in my next post !! Stay tuned !!


Hardik Dave


« August 2016