By swdeveloper on Nov 11, 2008
Sun Studio Express 11/08 Release went live on November 12, 2008. This new release has many important new features, although it is an express release. The new feature list includes updated IDE based on NetBeans 6.5, remote development and debugging, new standalone light weight GUI debugging tool, integrated Dtrace plug-in and the new compiler enhancement to generate the machine code with better run time performance on both Sparc and X86 systems. Any above new feature deserves a good article to describe it. Here I like to discuss two very important tools OpenMP 3.0 and MPI Performance analysis tool for parallel programming in this blog.
Both tools are very important in their own parallel programming domain. OpenMP 3.0 is a major leap from the older version with dynamic tasking in shared memory programming model and the MPI performance analyzer is the indispensable tool for MPI distributed programming. Here I will explain why the serious parallel application developers should feel excited about these two big tools to help design their parallel applications.
OpenMP 3.0 introduces a new feature, dynamic tasking. Instead of parallelizing the code by using the OpenMP parallel loop or parallel regions statically, the parallel application developers can create the parallel tasks in the logic flow dynamically in OpenMP 3.0 and expect these tasks to run concurrently. This new feature gives the developers a much needed programmability. Graph traversal is a practical example to illustrate the programming power of dynamic tasking. Many applications are designed to find an optimal solution for a very complex problem. Usually the problem can be transformed and modeled as a large state space graph and the program algorithm is implemented as finding the optimal graphic node through traversing the graph. The programmer can map the processing of a graph node as an OpenMP task. A task during the process can create as many new tasks as the child nodes of the current node, specifically one new task for each child node. Therefore a complex graph traversal problem can be implemented simply as a task of processing an graph node and creating more new tasks for the corresponding child nodes. The program starts with a single task mapping the root node and the task will generate more and more tasks which eventually cover all the graph nodes.
OpenMP 3.0 also include some other very helpful features such as loop collapse, new environment variables and routines for runtime scheduling etc., Sun Studio compiler implements all these new features too. More importantly Sun Studio compiler keep its tradition as a very high quality performance tool to support the new OpenMP 3.0 with the top run-time performance in the industry.
MPI is the De-facto standard for distributed programming model. Today when people talk about HPC , MPI is the first thing comes to people's mind. Actually MPI programming is not easy for most developers, but the harder problem is to scale up the parallel performance of a MPI application. An application must be thirsty for multiple times performance speed up to adopt the MPI programming model. The failure of scaling up a MPI program's parallel performance is unacceptable. Therefore MPI performance analyzer is the most fundamental tool for MPI application developers besides the MPI library itself.
There are several MPI performance tools ranging from the very expensive one to the open source free one in the market place . Why does Sun Studio offers its own MPI performance analyzer? What is unique about this tool? The quick answer is that Sun Studio MPI performance analyzer is the best of its kind. Here I like to discuss the unique features of Sun Studio MPI performance analyzer in core capability and ease of use interface. When you analyze and tune the performance of a distributed program, you need to analyze both the computation and communication cost and find the right balance between the two. Sun Studio Analyzer is one of the best performance tools in analyzing the computation performance in the industry. It has the most comprehensive instrumentation mechanism to collect all the key runtime performance data and allow the programmers to analyze these data with the various analysis mechanisms from time line, call tree, source code window to data space view. The MPI performance analyzer adds the complementary capability of tracing the MPI messages to analyze the communication cost. Besides the MPI performance tool also collaborating with Sun ClusterTools team (MPI library) adds MPI states to make the measurement of communication cost more precise. Therefore Sun Studio performance analyzer has the best core capability in analyzing both computation and communication cost.
Sun Studio MPI performance analyzer also presents a new and simple user model for the MPI developers. The instrumentation is similar to the Analyzer's original data collection interface for a sequential program. It is easy to run a performance experiment to collect the performance data. The MPI analyzer provides a simple and easy user interface. The main time-line window as shown below provides a two dimensional view of each MPI process's computation time and the messages between the processes. It is quite handy to zoom in and zoom out for either dimension through the corner control console.
Additionally the tool also offers a two dimensional chart window as shown below with both X axis and Y axis configurable with any performance parameter. It is very convenient for the developer to measure the impact of any performance parameter or evaluate the interaction between any pair of performance parameters with the chart analysis window.
Parallel programming remains relative hard for most application developers even with the current advanced software technology. If you are developing a parallel application or plan to develop a parallel application, the most important thing is to use the right tools to help you. I hope this blog incites your curiosity to download and try out Sun Studio Express 11/08 release.