Wednesday Jan 02, 2008

NEW BOOK: Solaris Application Programming

Solaris Application Programming, by Sun engineer Darryl Gove, has just been published by Sun Press/Prentice Hall.

Here's the back-of-the-jacket blurb:

Solaris™ Application Programming is a comprehensive guide to optimizing the performance of applications running in your Solaris environment. From the fundamentals of system performance to using analysis and optimization tools to their fullest, this wide-ranging resource shows developers and software architects how to get the most from Solaris systems and applications.

Whether you’re new to performance analysis and optimization or an experienced developer searching for the most efficient ways to solve performance issues, this practical guide gives you the background information, tips, and techniques for developing, optimizing, and debugging applications on Solaris.

The text begins with a detailed overview of the components that affect system performance. This is followed by explanations of the many developer tools included with Solaris OS and the Sun Studio compiler, and then it takes you beyond the basics with practical, real-world examples. In addition, you will learn how to use the rich set of developer tools to identify performance problems, accurately interpret output from the tools, and choose the smartest, most efficient approach to correcting specific problems and achieving maximum system performance.

Coverage includes

  • A discussion of the chip multithreading (CMT) processors from Sun and how they change the way that developers need to think about performance
  • A detailed introduction to the performance analysis and optimization tools included with the Solaris OS and Sun Studio compiler
  • Practical examples for using the developer tools to their fullest, including informational tools, compilers, floating point optimizations, libraries and linking, performance profilers, and debuggers
  • Guidelines for interpreting tool analysis output
  • Optimization, including hardware performance counter metrics and source code optimizations
  • Techniques for improving application performance using multiple processes, or multiple threads
  • An overview of hardware and software components that affect system performance, including coverage of SPARC and x64 processors


    You can get it at Powells Books online (my favorite online bookstore).


Sunday Dec 09, 2007

Setting up memcached on Solaris

These days everyone seems to use memcached, a high-performance, distributed memory object caching system, intended for use in speeding up web applications. Performance can be greatly improved from moving away from disk fetch to a RAM fetch. Here is an excellent article explaining memcached taking LiveJournal as a case study.
Do you know that memcached daemons can be set up on Solaris Zones too? For this you need to download memcached package from the Cool Stack site.

What you need to get?
1. Cool Stack 1.2
2. memcached Java Client APIs (Get this jar and this jar).

Create and Boot Zones
Create 3 zones - zonea, zoneb and zonec to test memcached. For information on creating Solaris zones read this article.

I'm using SXDE 9/07
OK you don't have SXDE? Get it!

Here is the status of my zones:

# zoneadm list -vc
0 global running / native shared
4 zonea running /zones/zonea native shared
5 zoneb running /zones/zoneb native shared
6 zonec running /zones/zonec native shared

Start memcached on all Zones

Follow the Cool Stack site for installing Cool Stack on your zones. When you do a pkgadd -d <memcached\*.pkg>, memcached gets installed in all the available zones even though they are not in a running state.

When everything is set, these commands should work fine:

zonea# ./opt/coolstack/bin/memcached -u phantom -d -m 100 -l -p 11111
zoneb# ./opt/coolstack/bin/memcached -u phantom -d -m 100 -l -p 11112
zonec# ./opt/coolstack/bin/memcached -u phantom -d -m 100 -l -p 11113

Start memcached as non-root user on all all zones with 100 MB memory bucket. This should be OK for testing but ideally in a production setup it should be around a terabyte.

If you don't know already, each Solaris zone can bind to an IP and port through the virtual interface. So you don't need 3 NICs or 3 machines - but just 3 zones.

Test memcached

Set the classpath pointing to the downloaded jars. Use NetBeans for simplicity. Store an object in-memory and retrieve it from the memcached daemon running on zonea:

    //Interact with zonea
    MemcachedClient c;
    try {
        c = new MemcachedClient(new InetSocketAddress
                ("", 11111));

        String test=new String("I'm going to be cached!");
        c.set("mykey", 180, test);
        Object obj=c.get("mykey");
    } catch (IOException ex) {


We are storing an object for 3 mins. After retrieving the object, you can clean the cache. When you compile and run the program,  the output will look like:

2007-10-03 10:38:49.615 INFO net.spy.memcached.MemcachedConnection: Connected to {QA sa=/, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} immediately
I'm going to be cached!

From your code, you can also connect to multiple memcached servers and store objects.
This is quite interesting. You can halt one zone and can try to store object in-memory on all the three zones.

# zoneadm -z zonea halt
# zoneadm list -vc
0 global running / native shared
5 zoneb running /zones/zoneb native shared
6 zonec running /zones/zonec native shared
- zonea installed /zones/zonea native shared

Now zonea is no longer running memcached because zonea zone is down.
Here is the modified code:

    MemcachedClient c;
    try {
        //zonea, zoneb and zonec
        c=new MemcachedClient(AddrUtil.getAddresses

        String test=new String("I'm going to be cached on all zones!");
        c.set("mykey2", 180, test);
        Object obj=c.get("mykey2");
    } catch (IOException ex) {

We are trying to store the object on all the available servers. But zonea is offline.
Here is the output:

2007-10-03 11:18:27.678 INFO net.spy.memcached.MemcachedConnection:
Added {QA sa=/, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue

2007-10-03 11:18:27.682 INFO net.spy.memcached.MemcachedConnection:

Connected to {QA sa=/, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} immediately

2007-10-03 11:18:27.684 INFO net.spy.memcached.MemcachedConnection:

Connected to {QA sa=/, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} immediately

I'm going to be cached on all zones!

The object is queued for insertion whenever zonea comes up. It would be interesting to test the automatic failover behavior of memcached considering the fact that memcached is a mother of all hashtables and there should be sufficient fail safe plumbing required between running instances of memcached daemons. You can also use DTrace for memcached debugging.

Saturday Jun 23, 2007

Technical Articles on Performance Tuning

Here's a list of the current SDN articles dealing with tuning and optimization of applications on Solaris using Sun Studio compilers: 

Here are examples of using a compiler flag or inline assembly language with Sun Studio compilers to increase the performance of C, C++, and Fortran programs. (June 4, 2007)
This article describes how to profile an IBM WebSphere Application Server (WAS) runtime environment with the Sun Studio Performance Analysis Tools, Collector and Analyzer. (January 30, 2007)

The SHADE library is an emulator for SPARC hardware. The particular advantage of using SHADE is that it is possible to write an analysis tool which gathers information from the application being emulated. The SHADE library comes with some example analysis tools which track things like the number of instructions executed or the frequency that each type of instruction is executed. A more advanced analysis tool might look at cache misses that the application encounters for a given cache structure.
(September 29, 2006)

Click on the link below to see the complete list-
[Read More]

Friday Jun 08, 2007

New Article: Performance Tuning with Sun Studio and Inline Assembly Code

 There's a new article on the SDN Sun Studio portal:

Performance Tuning With Sun Studio Compilers and Inline Assembly Code

By Timothy Jacobson, Sun Microsystems, June 2007  
For developers who need faster performance out of C, C++, or Fortran programs, Sun Studio compilers provide several efficient methods. Performance tuning has always been a difficult task requiring extensive knowledge of the machine architecture and instructions. To make this process easier, the Sun Studio C, C++, and Fortran compilers provide easy-to-use performance flags.

By using performance flags, developers can quickly improve execution speed. However, sometimes compiler flags alone do not result in optimum performance. For this reason, Sun Studio compilers also allow inline assembly code to be placed in critical areas. The inline code behaves similarly to a function or subroutine call, which enables cleaner, more readable code and also enables variables to be directly accessed in the inline assembly code.

This paper provides a demonstration of how to measure the performance of a critical piece of code. An example using a compiler flag and another example using inline assembly code are provided. The results are compared to show the benefits and differences of each approach.



For demonstration purposes, this paper uses an academic program to generate the Mandelbrot set. The example Mandelbrot program is written in C. Computing all the pixel values of the Mandelbrot set using the Sun Studio compiler is timed. Then, optimization flags are used and the computations are timed again. Finally, example Sun Studio inline assembly code is used and the computations are timed again and compared with the previous timings. The examples demonstrate two different methods for improving performance with the Sun Studio compiler: using flags and using inline assembly code.




Application development on Solaris OS


« April 2014