Friday Jul 04, 2008

Performance improvements in J2SE 6

One of the principal design centers of J2SE 6(Mustang) was to yield improved performance and scalability. This has been brought about mostly by the improved runtime performance optimization, a better garbage collection and some client side features.

The runtime performance optimization includes implementation of biased locking, lock coarsening, adaptive spinning, larger page heap support, background compilation and the new registry allocation algorithm.

Biased locking is where the expense of obtaining and releasing an objects lock can be avoided when not necessary. This is based on the fact that most of the synchronized blocks are never contested. Also, the synchronized blocks are accessed by only one thread in their entire lifetime. In this scenario, obtaining and releasing a lock by the same thread again and again is simply a waste of resource. Thus the concept of biased locking. The synchronized blocks are made biased towards the thread that accesses the lock first. Any other time the same thread tries the enter the synchronized block, it does not need to acquire the lock again. However, if any other thread accesses the block, a reversal of the biasing needs to be done, which is again expensive. The benefit of the elimination of atomic operations must exceed the penalty of revocation for this optimization to be profitable.

Lock coarsening is applied in a situation where no noticeable operation is performed between a lock release and lock reacquire. For example, consider a piece of code where a few objects are added to a vector consecutively. Here, each statement acquires the vector lock and releases it. This is a waste of resource. The compiler can observe that there is a sequence of adjacent blocks that operate on the same lock, and merge them into a single block. Not only does the locking overhead reduce, but it becomes one single synchronized block.

Adaptive spinning occurs when a thread tries to access a synchronized block, but does not get the lock. In this situation, the thread start spinning instead of going for an expensive context switching. The amount of time for which the thread spins is adaptive. It depends on factors like the recent spin success/failure, the state of current lock owner etc.

Support for large page heaps are also supported in J2SE 6. This helps avoid costly Translation-Lookaside Buffer (TLB) misses to enable memory-intensive applications to perform better. However, on the downside, the large page heaps can significantly slow down the system, or cause excessive paging in other applications that are running in the system.

Also, background compilation is now enabled in the Java SE 6 HotSpot client compiler. Which means, the hyperthreaded and multi-processing systems could take advantage of spare CPU cycles to optimize Java code execution speed.

On the client side, there has been a significant improvement in the Swing performance. The popular “Gray Rect” problem(That is, when a Swing based application is exposed after being hidden by another application, there is a noticeable delay between the background of the window disappearing and the Swing contents being painted.) has been fixed. This problem was fixed by introducing the double buffering support to Swings. That is, an off-screen image is always kept in sync with the on-screen image. When the Swing application is exposed, the copying is done directly from the off-screen image.

Further, the Java Virtual Machine's boot and extension class loaders have been enhanced to improve the cold-start time of Java applications.

One key feature added to the Java SE 6 HotSpot VM is the integration of DTrace provider.

DTrace is a dynamic troubleshooting and analysis tool first introduced in Solaris 10 and Open Solaris. It can be used to debug software bugs, observe devices (like network or disk activity), observe applications and capture profiling data for performance analysis.

The DTrace provider makes available numerous probes that can be used to monitor JVM internal state and activities as well as the Java application that is running. A few of the probes are the VM Life Cycle probe, Thread Life Cycle probe, Garbage collection probe and the Class Loading probe.

Another key features of J2SE 6 is the implementation of StAX(Streaming API for XML) – a bi-directional API for reading and writing XML. It is based on the mechanism of “pull parsing”, which combines the ease of use of DOM parser with the efficiency of the SAX parser. In Java SE 6, SJSXP(Sun Java Streaming XML Parser) is the default implementation of StAX.




« July 2008