Tuesday Jul 28, 2009

Java SE tools for Performance Analysis and debugging GlassFish

For effective performance analysis, it is very important that you can collect data and collect them consistently.   For you to collect quality data, you need very good tools for monitoring the JVM.  I am listing a collection of resources one can use to do their performance analysis job effectively.

If you are dealing with JVM Heap, Synchronization issues, etc., you can start using tools such as jConsole, jstack, jmap, etc. to obtain Monitoring, Management and Troubleshooting data from JVM.  This tech notes is a good starting point (http://java.sun.com/javase/6/docs/technotes/tools/) for tools such as jvisualvm and jconsole.  This lists all the tools available, but a Performance Engineer would be mostly interested in information under the section "Java Troubleshooting, Monitoring, Profiling..." and "Monitoring Tools".  

Now with Java SE platform (JDK version 6 update 7), you can use VisualVM - http://java.sun.com/javase/6/docs/technotes/guides/visualvm/index.html since it is bundled with the platform.  VisualVM incorporates a plug-in called VisualGC which gives you GC information in GUI for easy analysis.  For more information on VisualVM, you can visit the development site (https://visualvm.dev.java.net).   Apart from Visual GC, you can also use Profiler and Threads tabs for looking at profiling data and monitoring the thread stacks etc. 

GChisto(https://gchisto.dev.java.net) is another visual tool you can use, which is under development, to get a good understanding of Tenuring etc. if you are only interested in GC histograms.

You can use VisualVM tool for debugging and analyzing GlassFish issues since GlassFish released a VisualVM plug-in that interacts with GlassFish and lets you monitor the GlassFish Application Server.   If you want to get more creative, you can look into BTrace, and there is a BTrace Plugin for VisualVM called BTrace4VisualVM which you can play with..

Wednesday Jun 17, 2009

Java Performance Tuning resources

A lot of folks have asked me about tuning references, mostly by engineers I meet outside our company.  So I thought I will list out and keep collecting the resources which are pretty good to start with if you are dealing with performance issues(in random order).  Once you have a good understanding of the JVM, then you can use profiling tools to figure out the performance problem.   The site, http://java.sun.com/javase/technologies/performance.jsp was maintained by my team while I was managing both Java SE and EE performance group a couple of years back and has some EE resources too (though it is a little dated now).   Currently, our performance team takes care of "Middleware" product performance including Glassfish, so check out our blogs from my team members.






one more note:  If you are on JDK 6, I highly recommend that you pick up jdk 6.0_14(update 14) since it has a improvements when using 64-bit large heaps by turning on Compressed OOPs (-XX:+UseCompressedOops when using -d64 switch).

Monday Apr 20, 2009

GlassFish releases another record open-source software stack SPECjAppServer2004 result

Sun has published a new SPECjAppServer2004 result of 2925.18 JOPS@Standard on April 20th, 2009. This highlights the scalability of GlassFish on Open-Solaris and Java platforms using our new Sun Servers Sun Fire X2270 and Sun Fire X4170 based on Intel processors. The details are available in this press release.

This result of 2925.18 JOPS@Standard improves our previously published result dated September 2008 of 1197.10 JOPS@Standard using Sun Fire X4150 servers highlighting the GlassFish Application Server's commitment to improve Performance and Scalability. The 1197.10 JOPS publication has established the price/performance metric of $33.90 per JOPS@Standard.

For more details about this new benchmark result and it's configuration, visit Tom Daly's blog, Kevin Kelly's blog, and Jennifer Glore's blog. We use $/JOPS as the price/performance metric to compare the cost incurred to achieve a single transaction of work measured as JOPS@Standard.

The following table gives you a comparison of $/JOPS for both the publications released by Sun using GlassFish Application Server.

SPECjAppServer2004 Complete Solution (All prices in $US)

Sun Fire X4150/GF v2U2(Sep. 2008)

Sun Fire X2270/GF v2.1(Apr. 2009)

Performance (SPECjAppServer2004 JOPS@ Standard)



Total Acquisition Price



Software Acquisition Price (includes database)



Price/Performance (Smaller is Better - $/JOP)



Based on the above information, it is very clear that GlassFish solution can achieve high performance and the best price/performance metric of $/JOPS costing about $26.95/JOP which is better than our previous publishing of price/performance of $33.90/JOP. This demonstrates how an open-source software stack from Sun can be used to achieve the best price/performance result.

Required Disclosure : SPEC and SPECjAppServer are registered trademarks of Standard Performance Evaluation Corporation. Results from www.spec.org as of 04/20/2009. Sun GlassFish Enterprise Server v2.1 on Sun Fire X2270 with MySQL 5.1 on OpenSolaris 2008.11. Application Server: 1 x X2270 8 x cores (2 Chips) and Database Server: 1 x X4170 8 x Cores (2 Chips) 2925.18 SPECjAppServer2004 JOPS@Standard; 2xSun Fire X4150 (8 cores, 2chips) and 1xSun Fire X4150 (4 cores, 1 chip) 1197.10 SPECjAppServer2004 JOPS@Standard

Tuesday Jul 10, 2007

SJSAS (Application Server) 9.1 hits the top spot on SPECjAppServer Performance on T2000

Today, Sun Microsystems, Inc. posted a stellar SPECjAppServer 2004 benchmark performance numbers, 883.66JOPS@Standard, on SJSAS 9.1(Glassfish V2) on Sun Fire T2000 server. A single Sun Fire T2000 Server with a 1.4 Ghz 8-core UltraSPARC T1 processor ran a single instance of the Java System Application Server on the Solaris 10 operating system to deliver the record-setting performance. In addition, a Sun Fire T2000 Server with a single 1.0 Ghz 6-core UltraSparc T1 processor ran IBM DB2 9.1 on the Solaris 10 operating system. The full details of the publishings are available here and visit Scott Oaks' blog for detailed competitive analysis. Visit The Aquarium if you want more details on Glassfish V2.

We have improved in all the aspects of application server to achieve this "top-spot" on Sun Fire T2000 Server. We would like to hear from our users and developers about what areas you would like us to improve in future releases. We have performance forum at java.net which you can use to communicate with Application Server Performance team.

Disclosure Statement: Sun Fire T2000 (1 chips, 8 cores) 883.66 SPECjAppServer2004 JOPS@Standard. SPEC, SPECjAppServer reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 7/10/07.

Sunday May 14, 2006

Performance Analysis in 30 Minutes

The title is not about doing the "complete" performance analysis in 30 minutes. Performance analysis is an art more than a science. You don't know what you are getting into or it's magnitude. The "art or science" is a subject of a great debate, but let me stick to my title. I am trying to address situations where you have only 30 minutes to do your analysis because your production system is crawling. You get called into a production issue, and your system or customer is experiencing performance problem and wants a solution in 30 minutes. What do you do?

Here, I am going to talk about what options you have and how do you approach the performance issue without having to go thru the application code. This is called system performance analysis. Well, as most of the experts say that it's the application code that is mostly the culprit, but can you blame application code without eliminating the possible systemic performance problems. If you have only half an hour, my recommendation is to start with the systemic approach. If you want to dive deep, then I recommend using "DTrace" Solaris Dynamic Tracing Guide or visiting DTrace Community. Dtrace framework enables you to trace any point of interest with thousands of built-in probes already ready for use, including tracing java methods and jvm issues.

While most of the following information here applies directly to Unix based derivatives, Solaris or Linux, but these principles can be applied generically. So, what do you need and where do you start. In terms of what do you need, the answer is you need tools. You need built-in tools on the system which you can quickly use. "vmstat" is a good utility which gives a full overview of process, memory, paging/swap, io, system and cpu. In terms of where do you start, you need to divide up different problem areas. I am going to start with these areas. I will also point out what I think are good general guidelines based on my experience, but this is highly subjective so your mileage varies.

CPU: First thing to start is to look at cpu utilization. Provided that you have loaded up the users gradually, then you should see a steady increase in cpu utilization. If you are not able to max out the cpus in spite of loading up users, you could have locking and synchronization issues.

Use "vmstat" and pay attention to proc and cpu statistics.

1. Look for runnable processes reported as 'r' under procs which reports processes in run queue. If you have processes in run queue, and have idle time on cpu then you have a scalability issue. It is ok to have run queue equal or little more than the number of cpus as long the system is running at full cpu utilization. In fact this is ideal scalability situation which means that we are able to utilize the system to its full potential. Also, make sure that you do not have any processes reported as 'b' or 'w' under procs ('b' - blocked for resources i/o, paging etc.; and 'w' - process swapped).

2. You want to have cpus spend more time in user land(us) compared to kernel land(sy). A good rule of thumb is about 4:1(us:sy). Look at 'us' and 'sy' under cpu.

Use "mpstat" to get more details. A typical mpstat output looks like this:

CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl

0 378 0 4422 403 300 162 3 13 23 0 966 0 4 0 95

1 366 0 3374 111 100 283 10 34 8 0 1188 0 5 0 95

Lock contention is one of the major reasons for performance problems. There are four types of locking available in Solaris. 1. Mutexes 2. Semaphores 3. Condition Variables and 4. Multiple Readers Single Writer(read-write) locks.

1. Look for "smtx" value as this measurement indicates the number of times CPU failed to obtain a mutex immediately. If smtx value exceeds 500 for any CPU, and system time is greater than 20%, then it is possible that mutex contention is happening on the system.

2. Look for "srw" value as this measurement indicates the number of times CPU failed to obtain read-write lock immediately.

You can use lockstat sleep 15 command to help identify the contention on your system. Look for large counts(indv) with long locking times(nsec) for adaptive mutex block.

You could have cpu cycles chewed up by other sub-systems such as I/O (iostat -xnz 5), network ( netstat -an) or memory swapping (vmstat -S) i.e. "use -S switch" for vmstat to see swap statistics. Run these commands at regular intervals and spool the output to a file so that you can review these files for potential problems.

Okay, after all this let's assume that everything looks good, and you have a situation where your cpu is being consumed by a single thread in your java program, then how to figure out what that the thread is running. For that, you have to use "prstat" for figuring out the cpu hog process, and "pstack pid" to note the LWP number which is hogging the cpu at that particular time. Once you know LWP number, you can use "kill -3 pid" command and look for the same corresponding LWP number of "nid" value (in hex) to figure out the stack trace & the method consuming the cpu for that java process. I will cover this later with more details once I have some example snapshots.

[ T: ]

Tuesday Dec 06, 2005

Today is the day

Today is the day we are announcing Sun Fire T1000 and Sun Fire T2000. Sun's first "chip level multithreading" (CMT) processor is released on Sun Fire T1000 and Sun Fire T2000 servers. The official name is UltraSPARC T1, but our internal code-name is Niagara.

I have been planning to blog for some time but never sat down. However, I am very happy to sit down today and write about UltraSPARC T1 (CMT) servers, especially on Performance and Scalability. Our team makes Java SE and Java EE scream on these CMT servers. I will begin by highlighting what these CMT servers mean to Java world and would like to follow up with more blogs later. See Richard McDougall's blog about CMT information.

In one sentence, this is all about multithreading. If you do mutlithreaded programming and you use Java(TM) for Web Programming, then these servers are for you. So what do we have in store for you. We have best performance. Not only best performance, but world record performance and price/performance on a comparable systems. See David Dagastine's post on Java Performance. If you care about Java Performance Tuning on these systems, see Brain Doherty's weblog.

If you are interested in Java EE and Application Server Performance on these servers, then visit Scott Oak's blog for analysis. Worried about total cost of ownership(TCO) on Application Tier? Then visit our announcement to see Sun Java(TM) System Application Server 8.2 Platform Edition (AS 8.2 PE) as the price/performance leader.

We have scalability, rather excellent scalability, on these servers. If you are into Web Services, then visit Bharath Mundlapudi's blog to see scalability results. Interested in XML processing and parser performance, then look at Kim LiChong's blog. If you have specific questions on Java Performance, visit us at Java Performance Community Website.

Stay tuned....

[ T: ]




« July 2016