Analysis against JVM Thread Dump - CPU High Usage Issue

One important duty of mine during years' of technical support is to analyze JVM thread dump, which is an interesting research to help understand how the logic in source code works and more important is that you will learn how to design a good Java EE applications to avoid scenario of high CPU usage, resource contention or something like these. One typical bad design it dead lock, though it seldom happens in modern Java development, but I will demonstrate all of them with sample source code, and show how to analyze the root cause from Thread Dump. All will be practical with no abstract theory.

This article will discuss one scenario you will always see, High CPU Usage like 99% or 100%. This case is quite easy to identify the root cause.

Let's read this sample source code first, it is just an example and nobody developer would code such.

package zigzag.research.threaddump;

public class HighCPU {
    public static void main(String[] args)
    {
        while(true){
            // do nothing
        }
    }
}


On Linux

Run it and top command shows you the problematic parent process (attention it is process in Linux, not thread).

Now we look at the JVM thread dump (I removed some unusual information section that I do not care). What problematic thread will you get from below? WAITING status? No! In Java VM, there are many daemon threads and in most of time they are WAITING.

"Finalizer" daemon prio=1 tid=0x08fb6428 nid=0xbea in Object.wait() [0xb59d5000..0xb59d60b0]
        at java.lang.Object.wait(Native Method)
        - waiting on(a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
        - locked(a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=1 tid=0x08fb5ea8 nid=0xbe9 in Object.wait() [0xb5a56000..0xb5a56e30]
        at java.lang.Object.wait(Native Method)
        - waiting on(a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:474)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
        - locked(a java.lang.ref.Reference$Lock)

"main" prio=1 tid=0x08f128d0 nid=0xbe5 runnable [0xff80c000..0xff80c5e8]
        at zigzag.research.threaddump.HighCPU.main(HighCPU.java:6)


On Linux, we can press Shift-H on top command and the individual Threads will be displayed instead of parent PID. The man page for top command is:

       -H : Threads toggle
            Starts top with the last remembered 'H' state reversed.  
			When this toggle is On, all individual threads will be displayed. 
			Otherwise, top displays a summation of all threads in a process.

Now we get above top command screenshot with Shift-H is toggled on. The problematic Thread (PID) occurs as 3045 which is Decimal and the Hexadecimal value is 0xBE5. So you got it?

"main" prio=1 tid=0x08f128d0 nid=0xbe5 runnable [0xff80c000..0xff80c5e8]
        at zigzag.research.threaddump.HighCPU.main(HighCPU.java:6)
		


On Windows

It was difficult to get the OS level Thread on Windows platform to anlyze thread dump. Luckily we have 3rd part tool like Process Explorer to get the problematic TID like below Screenshot.

You will see the CPU usage increase to almost 100% if your computer CPU is single core. In my example it consumes 24.71% because mine is 4 core CPU.

As above screenshot shows a problematic TID which takes 24.71% CPU time and never goes down, which is abnormal. Here the TID is native thread id in operation system level and it is not the "tid" of JVM thread dump, they are definitely different. The TID in OS is the nid (Native Thread ID) in JVM Thread Dump stack.

In this example the TID (Native Thread ID) is 2708 (Decimal), which is 0xA94(hexadecimal). So you will get the real problematic thread stack if you have:

"main" prio=6 tid=0x000000000054b000 nid=0xa94 runnable [0x00000000025cf000]
   java.lang.Thread.State: RUNNABLE
        at zigzag.research.threaddump.HighCPU.main(HighCPU.java:6)


Summary

Java nid(Native Thread ID) = Linux Process = Windows Thread


Comments:

Hello,

I am getting Error message while starting the server. Please help for that. Please see detailed error message below:

******************************************************************
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.

C:\Windows\system32>g:

G:\>cd Agile\Agile932\agileDomain\bin

G:\Agile\Agile932\agileDomain\bin>startAgile

CLASSPATH="G:\Oracle\MIDDLE~1\patch_wls1211\profiles\default\sys_manifest_classp
ath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\patch_oepe101\profiles\default\sys_man
ifest_classpath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\patch_ocp371\profiles\defa
ult\sys_manifest_classpath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\JDK160~1\lib\to
ols.jar;G:\Oracle\MIDDLE~1\WLSERV~1.1\server\lib\weblogic_sp.jar;G:\Oracle\MIDDL
E~1\WLSERV~1.1\server\lib\weblogic.jar;G:\Oracle\MIDDLE~1\modules\features\weblo
gic.server.modules_12.1.1.0.jar;G:\Oracle\MIDDLE~1\WLSERV~1.1\server\lib\webserv
ices.jar;G:\Oracle\MIDDLE~1\modules\ORGAPA~1.1/lib/ant-all.jar;G:\Oracle\MIDDLE~
1\modules\NETSFA~1.0_1/lib/ant-contrib.jar;"

PATH="G:\Oracle\MIDDLE~1\patch_wls1211\profiles\default\native;G:\Oracle\MIDDLE~
1\patch_oepe101\profiles\default\native;G:\Oracle\MIDDLE~1\patch_ocp371\profiles
\default\native;G:\Oracle\MIDDLE~1\WLSERV~1.1\server\native\win\32;G:\Oracle\MID
DLE~1\WLSERV~1.1\server\bin;G:\Oracle\MIDDLE~1\modules\ORGAPA~1.1\bin;G:\Oracle\
MIDDLE~1\JDK160~1\jre\bin;G:\Oracle\MIDDLE~1\JDK160~1\bin;G:\Agile\Agile932\jdk\
jre\bin;G:\Agile\Agile932\jdk\jre\bin\client;;G:\app\Gitanjali\product\11.2.0\db
home_1\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\Sy
stem32\WindowsPowerShell\v1.0\;G:\Oracle\MIDDLE~1\WLSERV~1.1\server\native\win\3
2\oci920_8"

Your environment has been set.
A subdirectory or file G:\Agile\Agile932/agileDomain/servers/Gitanjali-PC-AgileS
erver/logs already exists.

G:\Agile\Agile932\agileDomain>"C:\Program Files\Java\jrockit-jdk1.6.0_45-R28.2.7
-4.1.0\bin\java" -server -ms3072M -mx3072M -XX:MaxPermSize=512M -XX:NewSize=1300
M -XX:MaxNewSize=1300M -XX:+UseConcMarkSweepGC -classpath "G:\Agile\Agile932\agi
leDomain\lib/ojdbc6.jar;G:\Oracle\MIDDLE~1\patch_wls1211\profiles\default\sys_ma
nifest_classpath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\patch_oepe101\profiles\de
fault\sys_manifest_classpath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\patch_ocp371\
profiles\default\sys_manifest_classpath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\JD
K160~1\lib\tools.jar;G:\Oracle\MIDDLE~1\WLSERV~1.1\server\lib\weblogic_sp.jar;G:
\Oracle\MIDDLE~1\WLSERV~1.1\server\lib\weblogic.jar;G:\Oracle\MIDDLE~1\modules\f
eatures\weblogic.server.modules_12.1.1.0.jar;G:\Oracle\MIDDLE~1\WLSERV~1.1\serve
r\lib\webservices.jar;G:\Oracle\MIDDLE~1\modules\ORGAPA~1.1/lib/ant-all.jar;G:\O
racle\MIDDLE~1\modules\NETSFA~1.0_1/lib/ant-contrib.jar;;;G:\Agile\Agile932\agil
eDomain\lib/agbase.jar;G:\Agile\Agile932\agileDomain\lib/wlsauth.jar;G:\Agile\Ag
ile932\agileDomain\lib/crypto.jar;G:\Agile\Agile932\agileDomain\lib/xercesImpl.j
ar;G:\Agile\Agile932\agileDomain\lib/jdom.jar;;G:\Agile\Agile932\agileDomain\lib
/log4j.jar;;G:\Agile\Agile932\agileDomain\lib/jobaccess.jar;;G:\Agile\Agile932\a
gileDomain\lib/colt.jar;../ldaplib/ldaputil.jar;../ldaplib/commons-cli.jar;../ld
aplib/groovy-all-1.5.6.jar;../ldaplib/ldapbp.jar;G:\Agile\Agile932/agileDomain/c
onfig" -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9899 -
Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.
ssl=false -Dweblogic.system.BootIdentityFile=G:\Agile\Agile932/agileDomain/conf
ig/boot.properties -Dweblogic.Name=Gitanjali-PC-AgileServer "-Dbea.home=G:\Oracl
e\Middleware" -Dweblogic.ProductionModeEnabled=true -Dweblogic.log.StdoutSeverit
y=Error "-Djava.security.policy==G:\Oracle\Middleware\wlserver_12.1/lib/weblogic
.policy" -Dagile.log.dir=G:\Agile\Agile932/agileDomain/servers/Gitanjali-PC-Agil
eServer/logs -Djava.awt.headless=true weblogic.Server
[WARN ][jrockit] MaxPermSize=512M ignored: Not a valid option for JRockit
[WARN ][jrockit] NewSize=1300M ignored: Not a valid option for JRockit
[WARN ][jrockit] MaxNewSize=1300M ignored: Not a valid option for JRockit
Unknown option or illegal argument: -XX:+UseConcMarkSweepGC.
Please check for incorrect spelling or review documentation of startup options.

Could not create the Java virtual machine.

G:\Agile\Agile932\agileDomain>goto finish

G:\Agile\Agile932\agileDomain>ENDLOCAL

G:\Agile\Agile932\agileDomain\bin>

******************************************************************

Regards,
Vikas Kumar

Posted by guest on June 15, 2013 at 04:43 PM CST #

Hello Vikas Kumar

It appears you are referencing jrockit to start Agile 9.3.2. However Agile does not support jrocket JVM, it only supports Oracle Java (Sun). See below message.

--
>"C:\Program Files\Java\jrockit-jdk1.6.0_45-R28.2.7
-4.1.0\bin\java" -server -ms3072M -mx3072M
--

You can modify agileDomain/bin/setEnv.cmd file to point JAVA_HOME to correct Oracle Java home instead of jrockit

Regards
Jie

Posted by Jie Chen on June 15, 2013 at 06:36 PM CST #

Post a Comment:
  • HTML Syntax: NOT allowed
About


Jie Chen is the L3 memeber in Oracle Agile Support.
This blog focuses on the Maintenance, Diagnosis and Tuning related technical skills.
The technology covers Java/JavaEE, Weblogic, Security, Clustering, and Database of course.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today