X

This blog introduces how to diagnose Agile PLM related issues.

  • April 4, 2013

Analysis against JVM Thread Dump - CPU High Usage Issue

Jie Chen
Senior Principal Technical Support Engineer

One important duty of mine during years' of technical support is to analyze JVM thread dump, which is an interesting research to help understand how the logic in source code works and more important is that you will learn how to design a good Java EE applications to avoid scenario of high CPU usage, resource contention or something like these. One typical bad design it dead lock, though it seldom happens in modern Java development, but I will demonstrate all of them with sample source code, and show how to analyze the root cause from Thread Dump. All will be practical with no abstract theory.

This article will discuss one scenario you will always see, High CPU Usage like 99% or 100%. This case is quite easy to identify the root cause.

Let's read this sample source code first, it is just an example and nobody developer would code such.

package zigzag.research.threaddump;
public class HighCPU {
public static void main(String[] args)
{
while(true){
// do nothing
}
}
}



On Linux

Run it and top command shows you the problematic parent process (attention it is process in Linux, not thread).

Now we look at the JVM thread dump (I removed some unusual information section that I do not care). What problematic thread will you get from below? WAITING status? No! In Java VM, there are many daemon threads and in most of time they are WAITING.

"Finalizer" daemon prio=1 tid=0x08fb6428 nid=0xbea in Object.wait() [0xb59d5000..0xb59d60b0]
at java.lang.Object.wait(Native Method)
- waiting on(a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
- locked(a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
"Reference Handler" daemon prio=1 tid=0x08fb5ea8 nid=0xbe9 in Object.wait() [0xb5a56000..0xb5a56e30]
at java.lang.Object.wait(Native Method)
- waiting on(a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:474)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
- locked(a java.lang.ref.Reference$Lock)
"main" prio=1 tid=0x08f128d0 nid=0xbe5 runnable [0xff80c000..0xff80c5e8]
at zigzag.research.threaddump.HighCPU.main(HighCPU.java:6)


On Linux, we can press Shift-H on top command and the individual Threads will be displayed instead of parent PID. The man page for top command is:

       -H : Threads toggle
Starts top with the last remembered 'H' state reversed.

When this toggle is On, all individual threads will be displayed.

Otherwise, top displays a summation of all threads in a process.

Now we get above top command screenshot with Shift-H is toggled on. The problematic Thread (PID) occurs as 3045 which is Decimal and the Hexadecimal value is 0xBE5. So you got it?

"main" prio=1 tid=0x08f128d0 nid=0xbe5 runnable [0xff80c000..0xff80c5e8]
at zigzag.research.threaddump.HighCPU.main(HighCPU.java:6)



On Windows

It was difficult to get the OS level Thread on Windows platform to anlyze thread dump. Luckily we have 3rd part tool like Process Explorer to get the problematic TID like below Screenshot.

You will see the CPU usage increase to almost 100% if your computer CPU is single core. In my example it consumes 24.71% because mine is 4 core CPU.

As above screenshot shows a problematic TID which takes 24.71% CPU time and never goes down, which is abnormal. Here the TID is native thread id in operation system level and it is not the "tid" of JVM thread dump, they are definitely different. The TID in OS is the nid (Native Thread ID) in JVM Thread Dump stack.

In this example the TID (Native Thread ID) is 2708 (Decimal), which is 0xA94(hexadecimal). So you will get the real problematic thread stack if you have:

"main" prio=6 tid=0x000000000054b000 nid=0xa94 runnable [0x00000000025cf000]
java.lang.Thread.State: RUNNABLE
at zigzag.research.threaddump.HighCPU.main(HighCPU.java:6)



Summary

Java nid(Native Thread ID) = Linux Process = Windows Thread


Join the discussion

Comments ( 4 )
  • guest Saturday, June 15, 2013

    Hello,

    I am getting Error message while starting the server. Please help for that. Please see detailed error message below:

    ******************************************************************

    Microsoft Windows [Version 6.1.7601]

    Copyright (c) 2009 Microsoft Corporation. All rights reserved.

    C:\Windows\system32>g:

    G:\>cd Agile\Agile932\agileDomain\bin

    G:\Agile\Agile932\agileDomain\bin>startAgile

    CLASSPATH="G:\Oracle\MIDDLE~1\patch_wls1211\profiles\default\sys_manifest_classp

    ath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\patch_oepe101\profiles\default\sys_man

    ifest_classpath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\patch_ocp371\profiles\defa

    ult\sys_manifest_classpath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\JDK160~1\lib\to

    ols.jar;G:\Oracle\MIDDLE~1\WLSERV~1.1\server\lib\weblogic_sp.jar;G:\Oracle\MIDDL

    E~1\WLSERV~1.1\server\lib\weblogic.jar;G:\Oracle\MIDDLE~1\modules\features\weblo

    gic.server.modules_12.1.1.0.jar;G:\Oracle\MIDDLE~1\WLSERV~1.1\server\lib\webserv

    ices.jar;G:\Oracle\MIDDLE~1\modules\ORGAPA~1.1/lib/ant-all.jar;G:\Oracle\MIDDLE~

    1\modules\NETSFA~1.0_1/lib/ant-contrib.jar;"

    PATH="G:\Oracle\MIDDLE~1\patch_wls1211\profiles\default\native;G:\Oracle\MIDDLE~

    1\patch_oepe101\profiles\default\native;G:\Oracle\MIDDLE~1\patch_ocp371\profiles

    \default\native;G:\Oracle\MIDDLE~1\WLSERV~1.1\server\native\win\32;G:\Oracle\MID

    DLE~1\WLSERV~1.1\server\bin;G:\Oracle\MIDDLE~1\modules\ORGAPA~1.1\bin;G:\Oracle\

    MIDDLE~1\JDK160~1\jre\bin;G:\Oracle\MIDDLE~1\JDK160~1\bin;G:\Agile\Agile932\jdk\

    jre\bin;G:\Agile\Agile932\jdk\jre\bin\client;;G:\app\Gitanjali\product\11.2.0\db

    home_1\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\Sy

    stem32\WindowsPowerShell\v1.0\;G:\Oracle\MIDDLE~1\WLSERV~1.1\server\native\win\3

    2\oci920_8"

    Your environment has been set.

    A subdirectory or file G:\Agile\Agile932/agileDomain/servers/Gitanjali-PC-AgileS

    erver/logs already exists.

    G:\Agile\Agile932\agileDomain>"C:\Program Files\Java\jrockit-jdk1.6.0_45-R28.2.7

    -4.1.0\bin\java" -server -ms3072M -mx3072M -XX:MaxPermSize=512M -XX:NewSize=1300

    M -XX:MaxNewSize=1300M -XX:+UseConcMarkSweepGC -classpath "G:\Agile\Agile932\agi

    leDomain\lib/ojdbc6.jar;G:\Oracle\MIDDLE~1\patch_wls1211\profiles\default\sys_ma

    nifest_classpath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\patch_oepe101\profiles\de

    fault\sys_manifest_classpath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\patch_ocp371\

    profiles\default\sys_manifest_classpath\weblogic_patch.jar;G:\Oracle\MIDDLE~1\JD

    K160~1\lib\tools.jar;G:\Oracle\MIDDLE~1\WLSERV~1.1\server\lib\weblogic_sp.jar;G:

    \Oracle\MIDDLE~1\WLSERV~1.1\server\lib\weblogic.jar;G:\Oracle\MIDDLE~1\modules\f

    eatures\weblogic.server.modules_12.1.1.0.jar;G:\Oracle\MIDDLE~1\WLSERV~1.1\serve

    r\lib\webservices.jar;G:\Oracle\MIDDLE~1\modules\ORGAPA~1.1/lib/ant-all.jar;G:\O

    racle\MIDDLE~1\modules\NETSFA~1.0_1/lib/ant-contrib.jar;;;G:\Agile\Agile932\agil

    eDomain\lib/agbase.jar;G:\Agile\Agile932\agileDomain\lib/wlsauth.jar;G:\Agile\Ag

    ile932\agileDomain\lib/crypto.jar;G:\Agile\Agile932\agileDomain\lib/xercesImpl.j

    ar;G:\Agile\Agile932\agileDomain\lib/jdom.jar;;G:\Agile\Agile932\agileDomain\lib

    /log4j.jar;;G:\Agile\Agile932\agileDomain\lib/jobaccess.jar;;G:\Agile\Agile932\a

    gileDomain\lib/colt.jar;../ldaplib/ldaputil.jar;../ldaplib/commons-cli.jar;../ld

    aplib/groovy-all-1.5.6.jar;../ldaplib/ldapbp.jar;G:\Agile\Agile932/agileDomain/c

    onfig" -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9899 -

    Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.

    ssl=false -Dweblogic.system.BootIdentityFile=G:\Agile\Agile932/agileDomain/conf

    ig/boot.properties -Dweblogic.Name=Gitanjali-PC-AgileServer "-Dbea.home=G:\Oracl

    e\Middleware" -Dweblogic.ProductionModeEnabled=true -Dweblogic.log.StdoutSeverit

    y=Error "-Djava.security.policy==G:\Oracle\Middleware\wlserver_12.1/lib/weblogic

    .policy" -Dagile.log.dir=G:\Agile\Agile932/agileDomain/servers/Gitanjali-PC-Agil

    eServer/logs -Djava.awt.headless=true weblogic.Server

    [WARN ][jrockit] MaxPermSize=512M ignored: Not a valid option for JRockit

    [WARN ][jrockit] NewSize=1300M ignored: Not a valid option for JRockit

    [WARN ][jrockit] MaxNewSize=1300M ignored: Not a valid option for JRockit

    Unknown option or illegal argument: -XX:+UseConcMarkSweepGC.

    Please check for incorrect spelling or review documentation of startup options.

    Could not create the Java virtual machine.

    G:\Agile\Agile932\agileDomain>goto finish

    G:\Agile\Agile932\agileDomain>ENDLOCAL

    G:\Agile\Agile932\agileDomain\bin>

    ******************************************************************

    Regards,

    Vikas Kumar


  • Jie Chen Saturday, June 15, 2013

    Hello Vikas Kumar

    It appears you are referencing jrockit to start Agile 9.3.2. However Agile does not support jrocket JVM, it only supports Oracle Java (Sun). See below message.

    --

    >"C:\Program Files\Java\jrockit-jdk1.6.0_45-R28.2.7

    -4.1.0\bin\java" -server -ms3072M -mx3072M

    --

    You can modify agileDomain/bin/setEnv.cmd file to point JAVA_HOME to correct Oracle Java home instead of jrockit

    Regards

    Jie


  • mike_jack Tuesday, October 16, 2018
    Informative blog on JVM Thread Dump. OUniversal Thread Dump Analysis
  • Frank Bergmann Wednesday, October 17, 2018
    For Linux there's a small helper tool for that: threadcpu

    See http://www.tuxad.de/blog/archives/2018/10/01/threadcpu_-_show_cpu_usage_of_threads/index.html
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services