Detailed Analysis of a Stuck Weblogic Execute Thread Running JDBC Code
By Laurent Goldsztejn on Jun 16, 2014
The following thread was extracted from a thread dump taken on a JVM instance running WebLogic Server.
In this post I will deconstruct this thread and describe the data it contains and the potential issues it may illuminate.
[STUCK] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)' id=73 idx=0x128 nid=13410 prio=1 alive, in native, daemon
This thread is considered Stuck by WebLogic because it's been running for over the time defined in MaxStuckThreadTime (600 seconds by default). Weblogic Server waits for this time to be reached before marking a thread as stuck if the thread is still working after this time. If you deem that 600 seconds is too long before a running thread is considered stuck then you can change the value of the this parameter using the WebLogic Console (as shown below), or use setMaxStuckThreadTime from the ServerFailureTriggerMBean interface.
An error including BEA-000337 will be logged in the server log file when the thread changes its status to stuck but the server won't take further action on this thread. However, you might want to investigate why this thread is taking such a long time to process the work assigned to it.
Lets now look at the thread itself. From its header, you can spot the thread identifier (2 in this example) and the queue where it originated. The term Self-tuning indicates that the associated thread pool consistently checks the overall throughput to determine if the thread count should change.
id (or tid) is the thread identifier, a unique process-wide number that identifies this thread within the JVM process. This id is unique but can be reused by another thread once this thread is terminated.
nid is the OS-level native thread identifier. It can be used effectively to correlate with high CPU usage threads identified at the OS level (e.g. with Linux watch command). See Unexpected High CPU Usage with WebLogic Server (WLS) Support Pattern (Doc ID 779349.1) for detailed steps.
idx is the thread index in the threads array.
prio refers to the thread priority, a number inherited from the thread that created it. You can learn more about thread priorities at Class Thread but basically threads with higher priority are executed in preference to threads with lower priority.
alive refers to the fact that this thread has not ended yet and is still active.
in native means that the thread uses the operating system's native ability to manage multi-threaded processes.
daemon indicates that this thread can't prevent the JVM from exiting.
The thread header is accompanied with a full java stack which lists each method and class invoked since the first assignement to the thread up to its most recent action. This thread consists of obtaining a connection to an Oracle database using a Type 4 JDBC driver and then issuing a call but getting no response from the back end database server. The database failed to respond, and the thread has probably been in the same waiting mode (unchanged and not progressing java stack) for a long time since it's now considered stuck; the most recent invocation being java.net.SocketInputStream.socketRead0.
At this point the back end database needs to be checked to understand why it's not responding to the java thread request. A starting point could be to query v$session to find potential blocking sessions at the database level.
Blocking sessions occur when one session holds an exclusive lock on an object and doesn't release it.
Needless to say, the communication with the database needs to be confirmed as healthy with none to very limited latency. Firewall issues should be ruled out as well. Firewalls could time out idle sockets used by JDBC connections to the database and lead to not closing the socket the JDBC driver is using.