Work Manager Leash for Slow JSPs in WebLogic Server

dogOnLeash

If part of your application goes bad, you do not want issues to cascade to the rest of your application or the rest of the server.  Based on my recent discovery of Michael Nygard’s architecture writings, this post will be somewhat applicable to his Bulkhead pattern (see page 39) of his excellent presentation.   I’ve been told that his ReleaseIt! book is phenomenal by peers I respect, so it’s already on it’s way to my bookshelf.  WebLogic has a way to put a leash on a JSP with only a few minor deployment descriptor edits – read on for an example on how to not let one resource intensive JSP bring you down.

Actual Customer Problem

One of my customers has an interesting problem with a long-running JSP.  They use a 3rd party java library exposed via a JSP that brings back large amounts of content from a backend file system.  This can take tens of minutes to process each request.  Within the same application (WAR file) they also have many short-lived requests hitting normal JSPs like a traditional web application.  In some cases they were having stability problems when too many threads were busy (STUCK threads) on the long-running JSP, taking up lots of server memory and starving threads from the fast JSPs.  In some cases, they would have to restart the WLS instance to recover stability.  The reason - by default the application components and even the rest of WebLogic Server are all using threads from the same default Work Manager – which is used by WebLogic Server’s self-tuning thread pool.

Potential Solution Using Custom Work Manager and Constraint

In 9.x and above of WebLogic, you have the ability to use a WorkManager with different types of constraints to control the shared threads in WebLogic Server.  Here is a detailed article on Work Managers on OTN.

Work Managers can be used to put some constraints on a troublesome JSP such as the max number of threads and the max number of requests that are being serviced and queued (called a capacity constraint) simply by editing the web.xml and weblogic.xml deployment descriptors.  Requests over the capacity constraint will result in a 503 error code.  Using web.xml a special page can be shown in this case that provides a helpful message as to why this condition exists instead of the default 503 page and the application can stay up while limited access to the JSPs that hog server resources.

Simple Example

2 JSPs - normal.jsp and veryslow.jsp - the veryslow.jsp has a 30 second java.lang.Thread.sleep()

A simple Filter that applies to both JSPs that simply logs to standard out – This is included just to show that Filters are not affected by Work Managers – you don’t have to worry about a Russian Nested Doll issue to target a work manager successfully at a JSP, even if it has Filters.

503error.html page (for the informational message in the overload condition).  I included this to show you that this can be a custom pretty page and does not have to be the ugly Internal Server Error page that you see by default.

weblogic.xml with a custom Work Manager defined in  with constraints:

<?xml version="1.0" encoding="UTF-8"?>

<wls:weblogic-web-app xmlns:wls="http://xmlns.oracle.com/weblogic/weblogic-web-app" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd http://xmlns.oracle.com/weblogic/weblogic-web-app http://xmlns.oracle.com/weblogic/weblogic-web-app/1.0/weblogic-web-app.xsd">

    <wls:weblogic-version>10.3.2</wls:weblogic-version>

    <wls:context-root>WorkManagerWeb</wls:context-root>

     <wls:work-manager>

        <wls:name>myCustomWorkManager</wls:name>

        <wls:max-threads-constraint>

            <wls:name>5MaxThreads</wls:name>

            <wls:count>5</wls:count>

        </wls:max-threads-constraint>

        <wls:capacity>

            <wls:name>7MaxCapacity</wls:name>

            <wls:count>7</wls:count>

        </wls:capacity>

    </wls:work-manager>

</wls:weblogic-web-app>


web.xml – this simply shows how you can target a custom Work Manager directly at a JSP/Servlet – It is probably more common to target a Work Manager at an entire web application – but we have a very targeted use case in this example.  In EJB modules, work managers are targeted per EJB or MDB.  The important part is looking at the init-param of the veryslow.jsp.  The 503 error code page is also defined here.



<?xml version="1.0" encoding="UTF-8"?>

<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" id="WebApp_ID" version="2.5">

  <display-name>WorkManagerWeb</display-name>

  <welcome-file-list>

    <welcome-file>normal.jsp</welcome-file>

  </welcome-file-list>

  <error-page>

    <error-code>503</error-code>

    <location>/503error.html</location>

  </error-page>

  <filter>

    <display-name>BarFilter</display-name>

    <filter-name>BarFilter</filter-name>

    <filter-class>foo.BarFilter</filter-class>

  </filter>

  <filter-mapping>

    <filter-name>BarFilter</filter-name>

    <url-pattern>*.jsp</url-pattern>

  </filter-mapping>

  <servlet>

    <servlet-name>VerySlowJSP</servlet-name>

    <jsp-file>veryslow.jsp</jsp-file>

    <init-param>

         <param-name>wl-dispatch-policy</param-name>

         <param-value>myCustomWorkManager</param-value>

    </init-param>

   </servlet>

</web-app>


Here’s an illustration with a max thread constraint of 5 and a capacity constraint of 7 showing what will happen when 10 requests hit veryslow.jsp at the same time.

10threadDiagram

Here is a JMeter report using 10 threads instantaneously hitting normal.jsp which uses the default Work Manager from WLS.  Notice that all finish executing right away which is exactly as we expect because there are no relevant constraints in the default Work Manager.

normalJMeter

Here is a JMeter report showing 10 requests instantaneously hitting veryslow.jsp which uses the custom Work Manager with the constraints discussed before.  In this case we see 3 rejected requests (the last 3 in), 5 that complete in 30 seconds, and 2 that complete in 60 seconds.  This is exactly what we expect given our constraints since 2 threads will wait not only for their 30 second sleep, but also for the 30 second sleep for the thread that they are waiting on.

veryslowJMeter

When you go to the browser and hit veryslow.jsp while the veryslow test is being run, you get the 503error.html page defined in web.xml to come up.

Here’s the access.log from the test – notice the 503 codes:



127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/normal.jsp HTTP/1.1" 200 304 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/normal.jsp HTTP/1.1" 200 304 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/normal.jsp HTTP/1.1" 200 304 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/normal.jsp HTTP/1.1" 200 304 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/normal.jsp HTTP/1.1" 200 304 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/normal.jsp HTTP/1.1" 200 304 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/normal.jsp HTTP/1.1" 200 304 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/normal.jsp HTTP/1.1" 200 304 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/veryslow.jsp HTTP/1.1" 503 348 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/normal.jsp HTTP/1.1" 200 304 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/veryslow.jsp HTTP/1.1" 503 348 

127.0.0.1 - - [30/Dec/2009:11:35:32 -0600] "GET /WorkManagerWeb/normal.jsp HTTP/1.1" 200 304 

127.0.0.1 - - [30/Dec/2009:11:35:32 -0600] "GET /WorkManagerWeb/veryslow.jsp HTTP/1.1" 503 348 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/veryslow.jsp HTTP/1.1" 200 396 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/veryslow.jsp HTTP/1.1" 200 396 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/veryslow.jsp HTTP/1.1" 200 396 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/veryslow.jsp HTTP/1.1" 200 396 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/veryslow.jsp HTTP/1.1" 200 396 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/veryslow.jsp HTTP/1.1" 200 396 

127.0.0.1 - - [30/Dec/2009:11:35:31 -0600] "GET /WorkManagerWeb/veryslow.jsp HTTP/1.1" 200 396


Here’s the Standard Out log – notice that both JSP’s use the Filter and it is not affected by using different work managers for JSPs.



Dec 30, 2009 11:33:20 AM - BarFilter [ACTIVE] ExecuteThread: '14' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:20 AM - BarFilter [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:20 AM - BarFilter [ACTIVE] ExecuteThread: '10' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:20 AM - BarFilter [ACTIVE] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - BarFilter [ACTIVE] ExecuteThread: '8' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - BarFilter [ACTIVE] ExecuteThread: '24' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - BarFilter [ACTIVE] ExecuteThread: '11' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - Normal JSP hit [ACTIVE] ExecuteThread: '24' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - Normal JSP hit [ACTIVE] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - Normal JSP hit [ACTIVE] ExecuteThread: '8' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - Normal JSP hit [ACTIVE] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - Normal JSP hit [ACTIVE] ExecuteThread: '14' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - Normal JSP hit [ACTIVE] ExecuteThread: '11' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - Normal JSP hit [ACTIVE] ExecuteThread: '10' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - Normal JSP hit [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - BarFilter [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - BarFilter [ACTIVE] ExecuteThread: '10' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - BarFilter [ACTIVE] ExecuteThread: '11' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - BarFilter [ACTIVE] ExecuteThread: '14' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - BarFilter [ACTIVE] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - BarFilter [ACTIVE] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - Normal JSP hit [ACTIVE] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - BarFilter [ACTIVE] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - Normal JSP hit [ACTIVE] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - VerySlow JSP Starting to sleep [ACTIVE] ExecuteThread: '10' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - VerySlow JSP Starting to sleep [ACTIVE] ExecuteThread: '14' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - VerySlow JSP Starting to sleep [ACTIVE] ExecuteThread: '11' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - VerySlow JSP Starting to sleep [ACTIVE] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:21 AM - VerySlow JSP Starting to sleep [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:51 AM - VerySlow JSP awake [ACTIVE] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:51 AM - VerySlow JSP awake [ACTIVE] ExecuteThread: '11' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:52 AM - VerySlow JSP awake [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:52 AM - BarFilter [ACTIVE] ExecuteThread: '11' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:52 AM - BarFilter [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:52 AM - VerySlow JSP awake [ACTIVE] ExecuteThread: '10' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:52 AM - VerySlow JSP awake [ACTIVE] ExecuteThread: '14' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:52 AM - VerySlow JSP Starting to sleep [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:33:52 AM - VerySlow JSP Starting to sleep [ACTIVE] ExecuteThread: '11' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:34:22 AM - VerySlow JSP awake [ACTIVE] ExecuteThread: '11' for queue: 'weblogic.kernel.Default (self-tuning)'

Dec 30, 2009 11:34:22 AM - VerySlow JSP awake [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default (self-tuning)'


I’ve attached the code for this example that includes both the war file and my Oracle Enterprise Pack for Eclipse project.

As you can see, it’s pretty easy to put constraints on individual JSPs, Servlets, EJB’s, MDB’s, and entire web applications using only a few deployment descriptor edits.  That’s a very powerful feature when you need it to keep hogs from bogging down other parts of your application or your WLS instance.

Comments:

Perfect example, James. I always find tough to read BEA e-docs. Thanks for explaining this in simpler way which I can understand easily. Many thanks again!

Posted by Kannan Pasupathy on January 31, 2010 at 03:07 PM PST #

Hi James , it probably couldn't be depicted in a simpler manner - thanks. It just hits the Albert Enstein's saying: "keep things as simple as possible, but not simpler" ;-) Piotr

Posted by Piotr on July 28, 2010 at 06:12 PM PDT #

James, What exception will be thrown from EJB A for the following scenario? Thanks. - Http Load Testing Client ==> WebApp A==> EJB B - Similar constraints: max-threads-constraint (5) and capacity (7) applied to EJB B - More than 7 concurrent requests from WebApp A to EJB B Cheers, Shaun

Posted by Shaun on January 12, 2011 at 11:10 AM PST #

Shaun, I'm fairly confident (I recommend trying a simple test) that the first entry point into WLS will control which work manager gets used. So in your case with a Servlet from WebApp A receiving the request, it's work manager should be used instead of EJB B's work manager. If your test client directly invoked the EJB as an RMI client, then I would expect it's work manager to be used. Thanks, James

Posted by james.bayer on January 12, 2011 at 11:32 AM PST #

James,
I am facing following issue regarding 'thread pool has stuck threads '

Issue Description: We have java application deployed on weblogic server 10.3 which is integrated with BPEL (SOA Oracle 11g). The application is particluary used to create a task through BPEL(Managed Server). We have configured thread time out to 600 secs (default). Now after certain time (5-7 hours) the application get gradually slower. When we checked the admin server (weblogic) we found that both the managed server has warning "Thread pool has stuck threads" and the free memory on both the servers (where the managed servers are deployed) got very low. On restarting the apps the memory and speed return to initial position.

When we tried to nail down the issue we found on certain blogs this issue primarily because of thread initialization. Is there any way we can configure weblogic to clear down the stuck threads after a certain duration. Is there any other performance tuning we should consider to avoid this?

Thanks in Advance.

Note: PFA thread dumps (these thread dumps should be taken at the time you facing the problem).

Posted by guest on July 18, 2011 at 09:25 PM PDT #

When STUCK threads are identified, the Java threading model does not allow them to be stopped arbitrarily an outside agent. Therefore, you should view STUCK threads as a notification that something might be wrong and you need to investigate. See http://blogs.oracle.com/jamesbayer/entry/weblogic_server_work_manager_demo

The best course of action is to take several thread dumps several seconds apart (kill -3 on linux, control-break on windows, jrcmd with JRockit, jstack with Hotspot, the WLS console Server->Monitoring->Threads ) and have someone that can read a thread dump (like Oracle Support) see which code is executing that is not returning the thread to the thread pool.

Thanks, James

Posted by james.bayer on July 19, 2011 at 04:03 AM PDT #

Very nice article. I've noticed that specifying goal-ms in response-time-request-class has absolutely no effect. Any idea why? There are no compilation or runtime errors, it simply has no effect. Using WebLogic 10.

Pages will not exit, or redirect to a 503, after the milliseconds specified in goal-ms has elapsed.

See the goal-ms WorkManager setting referenced here:

http://middlewaremagic.com/weblogic/?cat=31

Posted by guest on August 24, 2011 at 02:56 PM PDT #

Sam,

The response-time-request-class does not reject work when going beyond the response time goal, it is used to prioritize work managers with a ratio as described here:

Response Time Request Class – A response time request class specifies the target response time in milliseconds. Using the previous example, imagine that Application A uses a work manager with a response time of 3000 milliseconds and Application B uses a work manager with a response time of 5000 milliseconds. When the server is receiving a steady stream of requests from both applications that exceed the number of execute threads, the server will keep the average response times of the two applications in a 3 to 5 ratio, where the actual response times will be some fraction or multiple of the response time goal.

Hope this helps explain it.

Thanks, James

Posted by james.bayer on August 24, 2011 at 04:36 PM PDT #

Post a Comment:
Comments are closed for this entry.
About

James Bayer Image
I was formerly a Product Manager on the WebLogic Server team based out of Oracle HQ. You can find my new blog at http://iamjambay.com.
Follow Me on Twitter
Oracle WebLogic

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today