Sun Java System Web Server 7.0 monitoring - Part 1.

     Web Server 7.0 has some interesting ways to monitor the server. In previous versions of the Web Server, we typically create a uri e.g /.perf to obtain perfdump output.  When we wanted to get the perfdump output, we send a /.perf request. The problem with this approach is if Web Server is hanging because of some faulty application or in situations where Web Server is very busy processing the request then perfdump request also hung and we need to wait for /.perf request to get processed. In complicated situations, sometimes we were never able to obtain perfdump output. Web Server 7.0 comes to rescue in such situations.
    Let us do simple simple experiments to simulate a hanging Web Server.  Let us first change the min/max daemon session threads and set those to 2 so that Web Server can only process 2 requests at a time. Here is thread-pool element of my server.xml ,
  <thread-pool>
    <min-threads>2</min-threads>
    <max-threads>2</max-threads>
  </thread-pool>

After changing the server.xml, let us restart the Web Server. Now let us create a small cgi script which takes more than 5 minutes to execute. This will help us simulate a hanging web server.

#!/usr/bin/perl
print "Content-Type: text/html\\r\\n";
print "\\r\\n";
sleep(300);
print "Hello\\r\\n";

Now let use send two test.pl requests in parallel from two browser windows or by some command line tools e.g curl
$ curl --dump-header - -o - http://localhost:8080/cgi-bin/test.pl &
$ curl --dump-header - -o - http://localhost:8080/cgi-bin/test.pl &

The above two requests will be served after 5 minutes. Now let us try to see
if we can obtain the perfdump.
$ curl --dump-header - -o - http://localhost:8080/.perf
<hung>
Ctrl-Z
[3]+  Stopped                 curl --dump-header - -o - http://localhost:8080/.perf
$ bg
[3]+ curl --dump-header - -o - http://localhost:8080/.perf &

Our perfdump (/.perf) request hung. Now let us start the admin server and try to get perfdump of instance from admin server.
$ cd admin-server
$ bin/startserv
...
$ ./bin/wadm get-perfdump --config=myhostname --node=localhost
...
Sessions:
---------------------------------------------------------------------------------------
Process  Status    Client         Age  VS            Method  URI               Function

4952     response  192.168.1.115  110  myhostname  GET     /cgi-bin/test.pl  send-cgi
4952     response  192.168.1.115  109  myhostname  GET     /cgi-bin/test.pl  send-cgi


Voila! it worked. It showed me that there are two test.pl running and they are running since 110 seconds.

So how does it work. Why did /.perf request hung and why did get-perfdump wadm command work. Here is the explanation.

Since we created 2 daemon session threads and then send 2 test.pl requests.  All daemon session threads got busy processing requests. When we send the /.perf request. Since there was no daemon session threads available so The request went into the connection queue and waiting in connection queue to be served.

This information is also available in perfdump. Here is the snippet.
ConnectionQueue:
-----------------------------------------
Current/Peak/Limit Queue Length            1/1/1226
Total Connections Queued                   17

Note that there is 1 connection in the connection queue and that should be our "/.perf" request.

So it is now clear that if all daemon session threads are busy then web server won't be able to respond to monitoring requests based on uri. That was a big bottleneck in Web Server 6.1 and before. In situations when Web Server hung, we were not able to collect the perfdump output and it was not easy to find out what requests Web Server was processing at hanging stage.

Now the next question is how does get-perfdump wadm command works. Answer to this question lies into the monitoring architecture. There is a communiation channel from admin server to Web Server instance. All monitoring data is passed to admin server by that channel.  Also Web Server worker process has a dedicated thread for serving monitoring requests. Even though all daemon session threads are busy, this thread can generate the performance data and pass it to admin server. This is exactly what happened when we send the get-perfdump wadm command.

There are several interesting things to note in the above exercise. Admin server was not running at the time when Web Server hung. We simply started the admin server and ran get-perfdump command. During admin server startup the communication channel between admin server and instance is set up. This communication channel setup is independent of startup order of admin server or Web Server instance. They could be started in any order. Even when Web Server was busy processing the requests, this channel was successfully initialized.
There are other monitoring means e.g stats-xml, wadm CLI commands and administration GUI. They all use the same channel to access monitoring data from instance.



Comments:

it's good ..

Posted by bhkim on April 19, 2011 at 05:19 PM PDT #

Post a Comment:
Comments are closed for this entry.
About

Basant Kukreja

Search

Top Tags
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today