Connection pooling in Sun Java System Web Server

Web Server Architecture : Connection pooling
Connection pooling is one of the cool features of Sun Web Server. This feature has been there from a long time. When a request comes to Web Server, acceptor thread accept the requests and puts into the connection queue. One of the daemon session threads (also called worker threads) pulls the request from the connection queue and starts processing the request. After the request is served, if the client has requested the connection to be alive then connection will go to the keep alive pollmanager. Keep alive threads poll all keep alive connections, whenever a new request comes to the this connection then keep alive threads again puts the connection to the connection queue. If there is no further request to the same connection for some period (keep alive timeout) then keep alive threads close the connection.


The reason why Sun Web Server perform so well under high load conditions e.g in specweb is because of the above Conneciton pooling architecture. In Apache Web Server, prefork and worker mpm are most common mpms. One of the problem with these mpm is that connection is bound to a thread.  So what this mean is that number of concurrent connection is typically the number of processing threads. If load increases beyond some point then connections will start timing out very soon. Event mpm tries to address this problem.


Tunables :
Here are some of the configuration parameters which affect the connection queue
directly.
(a) thread-pool --> queue-size
(b) http-listener->listen-queue-size.
(c) keepalive->timeout
(d) keepalive->pollinterval

listen-queue-size
As I wrote before, acceptor threads accept the connection and puts into the connection queue. The question is what really happen to new connections when OS is busy and OS has not yet scheduled the acceptor thread. OS kernel maintains the TCP connections on behalf of Web Server process. listen-queue-size is the number of connections kernel will accept before application accept the connection. If there is more than listen-queue-size connections before Web Server calls accept, then new connection will be rejected. This is not at all a common situation but could happen on very busy systems. Here is a small experiment to demostrate this situation:

Step 1 : I sent a stop signal to my child Web Server process so that acceptor thread
         won't be able to accept any new connection.
Step 2 : I sent a 200 simultaneous request (using apache benchmark tool ab)
$ ab -c 200 -n 200 http://hostname/index.html
Step 3 :  I ran the netstat -an command to see the connections
Here is the connection line looks.
192.18.120.211.3014  129.150.16.164.48150  5312      0 49437      0 ESTABLISHED

As expected, kernel will reject the new connections. Here is the ab output :
Benchmarking chilidev4.red.iplanet.com (be patient)
apr_poll: The timeout specified has expired (70007)

What happens if we disable the connection pool :
If we disable the thread-pool. In such cases, there is no connection queue, no daemon session threads. Acceptor threads themselves process the request. Here is the typical call stack will look like :
-----------------  lwp# 12 / thread# 12  --------------------
 fd5c1134 pollsys  (fc8efc80, 1, fc8efc00, 0)
 fd55d028 poll     (fc8efc80, 1, 1388, 10624c00, 0, 0) + 7c
 fe3d862c pt_poll_now (fc8efcec, 1, fc8efd1c, fc8efc80, 20, ffffffff) + 4c
 fe3d9ed0 pt_Accept (247408, 1bd990, ffffffff, 15, 0, ffffffff) + cc
 ff17db5c PRFileDesc\*ListenSocket::accept(PRNetAddr&,const unsigned) (1bd708, 1bd990, ffffffff, 2060, 245a88, 1bd988) + c
 ff16e67c int DaemonSession::GetConnection(unsigned) (2b7648, ffffffff, ff2ff800, ffffe800, ff26f000, ff2abc00) + 64
 ff16edb0 void DaemonSession::run() (6c0008, 2b7648, 2b7668, 2b76f0, 2000, ffffffff) + 150
 fef76d48 ThreadMain (2b7648, 2a6eb0, 0, fef89c00, ff16ec60, ff2abc64) + 1c
 fe3de024 _pt_root (2a6eb0, fef76d2c, 400, fe3f6ba4, 0, fe3f891c) + d4
 fd5c0368 _lwp_start (0, 0, 0, 0, 0, 0)


Few Call stacks :
Let us see the pstack information to see how it looks like. In my test configuration, I setup the min/max threads to 2 and disabled j2ee plugin. Then I dumped the stack using :
$ pstack <child_webservd_pid> | c++filt > pstack.txt

Let us see what are various threads doing :
1. Acceptor threads call stack :

Here is the call stack of acceptor thread.
-----------------  lwp# 13 / thread# 13  --------------------
 fd5c1134 pollsys  (fc8bfc88, 1, fc8bfc08, 0)
 fd55d028 poll     (fc8bfc88, 1, 1388, 10624c00, 0, 0) + 7c
 fe3d862c pt_poll_now (fc8bfcf4, 1, fc8bfd24, fc8bfc88, 20, ffffffff) + 4c
 fe3d9ed0 pt_Accept (245368, 20ce50, ffffffff, 15, 1, ffffffff) + cc
 ff17db5c PRFileDesc\*ListenSocket::accept(PRNetAddr&,const unsigned) (1bd708, 20ce50, ffffffff, 3, 2ab988, 2ab988) + c
 ff1782a4 void Acceptor::run() (12e488, 245908, 20ce48, 6, 3e8, 45) + 184
 fef76d48 ThreadMain (12e488, 11d828, 0, fef89c00, ff178120, ff2ac444) + 1c
 fe3de024 _pt_root (11d828, fef76d2c, 400, fe3f6ba4, 0, fe3f891c) + d4
 fd5c0368 _lwp_start (0, 0, 0, 0, 0, 0)

2. Idle Daemon session thread's call stack :
Here is the stack trace of an idle deamon session thread :
-----------------  lwp# 16 / thread# 16  --------------------
 fd5c0408 lwp_park (0, 0, 0)
 fd5ba49c cond_wait_queue (50a10, 2cfec8, 0, 0, 0, 0) + 28
 fd5baa1c cond_wait (50a10, 2cfec8, 0, 1c, 0, fcd52d00) + 10
 fd5baa58 pthread_cond_wait (50a10, 2cfec8, 1, fe3f8518, 5fc, 400) + 8
 fe3d79e8 PR_WaitCondVar (50a08, ffffffff, 2a7700, 0, 2ab988, 0) + 64
 ff17797c Connection\*ConnectionQueue::GetReady(unsigned) (8bcc8, ffffffff, ffffffff, 8bcc8, 5fc, 2ab968) + c4
 ff16e630 int DaemonSession::GetConnection(unsigned) (2b7648, ffffffff, ff2ff800, 0, ff26f000, ff2abc00) + 18
 ff16edb0 void DaemonSession::run() (746008, 2b7648, 2b7668, 2b76f0, 2000, ffffffff) + 150
 fef76d48 ThreadMain (2b7648, 2a7700, 0, fef89c00, ff16ec60, ff2abc64) + 1c
 fe3de024 _pt_root (2a7700, fef76d2c, 400, fe3f6ba4, 1, fe3f891c) + d4
 fd5c0368 _lwp_start (0, 0, 0, 0, 0, 0)

As connection queue is empty so this daemon session is waiting for request to arrive in connection queue.

3. Processing Daemon session thread's call stack :
Here is the another daemon session thread (which is processing the "/cgi-bin/test.pl" request) :
 fd5c1248 read     (1d, 758088, 2000)
 ff167588 int ChildReader::read(void\*,int) (3c9c4, 758088, 2000, 0, b71b00, 1) + 1c
 ff0f2e30 INTnetbuf_next (758028, 1, 2000, 2001, 60, 758028) + 2c
 ff13a364 int cgi_scan_headers(Session\*,Request\*,void\*) (11ee70, 11eee8, 758028, 27c, ff29bbd8, 0) + 84
 ff13abd4 int cgi_parse_output(void\*,Session\*,Request\*) (758028, 11ee70, 11eee8, ff2a1c50, ff29bbd8, 6e) + 1c
 ff13b8cc cgi_send (2c6dc8, 11ee70, 11eee8, 3c948, 1400, ff2a25cc) + 514
 ff10df68 func_exec_str (2354f8, 2c6dc8, 0, fc348, 11eee8, 11ee70) + 1c0
 ff10edc0 INTfunc_exec_directive (3d948, 2c6dc8, 11ee70, 11eee8, 280a28, 1) + 2a0
 ff113b60 INTservact_service (0, 11ee70, 2c6dc8, 0, 11eee8, 27b470) + 374
 ff11472c INTservact_handle_processed (0, 11eee8, 11eee8, 11ee70, 0, 2d6328) + 8c
 ff172064 void HttpRequest::UnacceleratedRespond() (11edc8, 3, 1, ff26f400, 0, 20) + e34
 ff170a24 int HttpRequest::HandleRequest(netbuf\*,unsigned) (11edc8, 754008, 11edf8, 2000, 754068, 11edd0) + 7a8
 ff16f10c void DaemonSession::run() (ffffffff, 280a08, 280a28, 280ab0, 1, ffffffff) + 4ac
 fef76d48 ThreadMain (280a08, 266728, 0, fef89c00, ff16ec60, ff2abc64) + 1c
 fe3de024 _pt_root (266728, fef76d2c, 400, fe3f6ba4, 1, fe3f891c) + d4
 fd5c0368 _lwp_start (0, 0, 0, 0, 0, 0)

Note that the above thread is waiting for data from child cgi process (test.pl).

4. Keep alive thread's call stack :
Let us see the call stack of keep alive thread :
-----------------  lwp# 9 / thread# 9  --------------------
 fd5c0408 lwp_park (0, 0, 0)
 fd5ba49c cond_wait_queue (2c6e98, 7092c8, 0, 0, 0, 0) + 28
 fd5baa1c cond_wait (2c6e98, 7092c8, 0, 1c, 0, fcd51100) + 10
 fd5baa58 pthread_cond_wait (2c6e98, 7092c8, 1, fe3f8518, 5fc, 400) + 8
 fe3d79e8 PR_WaitCondVar (2c6e90, ffffffff, 13e790, 0, 0, 0) + 64
 ff176024 void PollArray::GetPollArray(int\*,void\*\*) (25e4c8, fcc3fec0, fcc3fec4, 3, 0, ff2ac000) + 5c
 ff176984 void KAPollThread::run() (15c5a8, 5, 4, ff2ac000, ff2ff8cc, fcc3fec4) + 6c
 fef76d48 ThreadMain (15c5a8, 13e790, 0, fef89c00, ff176918, ff2ac3b8) + 1c
 fe3de024 _pt_root (13e790, fef76d2c, 400, fe3f6ba4, 1, fe3f891c) + d4
 fd5c0368 _lwp_start (0, 0, 0, 0, 0, 0)


Comments:

Thanks for the post... really helpful.

Question though:
"One of the daemon session threads (also called worker threads) pulls the request from the connection queue and starts processing the request."

What happens if the connection has been opened, but the other side is delayed in sending the actual request?

For example, say I netcat to the http port. It sounds like an acceptor thread would accept my connection and hand it off to the request queue. At this point a session thread would pick it up... but say it takes me 15 seconds to type in the GET request. During this time, will the session thread just block?

Posted by John Apple on February 11, 2009 at 03:27 AM PST #

You are right that if there is a delay for HTTP request to come then daemon
session will block. This could be a genuine client or a malicious client
request. There is no easy way for Web Server to know. However there is a
timeout available to tune. If request doesn't arrive within timeout interval
then Web Server closes this connection.
<http>--><io-timeout> , <http>--><request-header-timeout>
<http>--> <request-body-timeout>
are few tunables to get rid of hanging clients.

Also note that this is generic problem for all Web Servers. For Sun Web
Server, yes for these kind of scenarios, more daemon session is required.

If you note the specweb published results
http://www.spec.org/web2005/results/res2007q4/web2005-20071008-00096.html#Ecommerce%20Details

Note that for 49500 Ecommerce user sessions, only 164 daemon sessions were
used.

Posted by Basant Kukreja on February 11, 2009 at 03:44 AM PST #

Post a Comment:
Comments are closed for this entry.
About

Basant Kukreja

Search

Top Tags
Archives
« July 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today