Troubleshooting memory leaks and memory corruptions in Sun Java System Web Server 7.0

Troubleshooting memory leaks and memory corruptions in Sun Java System Web Server 7.0

In this blog I am trying to show how to use libumem and watchmalloc to find more information about memory leaks and memory corruptions in Sun Java System Web Server 7.0.

Using libumem to find memory leaks and memory corruptions

1) Disable pools
Add this line in magnus.conf
Init fn="pool-init" disable="true"
Note this is not a supported public interface.
2) Set environment variable UMEM_DEBUG
$export UMEM_DEBUG="default"
Refer man pages for umem_debug (3MALLOC) for more details.
3)  Start the Web Server instance
$./bin/startserv
4) Note down the pid of the webservd process
$ps -eaf |grep webservd
You will see two webservd processes, note down the highest pid.
5) Dump the initial core
$gcore -o core.pid.start pid
6) Run tests , send some requests which cause memory corruption or leaks.
7) Dump core again
gcore -o core.pid.end pid
8) Compare memory allocated between these two should show the leaks.
$mdb core.pid.start  
>::findleaks -d ! cat > findleaks.start.log
( Or you can also run $echo ::findleaks -d |mdb core.pid.start > findleaks.start.log )

$mdb core.pid.end
>::findleaks -d ! cat > findleaks.end.log

compare these two logs manually to see if there are leaks.
If these logs are unreadable you can use Sun studio's c++filt or in mdb give this command as shown below
> $G
C++ symbol demangling enabled
>

9) For memory corruptions, try umem_status, bufctl_audir and umem_verify commands in the last core:
$mdb core.pid.end
>::umem_status
...
If it shows a corrupted buffer try looking at its contents
(For example >26a3640/40X)
or
>>bufferaddress::bufctl_audit
(For example >26a4618::bufctl_audit)

umem_status command, also shows if a buffer is accessed after its already freed. For example, once it showed me
> ::umem_status
Status: ready and active
Concurrency: 2
Logs: (inactive)
Message buffer:
umem allocator: buffer modified after being freed
modification occurred at offset 0x8 (0xdeadbeefdeadbeef replaced by 0xde20be20de20be20)
...

You can also use umem_verify command to see if one of the umem caches has a corrupted buffer.

Using watchmalloc for memory corruptions

1) Disable pools
Add the following line in magnus.conf
Init fn="pool-init" disable="true"
2) Add LD_PRELOAD watchmalloc with MALLOC_DEBUG=WATCH in start script , comment out libumem and libCld portions

$diff -u startserv startserv.watchmalloc
--- startserv     Wed Dec  7 14:03:40 2005
+++ startserv.watchmalloc     Thu Mar 15 13:19:06 2007
@@ -9,6 +9,9 @@
 SERVER_BIN_DIR="/space/wsDec7/iplanet/ias/server/work/B1/SunOS5.8_DBG.OBJ/bin"
 SERVER_LIB_DIR="/space/wsDec7/iplanet/ias/server/work/B1/SunOS5.8_DBG.OBJ/lib"
 SERVER_BIN=webservd-wdog
+LIBWMC=/usr/lib/watchmalloc.so.1
+LD_PRELOAD_32="${LIBWMC} ${LD_PRELOAD}"; export LD_PRELOAD_32
+MALLOC_DEBUG="WATCH"; export MALLOC_DEBUG

 # Add path to server binaries to PATH
 PATH="${SERVER_BIN_DIR}:${SERVER_LIB_DIR}:/bin:${PATH}"; export PATH
@@ -47,26 +50,6 @@
 # Add instance-specific information to SHLIB_PATH for HP-UX
 SHLIB_PATH="${SERVER_LIB_PATH}:${SERVER_JVM_LIBPATH}:${SHLIB_PATH}"; export SHLIB_PATH

-# Preload libumem to improve performance on Solaris 10
-LIBUMEM_32=/usr/lib/libumem.so
-if [ -f "${LIBUMEM_32}" ] ; then
-    if [ `uname -r | sed s/\\\\\\.//` -ge 510 ] ; then
-        LD_PRELOAD_32="${LIBUMEM_32} ${LD_PRELOAD_32}"; export LD_PRELOAD_32
-    fi
-fi
-LIBUMEM_64=/usr/lib/64/libumem.so
-if [ -f "${LIBUMEM_64}" ] ; then
-    if [ `uname -r | sed s/\\\\\\.//` -ge 510 ] ; then
-        LD_PRELOAD_64="${LIBUMEM_64} ${LD_PRELOAD_64}"; export LD_PRELOAD_64
-    fi
-fi
-
-# Preload libCld to resolve -compat=4/-compat=5 C++ ABI issues on Solaris
-LIBCLD="${SERVER_LIB_DIR}/libCld.so"
-if [ -f "${LIBCLD}" ] ; then
-    LD_PRELOAD_32="${LIBCLD} ${LD_PRELOAD_32}"; export LD_PRELOAD_32
-fi
-
 if [ $# -eq 0 ] ; then
     COMMAND=--start;
 elif [ "$1" = "-i" ] ; then
3) Comment out compat4/compat5 NSAPI Init plugins in magnus.conf if they exist (most probably not)
4) Start the Web Server instance
$./bin/startserv
This makes the server extremely slow.
5) Run tests, send requests to the server,
The server may crash at proper places rather than random ones.


In general if you want to see what's happening when you send a request you can try using truss : truss -o truss.log -u "\*" -d -fael -rall -vall -wall -p <pid>

More information about this is in http://docs.sun.com/app/docs/doc/816-5165/truss-1?l=ja&a=view

Note that -u option will show the user-level function call tracing.

Links:
http://sunsolve.sun.com/search/document.do?assetkey=1-9-70641
http://developers.sun.com/solaris/articles/libumem_library.html
Comments:

Meena,

thanks a lot for this nice technotes, some questions:

1. why we need to

"
1) Disable pools
Add the following line in magnus.conf
Init fn="pool-init" disable="true"
"

Is it because if pool is used, then memory can be allocated inside the pool, which cannot be identified easily by libumem as leaks ???

Pls. teach me here.

2. I assume pool is enabled by default ?
so we need to disable pool as above .
But internally, can we ask where the pool is used ?
e.g. in jdbc pool ? in SSL cache pool ? in ACL cache pool ? in <some others> pools ?

3. sometimes, even when we use libumem , we cannot see some memory usage, e.g. gcore has 1.5 GB and pstack showed heap = 1,100 MB, but libumem showed only 300 MB with mdb ::umausers , so we do not know where the 1,100 MB - 300 MB was used ?
(besides, Java JVM , what else can be said ? some memory assigned in some kinds of pools ???)

thanks,

Walter

Posted by Walter Lee on April 01, 2008 at 01:20 AM IST #

Web Server core doesn't call malloc and free every time it needs memory it uses an internal pool. This pool is enabled byu default. Libumem catches double free bugs when the memory is freed. But if you use pools it would not be actually freed. NOTE This pool is internal to Web Server and is not an supported interface NOBODY SHOULD USE THIS INTERFACE when using web Server.

I do not know the answer to question #3.

Posted by Meena on April 03, 2008 at 03:31 PM IST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Meena Vyas

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today