Wednesday Aug 22, 2007

Memory leak in scdpmd

The scdpmd (Sun Cluster disk path monitor daemon) have a memory leak when the reboot_on_path_failure flag is enabled. This is a known issue and reported in bug 6563949 which will be fixed soon. The workaround is to use the default of reboot_on_path_failure which is disabled. Only Sun Cluster 3.2 is affected because this is a new feature of Sun Cluster 3.2.
Details about Administering Disk-Path Monitoring.
Update 8.Feb.2008: The bug 6563949 is now fixed in the patches
126106-04 Sun Cluster 3.2: CORE patch for Solaris 10
126107-04 Sun Cluster 3.2: CORE patch for Solaris 10_x86
126105-04 Sun Cluster 3.2: CORE patch for Solaris 9
Update 30.Jun.2008: The bug 6682663 can prevent the reboot. This is fixed in the revision -15 of the already mentioned Sun Cluster 3.2 CORE patches.



How to identify if reboot_on_path_failure is enabled?

t2000d# scdpm -p all:all
t2000e:reboot_on_path_failure enabled
t2000e:/dev/did/rdsk/d1 Ok
t2000e:/dev/did/rdsk/d2 Ok
t2000e:/dev/did/rdsk/d4 Ok
t2000e:/dev/did/rdsk/d5 Ok
t2000e:/dev/did/rdsk/d6 Ok
t2000e:/dev/did/rdsk/d7 Ok
t2000d:reboot_on_path_failure enabled
t2000d:/dev/did/rdsk/d10 Ok
t2000d:/dev/did/rdsk/d11 Ok
t2000d:/dev/did/rdsk/d13 Ok
t2000d:/dev/did/rdsk/d14 Ok
t2000d:/dev/did/rdsk/d6 Ok
t2000d:/dev/did/rdsk/d7 Ok



How to configure out if scdpmd consume to much memory?

t2000d# ps -ef | grep scdpmd
root 5355 1 0 Aug 20 ? 390:26 /usr/cluster/lib/sc/scdpmd
t2000d# prstat
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
5355 root 952M 8520K sleep 59 0 6:29:59 4.4% scdpmd/14



How to disable reboot_on_path_failure flag?

t2000d# clnode set -p reboot_on_path_failure=disabled t2000d t2000e
t2000d# scdpm -p all:all
t2000e:reboot_on_path_failure disabled
t2000e:/dev/did/rdsk/d1 Ok
t2000e:/dev/did/rdsk/d2 Ok
t2000e:/dev/did/rdsk/d4 Ok
t2000e:/dev/did/rdsk/d5 Ok
t2000e:/dev/did/rdsk/d6 Ok
t2000e:/dev/did/rdsk/d7 Ok
t2000d:reboot_on_path_failure disabled
t2000d:/dev/did/rdsk/d10 Ok
t2000d:/dev/did/rdsk/d11 Ok
t2000d:/dev/did/rdsk/d13 Ok
t2000d:/dev/did/rdsk/d14 Ok
t2000d:/dev/did/rdsk/d6 Ok
t2000d:/dev/did/rdsk/d7 Ok



How to restart scdpm service to prevent memory leak?

t2000d# svcadm restart svc:/system/cluster/scdpm:default


Additional information and best practices informations are available in
Infodoc 1004119.1: Sun[TM] Cluster 3.2 Disk Path Monitoring and how to test for losing path to storage

About

I'm still mostly blogging around Solaris Cluster and support. Independently if for Sun Microsystems or Oracle. :-)

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today