Tuesday Aug 07, 2007

Debugging the Data service Resource Type Implementation

Why Debugging ?

Computer programs often need to be examined to determine the cause of apparent errors or to gain a better understanding of their source code structure and control flow. This examination is called debugging,since its usual objective is the location and removal of program errors (bugs). This need arises during both the development phase in the software life cycle and during subsequent software maintenance.

In this blog we are interested in the activity of the examining the program to understand the phenomenon involved. This would require the examination of the program source and its input and output behavior. It may also require examination of the internal state of the program at interesting points of execution. The purpose of this activity is to assist in the formulation of hypotheses regarding the reason for the perceived aberrant behavior. There are various ways of approaching to this, Single stepping, inserting break points to suspend and examine the program, invoke printable statements (like printf), use UNIX provided syslog interfaces etc... The error messages and warnings differ from debug statements and are events from the program and is used to notify the user of an abnormal behavior. Debug statements on the other hand contain function names, variable names etc.. Hence it is more understood by the developers than the administrators.

The below diagram illustrates the debug model using syslog interface on UNIX.

debug_model.gif 

From more information on syslog and the message format refer to  http://docs.sun.com/app/docs/doc/817-0403/6mg741c7i?l=en&q=syslog&a=view

With this brief introduction and moving forward, the Sun Cluster HA agent developers are offered with a set of DSDL syslog API's to assist in debugging of Data services.

To start with download the Open HA Cluster agent source, build tools, and related binaries from http://opensolaris.org/os/community/ha-clusters/ohac/downloads/. This will be for the understanding of the source code and to use the DSDL provided API's.

Now lets move on to use the DSDL built-in features for syslog.

    DSDL provides scds_syslog_debug() utility for adding debugging statements to the resource type implementation. The debugging level (a number between 1-9) can be dynamically set for each resource type implementation on each cluster node. A file named /var/cluster/rgm/rt/<rtname>/loglevel has to be created and contains only an integer between 1 and 9. This is read by all the resource type callback methods. The DSDL function scds_initialize() reads this file and sets the debug level to the specified level. The scds_syslog_debug() function uses the facility that is returned by the scha_cluster_getlogfacility() function at a priority of LOG_DEBUG. You can configure these debug messages in the /etc/syslog.conf file.

Example usage of scds_syslog_debug() in wls.c

<wls.c>
int
svc_validate(scds_handle_t scds_handle, bea_extn_props_t \*beaextnprops,
                boolean_t print_messages)
{
        int                     err = 0, rc = 0;
        char                    \*config_dir_name;
        char                    \*start_script_dir = NULL;
        char                    wls_config_file[MAXPATHLEN];
        struct stat             statbuf;
        scds_hasp_status_t      hasp_status;
        scds_net_resource_list_t        \*snrlp = NULL;

<! --- This message would be used to trace the function entry --->
        scds_syslog_debug(DBG_LEVEL_HIGH,
                "In svc_validate() method of WLS resource" );

        /\* check for HAStoragePlus resources \*/
        rc = scds_hasp_check(scds_handle, &hasp_status);

        if (rc != SCHA_ERR_NOERR) {
                /\* scha_hasp_check() logs msg when it fails \*/
                if (print_messages) {
                        (void) fprintf(stderr,
                                gettext("HASP check failed.\\n" ) );
                }
                return (1);
        }
<wls.c>

You can add debug statements you find sufficient at appropriate places to help you in debugging the data service agent.

Now edit /etc/syslog.conf and change daemon.notice to daemon.debug or add daemon.debug.
# grep daemon /etc/syslog.conf
\*.err;kern.debug;daemon.notice;mail.crit        /var/adm/messages
\*.alert;kern.err;daemon.err                             operator
#

Add daemon.debug and restart syslogd. Note that the output below, from grep daemon /etc/syslog.conf, shows that daemon.debug has been set.

# grep daemon /etc/syslog.conf
\*.err;kern.debug;daemon.notice;mail.crit        /var/adm/messages
\*.alert;kern.err;daemon.err                             operator
daemon.debug                                                  /var/adm/ds.out

Create a loglevel file under /var/cluster/rgm/rt/<rtname>/. Edit the loglevel file and add "9" as the debug level.

Restart the syslog daemon.
#pkill -9 syslogd

Now you would see the debug messages directed to /var/adm/ds.out

Perform failovers and switchovers (using cluster admin commands) to see how the Agent is behaving. Seeing these debug messages and correlating them with the code, gives you a much better understanding of the control flow of these Agents.

You can share your experience of debugging OHAC Agents and ask questions at: http://opensolaris.org/jive/forum.jspa?forumID=195

Hemachandran

Data services Team 

About

mkb

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today