Tuesday Aug 07, 2007

Debugging the Data service Resource Type Implementation

Why Debugging ?

Computer programs often need to be examined to determine the cause of apparent errors or to gain a better understanding of their source code structure and control flow. This examination is called debugging,since its usual objective is the location and removal of program errors (bugs). This need arises during both the development phase in the software life cycle and during subsequent software maintenance.

In this blog we are interested in the activity of the examining the program to understand the phenomenon involved. This would require the examination of the program source and its input and output behavior. It may also require examination of the internal state of the program at interesting points of execution. The purpose of this activity is to assist in the formulation of hypotheses regarding the reason for the perceived aberrant behavior. There are various ways of approaching to this, Single stepping, inserting break points to suspend and examine the program, invoke printable statements (like printf), use UNIX provided syslog interfaces etc... The error messages and warnings differ from debug statements and are events from the program and is used to notify the user of an abnormal behavior. Debug statements on the other hand contain function names, variable names etc.. Hence it is more understood by the developers than the administrators.

The below diagram illustrates the debug model using syslog interface on UNIX.

debug_model.gif 

From more information on syslog and the message format refer to  http://docs.sun.com/app/docs/doc/817-0403/6mg741c7i?l=en&q=syslog&a=view

With this brief introduction and moving forward, the Sun Cluster HA agent developers are offered with a set of DSDL syslog API's to assist in debugging of Data services.

To start with download the Open HA Cluster agent source, build tools, and related binaries from http://opensolaris.org/os/community/ha-clusters/ohac/downloads/. This will be for the understanding of the source code and to use the DSDL provided API's.

Now lets move on to use the DSDL built-in features for syslog.

    DSDL provides scds_syslog_debug() utility for adding debugging statements to the resource type implementation. The debugging level (a number between 1-9) can be dynamically set for each resource type implementation on each cluster node. A file named /var/cluster/rgm/rt/<rtname>/loglevel has to be created and contains only an integer between 1 and 9. This is read by all the resource type callback methods. The DSDL function scds_initialize() reads this file and sets the debug level to the specified level. The scds_syslog_debug() function uses the facility that is returned by the scha_cluster_getlogfacility() function at a priority of LOG_DEBUG. You can configure these debug messages in the /etc/syslog.conf file.

Example usage of scds_syslog_debug() in wls.c

<wls.c>
int
svc_validate(scds_handle_t scds_handle, bea_extn_props_t \*beaextnprops,
                boolean_t print_messages)
{
        int                     err = 0, rc = 0;
        char                    \*config_dir_name;
        char                    \*start_script_dir = NULL;
        char                    wls_config_file[MAXPATHLEN];
        struct stat             statbuf;
        scds_hasp_status_t      hasp_status;
        scds_net_resource_list_t        \*snrlp = NULL;

<! --- This message would be used to trace the function entry --->
        scds_syslog_debug(DBG_LEVEL_HIGH,
                "In svc_validate() method of WLS resource" );

        /\* check for HAStoragePlus resources \*/
        rc = scds_hasp_check(scds_handle, &hasp_status);

        if (rc != SCHA_ERR_NOERR) {
                /\* scha_hasp_check() logs msg when it fails \*/
                if (print_messages) {
                        (void) fprintf(stderr,
                                gettext("HASP check failed.\\n" ) );
                }
                return (1);
        }
<wls.c>

You can add debug statements you find sufficient at appropriate places to help you in debugging the data service agent.

Now edit /etc/syslog.conf and change daemon.notice to daemon.debug or add daemon.debug.
# grep daemon /etc/syslog.conf
\*.err;kern.debug;daemon.notice;mail.crit        /var/adm/messages
\*.alert;kern.err;daemon.err                             operator
#

Add daemon.debug and restart syslogd. Note that the output below, from grep daemon /etc/syslog.conf, shows that daemon.debug has been set.

# grep daemon /etc/syslog.conf
\*.err;kern.debug;daemon.notice;mail.crit        /var/adm/messages
\*.alert;kern.err;daemon.err                             operator
daemon.debug                                                  /var/adm/ds.out

Create a loglevel file under /var/cluster/rgm/rt/<rtname>/. Edit the loglevel file and add "9" as the debug level.

Restart the syslog daemon.
#pkill -9 syslogd

Now you would see the debug messages directed to /var/adm/ds.out

Perform failovers and switchovers (using cluster admin commands) to see how the Agent is behaving. Seeing these debug messages and correlating them with the code, gives you a much better understanding of the control flow of these Agents.

You can share your experience of debugging OHAC Agents and ask questions at: http://opensolaris.org/jive/forum.jspa?forumID=195

Hemachandran

Data services Team 

Tuesday Nov 14, 2006

Sun Cluster Data Services

Sun Cluster software provides the framework and API required to make applications highly available on Solaris OS. The software application, designed for solaris, does not have to be modified to use Sun Cluster API. Instead, you write an agent which acts as the interface between Sun Cluster core software and the application. Sun Cluster comes bundled with a rich portfolio of agents, also called data services, which make applications highly available on Sun Cluster software. These agents, designed and developed by the Sun Cluster Engineering team, have undergone rigourous testing by the internal Quality Assurance engineers.

Agents for the the following software products are available in Sun Cluster 3.1 and the upcoming Sun Cluster 3.2 releases. The following list is not in any particular order:

Oracle Database
Sybase ASE
Oracle's Siebel CRM (server and gateway)
BEA WebLogic Server
DNS
NFS Server (SC 3.2 supports the latest NFS V4)
Kerberos
Apache Web Server
Sun Java ES Web Server
Sun Java ES Application Server
Sun Java ES Message Queue Broker
Agfa IMPAX
Solaris DHCP
Oracle E-Business Suite
Oracle Application Server
Apache Tomcat
MySQL
SWIFTAlliance Access
SWIFTAlliance Gateway
IBM WebSphere MQ
Sun N1 Grid Engine
Sun N1 Service Provisioning system
PostgreSQL
SAP Web Application Server
SAP liveCache
MaxDB (previously called SAP DB)
Samba

In some cases I might not have listed the actual name of the product, as listed in the product manuals of the ISV. Please check with the ISV for the exact product name. The above list is provided to give you a high-level view of the rich application support on Sun Cluster software. If you want more details on how to configure and administer the agents for the above applications, please refer to the Data Service administration guides available at:

Sun Cluster 3.1 8/05 Software Collection for Solaris OS

As you can see, the list is very long. In addition to the agents from Sun Cluster Engineering, some internal Sun product groups and ISVs have agents for Sun Cluster software. IBM has designed and developed agents for some components of the IBM Informix and IBM DB2 product family. Symantec supports a Sun Cluster agent for Veritas NetBackup. The Sun Java ES Directory Server, Sun Java ES Messaging Server, and Sun Java ES Calender Server groups have written agents for Sun Cluster software which are bundled with the respective products.

If your application is not in the above list, then there is nothing to worry about. It is very easy to write an agent for Sun Cluster software by using the Data Service Development tools available in the product.

You need to first check  if your application can be made highly available on Sun Cluster software. This chapter in the Sun Cluster Data Services Developer's Guide lists everything you need to verify your application's cluster readiness. Most of the applications can be integrated with Sun Cluster software right out of the box. Sometimes you might have to enhance the application a little bit to be able to integrate with Sun Cluster software. Once the application is ready for integration, use the Sun Cluster Agent Builder to generate an agent for you. The Sun Cluster Agent Builder not only generates the code for you but also generates the Makefiles to compile the code and build a nice Solaris package. This generated package can be easily installed by using the pkgadd utility.

If you do not want to write a seperate agent for your application, you can use the Generic Data Service (GDS) agent to make your application HA on Sun Cluster software. Generic Data Service, as the name suggests, is a generic agent designed by Sun Cluster Engineering. GDS takes as an input, scripts to start, stop, validate, and probe an application. Instead of writing a separate agent for your application, you just write scripts to start, stop, validate, and probe your application and supply these scripts as extension properties to the GDS resource at the time of creation.

You can refer to the Sun Cluster Data Services Developer's Guide and the Sun Cluster Concepts Guide for further details.

Prasad Dharmavaram
Sun Cluster Engineering
 

About

mkb

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today