Resource Dependencies

Before the release of the Sun Cluster 3.1 9/04 software, the Resource and Resource Group dependency model used by Sun Cluster 3.x was simple and minimalist, providing a method for ordering the starting and stopping of different resources and resource groups. Within the same resource group, you could make one resource dependent on another resource, meaning that the latter resource had to be started before the first one could be started.

Just two types of resource dependencies were provided in the early Sun Cluster 3.x releases: strong dependencies, which require a depended-on resource to start before its dependent can be started, and weak dependencies, which wait for a depended-on resource to start before starting the dependent, but then start the dependent even if the depended-on resource fails to start.

A real-world case where you'd want to use a strong dependency is an application that depends on a database. The application needs to delay starting until after the database is online. Conversely, the application should stop before the database is taken offline. To achieve this you make the application resource depend on the dbms resource, for example:

clresource set -p resource_dependencies=oracle-server-rs application-rs

A weak dependency might be used by an application that prefers to use the underlying service, but can run without that underlying service. For example, if an application finds that the database is down, it might start and run in a reduced-functionality mode. In the clresource command above, you would set the resource_dependencies_weak property instead of resource_dependencies. Then, application-rs will wait for oracle-server-rs to start first; but if Oracle remains offline or fails to start, application-rs will start anyway.

With the release of Sun Cluster 3.1 9/04 software, this model was enhanced in several ways. Starting with that release, all types of resource dependencies are allowed to span across resource group boundaries, and therefore between nodes. A new dependency type called the restart dependency was added, which is similar to the strong dependency, with the additional feature that if the depended-on resource is stopped and then restarted, the dependent resource will also be restarted.

A restart dependency would be used for an application that has to be restarted in order to re-establish its connection to the underlying depended-on service. An example of such an application was WebSphere Business Information Message Broker v5, which had to be restarted if the underlying DBMS was restarted. (Note, this requirement was lifted in Message Broker v6.)

In Sun Cluster 3.2, we are introducing another flavor of resource dependency in addition to the existing strong, weak, and restart flavors. The new kind of dependency is called an offline-restart dependency. This is similar to the restart dependency, except that the dependent resource goes offline immediately as soon as the depended-on resource goes offline.

An example will help clarify the distinction between the restart dependency and the offline-restart dependency. Suppose that a resource r_app has a restart dependency on a resource r_dbms, and both resources are initially online. If r_dbms goes offline, r_app remains online. When r_dbms goes back online, r_app is then stopped and restarted.

Now instead of a restart dependency, suppose that r_app has an offline-restart dependency on r_dbms, and both resources are initially online. If r_dbms goes offline, r_app is also brought offline at the same time. If r_dbms later goes back online, then r_app is automatically brought back online.

The offline-restart dependency is useful when a fault in the underlying depended-on service renders the dependent service unusable. Instead of leaving the dependent service online but faulted when the underlying service goes offline, the offline-restart dependency will cause the dependent service to also be taken offline. When the underlying service recovers and is restarted, the dependent service automatically starts again.

The original restart dependency can still be useful if the dependent service is able to run in a degraded mode after the depended-on service goes offline. In this case, you wouldn't want to take the dependent service offline immediately. However, after the depended-on resource comes back online, you might still need to restart the dependent resource to re-establish full service. The restart dependency provides this behavior.

In Oracle 10g RAC configurations, offline-restart dependencies are used. We configure a scalable mount point (ScalMountPoint) resource to control availability of a file system mount point that is accessible from multiple nodes of the cluster. Underlying the ScalMountPoint resource is a ScalDeviceGroup resource, which controls availability of the disk resources on which the mountpoint is created. We configure an offline-restart dependency of the ScalDeviceGroup upon the ScalMountPoint:

# clresource create -t SUNW.ScalMountPoint -g scal-mp-rg \\
-p resource_dependencies_offline_restart=scal-dg-rs ... mp-resource

Now if the ScalDeviceGroup resource (scal-dg-rs) goes offline due to a fault, the ScalMountPoint resource (mp-resource) is immediately taken offline as well. If at a later time the ScalDeviceGroup is re-enabled and goes online, then the ScalMountPoint automatically goes online. Application resources higher up the dependency tree can in turn declare offline-restart dependencies on ScalMountPoint, so that they too will be taken offline if the disk resource fails, and will be brought back online when it recovers.

Martin Rattner
Sun Cluster Engineering

Comments:

Hi, Martin For Sun Cluster 3.1 9/04, if the database is configured in FailOver mode, and there is a HTTP Server that must run on the same host the database instance runs, could we use the restart dependency to let the HTTP Server failover to another host when the database failover? Thanks.

Posted by Gulf Zhou on December 29, 2006 at 11:28 AM PST #

Hi Gulf Zhou,I Looking at your case I suggest you to use the strong positive affinity with delegated failover. Check the blog article titled "How does sun cluster decide on which node a service runs?" posted on Nov 29, 2006 on Sun Cluster Oasis for more details on how you can accomplish this.

Posted by Harish Mallya on January 03, 2007 at 03:31 AM PST #

Can the offline restart dependency be used between resource groups or only by resources within the same resource group ?

Posted by Jeroen on January 11, 2007 at 08:44 PM PST #

Offline restart depedencies can be used between resources which are in different resource groups (termed as inter-rg) or also between the resources within the same resource group (termed as intra-rg).

Posted by Harish on January 17, 2007 at 11:42 PM PST #

Is there a way to specify dependency between two resource groups, where one runs at the global zone, and the other needs to be in a local zone on the same node?

e.g.
I have "myzone" configured for fail-over using SUNWsczone, node list for the RG is "node1,node2".
I would like to have apache running in myzone, using SUNW.apache. Since the node list is "node1:myzone,node2:myzone", I have to put it in a separate RG.
But I don't see how I can say "if the zone fails over to the other node, the apache service has to follow onto the same node"...
If I follow this:
http://docs.sun.com/app/docs/doc/820-2578/fumsy?a=view
it means I'll have to write my own start/stop/probe scripts. Is there any way to use the available Data Service packages in such a setup?

HXY

Posted by HXY on November 27, 2008 at 09:12 PM PST #

I see that HXY's question has gone unanswered. You can achieve the desired functionality by declaring a strong positive affinity with failover delegation (+++ affinity) between the RGs. For example,
# clrg set -p rg_affinities=+++RG2 RG1
where RG1 is the apache resource group and RG2 is the HA container (sczone) resource's group.

The +++ affinity assures that both RGs will always run on the same node. You should also declare a resource dependency of the apache resource upon the zone resource, so that apache does not try to start until the zone has come up.

Posted by Martin Rattner on July 27, 2009 at 06:34 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

mkb

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today