Wednesday Apr 25, 2007

Sun Cluster 3.2 Documentation Responses to Voice-of-the-Customer Feedback

Sun Cluster 3.2 is one of the most feature-rich releases in this product's history. The largest and most ambitious feature in the release was the new object-oriented command set. See A New Command Set for Solaris Cluster 3.2 for more information. This new set of commands required that the Sun Cluster writers modify every procedure in the Sun Cluster documentation suite and create over 30 new man pages. While this was a daunting task (over 2,000 procedures), the Sun Cluster writers decided to take advantage of this opportunity to address many of the comments and requests we have received from customers over the past few years.

First Request: Include More Examples

Many customers have asked us to include more examples in our documentation, especially in our man pages. In Sun Cluster 3.2, the Sun Cluster writers added examples to all of our new procedures and man pages. In addition, we went back and added examples to many of our existing procedures. We also added a new Quick Start Guide, which is a cookbook or example of how to install a common Sun Cluster configuration. Finally, we added examples of how to deploy hardware and data service (agent) configurations.

Examples of our new and improved documentation include:

  1. Quick Start Guide

    This guide is an example of a simplified Sun Cluster installation using a configuration used by approximately 80% of our Sun Cluster customers.

  2. "Sample Hardware Deployments" and "Sample Data Service Deployments" in Sun Cluster 3.2 Documentation Center

    Deployment examples provide samples of hardware and data service configurations.

  3. How to Install and Configure a Two-Node Cluster

    This "How To" provides simple instructions and examples for installing a two-node cluster.

  4. Examples in Object Oriented (1CL) man Pages and Quorum Server man Pages

    All man pages for the new object-oriented command line interface and the new quorum server feature contain comprehensive examples.

  5. Examples in All New Procedures Added for Sun Cluster 3.2 Features

    We added examples to the procedures for all new features in the Sun Cluster 3.2 release. In addition, we added examples to many of the existing procedures, as shown in Example 8.1 Live Upgrade to Sun Cluster 3.2 Software.

Second Request: Make Information Easier to Find

Someone wise once said, if you can't find the information, it might as well not be there. Many of our customers complained that it was difficult to find information in the Sun Cluster documentation. To help customers both identify and find the information they need, we created a Sun Cluster documentation center. This center provides a list of links to useful information. If you are a first-time user, we added a Getting Started section to help you ramp-up on Sun Cluster. Feel free to add your comments about this new navigation tool.

We also added information about using external browsers to search Now you can use your favorite search engine to search the Sun Cluster documentation.

Finally, we added the ability to view man pages from the procedural documentation. This ability was available in our core Sun Cluster documentation, but now we've added this ability to all our Sun Cluster product documentation. If there's a link to a man page in our documentation, just click on the link and the man page is displayed in the browser window. To view all of the man pages, click on the reference collection for the appropriate Sun Cluster product.

Examples of our improved documentation include:

  1. Sun Cluster Documentation Center

    This document provides a topic-based list of Sun Cluster information to help you find information in the Sun Cluster doc set.

  2. Searching Sun Product Documentation in Sun Cluster 3.2 Release Notes

    We added information to the Sun Cluster 3.2 Release Notes explaining how to use external browsers to search index entries. This information will help you find the product information quickly and easily.

  3. Reference Collections for All Sun Cluster Products

    The addition of reference collections for all Sun Cluster products enables you to view relevant man page information in a browsable format from links in the user documentation.

    Sun Cluster 3.2 Reference Manual
    Sun Cluster Data Services Reference Manual

See my entry here next month for two other categories of customer comments we addressed in our Sun Cluster 3.2 documentation.

Rita McKissick
Sun Cluster Documentation

Wednesday Feb 28, 2007

SSH Support for Cluster Console Panel


The Cluster Console Panel (CCP) utility has long been a favorite of users involved with administration of systems having multiple nodes. It provides a single access point to interact simultaneously with a multitude of nodes, thus saving a lot of effort.

In releases of Sun Cluster software until 3.2, the access methods which were available with the CCP utility were rlogin, telnet, and console access over telnet. The missing part was secure connections to nodes and to their consoles.

With the increasing focus on security in production environments, the Cluster Console tool, cconsole, was lacking this support. The newer breed of servers from Sun have platform managers like service processors, which offer secure connections and allow users to manage nodes remotely. The cconsole tool was, however, not equipped to utilize this. There have been repeated requests from customers to incorporate secure connections via Secure Shell (SSH) into cconsole.

The patch to Sun Cluster 3.2 software will add SSH support to both the GUI and command line variants of cconsole. The revamped CCP features include:

  • SSH support for cconsole: The cconsole tool will support connections to node consoles over SSH. This is in addition to the already existing standard telnet connections to consoles. The utility could be used in either of the following ways:

    - Launch the CCP GUI using the ccp command and then click on the cconsole button. The graphical interface for cconsole will have a new check box called “Use SSH" under the "Options" menu. Select this check box for going over SSH to the node consoles. By default, the check box is deselected, meaning that the default mode of connecting to consoles is not secure. Refer to Figure 1.

- Launch cconsole directly from the command line. The command line options for cconsole are:


New option for enabling SSH while connecting to a node's console. The /etc/serialports database has the console access device's name and the port number to be used for the SSH connection. Specify 22 as the port number if using the default SSH configuration on the console access device, otherwise specify a custom port number.

-l user

Optional SSH user name. By default, the user launching the cconsole/ccp command is effective.

If either the console or the ccp command is launched with the "-s" command line option, the “Use SSH” check box is automatically selected. If the “-s” option is not specified, select the “Use SSH” check box under the “Options” menu to enable SSH connection.

  • A new "cssh" command: CCP software will include a new cssh command which could be used to connect to nodes using standard SSH connections, in either of the following ways:

- Launch the CCP GUI with the ccp command, then click on the new cssh button (which is next to the existing crlogin, ctelnet, and cconsole buttons).

- Issue the cssh command directly from the command line. The cssh command takes the following options:

  -l user            Optional SSH user name. By default, the user launching the command is effective.

  -p port            Optional port number to use for the SSH connections. Port 22 is used by default.

Here is a screenshot of the modified Cluster Console Panel. It shows the new “cssh” button on the panel for the cssh command. It also shows the new “Use SSH” check box under the Options menu when the cconsole button is clicked.

Cluster Console Panel GUI

                Figure 1. Cluster Console Panel GUI

  • Shared options: The ccp command will accept options at the command line that are used by crlogin, cssh, and cconsole. Values passed to the options are effective for all the commands that are hence launched by clicking on the icons from the CCP GUI. For more details about the commands and their options, refer to the cconsole(1M) man page.

As an example, if one launches ccp in this manner:

      #ccp -l joe -s -p 123

then this will be the effect on individual tools that are launched from the buttons on the CCP GUI:


This command ignores all of the -l, -p, and -s options and treats everything else on the command line as cluster or node names.


The user name for rlogin would be "joe".


The SSH user name would be "joe" and the SSH port number would be "123".


The cconsole tool would use SSH to connect to the nodes due to the "-s" option. The user name for the SSH connection to the console access device (as determined by the entry in /etc/serialports) would be "joe".

The port number, however, is taken from the serialports database and not from the command-line value of the "-p" option.

In addition, the user could deselect the checkbox "Use SSH" and override the command-line option "-s", in which case the console would be accessed using a telnet connection to the console access device.

With all these changes, the CCP, and cconsole in particular, will be equipped to act as a full-fledged tool for multi-node administration, further adding to ease of use of Sun Cluster 3.2 software.

Subhadeep Sinha
Sun Cluster Engineering

Tuesday Jan 09, 2007

A New Command Set for Solaris Cluster 3.2

The 3.2 release of Solaris Cluster introduces an all-new Command Line Interface (CLI) for managing the cluster. The new command set includes many features that make it both easier to use and more powerful than the command set found in earlier releases of the product.

Of course, the familiar command set from earlier releases is still there in 3.2 and is still fully supported for those users who are not ready to switch. All commands are still located in /usr/cluster/bin, so users need not modify their PATH settings to find commands from either set. Command names in the older command set all begin with the prefix "sc", and command names in the new CLI all begin with the prefix "cl". The two command sets are fully compatible with one another, so users can intermix commands from both sets in the same shell script or on the command line.

Early response from users of the new command set has been overwhelmingly positive. But, we are sometimes asked, "Why a new CLI for Solaris Cluster? What was wrong with the old one?" Not too long ago, Sun conducted a comprehensive survey of Solaris Cluster system administrators. The study revealed that the top set of issues which administrators had with the cluster were mostly to do with the command set. Sun felt that the best solution to the issues raised in the study was to introduce an altogether new command set. Data from that study, along with other customer feedback, was used to begin design of a new CLI. The design underwent several phases of refinement as we went back to customers with our ideas. And, finally, in-depth usability studies were conducted before the final design was ready for implementation.

The new command set is "object-oriented". That is, there is a different command for each type of cluster object that an administrator might need to manage. For example, a user managing just resource groups need only use the new "clresourcegroup" command to manage groups. The "clresourcegroup" command can be used to create/delete groups, set group properties, perform resource group switchovers, print configuration and status reports, etc... Each new command supports full management of all objects of the type that it controls.

The new command interfaces all use the same basic format:

cl<object_type> [<subcommand>] [<options>] [<objects>]

So, for example, to create a new resource group "rg1", one might use either one of the following commands:

clresourcegroup create rg1
clresourcegroup create --property Description="My rg" rg1

Users will rely on subcommands for most things, although some options, such as --help and --version, can be used without subcommands.

Most of the new commands also have built-in "aliases", or "short names". Our customers told us that they wanted command names that were descriptive and meaningful. But, they also told us that they wanted command names that were short and easy to type. The Solaris Cluster team felt that the best solution was to provide both. So, most commands actually have two names to choose from, a descriptive name and a short name. For example, "clrg" is the same as "clresourcegroup":

clresourcegroup create rg1
clrg create rg1

From the examples above, you might have noticed that the new commands accept long option names (e.g., --property, --help, --version, ...). Long option names are useful, especially in shell scripts, as they tend to be self-documenting. But, they can also get in the way when issuing commands directly from the command line. All of the new commands support both long-names and single-letters for options. When specifying options, users may either use long option names, with a double dash (--), or short option letters, with a single dash (-). As an example, the following two commands do exactly the same thing:

clrg create --property Description="My rg" rg1
clrg create -p Description="My rg" rg1

By now, you may thinking that this is pretty basic stuff and is very similar to other commands that you have used. And, that is exactly what we intended. The new Solaris Cluster command line interface is designed to present a familiar interface. Commands are GNU-like, but actually conform to the slightly stricter conventions of Sun's "Command Line Interface Paradigm" (CLIP).

There isn't enough space here to describe all of the useful features packed into the new command set. But, let's use a couple of simple examples to quickly illustrate some of the other features that we haven't yet touched upon.

Our first example deletes, then re-creates, all resources and groups in a cluster:

# cluster export >clusterconfig.xml
# clrg delete --force +
# clrg create --input clusterconfig.xml +
# clrs create --input clusterconfig.xml +
# clrg online +

The first command in this example is "cluster export". Most of the new commands support an "export" subcommand for exporting a copy of selected cluster configuration data in XML format. The "cluster" command is something of an "umbrella" command and can, among other things, generate status and configuration reports for the entire cluster by using it with its "status", "show", or "export" subcommands.

Next, "clrg delete --force +" is used to delete all resources and resource groups from the cluster. The force option tells the command to delete all resource groups, even if groups still contain resources. The "+" symbol can be used as an operand to most commands as a sort of "wildcard character", to indicate all objects of the type managed by the command.

The next two commands, "clrg create" and "clrs create" are used to re-create all of the resource groups and resources described in the "clusterconfig.xml" file created in the first step. As a different approach, we could have actually left out this "clrg" step and used the "--automatic" option in the "clrs" step to automatically create any groups needed by the new resources.

Finally, all resource groups in the cluster are brought online using "clrg online +".

This next example shows how a "string array" resource property can be updated:

# clrs list-props --verbose myresource
Property Name       Description
-------------  -----------
myuserlist          This is a list of user names
# clrs set -p myuserlist+=user9,user10 myresource

"list-props" is a fairly helpful subcommand. It is used to list property names and, if used with --verbose, their descriptions. With "clrs", the default is to just list extension properties; but, options are available to list standard properties as well. In this example, we have used "clrs list-props" to list descriptions of all extension properties for the resource called "myresource".

Finally, "clrs set" is used to update the "myuserlist" extension property of resource "myresource". Notice that it is no longer necessary to distinguish between "extension" and "standard" properties when updating them on the command line (unless a resource type uses an "extension" property name which collides with a "standard" property name). Another useful feature is that "string array" properties can now be updated without having to re-specify the unchanged portion of the array. The --property (-p) option to "clrs" supports all of the following syntax for updating a "string array" resource property:

--property <property name>=<property value list>
--property <property name>+=<property value list>
--property <property name>-=<property value list>

Online man pages for the the new Solaris Cluster CLI can be found in the "1CL" section of the Sun Cluster Reference Manual for Solaris OS.

There are many more powerful features in the new command set that we have not been able to touch on here. Let us know what you like best about the new command set. And, of course, send us your suggestions for improvements, too.

John Cummings
Sr. Staff Engineer
Solaris Cluster Engineering

Saturday Dec 23, 2006

Solaris Cluster 3.2 is out there!!!

There was great excitement at the office last week! We are all really proud and happy to see our product suite, Solaris Cluster , roll out, before the end of the year 2006! Mid-year '06, the schedules were modified and the release dates pulled in... there was committment in the team to meet this challenge and we are all really glad that we made it!

Solaris Cluster is the framework that extends the high availability features of Solaris. It includes the latest releases of the Sun Cluster software: Sun Cluster 3.2, Sun Cluster Geographic Edition 3.2 and Sun Cluster Agents 3.2. Solaris Cluster 3.2 is available for download here. The media will be available with the Expanded Solaris(TM) 10 11/06 Media Kit, early 2007. Here is a quick update on the release and its features.


Ease of Use
\* New Command Line Interfaces
\* Oracle 10g improved integration and administration
\* Agent configuration wizards
\* Flexible IP address scheme

Higher Availability
\* Cluster support for SMF services
\* Quorum server
\* Extended flexibility for fencing protocol
\* Greater Flexibility
\* Expanded support for Solaris Containers
\* HA ZFS - agent support for Sun's new file system
\* Extended support for Veritas software components

Better Operations and Administration
\* Dual-partition software update
\* Live upgrade
\* Optional GUI installation

With Solaris Cluster Geographic Edition, new features include:
\* Support for x64 platforms
\* Support for EMC SRDF replication software

Solaris Cluster is supported on Solaris 9 9/05 and Solaris 10 11/06.


You should be able to register for the Sun Cluster training courses:

\* ES-345 Sun Cluster 3.2 Administration (5 day Instructor Led Training)
\* VC-ES-345 Sun Cluster 3.2 Administration (5 day Live Virtual Class)
\* ES-445 Sun Cluster 3.2 Advanced Administration (5 day Instructor Led Training)

For a complete list of Sun Cluster courses and more details, watch this space. We will blog about this in future too.


Check out the Business Continuity Learning centre here. See our director, Keith White, in a video, talking about the features and benefits of Sun Cluster. There is also an on-line demo and a how-to install Sun Cluster guide that you will find interesting.


We had a few days to do some fun activities, as a team, before our holiday break. There was a "YouTube challenge" suggested by Keith. As part of the fun, we had our creative hats on and made videos and designed TShirts with our product, as the theme. We had less then 3 days to make the videos. On thursday 21st December, the team met over lunch to view the 7 videos that were made. It was fun watching the videos created. Some of them are really fun. I am in awe of the creativity in this group!

Check one of our fun videos "Get A Life"


GetALife is also on YouTube. We will share some of the others as well. Some are bound to make you laugh!

Next year promises to be an interesting year! We will be back in '07 with renewed energy!

Happy Holidays!

Meenakshi Kaul-Basu
Sr. Manager, Solaris Cluster Engineering

Sunday Dec 10, 2006

Making SMF services Highly Available with Sun Cluster

If you have written an SMF service for an application to run on a single-node and want to make the service highly available across multiple nodes, then Sun Cluster can be used. Even though the Sun Cluster service model is different from the SMF service model, for adding HA it takes only as much as running a few simple commands without any need to write new agent script or code.

To support the SMF services, Sun Cluster 3.2 provides the following three new resource types.
- Proxy_SMF_failover resource type allows an SMF service to be managed as a failover resource.
- Proxy_SMF_multimaster resource type allows an SMF service to be online on more than one node simultaneously (without any load balancing).
- Proxy_SMF_scalable resource type allows running an SMF service as a scalable resource (with Sun Cluster load balancing facility).

Here is an example to show how easy it is to make a DNS server (an SMF service) Highly Available. We encapsulate the DNS server SMF service in an SMF proxy resource dns-rs. Note that SC 3.1u4 already provides an HA-DNS agent, which is the recommended option for making a DNS server HA on Sun Cluster. This is just an example.

1. Create a text file specifying the name of the SMF service and the location of its manifest, as shown in the example below and save it in any convenient location, say /tmp.

# cat /tmp/dns_svcs

2. Register the Proxy SMF failover Resource type.

# clrt register SUNW.Proxy_SMF_failover

3. Create a resource group dns_rg specifying the list of cluster nodes on which the service can run, in this example, plift1 and plift2.

# clrg create -n plift1,plift2 dns_rg

4. Add a dns_rs to manage the above mentioned SMF service. Specify the path and name of the text file created earlier in step 1 as the value for the extension property Proxied_service_instances.

# clrs create -g dns_rg -t Proxy_SMF_failover -x Proxied_service_instances=/tmp/dns_svcs dns_rs

5. Manage the rg and bring it online.

# clrg online -M dns-rg

The service is up and running, with all the supervision and failover capability provided by SUN Cluster. If a failure occurs, the resource group will automatically be restarted or switched onto a different node.

6. Verify the status of the resource group and the SMF proxy resource.

# clrg status
=== Cluster Resource Groups ===
Group Name Node Name Suspended Status
---------- --------- --------- ------
dns-rg plift1 No Online
  plift2 No Offline

# clrs status
=== Cluster Resources ===
Resource Name Node Name State Status Message
--------- --------- ----- --------------
dns-rs plift1 Online but not monitored Online
  plift2 Offline Offline

7. Verify that the dns server SMF service is online on plift1 and is offline on plift2.

Logon to plift1 and run the commands below to verify.

# svcs -a | grep dns
online 11:38:14 svc:/network/dns/server:default

# svcs -l svc:/network/dns/server:default
fmri svc:/network/dns/server:default
name BIND DNS server
enabled true
state online
next_state none
state_time Wed Nov 15 11:38:14 2006
restarter svc:/system/cluster/sc_restarter:default
dependency require_all/none file://localhost/etc/named.conf (online)
dependency require_all/none svc:/system/filesystem/minimal (online)
dependency require_any/error svc:/network/loopback (online)
dependency optional_all/error svc:/milestone/network (online)

("named" is a dns specific daemon which is started by the dns service)

# ps -efj | grep name
root 7773 1 7773 7773 0 11:38:15 ? 0:00 /usr/sbin/named
root 7785 7598 7784 7594 0 11:40:06 pts/4 0:00 grep named

Logon to plift2 and repeat the above commands and verify that the dns SMF service is offline.
# svcs -a | grep dns
offline 11:38:09 svc:/network/dns/server:default

# svcs -l svc:/network/dns/server:default
fmri svc:/network/dns/server:default
name BIND DNS server
enabled true
state offline
next_state none
state_time Wed Nov 15 11:38:09 2006
restarter svc:/system/cluster/sc_restarter:default
dependency require_all/none file://localhost/etc/named.conf (online)
dependency require_all/none svc:/system/filesystem/minimal (online)
dependency require_any/error svc:/network/loopback (online)
dependency optional_all/error svc:/milestone/network (online)

# ps -efj | grep name
root 15318 15287 15317 15283 0 11:40:11 pts/1 0:00 grep name

8. You can switchover the resource-group dns_rg from node plift1 to plift2. Thereby it goes offline on plift1 and goes online on plift2.

# clrg switch -n plift2 dns_rg

Check the status of the resource group.

# clrg status
=== Cluster Resource Groups ===
Group Name Node Name Suspended Status
---------- --------- --------- ------
dns-rg plift1 No Offline
  plift2 No Online

# clrs status
=== Cluster Resources ===
Resource Name Node Name State Status Message
--------- --------- ----- --------------
dns-rs plift1 Online Offline
  plift2 Offline but not monitored Online

Verify that the SMF service has been stopped on plift1 and running on plift2.

On plift1.

# svcs -a | grep dns
disabled Nov_07 svc:/network/dns/client:default
offline 11:42:22 svc:/network/dns/server:default

# ps -efj | grep name
root 7808 7598 7807 7594 0 11:43:47 pts/4 0:00 grep name

On plift2.

# svcs -a | grep dns
disabled Nov_04 svc:/network/dns/client:default
online 11:42:22 svc:/network/dns/server:default

# ps -efj | grep name
root 15331 15287 15330 15283 0 11:43:52 pts/1 0:00 grep name
root 15324 1 15324 15324 0 11:42:23 ? 0:00 /usr/sbin/named

In case you want to decouple the SMF service from Sun Cluster control, all you need to do is disable the SMF proxy resource and delete it.

Now that is as cool as it can get. So go ahead and try it out.

Harish Mallya
Sun Cluster Engineering.


Oracle Solaris Cluster Engineering Blog


« June 2016