Sun Cluster 3.2 is now available!
By relling on Dec 28, 2006
Sun Cluster version 3.2 has arrived just in time for the holidays. There is a number of features that we have been waiting for in this release. Here is my favorite list, in no particular order:
- ZFS support - ZFS is now supported as a fail-over file system (HAStoragePlus). This is an important milestone for the use of ZFS in mission critical systems.
- Sun Cluster Quorum Server - this is another feature I've wanted for many years. Prior to Sun Cluster 3.2, we used a shared storage device as a voting member for the quorum algorithm. The quorum algorithm ensures that only one cluster is operating with the storage at any time. For a two-node cluster, the shared storage was the tie-breaking vote. For three or more nodes, you can configure the nodes themselves to break the tie with or without storage votes. Over the years we've been stung many times by the implementations of SCSI reservations in storage devices. There are many uhmm... inconsistent implementations of SCSI reservations out there and if they didn't quite work as expected, it would cause a problem with the quorum voting. This is one reason why qualifying shared storage devices for Sun Cluster is so time consuming and frustrating. Now, with the quorum server, you can select some other server on the network to handle the quorum voting tie breakers. This provides additional implementation flexibility and can help eliminate the vagaries of array or disk software/firmware implementations of SCSI reservations.
- Fencing protocol flexibility - use SCSI-2 or SCSI-3 reservations. Prior to Sun Cluster 3.2, the default behavior was to use SCSI-2 (non persistent) reservations for two-node clusters and SCSI-3 (persistent) reservations for more-than-two-node clusters. This was another source of frustration (as above). Not only are SCSI-2 reservations going away, this is one less compatibility speed bump to overcome in cluster designs.
- Disk-path failure handling - a node can be configured to reboot if all of the paths to the shared storage are failed. For a correctly configured cluster, this represents a multiple failure scenario policy. This is also a commonly requested multiple failure scenario "test" so it should create more smiles than frowns.
- More Solaris Zones support - a number of data services have been modified to now work in zones. We introduced basic zone failover support in Sun Cluster 3.1. Now we have over a dozen data services which are ready to work inside zones. This allows you to compartmentalize services to a finer grain than ever before, while still providing highly available services. For example, you might run the Sun Java System Application Server in one zone on on node and the PostgresSQL, mySQL, or Oracle database in another zone on another node. If one node fails, then the zone moves to the surviving node. Since the services are in zones, then you can manage the resource policies at the zone level. Very cool.
- Better integration with Oracle RAC 10g - continued improvements in integrating RAC into a fully functional, multi-tier, highly available platform. We are often asked why use Sun Cluster with RAC when "RAC doesn't need Sun Cluster." One answer is that most people deploy RAC as part of a larger system and Sun Cluster can manage all of the parts of the system in concert (Sun Cluster as the conductor and the various services as performers.)
- More flexibility in IP addresses - no more class B networks for private interconnects. Believe it or not, some of our customers get charged for each possible IP address that a server may directly address! No kidding! Prior to Sun Cluster 3.2, we reserved a private class B IP address range for each interconnect (up to six). This was an old design decision made when the vision was that there would be thousands of nodes in a Sun Cluster and we'd better be prepared to handle that case. In reality, closely coupled clusters with thousands of nodes aren't very practical. So we've changed this to allow more flexibility and smaller address space. Note: these weren't routable IP address ranges anyway, but that argument didn't always make it past the network police.
- Improved upgrade procedures - now dual-partition and Live Upgrade procedures are supported. This eliminates a long standing requirements gotcha: Sun Cluster "supports" Solaris; Solaris has Live Upgrade; Sun Cluster didn't "support" Live Upgrade; huh?
- Improved installation and administration - many changes here which will make life easier for system administrators.
- Improved performance and reliability - faster, better failure detection and decision making and many more hours of stress testing have improved the Sun Cluster foundation and agents. Much of this work occurs back in the labs, largely unseen by the masses. We make detailed studies of how the services work under failure conditions and use those measurements to drive product improvement and test for regressions. This is part of our high quality process built into the DNA of Sun Cluster engineering.
Whew! And this is just my favorite list! I encourage you to check out the Sun Cluster 3.2 docs, especially the Release Notes. And, of course, you can download and try Sun Cluster 3.2 for yourself (and yes, it does work on AMD/Intel platforms, including laptops!)