Friday Oct 16, 2009

New White Paper: Practicing Solaris Cluster using VirtualBox

For developers it is often convenient to have all tools necessary for their work in one place, ideally on a laptop for maximum mobility.

For system administrators, it is often critical to have a test system on which to try out things and learn about new features. Of course the system needs to be low cost and transportable to anywhere they need to be.

HA Clusters are often perceived as complex to setup and resource hungry in terms of hardware requirements.

This white paper explains how to setup a single x86 based system (like a laptop) with OpenSolaris, configuring a training and development environment for Solaris 10 / Solaris Cluster 3.2 and using VirtualBox to setup a two node cluster. The configuration can then be used to practice various technologies:

OpenSolaris technologies like Crossbow (to create virtual networking adapters), COMSTAR (to export iSCSI targets from the host being used as iSCSI initiators by the Solaris Cluster nodes as shared storage and quorum device), ZFS (to  export a ZFS volume as iSCSI targets and as failover file system within the cluster) and IPsec (to secure the cluster private interconnect traffic) are used for the host system and VirtualBox guests to configure Solaris 10 / Solaris Cluster 3.2.

Solaris Cluster technologies like software quorum and zone clusters are getting used to setup HA MySQL and HA Tomcat as failover services running in one virtual cluster. A second virtual cluster is being used to show how to setup Apache as a scalable service.

The instructions can be used as a step-by-step guide to setup any x86 64bit based system that is capable to run OpenSolaris. A CPU which supports hardware virtualization is recommended as well as at least 3GB of main memory. In order to try out if your system works, simply boot the OpenSolaris live CD-ROM and confirm with the Device Driver Utility (DDU) that all required components are able to run. The hardware compatibility list can be found at The role model for such a system is the Toshiba Tecra M10 with 4GB main memory.

If you ever had missed a possibility to just try out things with Solaris 10 and Solaris Cluster 3.2 and exploring new features - this is your chance :-)

Monday Oct 05, 2009

Sicherheitsaspekte bei Hochverfügbarkeitsclustern

Vor kurzem bekam ich die Gelegenheit, einen Artikel für den Newsletter zur IT-Security-Messe it-sa beizutragen.

Der Artikel gibt einen Überblick über Methoden zur Härtung (security hardening) und Minimierung von Solaris und Solaris Cluster basierenden HA Systemen.

Friday Mar 06, 2009

Zone Clusters Blueprint available - How to deploy virtual clusters and why

If you have read the introducing blog about the new Zone Clusters feature, which got released with Solaris Cluster 3.2 01/09, then you might be also interested that a new blueprint has been published: Zone Clusters - How to deploy virtual clusters and why, again written by the techlead himself: Dr. Ellard Roush.

Note that you need to login with your Sun Online Account.

The following topics are getting covered:

  • Some background on virtualization technologies
  • High level overview of Zone Cluster
  • Zone Cluster customer use cases
  • Zone Cluster design overview
  • Creation of an example Zone Cluster
  • Detailed examples for the many clzonecluster commands

Wednesday Jan 28, 2009

Unsung features of the Solaris Cluster 3.2 1/09 (U2) release

By now I am sure you have seen that Solaris Cluster 3.2 1/09 got released yesterday. The set of new features is impressive, where the Solaris Containers Cluster for Oracle RAC clearly stands out. Read the Release Notes and the new set of product documentation for more details.

But I also want to mention the additional agent qualifications part of this new update release:

  • Informix Dynamic Server (IDS) version 9.4, 10, 11 and 11.5 on Solaris 10 (SPARC and x64) - this is a new agent which got developed within a project of the HA Clusters Community Group for Open HA Cluster
  • PostgreSQL agent support for PostgreSQL WAL shipping - again developed within the corresponding project for Open HA Cluster
  • SAP version 7.1
  • MaxDB version 7.7
  • WebLogic Server version 9.2, 10.0 and 10.2 in Solaris Container
  • SwiftAlliance Access version 6.2
  • SwiftAlliance Gateway version 6.1
  • Sun Java System Message Queue version 4.1 and 4.2
  • Sun Java System Application Server version 9.1UR2, Glassfish V2 UR2
  • Sun Java System Web Server version 7.0u4
  • MySQL version 5.1 - have also a look at the corresponding project page for Open HA Cluster
  • Apache Proxy Server version 2.2.5 and versions bundled with Solaris 10 10/08 and 5/08
  • Apache Web Server version 2.2.5 and versions bundled with Solaris 10 10/08 and 5/08
  • Agfa IMPAX version 6.3
  • IBM Websphere MQ version 7
  • Solaris 9 Container support within the HA Containers agent - again developed as part of the corresponding project for Open HA Cluster

Clearly anyone deploying Solaris Cluster wants to achieve high availability for an application, or set of applications. Therefor I believe we can be proud on our rich data services portfolio for various standard applications!

Monday Jan 12, 2009

White paper on implementing Oracle 11g RAC with Solaris Cluster Geographic Edition and EMC

EMC2 did publish a white paper about "Implementing Oracle 11g RAC on Sun Solaris Cluster Geographic Edition Integrated with EMC SRDF". The abstract reads like:

 "This white paper  documents a proof of concept for running Oracle 11g RAC in a Sun Solaris Cluster Geographic Edition (SCGE) framework. The paper outlines the steps in configuring the test environment and also describes the system's functionality and corresponding administrative task."

It is a result of collaboration from Sun and EMC2 at Oracle Open World 2008.

Monday Nov 24, 2008

Some Blueprints and Whitepaper for Solaris Cluster

Recently there have been some Blueprints and Whitepaper made available, which also cover Solaris Cluster within the various topics:

  1. Blueprint: Deploying MySQL Database in Solaris Cluster Environments for Increased High Availability
  2. Whitepaper: High Availability in the Datacenter with the Sun SPARC Enterprise Server Line
  3. Community-Submitted Article on BigAdmin: Installing HA Containers With ZFS Using the Solaris 10 5/08 OS and Solaris Cluster 3.2 Software
Hope you find them useful!

Thursday Oct 16, 2008

Business Continuity and Disaster Recovery Webcast

In the past months I talked in various presentations about Open HA Cluster and Solaris Cluster. The emphasis has been set to give an introduction into the Solaris Cluster architecture and the fact that this product is now fully Open Source, describing the various possibilities to contribute and giving an overview about already existing projects.

Most talks started with a note that in order to achieve high availability for a given service, it is not just enough to deploy a product like Solaris Cluster. The same is true if you look for business continuity and disaster recovery solutions. Besides the service stack in the backend, it is not only necessary to analyze the infrastructure end-to-end to identify and eliminate single points of failure (SPOF), but also to have a close look at people (education), processes, policies and clearly defined service level agreements.

Thus I am happy to see a webcast hosted by Hal Stern about Business Continuity and Disaster Recovery, which gives a nice introduction about this holistic topic. More information can be found at a dedicated page about Sun Business Continuity & Disaster Recovery Services.

Start with a Plan Not a Disaster! :-)

Monday Jun 23, 2008

Solaris8 and 9 Container on Solaris Cluster

If you are still running applications on Solaris 8 using SPARC hardware and maybe even using Sun Cluster 3.0, then you should get a plan ready to upgrade to a more recent releases like Solaris 10 and Solaris Cluster 3.2 02/08.

As you might know the last ship date for Solaris 8 was 02/16/07, the end of Phase 1 support is scheduled for 3/31/09.

Sun Cluster 3.0 is also reaching its end of life as announced within the Sun Cluster 3.2 Release Notes for Solaris OS.

In case you can not immediately upgrade to a newer Solaris release, Sun recently announced the Solaris 8 Container, which introduces the solaris8 brand type for non-global zones on Solaris 10. The packages can be freely downloaded for evaluation and would require subscription for the RTU and support.

While the solaris8 brand type is NOT extending the support life for Solaris 8, it allows you a phased approach for migrating to Solaris 10 and leveraging new hardware platforms while the application still runs within a Solaris 8 runtime environment.

The Sun Cluster Data Service for Solaris Containers does support the solaris8 brand type for Sun Cluster 3.1 08/05 with Patch 120590-06 and for Solaris Cluster 3.2 with Patch 126020-02 and newer.

Before going through the virtual to physical (p2v) migration, the existing Sun Cluster 3.0 configuration and packages need to get removed. See the Sun Cluster 3.0 System Administration Guide for more details on how to achieve that. This also means that there is no cluster framework running within the solaris8 brand type zone. Therefore existing standard agents can not be used. However, the sczsh component of the HA Container agent can be used to manage an application running within that solaris8 branded zone.

Of course any migration should get carefully planned.

The same works and is true for the recent announced Solaris 9 Containers. Patch 126020-03 introduces support for the solaris9 brand type for the HA Container agent on Solaris Cluster 3.2.

Wednesday Apr 02, 2008

Detailed Deployment and Failover Study of HA MySQL on a Solaris Cluster

Krish Shankar from ISV engineering published a very nice and detailed blog illustrating the deployment process of MySQL on a Solaris Cluster configuration. It also focuses on regression and failover testing of HA MySQL, and explains in detail the tests that were performed.  Solaris 10 fully supports MySQL, and the HA cluster application agent for MySQL on Solaris Cluster.

Monday Jan 21, 2008

BigAdmin article on how to implement IBM DB2 on Solaris Cluster 3.2

My colleague Neil Garthwaite from Availablility Engineering and Cherry Shu from ISV Engineering did write an article on BigAdmin about implementing IBM DB2 UDB V9 HA in a Solaris Cluster 3.2 Environment.

This paper provides step-by-step instructions on how to install, create, and enable DB2 Universal Database (UDB) V9 for high availability (HA) in a two-node Solaris Cluster 3.2 environment. The article demonstrates how to use ZFS as a failover file system for a DB2 instance and how to implement DB2 failover across Solaris Containers in the Solaris 10 Operating System.

Friday Nov 09, 2007

SWIFTAlliance Access and SWIFTAlliance Gateway 6.0 support on SC 3.2 / S10 in global and non-global zones

The Solaris 10 packages for the Sun Cluster 3.2 Data Service for SWIFTAlliance Access and the Sun Cluster 3.2 Data Service for SWIFTAlliance Gateway are available from the Sun Download page. They introduce support for SWIFTAlliance Access and SWIFTAlliance Gateway 6.0 on Sun Cluster 3.2 with Solaris 10 11/06 or newer. It is now possible to configure the data services for resource groups, which can fail over between the global zone of the nodes or between non-global zones. For more information consult the updated documentation, which is part of the PDF file of the downloadable tar archive.

The data services were tested and verified in a joint effort between SWIFT and Sun Microsystems at the SWIFT labs in Belgium. Many thanks to the SWIFT engineering team and our Sun colleagues in Belgium for the ongoing help and support!

For completeness, here is the support matrix for SWIFTAlliance Access and SWIFTAlliance Gateway with Sun Cluster 3.1 and 3.2 software:

Failover Services for Sun Cluster 3.1 SPARC

Application Application Version Solaris SC version Comments
SWIFTAlliance Access 5.0
8, 9
10 11/06 or newer
3.1 SPARC Requires Patch 118050-05 or newer
SWIFTAlliance Gateway 5.0
10 11/06 or newer
3.1 SPARC Requires Patch 118984-04 or newer

Failover Services for Sun Cluster 3.2 SPARC

Application Application Version Solaris SC version Comments
SWIFTAlliance Access 5.9
10 11/06 or newer
3.2 SPARC Requires Patch 126085-01 or newer
Package available for download
SWIFTAlliance Gateway 5.0
10 11/06 or newer
Package available for download

If you want to study the data services source code, you can find it online for SWIFTAlliance Access and SWIFTAlliance Gateway on the community page for Open High Availability Cluster.

Thursday Oct 18, 2007

Oracle certifies Sun Cluster 3.2 for RAC 10gR2 64 & 32-bit on x86-64

Finally Oracle now also certified RAC 10gR2 64 & 32-bit for Sun Cluster 3.2 running on the Solaris 10 x86/x64 platform. You can verify this if you have a Metalink account, in the "Certify" column, clicking on the section "View Certifications by Platform", selecting "Solaris Operating System x86-x64" and then selecting "Real Application Clusters".

Together with the certification mentioned in my previous blog on the SPARC platform, you can fully leverage the recently documented "Maximum Availability Architecture (MAA) from Sun and Oracle" solution. Details are available within the published white paper and presentation.

The BigAdmin document "Installation Guide for Solaris Cluster 3.2 Software and Oracle 10g Release 2 Real Application Clusters" describes a detailed, step-by-step guide for installing the Solaris 10 11/06 Operating System, Solaris Cluster (formerly Sun Cluster) 3.2 software, the QFS 4.5 cluster file system, and Oracle 10g Release 2 Real Application Clusters (Oracle 10gR2 RAC). It also provides detailed instructions on how to configure QFS and Solaris Volume Manager so they can be used with Oracle 10gR2 RAC. Those instructions are valid for SPARC and x86-x64.

Saturday Jul 14, 2007

"Secure by default" and Sun Cluster 3.2

If you choose the "Secure by default" option when installing Solaris 10 11/06 (which is equal to run "netservices limited" lateron), then you need to perform the following steps prior to installing Sun Cluster 3.2:

  1. Ensure that the local_only property of rpcbind is set to false:
    # svcprop network/rpc/bind:default | grep local_only

    if local_only is not set to false, run:

    # svccfg
    svc:> select network/rpc/bind
    svc:/network/rpc/bind> setprop config/local_only=false
    svc:/network/rpc/bind> quit
    # svcadm refresh network/rpc/bind:default

     It is needed for cluster communication between nodes.

  2. Ensure that the tcp_listen property of webconsole is set to true:
    # svcprop /system/webconsole:console | grep tcp_listen

    If tcp_listen is not true, run:

    # svccfg
    svc:> select system/webconsole
    svc:/system/webconsole> setprop options/tcp_listen=true
    svc:/system/webconsole> quit
    # svcadm refresh svc:/system/webconsole:console
    # /usr/sbin/smcwebserver restart

    It is needed for Sun Cluster Manager communication.

    To verify if the port is listen to \*.6789 you can execute
    # netstat -a | grep 6789

Wednesday Jul 11, 2007

Oracle certifies Sun Cluster 3.2 for RAC 9i/10g on S9/S10 SPARC

Finally Oracle did officially certify RAC 9.2/10gR1/10gR2 64-bit on Solaris 9 and Solaris 10 SPARC running with Sun Cluster 3.2.  You can verify this if you have a Metalink account, in the "Certify" column, searching in the section "Product Version and Other Selections: RAC for Unix On Solaris Operating System (SPARC)".

There got also an Installation Guide for Solaris Cluster 3.2 Software and Oracle 10g Release 2 Real Application Clusters published on the BigAdmin System Administration Portal.

It is a detailed, step-by-step guide for installing the Solaris 10 11/06 Operating System, Sun Cluster 3.2 software, the QFS 4.5 cluster file system, and Oracle 10g Release 2 Real Application Clusters (Oracle 10gR2 RAC). It also provides detailed instructions on how to configure QFS and Solaris Volume Manager so they can be used with Oracle 10gR2 RAC.

Last but not least I want to reference the white paper Making Oracle Database 10G R2 RAC Even More Unbreakable, which explains a lot reasons why the combination of Sun Cluster and Oracle RAC is very benefitial.

Friday Jun 01, 2007

Removing and unregistering a diskset from Sun Cluster

Today I realized that the procedure "How to Remove and Unregister a Device Group (Solaris Volume Manager)" lacks a specific example.

Lets assume the following diskset configuration on a Sun Cluster with two nodes named pplanet1 and pplanet2:

# cat /etc/lvm/
test_ds/d1 -m test_ds/d10
test_ds/d10 1 1 /dev/did/rdsk/d4s0

# metaset -s test_ds -a -h pplanet1 pplanet2
# metaset -s test_ds -a /dev/did/rdsk/d4
# metaset -s test_ds -a -m pplanet1 pplanet2
# metainit test_ds/d10
test_ds/d10: Concat/Stripe is setup
# metainit test_ds/d1
test_ds/d1: Mirror is setup

# cldg show

=== Device Groups ===                         

Device Group Name:                              test_ds
  Type:                                            SVM
  failback:                                        false
  Node List:                                       pplanet1, pplanet2
  preferenced:                                     true
  numsecondaries:                                  1
  diskset name:                                    test_ds

# cldg status

=== Cluster Device Groups ===

--- Device Group Status ---

Device Group Name     Primary      Secondary    Status
-----------------     -------      ---------    ------
test_ds               pplanet1     pplanet2     Online

# metaset -s test_ds

Set name = test_ds, Set number = 2

Host                Owner
  pplanet1           Yes

Mediator Host(s)    Aliases

Driv Dbase

d4   Yes 

And now assume you want to remove and unregister this diskset again. Generally speaking you want to make sure prior to perform this, that

  • no file system is mounted on any node from this diskset
  • no entry on any node for this diskset is active in /etc/vfstab
  • no SUNW.HAStoragePlus resource is using this diskset or a file system from this diskset

Find out on which node the diskset is primary/online:

# cldg status

=== Cluster Device Groups ===

--- Device Group Status ---

Device Group Name     Primary      Secondary    Status
-----------------     -------      ---------    ------
test_ds               pplanet1      pplanet2     Online

Perform all following on the node where the diskset is primary/online (here: pplanet1):

Remove all metadevices on that diskset:

# metaclear -s test_ds -a
test_ds/d1: Mirror is cleared
test_ds/d10: Concat/Stripe is cleared

Remove all devices from that diskset (you need the -f option for the last one):

# metaset -s test_ds -d -f /dev/did/rdsk/d4

On a two node cluster, if mediators are configured, remove them:

# metaset -s test_ds -d -m pplanet1 pplanet2 

For all nodes (but the node where the diskset is primary last) perform:

# metaset -s test_ds -d -h pplanet2
# metaset -s test_ds -d -h pplanet1

In /var/adm/messages you see the following after the last command:

Jun  2 02:21:33 pplanet1 Cluster.Framework: [ID 801593 daemon.notice] stdout: no longer primary for test_ds

And you can confirm that the diskset is now removed and unregistered:

# cldg list


This Blog is about my work at Availability Engineering: Wine, Cluster and Song :-) The views expressed on this blog are my own and do not necessarily reflect the views of Sun and/or Oracle.


« December 2016