Sunday Dec 31, 2006

suncluster 3.2

Sc3.2 just released, on can get all the documents on

What's new is in the following link

The download is in the follow link

It support Solaris 9 09/05 on Sparc and Solaris 10 11/06 on Sparc and x86

Te main features

  • HA-ZFS
  • new object CLI
  • Support many standard agent in zone
  • Quarum Server
  • Oracle 10g RAC framework support

Wednesday Jul 12, 2006

HA in Sun env

Comments on various way Sun support HA/DR[Read More]

Sunday Dec 05, 2004

active active Java_ES messaging server in a cluster environment

In setting up a Highly Available Java_ES messaging server environment, one would like to have active active environment.

This setup is fully supported in Java_ES 2004Q2.

One can setup 1+1 or N+1 environment.

For this to work one will setup multiple stores and will need MMP to mask the multiple stores.

One will need to setup HAStorage+ (FFS) and multiple LH and resouce groups.

It will be a best practice to configure the instance on a shared FFS.

Messaging Server: Planning for service availability

High Availability Using Sun Cluster Software

Sunday Sep 26, 2004


The ARCO feature in N1GE6 require a database to store the data. Right now one can use postgresql or oracle. In case one may want to protect the database server in Solaris env, one could use the HA-oracle agent to protect the oracle database.

For postgresql database, one can use the Java ES Cluster SW's Sunplex agent builder to build an failover agent based on GDS to protect the postgresql database.

We assume the uid for postgresql is postgres, the data is stored in /shared/pgsql/data and the binary is installed in /usr/local/pgsql.

We create two scripts:


    su - postgres -c " /usr/local/pgsql/bin/pg_ctl start -l logfile /shared/pgsql/data "


    su - postgres -c " /usr/local/pgsql/bin/pg_ctl stop -l logfile /shared/pgsql/data "

Please check my our weblog for the other step to create HA-pgsql agent.

Sunday Sep 12, 2004


Degreework from Sungard is a DSS that allow student to do whatif analysis for the courses releated to their future Degree

In solaris/sparc environment, it can be implemented in a three tiers setup.

  • web server, many web servers farm befind LB
  • app server, and server for fat PC clients
  • database backend
To protect the system and for scalability one can add the Java Enterprise Cluster SW.
  • HA-oracle to protect the back-end
  • web front-end are protected by the LB
  • for app server, pc-clients server one can use the Sunplex agent builder and use
    1. webstart, webstop and webshow for web app
    2. dapstart, datstop and dapshow for PC apps

HA-ALEPH of Ex Libris library system

ALEPH of Ex Libris Library system

Could be implemented in a three tiers environment

  • web tier, few servers that run apache web server behind LB
  • Application server, that run the main applications server , FAT PC clients servers
  • Database server, e.g. oracle 9i
In solaris/sparc server environment, one can also add the Java Enterprise Cluster SW to make the ALEPH 500 of Exlibris library system Hightly Available.
  • DB server can be protected by the HA-oracle agent
  • For app server, one can use the Sunplex agent builder to build HA-agent for AS using the scripts:
    1. /usr/local/bin/start_aleph
    2. /usr/local/bin/stop_aleph

Friday Sep 03, 2004

vxvm 4.0 new feature

Finally in vxvm 4.0 rootdg is not required This make the job of using SVM to mirroring the root disk and use vxvm to controll the other disks more nature

Now a day most storages are protected by HA-RAID. so the question: when you will need VM?

  • mirror root disks, right now most root disk are not yet protected by HW_RAID, even through SAN boot is supported, since SAN Foundation SW is not part of the Solaris SW, the task is not very straightforward
  • the need of controlling the shared storage (diskgroup, diskset) (deport, import ) (take, release) in failover env or RAC environment
  • The need of mirror disks between arrary

Thursday Jun 10, 2004

the tale of HA-oracle installation problem

the oracle uid and group id with SunCluster HA-oracle agent

Recently we encounter an issue at customer site, they are using Sun servers and storage to run oracle EBS, after the cluster SW and HA-oracle was installed


HA-oracle probe has problem talking to the oracle server so it will just shutdown the oracle. Our HA-oracle does not work

Since we know that HA-oracle has been install on many many systems, so we believe that there must be some setup problem with customer's oracle installation.

The customer was using some cloneing Sw to clone the whole installation from an Solaris 8 machine to a solaris 9 machine.

Customer keep telling us this procedure is fully supported by oracle. But our knowledge on oracle is limited, so one cannot argue with real knowledge

we have the back-end HA support eng and VOS support eng to engage with oracle to try to resolve this problem. Since the source code of the oracle-probe is not available to me or oracle or the VOS eng it is not easy to find out where the problem lie.


Customer use many different oracle uid and dba group id for different instance of the oracle, beside the oracle and dba:
  • oracle1,dba1 for instance one and dba1 is the primary group
  • oracle2 and db2 for instance two and dba2 is the primary group each uid and gid own all the instance's binary and data files (from the ls -l output)

    From the oracle installation guide, we notice that there need to be a primary group, said dba, need to own all the binary and datafiles. this means that from the ls -l output we should see dba as the primary group After one make this change thing went back to normal and everything work fine



    IMHO, one should just use one group id (dba) and many difference oracle uid as you want
  • Building customer agent using Sunplex agent builder

    one of best feature of the Sun Cluster 3.x SW is the buildin SunPlex Agent Builder

    you can use this simple tool to build a failover agent or scalable agent. You can use the full GUI version: scdsbuilder or the CLI scdscreate and scdsconfig commands

    you can build agent based on the SUNWgds (Generic Data Service) or generate source code in C or ksh.

    Please refer to the Sun Cluster Dataservice developer's (817-4227) guide for full detail.


    We just give an example that use the CLI tool. Let's assume that we have an application that has the following script

  • start script :/usr/local/bin/startapps
  • stop script :/usr/local/bin/stopapps
  • and we will like to create a failover agent.


    First we run the following command

    scdscreate -V BIG -T apps -d /tmp -k

    ...... Creating the RTR file BIG.apps .... done.

    here we use the vender_id=BIG, resource_type_name=apps, working_directory=/tmp and generate ksh source code.

    It is always a good idea to use generate ksh code, one can learn alots by just examine the codes.


    Next we run the following command

    scdsconfig -s /usr/local/bin/startapps -t /usr/local/bin/stopapps -d /tmp

    we will see


    The Package for the apps services has benn created in /tmp/BIGapps/pkg


    you can tar the diectory BIGapps up and ready to use it.


    Next setup to to install the BIGapps by the command pkgadd -d /tmp/BIGAppd/pkg at one node and copy the BIGapps directory to the second node and install it also.

    register Resource Type

    In th configuration process you will need to register the BIGapps rsource type and you are on you way to configure the failover agent for apps

    scrgadm -t BIGapps

    that is all ...

    Tuesday Jun 08, 2004

    Manual Failover procedure

    manual Failover

    For many business critical application, we are used to relied on the vendor cluster software to accomplsh the HA requirement.

    But in some case, some user many just want to setup a manual failover procedure, so in case one node need to be repaired one can manual switch the application to the second node


    This note describe the main requirement for a manual failover in a two node situation
    • the Storage need to be shared but will be controlled by one node at time
    • simplest way to do this is using SVM (Solaris Volume Manager) in Solairs 9, or Veritas VM (vxvm). Please refer to the HW setup document in the Cluster Sw for how to setup SCSI or FC storage.

      You will need to setup Diskset in SVM or Diskgroup in VxVM, and your script will contain the procedure to release (deport) or take(import) the Diskset(Diskgroup), and mount the FS

    • Your application will need to use the Logical Hostname and not the Physical Nodename.
    • It is always a good idea to use the LH, this way you can always relocate the application to different physical host.

    How to setup LH?

    You use the Logical Interface of your NIC. e.g if your physical interface is ce0 then the Logical Interface will be ce0:3 for example.


    The script will include the procedure to plumb and unplumb the LI and associate the LH with the LI at different Physical host.

    The code will be the following:

    to switchover from node1 to node2

    1. at node1: bring down the LI, unmount the FS, release the Diskset
    2. at node2: take the Diskset, Mount the FS and plumb the LI and bring up the LI



    We describe effort to make Blackboard SW highly available

    Standard BB6.x has two tier setup, the front-end is based on apache and tomcate the back-end are oracle (an example) and NFS server.

    To integrate with the Sun Cluster SW, one can use the HA-oracle agent to protect oracle, and HA-NFS agent to protect the NFS.


    One can setup a two node cluster and two Logical Hosts, one LH is responsible for HA-Oracle and one LH is responsible for HA-NFS, one node will own the LH for oracle and one node will own the LH for NFS.


    The front-end can be protected by the HA-apache and HA-tomcate agent or just set up multiple front-end behind a Load Balancer (LB).

    NFS failover is transparent to the clients.

    Every time oracle failover to the other host, one need to restart the front-end's apache and tomcate.

    customer agent

    To achive this, one can create a customer agent: its job are just to use ssh to each front-end and restart the apache and tomcate.


    Of course one need to setup SSH in such a way one can run ssh command between back-end and front-end with-out the need of input passwd/passphrase.


    If one use the collab server then one need to configure a customer agent. In Sun Java ES Cluster SW, one can easily make a HA-agent if one has the start, stop and probe scripts for the collab server. We will describe the procedure in a separate posting.



    Top Tags
    « March 2017