Friday Mar 06, 2009

SailFin Talk at Sun Tech Days, Q and A, useful links...

I presented a session on SailFin at the recently concluded Sun Tech Days at Hyderabad. This session was a last minute replacement as Sang Shin couldn't make it for his session. The session was titled "Adding Convergence of Media to your Enterprise Application using Sun GlassFish Communication Server". This was a session that introduced SailFin to the audience by getting JavaEE and SIP together.

A similar presentation is available at A really nice introduction to Java EE and SIP (via SailFin) is also available at: .

Many students were really excited about the Call-Setup Demo that Prasad Subramanian did a day before (see A demo was not part of my session due to the short notice, but did manage to go over the recorded demo created by my colleague Bhavani (on a Loan Processing Banking Application) available at .

To complete this blog, here's a sampling of some of the questions asked during the session:

  1. How Popular is SIP ?
    SIP is very popular as a industry standard for providing telecom services like VoIP etc.

  2. What are the other competitive/comparable protocols ?
    Search the net for comparisons. There are many out there and can be confusing :-)

  3. Are users limited to SIP enabled phones when working with SailFin ?

  4. Can I deploy a SIP Servlets application on SailFin and make it talk with Skype or GTalk ?
    Although Skype does not support SIP, perhaps you still can make a SailFin application talk to it.... a quick search on the internet brings up many SIP-to-Skype (and vice-versa) functionality providing software.

  5. How do you find if a phone is SIP enabled ?
    The product must provide that information.

  6. Do application creators have to know the SIP address of the phone they need to contact, isn't that hard to remember ?
    No. The application can use the phone number of the SIP enabled phone for this purpose. The TelURL API provided by JSR 289 is available for that.

Tuesday Feb 17, 2009

Notes on Sailfin Cluster Failure Management and GMS

Here are some short notes on how a SailFin Cluster deals with instance failures. These are good for troubleshooting and helpful when debugging issues with failure scenarios. But first, if you are unfamiliar with sailfin clustering please read Quick Start with Sailfin Clustering.

Sailfin relies on Group Management Service for its failure management. This includes detection of an instance's failure and appropriate notification. Below is a list of some types of instance failure that GMS helps detect:
  1. Software Failure:
    a. Node Agent and instance process dying
    b. Instance process alone dying [Transient Failure]

  2. Hardware Failure:
    a. Network Failure [cable snap at the machine's end or at the router's end]
    b. Power Failure [of the machine hosting a sailfin instance]

Notes on how GMS works:
  1. Each instance in a sailfin cluster has an instance of GMS service running in it which starts as the instance is started. A logical GMS group is formed by the GMS services running on all the instances of a cluster.

  2. Using the GMS service each member of the group is able to send and receive signals. Using a heartbeat mechanism the GMS services are able to detect states such as addition, failure or recovery of a group member.

  3. These states are registered as events and are logged in the instance's server.log file under <sailfin-installation>/nodeagents/<agent-name>/<instance-name>/logs/server.log. For example, if an instance is shutdown using the "asadmin stop-instance <instance-name>" command, all other instances that are part of the group would detect this shutdown. You will note a PEER_STOP_EVENT registered in the server log files of these instances.

Below is a list of some important GMS events along with their significance:
  1. PEER_STOP_EVENT: Indicates a planned shutdown of an instance (using the asadmin stop-instance command).
  2. ADD_EVENT: Indicates that an instance has been started (using the asadmin start-instance command) and its GMS service joining the logical GMS group.
  3. JOINED_AND_READY_EVENT: Indicates that startup of an instance is complete.
  4. IN_DOUBT_EVENT: Indicates that GMS suspects that an instance has failed. (Try this by killing an instance and its associated node-agent's process and notice the messages in the logs of other instances)
  5. FAILURE_EVENT: Indicates confirmation of failure of an instance by GMS.
These log messages also indicate the instance associated with the event. This information is quite handy when debugging failure based scenarios.

Node-Agent as a WatchDog:
One other failure detection mechanism is a non-GMS one. The node-agent acts as a watch-dog for the instance. It detects instance process failure and attempts a restart of the instance. This is the transient failure mentioned above as item 1 (b) above.

CLB as a Listener of GMS events:
CLB is a listener of GMS events and it adjust its functonality as per the event's significance. For example, the CLB considers an instance to be available to serve requests until it receives a FAILURE_EVENT for that instance from GMS. On receiving a FAILURE_EVENT the CLB stops forwarding requests to the failed instance. The instance is added back to the CLB's list of available instances only after the CLB receives a JOINED_AND_READY_EVENT for that instance.

GMS failure detection and notification times can vary depending on the hardware used and certain GMS and sailfin configurable settings. For information on this and other functionality provided by GMS, please read documention available at and

Quick Start with Sailfin Clustering

Sailfin Clustering

[Read More]

Sailfin, Glassfish and more....


« February 2017