Friday Jan 11, 2008

GlassFish Hidden Nugget: Automatic Distributed Transaction Recovery Service

GlassFish v2 and v2 ur1 releases (and later) have support for transaction recovery (both manual and automated) in the sense that incomplete transactions at the time of an instance failure can be committed either manually or automatically.

Part of the new feature set in the cluster profile is a little known feature called Automated Distributed Transaction Recovery that comes out of Project Shoal's support for it. 

Essentially, Automatic Distributed Transaction Recovery in GlassFish works as follows :

Consider the following :

  • a cluster of three instances : instance1, instance2, and instance3
  • Two XA resources used by each GlassFish instance
  • a transaction starts on instance 1,
  • Transaction Manager on instance1 asks resource X to pre-commit,
  • Transaction Manager on instance1 asks resource Y to pre-commit,
  • Transaction Manager on instance1 asks resource X to do a commit,

Now, instance1 crashes

The Transaction Service component in one of the surviving members, instance2 and instance3, gets a notification signal that a failure recovery operation needs to be performed for a instance1. This signal from Shoal is called FailureRecoverySignal.

This notification signal comes to the Transaction Service component in only one particular selected instance as a result of a selection algorithm run in Shoal's GMS component that takes advantage of the identically ordered cluster view provided to it by the underlying group communication provider (default provider is Jxta).

The Transaction Service component in this instance, say instance2, would now go into its autorecovery block. It starts by waiting for a designated time (default to 60 seconds) to allow for the failed instance1 to start back up.

If instance1 is starting up, its own Transaction Service component would do self recovery to complete phase 1 transactions.

In instance2, after the wait timeout occurs, the transaction service component would now see if instance1 is part of the group view and if not try to acquire a lock for the failed instance's transaction logs through Shoal's FailureRecoverySignal and if successful (indicating that the failed instance did not startup), acquire the transaction log and start recovery of transactions i.e complete the commit operations for the pre-commit transactions. If the acquisition of the lock fails, then it gives up, and checks that the failed instance did startup through Shoal's group view and logs this fact.

If, during the recovery operations  being performed by instance2, the failed instance1 starts up, the transaction service component in this instance would first check with Shoal if a recovery operation is in progress for its resources by any other instance in the group and if yes, it waits for the recovery operations to be completed and then completes startup. This ability to check for such recovery operations in progress is through a related Shoal feature called Failure Fencing[1].  If there are no recovery operations in progress, then the startup proceeds with a self recovery which recovers any incomplete transactions in instance1's logs.

Now during recovery of instance1's transaction logs, instance2 fails, then the fact that this instance was in the process of recovering for instance1 is known to the remaining members of the group (i.e. instance3) through the failure fencing recovery state recorded in Shoal's Distributed State Cache. As a result, when instance3's transaction service gets the failure recovery signal, not only does it get it for instance2's failure, but also for instance1. This facility covers for cases where cascading failures or multiple failures occur.

Note that, for the automatic distrbuted transaction recovery to work, access to the transaction logs for all instances in the cluster for
purposes of auto recovery requires that the logs be mounted on a shared/mirrored disk[2].

[1] More on Shoal's Automated Delegated Recovery Selection
[2] Distributed Transaction Recovery




Thursday Nov 09, 2006

Introducing Project Shoal : another open source contribution from Sun

Project Shoal was started in the java-enterprise community incubator a few months ago. We are happy to report that over the last week, we have commited sources to the project's CVS repository. The sources are reasonably well tested and are a good starting point towards building a quality product over the ensuing months.

The project's goal is to produce a Java-based dynamic clustering framework that can be plugged into any product requiring clustering functionality as an in-process component.  One can think of several important use cases that such a library will serve ranging from basic group membership service to building fault tolerance and reliability oriented solutions to distributed caches to high availability  infrastructure.

The heart of Shoal lies in its Group Management Service which provides a group membership management infrastructure such as group and member discovery, detecting failures, planned shutdowns, etc. in addition to value add features such as recovery oriented support and lightweight caching. Group members can also send messages to an individual member, a collection of members or all members. Group members can be cluster or non-cluster members identified by their member type. Members communicate by simply using their application level member identity and do not have to know anything about the network level locational details of themselves or other group members.

Shoal exposes an easy-to-use client API for consuming client components in each JVM process and provides an effective abstraction shielding clients from complexity of bootstrapping into a process group and its networking semantics. This, it achieves, through a service provider interface (a still-evolving SPI) that allows group communication technologies to be integrated. The default communication provider is based on JXTA peer-to-peer technology.

During the last few months, while the internal approval processes were ongoing, we built a very productive relationship with the JXTA community. JXTA has several inherent strengths in the group communication space by virtue of its being in the peer-to-peer (p2p) area.  JXTA provides very good security (authentication and encryption) support allowing pluggable keystores, dynamic route mapping (especially useful when peers are mobile), transport agnosticity (uses virtual multicast through rendezvous services if UDP multicast is not supported, TCP or HTTP transports that are dynamically switchable when a given transport is unavailable), WAN capable, simple addressing semantic (application instance name encoded peer identifiers), etc.


Shoal is part of the GlassFish Community and will be incorporated for value added features in GlassFish v2. GlassFish code base forms the core matter within the Java EE SDK and the Sun branded and supported Sun Java System Application Server 9.x.


Watch out for more blogs on Shoal in the coming weeks. We welcome interested folks to join the project and contribute to the success of Shoal in various ways including contributing code, bugs, patches, documentation, spreading the word, code reviews, etc.

Here are some Shoal related blogs:

by Mohamed Abdelaziz  (JXTA)

by Bernard Traversat (JXTA)

Blog by
Masood (Max) Mortazavi (Java DB)



Shreedhar Ganapathy


« July 2016