GlassFish Hidden Nugget: Automatic Distributed Transaction Recovery Service
By Shreedhar Ganapathy on Jan 11, 2008
GlassFish v2 and v2 ur1 releases (and later) have support for transaction recovery (both manual and automated) in the sense that incomplete transactions at the time of an instance failure can be committed either manually or automatically.
Part of the new feature set in the cluster profile is a little known feature called Automated Distributed Transaction Recovery that comes out of Project Shoal's support for it.
Essentially, Automatic Distributed Transaction Recovery in GlassFish works as follows :
Consider the following :
- a cluster of three instances : instance1, instance2, and instance3
- Two XA resources used by each GlassFish instance
- a transaction starts on instance 1,
- Transaction Manager on instance1 asks resource X to pre-commit,
- Transaction Manager on instance1 asks resource Y to pre-commit,
- Transaction Manager on instance1 asks resource X to do a commit,
Now, instance1 crashes
The Transaction Service component in one of the surviving members, instance2 and instance3, gets a notification signal that a failure recovery operation needs to be performed for a instance1. This signal from Shoal is called FailureRecoverySignal.
This notification signal comes to the Transaction Service component in only one particular selected instance as a result of a selection algorithm run in Shoal's GMS component that takes advantage of the identically ordered cluster view provided to it by the underlying group communication provider (default provider is Jxta).
The Transaction Service component in this instance, say instance2, would now go into its autorecovery block. It starts by waiting for a designated time (default to 60 seconds) to allow for the failed instance1 to start back up.
If instance1 is starting up, its own Transaction Service component would do self recovery to complete phase 1 transactions.
In instance2, after the wait timeout occurs, the transaction service component would now see if instance1 is part of the group view and if not try to acquire a lock for the failed instance's transaction logs through Shoal's FailureRecoverySignal and if successful (indicating that the failed instance did not startup), acquire the transaction log and start recovery of transactions i.e complete the commit operations for the pre-commit transactions. If the acquisition of the lock fails, then it gives up, and checks that the failed instance did startup through Shoal's group view and logs this fact.
If, during the recovery operations being performed by instance2, the failed instance1 starts up, the transaction service component in this instance would first check with Shoal if a recovery operation is in progress for its resources by any other instance in the group and if yes, it waits for the recovery operations to be completed and then completes startup. This ability to check for such recovery operations in progress is through a related Shoal feature called Failure Fencing. If there are no recovery operations in progress, then the startup proceeds with a self recovery which recovers any incomplete transactions in instance1's logs.
Now during recovery of instance1's transaction logs, instance2 fails, then the fact that this instance was in the process of recovering for instance1 is known to the remaining members of the group (i.e. instance3) through the failure fencing recovery state recorded in Shoal's Distributed State Cache. As a result, when instance3's transaction service gets the failure recovery signal, not only does it get it for instance2's failure, but also for instance1. This facility covers for cases where cascading failures or multiple failures occur.
Note that, for the automatic distrbuted transaction recovery to work, access to the transaction logs for all instances in the cluster for
purposes of auto recovery requires that the logs be mounted on a shared/mirrored disk.
 More on Shoal's Automated Delegated Recovery Selection
 Distributed Transaction Recovery