By fkieviet on Jan 15, 2007
A few months ago, in my blog entry Transactions,
disks, and performance
I went into the importance of minimizing the number of writes.
Transaction logging is one of those cases where minimizing the number
of writes greatly enhances performance. In this entry, I'll describe a
way to avoid transaction logging altogether.
What is transaction logging? Transaction logging
refers to persisting the state of a two-phase transaction so that in
the event of a crash, the transaction can either be committed or rolled
back (recovered). I won't go into the details of what XA is; more
information about XA transactions can be found elsewhere, e.g. in Mike Spille's XA
Let me illustrate what recovery is using a
Consider an XA two phase transaction with three Resource Managers (RMa, RMb, and RMc). To
indicate what happens at what time, I'll put all actions in a table;
each row corresponds to a different time.
||delete from log
the transaction manager records the decision
to commit to the log.
Let's say that the system crashes after t10, say between t11 and t12. When the system restarts,
it will call recover() on
all known Resource Managers and it will read the transaction log. In
the transaction log it will find that xid1x was marked for
Through recover() it will
find that xid1b
and xid1c are in doubt. It knows that these two
need to be committed because of the commit decision in the log.
What happens if the system crashes before the
commit decision is written to the log, for example between t8 and t9? Upon recovery, the recover() method of RMa, RMb and RMc return xid1a and xid1b (but not xid1c because
prepare was not
called on RMc
transaction manager will rollback RMa
because no commit
decision was found in the log.
SeeBeyond's Logless XA Transactions
Let's take a look at the recover() method on the XAResource. This method returns
an array of Xid objects.
Each Xid object holds two
arrays. These two arrays represent the global transaction ID and the
qualifier. They are typically random numbers picked by the transaction
manager. The Resource Managers that receive these Xids should use these objects
as identifiers and return them in the recover() method unmodified.
At SeeBeyond, Jerry Waldorf and Venugopalan
Venkataraman came up with an idea to use the storage space in the byte arrays of the Xid as a way to persist the
transaction state. Here's how it works. Let's modify the above
example by removing transaction logging:
A commit decision is still being made, but this decision is no longer persisted in a separate transaction log. In stead, it is persisted in xid1a. If the system finds xid1a upon recovery, it knows that a commit decision was made. If it doesn't find xid1a, it knows that a commit decision was not made. Note that the order in which both prepare and commit are called on the three Resource Managers is very important.
As in the first example, if the system crashes before a commit
decision has been made, it will rollback any resources upon recovery.
E.g. if the system crashes between t8 and t9, it will encounter xid1c and xid1b and will call rollback() on these because it
cannot find a record of a commit-decision for xid1, i.e. it cannot find xid1a. Hence, xid1b and xid1c need to be
If the system crashes after a commit decision has been made, for
example between t10 and t11, it will find xid1b and xid1a. Since xid1a signifies a
decision, both xid1b
should be committed.
So far so good. But how does the transaction manager know that if it
look for xida to figure out if a
decision was made? This is where the transaction manager uses the byte of the Xid: it stores this information
in one of them.
A problem in this scheme occurs when the prepare(xid1a) method returns XA_RDONLY. If that happens, commit(xid1a, false)
called, and RMa will not return xid1a
upon calling recover().
Recall that xid1a had special significance!
Hence it is important to order the Resource Managers such that the
first one on which prepare()
is called, is both reliable and will not return XA_RDONLY. However, in normal
EE applications, the application prescribes in which order resources
are enlisted in a transaction. Hence, to use this logless transaction
scheme, the application server either needs to be extended with a way
specify resources a priori, or the application server needs to be
extended with a learning capability so that it knows which resources
are enlisted in a particular operation so that it can pick the right
resource manager to write the commit decision to.
The SeeBeyond logless transaction approach is one of the ways that
transaction logging can be made less exensive. In a future blog, I'll
cover additional ones.