Developing Event-Driven, Auto-Compensating Transactions With Oracle Database Sagas and Lock-free Reservations

March 21, 2024 | 15 minute read
Paul Parkinson
Architect and Developer Evangelist, Microservices and Converged Database
Text Size 100%:

I have presented many times, and written numerous blogs and source code, on sagas and event-driven microservices over the past decade leading up to this point where I can describe the availability of a truly one-of-a-kind event-driven, auto-compensating saga framework.

In those blogs, I’ve discussed the need for sagas in microservices architectures, the preferred and increased use of event-driven patterns and communication for microservices, and the difficulties in implementing sagas, particularly around developing saga participant code for compensating transactions. These are addressed in the product solution I will describe including example source code here and soon an update of the beta version of the saga workshop showing the same.  The features are in the Oracle Database Free docker container and soon in the Oracle Autonomous Database.

Part of what makes the new Oracle Saga Framework so powerful is its combined usage of other one-of-a-kind features in the Oracle Database including the TxEventQ transactional messaging system and lock-free reservations, therefore I will describe them and how they contribute to the overall comprehensive solution as well.

A video walkthrough of this content is also available here...

 

Quick background (skip if you know what sagas are)...

The saga pattern is used to provide data integrity between multiple services and to do so for potentially long-running transactions. There are many, cursory as they tend to be, blogs written on sagas and long-running transactions. In short, XA and 2PC require distributed locks (con) which manage ACID properties such that the user can simply execute rollback or commit (pro) whereas sagas use local transactions only and so do not require distributed locks (pro) but require the user to implement compensation, etc. logic (con). My previous blog showed examples of compensation logic and the need to explicitly maintain journals and handle a number of often subtle but important complexities.  Indeed most agree that data integrity is perhaps the most difficult and critical of challenges when taking advantage of a microservices architecture.

saga flow

  1. The Transfer Service receives a request to transfer money from one account to another. 

  2. The “transfer” method that is called is annotated with @LRA(value = LRA.Type.REQUIRES_NEW, end = false), therefore, the underlying LRA client library makes a call to the Coordinator/Orchestrator service which creates a new LRA/saga and passes the LRA/saga id back to the Transfer service.

  3. The Transfer Service makes a call to the (Bank)Account (for account 66) service to make the withdraw call.  The LRA/saga id is propagated as a header as part of this call.

  4. The “withdraw” method that is called is annotated with @LRA(value = LRA.Type.MANDATORY, end = false), therefore, the underlying client library makes a call to the Coordinator/Orchestrator service which recognizes the LRA/saga id and enlists/joins the Account Service endpoint (address) to the LRA/saga started by the Transfer Service.  This endpoint has a number of methods including the Complete and Compensate methods that will be called when the saga/LRA is terminated/ended.

  5. The “withdraw” method is executed and control returns to the Transfer Service.

  6. This is repeated with a call from the Transfer Service to the Account service (for account 67) to make the deposit call.

  7. Depending on the returns from the Account Service calls, the Transfer Service determines if it should close or cancel the saga/LRA. Close and cancel being somewhat analogous to commit or rollback.

  8. The Transfer Service issues the close or cancel call to the Coordinator (I will get into more details on how this is done implicitly when looking closer at the application).

  9. The Coordinator in turn issues complete (in the case of close) or compensate calls on the participants that were joined in saga/LRA previously.


Oracle TxEventQ (formerly AQ) messaging system in the database.

There are several advantages to event-driven microservices particularly those that are close to the (potentially critical) data including scalability and QoS levels, reliability, transactionality, integration and routing, versioning, etc.

Of course there needs to be a messaging system/broker in order to provide event-driven sagas the TxEventQ messaging system (formerly called AQ) has been part of the Oracle database decades (long before Kafka existed). It provides some key differentiators not available in other messaging systems, in particular the ability to do messaging and data operations in the same local transaction which is required to provide transactional outbox, idempotent producers and consumers, and in particular the robust sagathings  simply can't do. These are described in the blog Apache Kafka vs. Oracle Transactional Event Queues as Microservices Event Mesh, but the following table gives an idea of the common scenario that would require extra developer and admin handling or is simply not possible in Kafka and other messaging and database systems. The scenario involves an Order micoservice inserting an order in the database and sending a message to an Inventory microservice, the Inventory microservices receives the message, updates the Inventory table, and sends a message back to the Order service which receives that message and updates the Order in the database. Notice how the Oracle Database and TxEventQ handle all failures scenarios automatically.

TxEventQ vs Kafka


Auto-compensating data types via Lock-free reservations 

Saga and Escrow History

There is little debate that the saga pattern is currently the best approach to data integrity between microservices and long-running transactions/activities. This comes as little surprise as it has a long history starting with the original paper that was published in 1987 which also states that a simplified and optimal implementation of the saga pattern is one where the coordinator is implemented in the database(s).  

The concept of escrow concurrency and compensation-aware transactions were described even earlier in 1985.  The new Oracle database feature is named "Lock-free Reservations" as a reservation journal acts as an intermediary to the actual data table for any fields marked with the keyword RESERVABLE. Here is an example of how easy it is to simply label a column/field as reservable...
 

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
CREATE TABLE bankA (
  ucid VARCHAR2(50),
  account_number NUMBER(20) PRIMARY KEY,
  account_type VARCHAR2(15) CHECK (account_type IN ('CHECKING', 'SAVING')),
  balance_amount decimal(10,2) <strong>RESERVABLE</strong> constraint balance_con check(balance_amount >= 0),
  created_at TIMESTAMP DEFAULT SYSTIMESTAMP
);


An internal reservation journal table is created and managed automatically by the database (with nomenclature SYS_RESERVJRNL_<object_number_of_base_table>) which tracks the actions made on the reservable field by concurrent transactions.  

Changes requested by each transaction are verified against the journal value (not the actual database table) and thus promises of the change are made to the transactions based on the reservation/journal. The changes are not flushed/processed on the underlying table until commit of the transaction(s).  The modifications made on these fields must be commutative. That is relative increment/decrement operations such as quantity = quantity + 1, not absolute assignments such as quantity = 2. This is the case in the vast majority of data hot spots and indeed even state machines, work on this principle. Along with the fine-grained/column-level nature of escrow/reservations, high throughput for hot spots of concurrent transactions is attained and likewise transactions do not block for long-running transactions.  

A customer in a store no longer locks all of a particular type of item just because one of the item is in their cart, nor can they take items from another person's cart.

 

A good way to understand is to compare and contrast lock-free reservations/escrow with the concurrency mechanisms, drawbacks, and benefits of pessimistic and optimistic locking…
A video describing the same can be found here.
 

pessimistic locking

 

 

 

 

 

 

 

 

 

 

Concurrency mechanism

  • Check inventory => exclusive locking at Update 

Drawbacks

  • No parallel Update on Shared data  => Serialized
  • Limited concurrency/scalability
  • Not so good for Microservices

Benefits

  • ACID => No dirty Reads  

 

optimistic locking

 

 

 

 

 

 

 

 

 

 

 

Concurrency mechanism 

  • Check inventory  => shared locks
  • Actual update at transaction commit

Drawback

  • Lack of Isolation => Eventual consistency (dirty Reads)
  • Insufficient Inventory at commit time => rolling back older Tx

Benefits

  • High concurrency
  • Parallel Update on shared data

 

escrow locking

 

 

 

 

 

 

 

 

 

 

 

Concurrency mechanism

  • Check  inventory   => shared locks 
  • Journal  promises => allow/reject based on journal
  • Actual update at transaction commit

Drawback

  • Lack of Isolation => Eventual consistency (dirty Reads)
  • Insufficient Inventory at commit time => rolling back younger Tx

Benefits

  • High concurrency
  • Parallel updates 
  • Can use the  journal to compensate
     

What is extremely interesting is the fact that the journaling, etc. conducted by lock-free reservations is also used by the Oracle Saga Framework to provide auto-compensating/compensation-aware data.

The Saga framework performs compensating actions during a Saga rollback. Reservation journal entries provide the data that is required to take compensatory actions for Saga transactions. The Saga framework sequentially processes the saga_finalization$ table for a Saga branch and executes the compensatory actions using the reservation journal.

In other words, it removes the burden of coding the compensation logic.
As described in the Developing Saga Participant Code For Compensating Transactions blog and video.

Quick Feature Comparison In Saga Implementations...

In my previous blog, I used the extremely versatile and powerful Oracle MicroTx product, written by the same stellar team that wrote the famous Tuxedo transaction processing monitor. I've provided this table of comparison features to show what is provided by LRA in general and what unique features currently exist (others are in development) between the two Oracle Saga coordinator implementations.

MicroTx OSaga comparison

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Application Setup and Code...
 

Setup

There are a few simple setup steps on the database side that just need to be issued once to initialize the system. Full doc can be found here.  It is possible to use a number of different configurations for microservices and all are supported by the Oracle Database Saga Framework.  For example, there can be schema or another isolation level between microservices, or there can be a strict database-per-service isolation.  We will show the latter here and use a Pluggable Database (PDB) per service.  A PDB is  a devoted database than can be managed as a unit/CDB for HA, etc. making it perfect for microservices) per service.

1. Create database links between each database for message propagation, forming an event mesh. The command looks like this

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
CREATE PUBLIC DATABASE LINK PDB2_LINK CONNECT TO admin IDENTIFIED BY test USING 'cdb1_pdb2';
CREATE PUBLIC DATABASE LINK PDB3_LINK CONNECT TO admin IDENTIFIED BY test USING 'cdb1_pdb3';
CREATE PUBLIC DATABASE LINK PDB4_LINK CONNECT TO admin IDENTIFIED BY test USING 'cdb1_pdb4';

2. Grant saga related privileges to the saga coordinator/admin  

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
grant saga_adm_role to admin;
grant saga_participant_role to admin;
grant saga_connect_role to admin;

3. add_broker and add_coordinator

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
exec dbms_saga_adm.add_broker(broker_name => 'TEST', broker_schema => 'admin');
exec dbms_saga_adm.add_coordinator(coordinator_name => 'CloudBankCoordinator', mailbox_schema => 'admin', broker_name => 'TEST', dblink_to_coordinator => 'pdb1_link');
exec dbms_saga_adm.add_participant(participant_name => 'CloudBank', coordinator_name => 'CloudBankCoordinator' , dblink_to_broker => 'pdb1_link' , mailbox_schema => 'admin' , broker_name => 'TEST', dblink_to_participant => 'pdb1_link');

4. add_participant (s)

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
exec dbms_saga_adm.add_participant(participant_name=> 'BankB' ,dblink_to_broker => 'pdb1_link',mailbox_schema=> 'admin',broker_name=> 'TEST', dblink_to_participant=> 'pdb3_link');

 

Application Dependencies

On the Java application side, we just need to add these two dependencies to the the maven pom.xml

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
<dependency>
  <groupId>com.oracle.database.saga</groupId>
  <artifactId>saga-core</artifactId>
  <version>[23.3.0,)</version>
</dependency>
<dependency>
  <groupId>com.oracle.database.saga</groupId>
  <artifactId>saga-filter</artifactId>
  <version>[23.3.0,)</version>
</dependency>

 

Application source code

As the Oracle Database Saga Framework implements the MicroProfile LRA (Long Running Actions) specification, much of the code, annotations, etc. that I've presented in previous blogs apply to this one.

However, the LRA specification, though future support has been discussed, does not support messaging/eventing, only Rest (it supports Async Rest but that is of course not messaging/eventing), and so a few additional annotations have been furnished to provide such support and take advantage of the TxEventQ transactional messaging and auto-compensation functionality already described. Full documentation can be found here , but the key two additions are @Request in the saga/LRA participants and @Response in the initiating service/participant as described in the following.

Note that the Oracle Database Saga Framework also provides access to the same saga functionality via direct API calls (eg SagaInitiator beginSaga()Saga sendRequestcommitSagarollbackSaga, ...) and so can be used not only in JAX-RS clients but any Java client and of course in PL/SQL as well. As shown in the previous blog and code repos, JAX-RS can also be used in Spring Boot.
The example code snippets below shows the classic TravelAgency saga scenario (with Airline, etc. participants) while the example I've been using in previous blog and in the GitHub repos provided continues the bank transfer scenario.  The same principles apply of course, just using different use cases to illustrate.

The Initiator (the initiating Participant)

@LRA for demarcation of the saga/LRA indicating whether the method should start, end, or join an LRA

@Response is an Oracle Saga-specific annotation indicating this method collects responses from Saga participants (that were enrolled into a Saga using the sendRequest() API and the name of the participant ("Airline" in this case).

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
@Participant(name = "TravelAgency")
/* @Participant declares the participant’s name to the saga framework */
public class TravelAgencyController extends SagaInitiator {
/* TravelAgencyController extends the SagaInitiator class */
 @LRA(end = false)
 /* @LRA annotates the method that begins a saga and invites participants */
 @POST("booking")
 @Consumes(MediaType.TEXT_PLAIN)
 @Produces(MediaType.APPLICATION_JSON)
  public jakarta.ws.rs.core.Response booking( 
      @HeaderParam(LRA_HTTP_CONTEXT_HEADER) URI lraId,
      String bookingPayload) {
    Saga saga = this.getSaga(lraId.toString());
    /* The application can access the sagaId via the HTTP header
       and instantiate the Saga object using it */
    try {
      /* The TravelAgency sends a request to the Airline sending
         a JSON payload using the Saga.sendRequest() method */
      saga.sendRequest ("Airline", bookingPayload);
      response = Response.status(Response.Status.ACCEPTED).build();
    } catch (SagaException e) {
      response=Response.status(Response.Status.INTERNAL_SERVER_ERROR).build();
    }
  }
  @Response(sender = "Airline.*")
  /* @Response annotates the method to receive responses from a specific
     Saga participant */
  public void responseFromAirline(SagaMessageContext info) {
    if (info.getPayload().equals("success")) {
      saga.commitSaga ();
      /* The TravelAgency commits the saga if a successful response is received */
    } else {
      /* Otherwise, the TravelAgency performs a Saga rollback  */
      saga.rollbackSaga ();
    }
  }
}

 

The Participant services

@Request is an Oracle Saga-specific annotation to indicate the method that receives incoming requests from Saga initiators.

@Complete the completion callback (called by the coordinator) for the saga/LRA

@Compensate the compensate callback (called by the coordinator) for the saga/LRA

The Saga framework provides a SagaMessageContext object as an input to the annotated method which includes convenience methods to get the Saga, SagaId, Sender, Payload, and Connection (to use transactionally and as an auto-compensating data type as part of the saga as describe earlier).

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
@Participant(name = "Airline")
/* @Participant declares the participant’s name to the saga framework */
public class Airline extends SagaParticipant {
/* Airline extends the SagaParticipant class */
  @Request(sender = "TravelAgency")
  /* The @Request annotates the method that handles incoming request from a given
     sender, in this example the TravelAgency */
  public String handleTravelAgencyRequest(SagaMessageContext    
      info) {
     
    /* Perform all DML with this connection to ensure 
       everything is in a single transaction */  
    FlightService fs = new 
      FlightService(info.getConnection());
    fs.bookFlight(info.getPayload(), info.getSagaId());
    return response;
    /* Local commit is automatically performed by the saga framework.  
       The response is returned to the initiator */  
  }
  
  @Compensate 
  /* @Compensate annotates the method automatically called to roll back a saga */
  public void compensate(SagaMessageContext info) {
    fs.deleteBooking(info.getPayload(), 
        info.getSagaId());
  }
  @Complete 
  /* @Complete annotates the method automatically called to commit a saga */
  public void complete(SagaMessageContext info) {
    fs.sendConfirmation(info.getSagaId());
  }
}

 

APEX Workflow with Oracle Saga Framework

Oracle's new APEX Workflow product has been designed to account for and include sagas. More blogs with details are coming but to give an idea, the following shows the same bank transfer saga we've been discussing but defined in a workflow and with the inclusion of a manual step in the flow for approval of the transfer (a common use case in finance and other workflows).

Saga workflow

You can read more about the workflow product in the blogs here and here.


Other topics: Observability, Optimizations, and Workshop

Event-driven application are of course different from blocking/sync/Rest applications and lend to different patterns and advantages particularly as far as parallelizism. Therefore settings for pool size, number of publishers and listeners, etc. are part of the saga framework in order to optimize.

As the journaling and bookkeeping is stored in the database along with the data and messaging, completion and compensation can be conducted locally there, making it in many cases unnecessary to make the callbacks to the application code that are otherwise necessary.  This again greatly simplifies development and also drastically cuts down on the costs and considerations of network calls.

Microservices, and especially those containing sagas, require effective observability and especially those containing sagas needs this observability not only in the application but also the saga coordinator, communication infrastructure, and the database.  Oracle has a, once again one-of-a-kind, OpenTelemetry based solution for this that is coordinated across all tier.  A "DevOps meets DataOps" video explains this Unified Observability architecture and how it can be used with entirely open source products such as Kubernetes (including eBPF), Prometheus, Loki and Promtail, ELK stack, Jaeger and Zipkin, Grafana, etc.

Finally, note that the existing beta workshop will soon be updated to include the now GA release of the Saga Framework and will be announced at this same blog space.

Conclusion

Thank you for reading and of course please feel free to reach out to me with any questions or feedback.

I want to give credit to the Oracle TxEventQ and Transaction Processing teams for all the amazing work they've done to conquer some of if not the most difficult areas of data-driven microservices and simplify it for the developers. I'd like to give special credit to Oracle's Dieter Gawlick who started the work in both escrow (lock-free reservations) and compensation-aware datatypes and sagas 40 years ago and who is also the original architect of TxEventQ (formerly AQ) with its ability to do messaging and data manipulation in the same local transaction.

 

Paul Parkinson

Architect and Developer Evangelist, Microservices and Converged Database

Paul is Architect and Developer Evangelist for Microservices and the Converged Database

His focus includes data and transaction processing, service mesh and event mesh, observability, and polyglot data models, languages, and frameworks.

The 18 years prior to this he was the Transaction Processing Dev Lead for the Mid-tier and Microservices (including WebLogic and Helidon)

Creator of the workshop "Building Microservices with Oracle Converged Database" @ http://bit.ly/simplifymicroservices

Holds 20+ patents and has given numerous presentations and publications over the past 20+ years.

Show more
Oracle Chatbot
Disconnected