OWB 11gR2 – Trickle Feed Data Acquisition and Delivery
By David Allan-Oracle on Sep 13, 2009
The OWB 11gR2 release introduced near real time data warehousing capabilities. OWB has been enhanced to also support a continuous trickle feed loading approach by introducing the concept of trickle feed mappings. A trickle feed mapping is a mapping with an operator designated as the trickle feed driver representing the source from which data is trickling in. This is based on the Oracle Streams feature in the Oracle database server.
The only operator type that will be considered as candidate for the trickle feed driver is the Oracle Advanced Queue (AQ) operator type. The rest of the mapping represents the transformation and loading to be performed on each message/event that trickles in from the AQ source. Such a message could be due to an application publishing the message or could be due to a Streams CDC setup in which case the messages encapsulate a logical change record (LCR). OWB has been enhanced to support triggering of execution of a trickle feed mapping on the arrival of a message in the trickle feed driver, thereby providing continuous processing.
Advanced Queue is back! Not only can queues be processed in the traditional batch mode, they can be targets in mappings, propagations between queues can be designed and trickle feed mappings can be constructed.
OWB 11gR2 supports both typed AQ and ANYDATA AQs in the Oracle database. A typed AQ is an advanced queue whose payload is strongly typed. It can be created/imported within Oracle module under 'Queues' node.
ANYDATA AQ is an advanced queue capable of supporting heterogeneous payloads. Streams queues are a special type of ANYDATA AQ. It is secure, transactional. The Oracle database supports streams queues that can stage both LCRs as well as non-LCR messages. However, OWB requires a user to declare whether an AQ will stage either LCRs or messages - in other words, OWB does not support AQs staging both LCRs and messages.
A new mapping execution mode is now available and is indicated by the execution type property, the property can have the values Batch or Trickle.
- Trickle or Batch? This refers to whether trickle feed processing or batch processing is used. In trickle feed processing one special operator designated as the 'trickle feed driver' is seen to represent a single message. In batch processing each stage component in the map is seen to represent a set of records/messages.
The figure below illustrates the difference between batch and trickle mappings and the integration for trickle feed maps with Oracle Streams;
The Streams framework in the Oracle database server provides the infrastructure to implement trickle feed mappings. The Streams framework supports the concept of an Apply daemon that is constantly listening for arrival of messages in a Streams Queue (a secure, transactional Advanced Queue). On arrival of the message, the Apply process will pass control to a user defined handler procedure with the message as input. Custom transformations can be performed on the message in the user defined handler procedure (the OWB mapping).
In addition to providing a continuously listening daemon, the Streams framework will also handle data and transactional integrity aspects by guaranteeing that every message will be processed once and only once.
A mapping is a candidate to be a trickle feed mapping if and only if it has one AQ source operator (mappings can also write to AQs). To create a trickle feed mapping, the user will invoke the AQ operator creation wizard in mapping editor. This wizard will ask user to choose the advanced queue acting as source operator, the source type (batch or real-time), the type of the data that is going to be staged in the queue (as mentioned earlier, LCRs or non-LCR, for example user defined message).
You must then define the target type for the AQ operator (what the payload types is), in this example there is an object type MSG_PRODUCT_UPD containing updated product information that will be the message on the AQ.
If the AQ is defined as real time source, the mapping is automatically created as trickle feed mapping. This can be established by inspecting the mapping property 'Execution type' which has either value 'Batch' or 'Trickle'.
When the AQ is added to the mapping the operator output attribute of the AW operator is the type selected above when the AQ was added, so MSG_PRODUCT_UPD in our case. We can use the Expand Object operator to retrieve the attribute of the type and use them to update the target PRODUCTS table for example.
When the mapping is started (like other mappings):
The map is shown in a 'Running' state in the 'Control Center Jobs' panel, the queue listener will invoke the logic in the mapping whenever messages are pushed on to the queue.
The listener can be stopped by right clicking on the mapping in the 'Control Center Jobs' panel and selecting 'Stop'.
There are some other details you have to be aware of that are defined in the configuration of the mapping these include the schema where the Streams definitions will be created (as opposed to the location where the mapping will reside, these could be different places, and more than likely will be). Other properties include the apply process name, the parallelism degree and the redo value for the Streams apply process.
These values are used when building the Streams apply process using the package DBMS_APPLY_ADM, for example the default parallelism degree above is used by OWB to set the number of concurrent transactions that can be applied by generating the following code when deploying the mapping;
apply_name => '"PROCESS_MSG_QUEUE"',
parameter => 'parallelism',
value => '5'
These properties let you configure and control how the apply handler operates. As you see the OWB 11gR2 release has introduced near real time data warehousing capabilities through the support of continuous trickle feed mappings.