ODI 11g - Pipelines
By David Allan on Jul 05, 2011
Carrying on from the simple, flexible, powerful posting on the code generation capabilities of ODI 11g which illustrated the SQL as a Source option we can see how additional new code generation components can be defined (which let us capture information about specific transformations) and assemble in a pipeline. This is the kind of info I covered in my ODTUG KScope11 'Pushing the Envelope with ODIEE' talk at Long Beach this year - had a great time meeting everyone there!
The example we will illustrate here is using table functions (could be other transformations such as SQL unpivot) in the Oracle database – table functions are functions that can be queried like a table and support parallelization capabilities along with piping of data which are critical elements for improving performance and leveraging capabilities of the server (the same approach can be used for DB2 table functions or Microsoft SQLServer CROSS APPLY style capabilities). The flow can be constructed by piping SQL and transformations (like table functions) in an assembly like manner by constructing interfaces in a modular manner.
Below is an illustration of a number of interfaces assembled in a pipeline which can leverage the code generation capabilities in ODI 11g for assembling components.
We have seen from other posts how the interfaces can be organized into flows which are resolved into a single SQL (for example) with sub-queries. In this post you will see how transformations such as table functions can be incorporated into a design and the components assembled. The ODI 11g capability to define the SQL subquery generator within a KM opens the door to this flexibility. Table functions themselves are useful for many things;
- parallelizing arbitrarily complex (PLSQL, SQL PL) code in a SQL manner
- eliminating intermediate storage
- SQL-izing the stored procedure
In the illustration below I have a pipelined table function FX which is a
temporary target data store process data from datastores T1 and T2. The output
of the table function is then joined with T3 and further consumed in target
FX_TAB. The temporary datastore FX (representing the table function) has user
defined properties set which indicate the inputs and outputs for FX (A and B are
inputs and have expressions, X and Y are outputs).
The image has the two interfaces side by side, on the left we have a temporary interface for T1 and T2 joined and providing data for table function FX. On the right, the result of the table function is joined with T3 and written to FX_TAB.
The above illustrates how table functions can be used in pipelined manner with query inputs. The meaning of the columns in the FX temporary target datastore is inferred from the user defined property settings – this lets us distinguish the inputs from the outputs. Below in the KM definition (SQL_TableFunction) we can see how the SQL for the transformation type is generated.
Things to note are that the KM supports both cursor and scalar definitions (so a query can be the input to the table function or alternatively you can provide scalar values such as ODI variables, the option CURSOR_INPUT provides the switch between cursor and scalar). So, importantly table functions can be used as a data source, as transformation pipes and and targets (since there is a step in the KM for this). A good use case for a target table function is when the results of the SQL are being written in parallel to file for example.
In my case I used the cursor approach, so the CURSOR_INPUT option for the SQL_TableFunction KM assignment is set to true (the default);
The cursor option in the KM lets us generate either the following style for the cursor;
…from TABLE ( <tablefunction> ( CURSOR(SELECT <columns> FROM <tables> where <conditions>
or for scalar inputs;
…from TABLE ( <tablefunction> ( <variables> ) )
Simple and flexible. As you can see we can assemble a flow of ODI interfaces into a series of pipes that interconnect leveraging ODIs declarative description for the capabilities that are supported well out of the box and extend to encompass table functions for incorporating pipelining or other transformation capabilities that you can think of. A topical area of assembly style problem solving is the divide and conquer Map-Reduce style problem, check out Jean-Pierre’s example from the Oracle Data Warehousing blog MapReduce + Oracle = Table Functions, we can see how this maps to the components we have described above;
each mapper and reducer is an ODI interface.
Where is the sample KM? It's on the Oracle Code Sample site here;
The example above was illustrated using SQL and table functions to chain pipes together, there are other avenues for creating named pipes between interfaces to do much, much more.