Thursday Sep 25, 2014

ODI 12c - Migration from OWB to ODI - PLSQL Procedures Pt 2

In the first part of this blog post I showed you how PLSQL procedures could be invoked from ODI and help you benefit from supporting such functionality in the tool and increasing performance. Here in part 2, I'll show how the map can be changed in OWB and subsequently migrated using the migration utility.

Remember the original mapping design in OWB used a transformation operator to invoke the PLSQL procedure, the same mapping can be modified by replacing that transformation component with a construct operator to build the ref cursor and also a table function operator, see below for the original map design at the top and the modified one below with the ref cursor and table function;


The mapping can be configured to be further optimized by specifying extraction and loading hints in order to tune performance for your system;


You will have to enable the parallel DML capabilities for the OWB mapping also. With this design you can test/ensure it works and then use it in a migration. Why go through all this? You may want to try and optimize existing mappings in OWB, or when migration your map may be more complex and you do not wish to reconstruct it (perhaps the mapping is large and complex upstream from the PLSQL procedure). Doing this will save that work, you need to remove the PLSQL procedure plus target attribute mappings and insert the cursor and table function.

When the migration utility migrates this, it will migrate the entire logical design and also the hints you have specified. You will see the following mapping in ODI;

You will have to add the enable parallel DML code into the begin mapping command and then the code will be equivalent and performing as such. For details of the OWB to ODI migration utility see here, it's also worth checking various other useful resources such as the migration in action demo here and Stewart Bryson's 'Making the Move from Oracle Warehouse Builder to Oracle Data Integrator 12c' article here (useful tip on database link mechanics in ODI).

ODI 12c - Migrating from OWB to ODI - PLSQL Procedures

The OWB to ODI migration utility does a lot, but there are a few things it doesn't handle today. Here's one for our OWB customers moving to ODI who are scratching their heads on the apparent lack of support for PLSQL procedures in ODI... With a little creative work you can not only get those mappings into ODI but there is potential to dramatically improve performance (my test below improves performance by 400% and can be easily further tuned based on hardware).

This specific illustration will cover the support for PLSQL procedures (not functions) in OWB and what to do in ODI, OWB takes care of invoking PLSQL procedures mid-flow - it did this by supporting PLSQL row-based code for certain (not all) map designs out of the box - PLSQL procedure invocation was one case this was done. The PLSQL that OWB generated was pretty much as efficient as it could have been for PLSQL (used bulk collect and many other best practices you didn't have to worry about as a map designer) but it was limited in the logical map support (couldn't have a set-based operator such as join after a row based only operator such as a PLSQL transformation) - it was also PLSQL not SQL.

Here we see how with a simple pipelined, parallel enabled table function wrapper around your PLSQL procedure call, how you capture the same design in ODI 12c and/or get the mapping migrated from OWB. I think the primary hurdle customers have is what is the option going forward. To solve this, we will just leverage more of the Oracle database; table functions and parallelize the <insert your favorite word> out of it!

The mapping below calls a PLSQL procedure and OWB generated PLSQL row based code for this case, the target is getting loaded with data from the source table and the 2 output parameters of the PLSQL procedure;

When you try and migrate such a mapping using the OWB to ODI migration utility, you'll get a message indicating that the map cannot be migrated using the utility. Let's see what we can do! The vast majority of mappings are set-based, generally a very small subset are row based PLSQL mappings. Let's see how this is achieved in ODI 12c.

I did a test using the generated code from OWB - no tuning just the raw code for the above mapping - it took ">">">12 minutes 32 seconds to process about 32 million rows and invoke the PLSQL procedure and perform a direct path insert into the target. With my ODI 12c design using a very simple table function wrapper around the PLSQL procedure I can cut the time to 3 minutes 14 seconds!! Not only can I do this, but I can easily further optimize it to better leverage the Oracle database server by quickly changing the hints - I had a 4 processor machine, so that's about as much as I could squeeze out of it.

Here is my map design in ODI 12c;

The table function wrapper to call the PLSQL procedure is very simple, line 7 is where I call the PLSQL procedure, I use the object instance in the call and pipe the data when the call is made;

  1. create or replace function TF_CALL_PROC(input_values sys_refcursor) return TB_I_OBJ pipelined parallel_enable(PARTITION input_values BY ANY) AS
  2.   out_pipe i_obj := i_obj(null,null,null,null,null,null,null,null,null);
  3. BEGIN
  4.   LOOP
  5.     FETCH input_values INTO out_pipe.prod_id, out_pipe.cust_id,out_pipe.time_id,out_pipe.channel_id,out_pipe.promo_id,out_pipe.quantity_sold,out_pipe.amount_sold;
  6.     EXIT WHEN input_values%NOTFOUND;
  7.     MYPROC(out_pipe.prod_id,out_pipe.status,out_pipe.info);
  8.     PIPE ROW(out_pipe);
  9.   END LOOP;
  10.   CLOSE input_values;
  11.   RETURN;
  12. END;

This is a very simple table function (with enough metadata you could generate it), it uses table function pipelining and parallel capabilities - I will be able to parallelize all aspects the generated statement and really leverage the Oracle database. The above table function uses the types below, it has to project all of the data used downstream - whereas OWB computed this, you will have to do that.

  1. create or replace type I_OBJ as object (
  2.  prod_id number,
  3.  cust_id number,
  4.  time_id date,
  5.  channel_id number, 
  6.  promo_id number,
  7.  quantity_sold number(10,2),
  8.  amount_sold number(10,2),
  9.  status varchar2(10),
  10.  info number
  11.   );
  12. create or replace type TB_I_OBJ as table of I_OBJ; 

The physical design in ODI has the PARALLEL(4) hints on my source and target and I enable parallel DML using the begin mapping command within the physical design.

You can see in above image when using Oracle KMs there are options for hints on sources and targets, you can easily set these to take advantage of the hardware resources, tweak these to pump the performance throughput!

To summarize, you can see how we can leverage the database to really speed the process up (remember the 400%!), also we can still capture the design in ODI and on top of that unlike in OWB, this approach let's us carry on doing arbitrary data flow transformations after the table function component which is invoking our PLSQL procedure - so we could join, lookup etc. Let me know what you think of this, I'm a huge fan of table functions I think they afford a great extensibility capability.  

Monday Apr 01, 2013

ODI - Tip of the day, Connect/Disconnect call outs

The connect/disconnect call outs are a useful mechanism for executing code at the start and end of connections - you can write SQL, PLSQL and all sorts through this callout. The code is useful for example for recording your own additional audit information. The code is executed using the execute function call via the java driver for your system. This lets you execute arbitrary valid calls for your driver. For Oracle this can include anonymous PLSQL blocks of code which can contain many statements including SQL and PLSQL. You can also use odiRef APIs so you can generate the code to be executed in the context of where it is executed.

To illustrate if I want to record the session ID in my own audit/trace table I can define an on-connect block of code to be executed. You also define which ODI transactions the code is pertinent to. So for example below I have defined an anonymous PLSQL block of code for on connect applicable to transaction 1 - this is the transaction referenced from the KMs I am using, on the right you can see the IKM SQL Control Append is 'Transaction 1'.


Notice how I can use the odiRef calls to get the session id and add it into my block of code. I have also used the commit option to save the details. When an interface or scenario is executed that uses this data server the code will be executed and can be inspected from the ODI operator under the Command On Connect node in the tree;

Again, the code is executed through the java jdbc driver so it must be of the technology that the data server is, but this is a useful exit/call out point for invoking code. You can even use variables within this block to add another level of flexibility.

Thursday Dec 06, 2012

ODI 11g - Cleaning control characters and User Functions

In ODI user functions have a poor name really, they should be user expressions - a way of wrapping common expressions that you may wish to reuse many times - across many different technologies is an added bonus. To illustrate look at the problem of how to remove control characters from text. Users ask these types of questions over all technologies - Microsoft SQL Server, Oracle, DB2 and for many years - how do I clean a string, how do I tokenize a string and so on. After some searching around you will find a few ways of doing this, in Oracle there is a convenient way of using the TRANSLATE and REPLACE functions. So you can convert some text using the following SQL;

  • replace( translate('This is my string'||chr(9)||' which has a control character', chr(3)||chr(4)||chr(5)||chr(9), chr(3) ), chr(3), '' )

If you had many columns to perform this kind of transformation on, in the Oracle database the natural solution you'd go to would be to code this as a PLSQL function since you don't want the code splattered everywhere. Someone tells you that there is another control character that needs added equals a maintenance headache. Coding it as a PLSQL function will incur a context switch between SQL and PLSQL which could prove costly.

In ODI user functions let you capture this expression text and reference it many times across your mappings. This will protect the expression from being copy-pasted by developers and make maintenance much simpler - change the expression definition in one place.

Firstly define a name and a syntax for the user function, I am calling it UF_STRIP_BAD_CHARACTERS and it has one parameter an input string; 

We then can define an implementation for each technology we will use it, I will define Oracle's using the inputString parameter and the TRANSLATE and REPLACE functions with whatever control characters I want to replace;

I can then use this inside mapping expressions in ODI, below I am cleaning the ENAME column - a fabricated example but you get the gist.

 Note when I use the user function the function name remains in the text of the mapping, the actual expression is not substituted until I generate the scenario. If you generate the scenario and export the scenario you can have a peak at the code that is processed in the runtime - below you can see a snippet of my export scenario;

 That's all for now, hopefully a useful snippet of info.

About

Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality

Search

Archives
« July 2015
SunMonTueWedThuFriSat
   
3
4
5
6
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today