By David Allan on Apr 11, 2014
The ODI 12c mapping designer let's you design a mapping using components and define expressions for each of those components. Like 11g, in 12c there are hints to let ODI know how you would like to compute a physical plan for your design and from this generate code. I think the rules about how some of this work are not known - both in 11g and 12c - people second guess how they think it works. There's no magic to it, let's have a look at it. Underpinning the mapping are the technology definitions, it's here that datatypes are defined and datatype mappings between technologies. This let's ODI be very flexible in support for arbitrary data transformations between systems and how such data and its datatype is mapped across heterogeneous systems.
Putting the heterogeneous nature aside we can look at how datatypes are transformed just in a distributed example for Oracle (the example is for demonstration, in reality the database link LKM will be used which will do no staging). The example has 2 columns in a source table that are both VARCHAR2(30), one of those columns has an actual string, the other has a date. The target system has 2 columns in our target table that are VARCHAR2(10) and DATE. Note the target for one is shorter than its source and the other is a DATE datatype and not a string.
We can define the simple table to table mapping as below and define the expressions on the target table.
By default the expressions have no hint defined and will execute where the table is executed - in this case on the target system. We can see how the C$ table would be defined by previewing the DDL code in the physical design, we will see the type/DDL in the syntax of the particular technology. Below you can see the source datatype information is propagated - the length is still 30.
If we look at the target table we can see the expressions defined on it, in the example below I have selected the TGTEMP table and I can see the 2 expressions, I could actually change where the expression is defined for this particular physical design, I will not do that though, I will go back to the logical designer and set the hint there - then all of my potential physical designs leverage it.
Use the 'Execute on hint' property for the attribute, you can see the different options there, just now it has value no hint.
Below I have selected 'Source' to indicate I want the SUBSTR expression executed on the source system.
After this change has been made, if you look at the physical design you will see that the datatype information on our AP is now different. Because the data has been prepared on the source then the datatype for our C$ column now takes on the definition of the target (VARCHAR2(10)).
This gives you some idea as to how the hints works for expressions in a datastore. ODI 12c also has an expression component that let's you define groups of expressions. I generally think this is good for when an expression is reused within a mapping, but I know everyone works differently with tools and I have heard that some people like using this for all complex expressions rather than it being defined within a component such as a target table as they can easily 'see' where complex expressions are defined. Each to their own, the good thing is that you can do whatever you like. One benefit with using the expression component is that ODI by default will push that expression component as close to the source as possible and you can easily grab the entire component and push it to the stage, target or wherever.
The mapping below defines the expressions on the expression component, again there are no hints defined on the component or individual expressions.
When the physical design for this mapping is inspected we can see the expression component is by default on the source execution unit. This goes for other components too (filter, join etc.). In this case you can see both of the columns in the AP take on the downstream target table's datatypes (VARCHAR2(10) and DATE).
Changing the hint on the logical design for the expression component to stage will place the expression in the downstream execution unit. If I had just switched the hint to be stage for the expression component then in the physical design the expression would go in TARGET_UNIT_1. In 11g, ODI also supported a concept where the stage was different for the target. This is still available in 12c and is configured by defining what you want to execute in the stage by using these hints plus defining what the stage is (so similar to 11g apart from you don't have to switch tabs and the gestures are more simple). So firstly, define the expression to execute on the stage using that hint. Then on the logical mapping if you click on the canvas background you will see a property named 'Staging Location Hint', you can set this to the logical schema location for the staging area if you have one. By default it is not set as the staging area is the same as the target.
Let's change this to MEMORY_ENGINE just to see what the physical design looks like. We see we now have multiple execution units and the middle one where we executed the expression component is the 'stage' executing on the MEMORY_ENGINE location.
The hints are done on the logical design. You can also hard-wire physical changes in the physical design, I will go into that in a subsequent post but wanted to stick to the logical hints here to demystify how this works. I hope this is some useful background, I think for ODIers from 11g it will help.