Friday Apr 11, 2014

ODI 12c - Expression Hints

The ODI 12c mapping designer let's you design a mapping using components and define expressions for each of those components. Like 11g, in 12c there are hints to let ODI know how you would like to compute a physical plan for your design and from this generate code. I think the rules about how some of this work are not known - both in 11g and 12c - people second guess how they think it works. There's no magic to it, let's have a look at it. Underpinning the mapping are the technology definitions, it's here that datatypes are defined and datatype mappings between technologies. This let's ODI be very flexible in support for arbitrary data transformations between systems and how such data and its datatype is mapped across heterogeneous systems.

Putting the heterogeneous nature aside we can look at how datatypes are transformed just in a distributed example for Oracle (the example is for demonstration, in reality the database link LKM will be used which will do no staging). The example has 2 columns in a source table that are both VARCHAR2(30), one of those columns has an actual string, the other has a date. The target system has 2 columns in our target table that are VARCHAR2(10) and DATE. Note the target for one is shorter than its source and the other is a DATE datatype and not a string.

We can define the simple table to table mapping as below and define the expressions on the target table.

By default the expressions have no hint defined and will execute where the table is executed - in this case on the target system. We can see how the C$ table would be defined by previewing the DDL code in the physical design, we will see the type/DDL in the syntax of the particular technology. Below you can see the source datatype information is propagated - the length is still 30.

 If we look at the target table we can see the expressions defined on it, in the example below I have selected the TGTEMP table and I can see the 2 expressions, I could actually change where the expression is defined for this particular physical design, I will not do that though, I will go back to the logical designer and set the hint there - then all of my potential physical designs leverage it.

Use the 'Execute on hint' property for the attribute, you can see the different options there, just now it has value no hint. 

Below I have selected 'Source' to indicate I want the SUBSTR expression executed on the source system. 

After this change has been made, if you look at the physical design you will see that the datatype information on our AP is now different. Because the data has been prepared on the source then the datatype for our C$ column now takes on the definition of the target (VARCHAR2(10)). 

This gives you some idea as to how the hints works for expressions in a datastore. ODI 12c also has an expression component that let's you define groups of expressions. I generally think this is good for when an expression is reused within a mapping, but I know everyone works differently with tools and I have heard that some people like using this for all complex expressions rather than it being defined within a component such as a target table as they can easily 'see' where complex expressions are defined. Each to their own, the good thing is that you can do whatever you like. One benefit with using the expression component is that ODI by default will push that expression component as close to the source as possible and you can easily grab the entire component and push it to the stage, target or wherever.

The mapping below defines the expressions on the expression component, again there are no hints defined on the component or individual expressions.

When the physical design for this mapping is inspected we can see the expression component is by default on the source execution unit. This goes for other components too (filter, join etc.). In this case you can see both of the columns in the AP take on the downstream target table's datatypes (VARCHAR2(10) and DATE).

Changing the hint on the logical design for the expression component to stage will place the expression in the downstream execution unit. If I had just switched the hint to be stage for the expression component then in the physical design the expression would go in TARGET_UNIT_1. In 11g, ODI also supported a concept where the stage was different for the target. This is still available in 12c and is configured by defining what you want to execute in the stage by using these hints plus defining what the stage is (so similar to 11g apart from you don't have to switch tabs and the gestures are more simple). So firstly, define the expression to execute on the stage using that hint. Then on the logical mapping if you click on the canvas background you will see a property named 'Staging Location Hint', you can set this to the logical schema location for the staging area if you have one. By default it is not set as the staging area is the same as the target.

Let's change this to MEMORY_ENGINE just to see what the physical design looks like. We see we now have multiple execution units and the middle one where we executed the expression component is the 'stage' executing on the MEMORY_ENGINE location.

The hints are done on the logical design. You can also hard-wire physical changes in the physical design, I will go into that in a subsequent post but wanted to stick to the logical hints here to demystify how this works. I hope this is some useful background, I think for ODIers from 11g it will help.

Tuesday Apr 01, 2014

ODI 12c - Mapping Builder

A few years ago I posted a utility (see interface builder post here) to build interfaces from driver files, here I have updated it for 12c to build mappings from driver files. The example uses a tab delimited text file to control the mapping creation, but it could be easily taken and changed to drive from whatever you wanted to capture the design of the mapping.

The mapping can be as complete or incomplete as you’d like, so could just contain the objects or could be concise and semantically complete.

The control file is VERY simple and just like ODI requests the minimal amount of information required. The basic format is as follows;So for example the control file below can define the sources, target, joins, mapping expressions etc;

Directive Column2 Column3 Column4 Column5 Column6
 source  <model>  <datastore>  <alias>
 .....can add many
 target  <model>  <datastore>
 filter  <filter_condition>  <filter_datastore_alias>  <alias>
 lookup  <model> <datastore>  <driver_alias> <lookup_condition>  <alias>
 join  <join_condition>  <alias>
 ....can add many of the components above.
 mapping  <column>  <expression>

So for example the control file below can define the sources, target, joins, mapping expressions etc;

  • source SOURCE EMP EMP
  • source SOURCE DEPT DEPT
  • target TARGET_MODEL TGTEMP
  • join EMP.DEPTNO = DEPT.DEPTNO AJOIN
  • filter EMP.SAL > 1 EMP AFILTER
  • lookup SOURCE BONUS EMP BONUS.ENAME = EMP.ENAME ALOOKUP
  • mapping ENAME UPPER(EMP.ENAME) 
  • mapping DEPTNO ABS(DEPT.DEPTNO) 
  • mapping COMM ABS(BONUS.COMM)

When executed, this generates the mapping below with the join, filter, lookup and target expressions from the file;


You should be able to join the dots between the control file sample and the mapping design above. You will need to compile and execute the code in OdiMappingBuilder;

java –classpath <cp> OdinterfaceBuilder jdbc:oracle:thin:@localhost:1521:ora112 oracle.jdbc.OracleDriver ODI_MASTER mypwd WORKREP1 SUPERVISOR myodipwd DEMOS SDK DEMO1 < mymappingcontrolfile.tab

The mapping to be created is passed from the command line. You can intersperse other documentation lines between the control lines so long as the control keywords in first column don’t clash. See the driver file below viewed from within Excel;

Anyway some useful snippets of code for those learning the SDK (download OdiMappingBuilder here), or for those wanting to capture the design outside and generate ODI mappings. Have fun!

Wednesday Mar 12, 2014

ODI 12c - Data Input Experts

Back in the olde days of OWB I blogged about a few utilities (see here) that were useful for collecting user input data in custom flows, users build such flows to implement accelerators to take the mundane tasks out of common activities. In ODI you can also use groovy SwingBuilder, this let's you build useful dialogs very easily. I posted some examples such as the one below for model creation in ODI and a launchpad example;

The utilities for OWB I mentioned in the blog are just basic java classes that were invoked from OWB via tcl/jacl. These utilities are written in java and can still be used from ODI via groovy. Still as useful, still as functional. Let's see how we call them now!

The required JARs need to be put on the groovy classpath, which is under the ODI IDE's Tools->Preferences option, and then under ODI->System->Groovy and set the groovy classpath to include jexpert.jar, tcljava.jar and jacl.jar. For example I have the following referencing the JARs from my 11gR2 database which has the OWB code;

  • D:\app\dallan\product\11.2.0\dbhome_1\owb\lib\int\jexpert.jar;D:\app\dallan\product\11.2.0\dbhome_1\owb\lib\int\tcljava.jar;D:\app\dallan\product\11.2.0\dbhome_1\owb\lib\int\jacl.jar

I can then launch the shuttle dialog for example as follows;

  1. import oracle.owb.jexpert.ShuttleObjects
  2. arrayOfString = [ "PRODUCT_ID", "PRODUCT_NAME", "PRODUCT_COLOR", "PRODUCT_DESC", "PRODUCT_LONG_DESC", "CATEGORY_ID", "CATEGORY_NAME", "CATEGORY_DESCRIPTION", "SUBCATEGORY_ID", "SUBCATEGORY_NAME", "SUBCATEGORY_DESCRIPTION" ]
  3. sels = ShuttleObjects.getselection("Select dimension levels", "Select columns to identify levels:", "Columns:", "Levels", (String[]) arrayOfString.toArray())

  4. println sels

I can use the returned variable sels and do whatever ODI stuff I need, you can see the code above executed from within ODI and the dialog appearing with the information;

Likewise the data entry dialog works as is, when that dialog is executed from groovy, just like in OWB we can get the information displayed, the user can enter data, we can collect it and action it in our groovy using the ODI SDK;

The blog on the 12c mapping SDK here has a good SDK reference table that gives you pointers for all parts of the product into the SDK areas. This is definitely a handy one to bookmark, I often use it myself. Learn some scripting it'll help save you are your teams a lot of time.

Friday Mar 07, 2014

Using the Unpivot Component in ODI 12c

As we’ve seen last week the latest ODI 12c patches added several new Mapping Components such as Pivot or Unpivot. In this blog post we will walk you through an example of how to use the new Unpivot Component. 

You can use the following SQL statements to recreate this example in your environment. It will create the source (UNPIVOT_TEST) and target (TRG_UNPIVOT_TEST) tables used in this article in your database then you can reverse engineer them in ODI.

CREATE TABLE unpivot_test (
year NUMBER,
q1_sales NUMBER,
q2_sales NUMBER,
q3_sales NUMBER,
q4_sales NUMBER
);
insert into unpivot_test values (2012, 10.5, 11.4, 9.5, 8.7);
insert into unpivot_test values (2013, 9.5, 10.5, 10.3, 7.6);
commit;
CREATE TABLE trg_unpivot_test (
year NUMBER,
quarter VARCHAR2(255),
sales NUMBER
);

Our goal is to unpivot the data on the Quarter columns when going from UNPIVOT_TEST into TRG_UNPIVOT_TEST as shown below


Follow these steps to add and configure an Unpivot Component in a Mapping:

  1. First add the Source table UNPIVOT_TEST into your Mapping, to do so drag and drop the datastore from the Models into the Mapping
  2. Next add an Unpivot component into the Mapping. This is done by clicking on the Unpivot Component in the Component palette and then clicking on the Mapping diagram. A new UNPIVOT component will appear in the Mapping:

  3. Drag and drop the YEAR column from UNPIVOT_TEST into the UNPIVOT component. There is no need to add the remaining source attributes yet, we will be using them in the Unpivot Transforms section later.

  4. Then click on the UNPIVOT component and select the Attributes panel. We will add 2 new attributes representing the row and value locator attributes: QUARTER (VARCHAR) and SALES (NUMERIC).

  5. Click on the General panel and select QUARTER in the Row Locator drop-down list. We will be performing the unpivot operation using this column.

  6. Next add the transformations to unpivot the data. Click on the Unpivot Transforms panel and create a transformation rule for each of the 4 quarters and match the appropriate QUARTER and SALES values as shown below:

  7. Finally add the Target datastore TRG_UNPIVOT_TEST and link the UNPIVOT component to it. Leave the default settings in the Attributes Matching panel and click OK

  8. In this example you can use the default Physical settings for your Mapping. Integration Type is set to Control Append by default and the IKM Oracle Insert is used
  9. Finally click on Run to execute the Mapping, 8 inserts are performed and you should see the following data in your target table:

  10. If you review the generated code you will notice that ODI leverages the UNPIVOT function on Oracle to perform such operation. The Unpivot component supports Oracle as well as any database

You can recreate the following example using the ODI 12c Getting Started VirtualBox image which is available on OTN: http://www.oracle.com/technetwork/middleware/data-integrator/odi-demo-2032565.html

Friday Feb 28, 2014

Pivoting Data in ODI 12c

We have recently added several new Mapping Components in Oracle Data Integrator 12c such as Pivot or Unpivot. In this blog post we will walk you through an example of how to use the new Pivot Component.

You can use the following SQL statements to recreate this example in your environment. It will create the source (PIVOT_TEST) and target (TRG_PIVOT_TEST) tables used in this article in your database then you can reverse engineer them in ODI.

CREATE TABLE pivot_test (
year NUMBER,
quarter VARCHAR2(255),
sales NUMBER
);

insert into pivot_test values (2012, 'Q1', 10.5);
insert into pivot_test values (2012, 'Q2', 11.4);
insert into pivot_test values (2012, 'Q3', 9.5);
insert into pivot_test values (2012, 'Q4', 8.7);
insert into pivot_test values (2013, 'Q1', 9.5);
insert into pivot_test values (2013, 'Q2', 10.5);
insert into pivot_test values (2013, 'Q3', 10.3);
insert into pivot_test values (2013, 'Q4', 7.6);
commit;

CREATE TABLE trg_pivot_test (
year NUMBER,
q1_sales NUMBER,
q2_sales NUMBER,
q3_sales NUMBER,
q4_sales NUMBER
);

Our goal is to pivot the data on the Quarter column when going from PIVOT_TEST into TRG_PIVOT_TEST as shown below:


Follow these steps to add and configure a Pivot Component in an ODI 12c Mapping:

  1. First add the Source table PIVOT_TEST into your Mapping, to do so drag and drop the PIVOT_TEST datastore from the Models into the Mapping.
  2. Next add a Pivot component into the Mapping. This is done by clicking on the Pivot Component in the Component palette and then clicking on the Mapping diagram. A new PIVOT component will appear in the Mapping:

  3. Drag and drop the YEAR column from PIVOT_TEST into the PIVOT component. There is no need to add the QUARTER and SALES attributes yet, they will be used later in the Row Locator and Attributes sections.

  4. Click on the PIVOT component and in the Properties window select the Row Locator panel. In our example the Row Locator will be the QUARTER column which is transposed from rows into 4 columns in our target table TRG_PIVOT_TEST.

  5. Open up the Expression Editor next to the Row Locator field and select the QUARTER column from our source table PIVOT_TEST. Then click OK.


  6. Now specify the various values the QUARTER column can take. This is done using the Row Locator Values table. Click on the + icon under Row Locator Values and add the 4 possible values: ‘Q1’, ‘Q2’, ‘Q3’ and ‘Q4’.

  7. Then click on the Attributes panel and add the 4 output attributes which correspond to each Row Locator values: Q1_SALES, Q2_SALES, Q3_SALES and Q4_SALES.

  8. Next select a Matching Row for the output attributes you just created. The Matching Row values come from the Row Locator Values entered earlier.
    Pick ‘Q1’ for Q1_SALES, ‘Q2’ for Q2_SALES, ‘Q3’ for Q3_SALES and ‘Q4’ for Q4_SALES.
    Finally enter an expression for each of the new attributes, use PIVOT_TEST.SALES for all of them as we are interested in getting the Sales data into those columns. You can type the expression using the Expression Editor or drag and drop the SALES column from PIVOT_TEST into each of the newly created attributes.

  9. Finally add the target table TRG_PIVOT_TEST and connect the PIVOT component to it. Unselect the Create Attributes on Source checkbox in the Attribute Matching window and click OK to finish the Mapping configuration.

  10. In this example you can use the default Physical settings for your Mapping. Integration Type is set to Control Append by default and the IKM Oracle Insert is used.
  11. Click on Run to execute the Mapping, 2 inserts are performed and you should see the following data in your target table.

  12. If you review the generated code you will notice that ODI leverages the PIVOT function on Oracle to perform such operation. The Pivot component supports Oracle as well as any other database supported by ODI 12c.

You can recreate the following example using the ODI 12c Getting Started VirtualBox image which is available on OTN: http://www.oracle.com/technetwork/middleware/data-integrator/odi-demo-2032565.html

Tuesday Feb 18, 2014

Recap of Oracle GoldenGate 12c Webcast with Q&A

Simply amazing! That’s how I would summarize last week’s webcast for Oracle GoldenGate 12c.  It was a very interactive event with hundreds of live attendees and hundreds of great questions. In the presentation part my colleagues, Doug Reid and Joe deBuzna, went over the new features of Oracle GoldenGate 12c. They explained Oracle GoldenGate 12c key new features including:

  • Integrated Delivery for Oracle Database,
  • Coordinated Delivery for non-Oracle databases,
  • Support for Oracle Database 12c multitenant architecture,
  • Enhanced high availability via integration with Oracle Data Guard Fast-Start Failover,
  • Expanded heterogeneity, i.e. support for new databases and operating systems,
  • Improved security,
  • Low-downtime database migration solutions for Oracle E-Business Suite,
  • Integration with Oracle Coherence.

We also had a nice long and live Q&A section. In the previous Oracle GoldenGate webcasts, we could not respond to all audience questions in a 10-15 minute timeframe at the end of the presentation. This time we kept the presentation part short and left more than 30 minutes for Q&A. To our surprise, we could not answer even half of the questions we received. 

If you missed this great webcast discussing the new features of Oracle GoldenGate 12c,  and more than 30 minutes of Q&A with GoldenGate Product Management, you can still watch it on demand via the link below.

On Demand Webcast: Introducing Oracle GoldenGate 12c: Extreme Performance Simplified

On this blog post I would like to provide brief answers from our PM team  for some of the questions that we were not able to answer during the live webcast.

1) Does Oracle GoldenGate replicate DDL statements or DML for Oracle Database?

    Oracle GoldenGate replicates DML and DDL operations for Oracle Database and Teradata.

2) Where do we get more info on how to setup integration with Data Guard Fast-Start Failover (FSFO)?

     Please see the following blog posts or documents on My Oracle Support:

Best Practice - Oracle GoldenGate and Oracle Data Guard - Switchover/Fail-over Operations for GoldenGate    [My Oracle Support Article ID   1322547.1] 

Best Practice - Oracle GoldenGate 11gr2 integrated extract and Oracle Data Guard - Switchover/Fail-over Operations  [My Oracle Support Article ID 1436913.1] 

3) Does GoldenGate support SQL Server 2012 extraction? In the past only apply was supported.

Yes, starting with the new 12c release GoldenGate captures from SQL Server 2012 in addition to delivery capabilities.

4) Which RDBMS does GoldenGate 12c support?

GoldenGate supports all major RDBMS. For a full list of supported platforms please see Oracle GoldenGate certification matrix.

5) Could you provide some more details please on Integrated Delivery for dynamic parallel threads at Target side?

Please check out our white papers on Oracle GoldenGate 12c resource kit for more details on the new features, and how Oracle GoldenGate 12c works with Oracle Database. 

6) What is the best way to sync partial data (based on some selection criterion) from a table between databases?

 Please refer to the article: How To Resync A Single Table With Minimum Impact To Other Tables' Replication? [Article ID 966211.1]

7) How can GoldenGate be better than database trigger to push data into custom tables?

Triggers can cause high CPU overhead, in some cases almost double compared to reading from redo or transaction logs. In addition, they are intrusive to the application and cause management overhead as application changes. Oracle GoldenGate's log-based change data capture is not only low-impact in terms of CPU utilization, but also non-intrusive to the application with low maintenance requirements.

8) Are there any customers in the manufacturing industry using GoldenGate and for which application?

We have many references in manufacturing. In fact, SolarWorld USA was our guest speaker in the executive video webcast last November. You can watch the interview here. RIM Blackberry uses Oracle GoldenGate for multi-master replication between its global manufacturing systems. Here is another manufacturing customer story from AkzoNobel.

9) Does GoldenGate 12c support compressed objects for replication? Also does it supports BLOB/CLOB columns?

Yes, GoldenGate 12c and GoldenGate 11gR2 both support compressed objects. GoldenGate has been supporting BLOB/CLOB columns since version 10.

10) Is Oracle Database 11.2.0.4 mandatory to use GoldenGate 12c Integrated Delivery? Not earlier versions?

Yes. To use GoldenGate 12c’s Integrated Delivery, for the target environment Oracle Database 11.2.04 and above is required .

11) We have Oracle Streams implementation for more than 5 years. We would like to migrate to GoldenGate, however older version of GoldenGate were not supporting filtering individual transactions. Is it supported in GoldenGate 12c?

      Yes, it is supported in GoldenGate 12c.


In future blog posts I will continue to provide answers for common questions we received in the webcast. In the meanwhile I highly recommend watching the Introducing Oracle GoldenGate 12c: Extreme Performance Simplified webcast on demand.

Friday Feb 14, 2014

ODI 12c - Table Functions, Parallel Unload to File and More

ODI 12c includes a new component for integrating and transformation data programmatically, there have been plenty of examples through the years of such implementations, recent examples include SQL access to R from Mark Hornick (see an example blog here). As well as a great integration technique they have fantastic performance and scalability options - hence you see posts and talks from Kuassi Mensah on in-database map-reduce; all about leveraging the Oracle database's parallel query engine and the skills you already have (SQL and PLSQL/java).

The table function component in ODI 12c lets you integrate an existing table function implementation into a flow - the parameters for the table function can be scalar or a ref cursor, you can see how the examples from the AMIS posting here are defined within the mapping designer below, there are multiple table functions chained together, used as both a data source and a transformation;

In the above image you can see the table function name defined in the database is specified in the component's general properties (property is Function Name). The signature for the function must be manually defined by adding input/output connector points and attributes. Check the AMIS blog and reflect on the design above.

Regarding performance, one of the examples I blogged (OWB here and ODI here) was parallel unload to file. The table function examples from those previous blogs were fairly rudimentary, in this blog we will see what happens when we tweak the implementation of such functions - we can get much better performance. Here is the table function implementation I will use within the ODI examples (the type definitions used come from the OWB blog post above).

  1. create or replace function ParallelUnloadX (r SYS_REFCURSOR) return NumSet 
  2.    PIPELINED  PARALLEL_ENABLE (PARTITION r BY ANY) is 
  3.    TYPE row_ntt IS TABLE OF VARCHAR2(32767);
  4.    v_rows row_ntt;
  5.    v_buffer VARCHAR2(32767);
  6.    i binary_integer := 0; 
  7.    v_lines pls_integer := 0;
  8.    c_eol CONSTANT VARCHAR2(1) := CHR(10); 
  9.    c_eollen CONSTANT PLS_INTEGER := LENGTH(c_eol); 
  10.    c_maxline CONSTANT PLS_INTEGER := 32767; 
  11.    out utl_file.file_type; 
  12.    filename varchar2(256) := 'dbunload'; 
  13.    directoryname varchar2(256) := 'MY_DIR'; 
  14.    vsid varchar2(120); 
  15. begin 
  16.    select sid into vsid from v$mystat where rownum=1; 
  17.    filename := filename || vsid || '.dat'; 
  18.    out := utl_file.fopen (directoryname, filename , 'w');

  19.    loop 
  20.      fetch r BULK COLLECT INTO v_rows; 
  21.      for i in 1..v_rows.COUNT LOOP
  22.        if LENGTH(v_buffer) + c_eollen + LENGTH (v_rows(i)) <= c_maxline THEN
  23.          v_buffer := v_buffer || c_eol || v_rows(i);
  24.        else
  25.          IF v_buffer IS NOT NULL then
  26.            utl_file.put_line(out, v_buffer);
  27.          end if;
  28.          v_buffer := v_rows(i);
  29.        end if;
  30.      end loop;
  31.      v_lines := v_lines + v_rows.COUNT;
  32.      exit when r%notfound;
  33.    end loop;
  34.    close r;
  35.    utl_file.put_line(out, v_buffer); 

  36.    utl_file.fclose(out); 
  37.    PIPE ROW(i); 
  38.    return ;
  39. end; 
  40. /

The function uses PARALLEL_ENABLE and PARTITION BY keywords - these 2 are critical to performance and scalability. In addition, this function is further optimized; it uses the PLSQL BULK COLLECT capability and also buffers data in PLSQL variables before writing to file (this avoids IO calls). This was not rocket science to tune (plenty of posts on PLSQL IO tuning such as this) yet you can see the impact it has on performance further below.

My mapping using the table function as a target is shown below, 

In the physical design I define the parallel hints, this will then perform parallel unloads to file and you can easily leverage the hardware and power of the Oracle database. Using the hints to tweak the physical design let's the designer very easily compare and tune performance - you do not have to design the parallelism in your own flows.

In the table below you can see the performance difference when I use the PARALLEL(4) hint on a 4 CPU machine;

No Hint PARALLEL(4)
5 million rows  16s  6s
32 million rows 200s  47s 

If I execute the agent based SQL to file LKM, the time taken out of the box is 398 seconds (slower than 47s above when a hint is used) on the 32 million row example, the only divide and conquer techniques with the LKM are building a custom workflow to do such. With the table function approach if your database is on a bigger, more powerful host you can easily take advantage of the system by tweaking the hints.

As you see, the ODI table function component provides another custom exit point in a flow which let's you not only provide some useful integration capabilities but you can also do it in a very efficient manner - leveraging and exploiting the database you are running on. Hopefully this gives you a little insight and an interesting use case or two.

Wednesday Feb 12, 2014

ODI 12c - Mapping SDK Auto Mapping

The ODI 12c release has the new flow based mapping designer, this comes with new concepts to make the mapping design as efficient as possible as well as the runtime execution of such! The 12c release also has a new SDK for mapping, the 11g SDK is still available for backwards compatibility, but if if you want to properly leverage the 12c release the new mapping designer and SDK is the way forward. I posted a bunch of SDK examples (here and here) which demonstrated different mapping designs - the examples were in groovy and the column/attribute level mapping expressions were all done explicitly, I did not illustrate any auto mapping capabilities. So... I thought I should do it here. In doing so I'll show some other APIs within the mapping area that are very useful.

The 12c release introduced mapping components and categorized such components so that we can minimize the column level mapping expressions. If you compare ODI 12c with OWB, ODI 12c has a lot less inter component and cross component information, OWB capture a lot of information in a very explicit manner (it was very verbose, concise, but verbose).

One of the useful capabilities in the UI is to perform auto mapping, the function createExpressions below will use all available in-scope attributes that are upstream from the target component and match with attributes in the component you are targeting. The match can be done by equality, ends or starts and ignore case or exact match. Quite a simple piece of code and you can see the use of the function getUpstreamLeafAttributes for components or even getUpstreamInScopeAttributes for connector points. Some components have multiple input connector points with different graphs, for example set component, other ones are simple.

  1. enum MatchTypes {EQUALS,SRCENDSWITH, TGTENDSWITH, SRCSTARTSWITH, TGTSTARTSWITH}
  2. enum MatchCaseTypes {MATCH,IGNORECASE}

  3. def createExpressions(component, conPoint, matchType, matchCaseType) { 
  4.   atts = null
  5.   if (conPoint != null)   atts = conPoint.getUpstreamInScopeAttributes()
  6.   else atts = component.getUpstreamLeafAttributes(component)
  7.   tatts = component.getAttributes()
  8.   for (MapAttribute tgt_attr : tatts) {
  9.     attr_str = tgt_attr.getName()
  10.     if (matchCaseType == MatchCaseTypes.IGNORECASE) {
  11.       attr_str = attr_str.toLowerCase()
  12.     }
  13.     sourceCol = null;
  14.     for (MapAttribute src_attr : atts) {
  15.       src_attr_str = src_attr.getName()
  16.       if (matchCaseType == MatchCaseTypes.IGNORECASE) {
  17.        src_attr_str = src_attr_str.toLowerCase()
  18.       }
  19.       if ( (matchType == MatchTypes.SRCENDSWITH && src_attr_str.endsWith( attr_str )) ||
  20.            (matchType == MatchTypes.SRCSTARTSWITH && src_attr_str.startsWith( attr_str )) ||
  21.            (matchType == MatchTypes.TGTSTARTSWITH && attr_str.startsWith( src_attr_str )) ||
  22.            (matchType == MatchTypes.TGTENDSWITH && attr_str.endsWith( src_attr_str )) ||
  23.            (matchType == MatchTypes.EQUALS && attr_str.equals( src_attr_str )) ) {
  24.        sourceCol = src_attr
  25.        break
  26.       }
  27.     }
  28.     if (sourceCol != null && conPoint != null)  tgt_attr.setExpression( conPoint, sourceCol, null )      
  29.     else if (sourceCol != null)  tgt_attr.setExpression( sourceCol )      
  30.   }
  31. }

You can then call this function on a datastore to auto map all attribute expressions in the component as follows;

  • createExpressions(tgtempDatastoreComponent, null,MatchTypes.EQUALS,MatchCaseTypes.MATCH);

To illustrate the set component, you can code the population of each connector point as follows;

  • createExpressions(setComponent, inConnectorPoint1,MatchTypes.EQUALS,MatchCaseTypes.MATCH);
  • createExpressions(setComponent, inConnectorPoint2,MatchTypes.EQUALS,MatchCaseTypes.IGNORECASE);

this will auto map the attributes in a set component for each connector point with different rules (just for illustration purposes). You can see below the result of calling these 2 functions on the set component for each connector point. All upstream in-scope attributes are considered.

These APIs to get scoping attributes make it simple to build customized accelerators for building expressions when auto mapping. Its a little different than in 11g, have a look at the examples I posted above and the snippets above, there's a lot you can do and its easy to utilize.

Wednesday Feb 05, 2014

Introducing Oracle GoldenGate 12c: Extreme Performance Simplified

Oracle GoldenGate 12c was released last fall with a long list of new features that simplify configuration and increase flexibility, while delivering easy-to-use, advanced solutions with multi-fold performance gain.  We have been discussing these new features in various blog posts including:

· Advanced Replication for The Masses – Oracle GoldenGate 12c for the Oracle Database.

· GoldenGate 12c - What is Coordinated Delivery?

· GoldenGate 12c - Coordinated Delivery Example

· Oracle GoldenGate 12c - Announcing Support for Microsoft and IBM

And you can find new white papers about the 12c release in the Oracle GoldenGate resource kit.

Following the executive video webcast launching Oracle Data Integrator 12c and Oracle GoldenGate 12c in November 2013, we have set up another webcast for Oracle GoldenGate 12c where our product management team discusses the key new features in more depth and takes live questions from the audience.

If you would like to learn more about GoldenGate 12c I invite you to join us on Feb 12th in a webcast with product experts. You can register for this free event via the link below.

Live Webcast: Introducing Oracle GoldenGate 12c: Extreme Performance Simplified

February 12, 2014 -  10am PT/ 1pm ET

If you have missed our executive launch webcast I highly recommend that you watch it on demand via the link below. It talks about the tighter integration between Oracle Data Integrator 12c and Oracle GoldenGate and features customer and partner speakers from SolarWorld, BT, and Rittman Mead Consulting.

On-Demand Video Webcast: Introducing 12c for Oracle Data Integration

Friday Jan 03, 2014

ODI 12c - Components and LKMs/IKMs

Here I'd like to illustrate some nice capabilities of ODI 12c's knowledge module framework in combination with the new component based mapper. Some of this was prompted from recent questions from Holger Friedrich (on migration from OWB..'where is delete'), Stewart Bryson and Mark Rittman (on components and KMs here) and a general nagging thought I had that people were generally unaware of the capabilities. There's a lot more we can illustrate and talk about in this area, so this is a little taster...

Customized loading, integration and error management can be defined using knowledge modules. There is a text based substitution library that makes building such knowledge modules very easy - especially with the abundance of examples - plus the substitution reference documentation is a good reference guide (intro here, reference here). This can be used in harmony with ODI 12c components. In 12c we have modularized a lot and introduced components (components are logical and describe the WHAT of the transformation, component KMs describe the HOW, just like regular KMs), also we have formalized some parts of the framework but it is still as open as ever.

In supporting the odiRef substitution library we have chosen a similar approach to how the sub-select worked in 11g. If you want to build your own integration you can use the odiRef methods to get the target shape, details, connection etc. The source for the target, just like in 11g is provided via the odiRef methods also and may be rolled into the odiRef.getFrom method.

To illustrate, if you want to perform an integration to do a delete operation, then the following snippet is sufficient - in 11g or 12c (the code in red is boiler plate). In ODI 12c you can now have an arbitrary mapping graph leading up to the target;

  1. DELETE FROM <%=odiRef.getTable("L","TARG_NAME","A")%> T
  2. WHERE (<%=odiRef.getTargetColList("", "[COL_NAME]", ", ", "\n", "UK")%> )
  3. IN
  4. (
  5. select * from (
  6.  <%for (int i=odiRef.getDataSetMin(); i <= odiRef.getDataSetMax(); i++){%>
  7.  <%=odiRef.getDataSet(i, "Operator")%>
  8.    SELECT  <%=snpRef.getColList(i,"", "\t[COL_NAME]", ",\n", "", "UK")%>
  9.    FROM <%=odiRef.getFrom(i)%>
  10.    WHERE (1=1) <%=snpRef.getJoin(i)%> <%=snpRef.getFilter(i)%> <%=snpRef.getGrpBy(i)%> <%=snpRef.getHaving(i)%>
  11.  <%}%>
  12. ) S
  13. )

For example in the mapping below, I am using the subquery filter component to identify some rows that I then want to DELETE from the CONTACT_LIST target. The integration type property on target datastores was introduced in 12c and supports a set number of integrations to help filter the IKMs, if you select None, you can pick any IKM applicable for the technology.

If you look at the physical design for the above mapping you can now pick the IKM which performs the delete, the subquery filter also has a component KM which produces code, this code is rolled into the odiRef methods in the template. This then lets arbitrary map designs to be created and still build customized KMs for loading, integrating and error management. The odiRef substitution methods are simple text based APIs for producing text based on simple primitives.

This then produces the following SQL statement to perform the DELETE DML, our IKM has no knowledge of how to construct subquery filter, pivot or any other complex transformation - the code has been modularized. The code produced is below;
  1. DELETE FROM OE.CONTACT_LIST T WHERE (CUSTOMER_ID)
  2. IN (
  3. SELECT   CUSTOMER_ID
  4. FROM OE.ADDRESSES ADDRESSES
  5. WHERE (1=1)   and ( ( ADDRESSES.CUSTOMER_ID ,  ADDRESSES.PHONE_NUMBER )
  6.                = SOME  (  
  7. SELECT 
  8.   CUSTOMERS.CUSTOMER_ID  CUSTOMER_ID ,   CUSTOMERS.CUST_FIRST_NAME  PHONE_NUMBER  
  9. FROM
  10.  OE.CUSTOMERS CUSTOMERS  
  11.  WHERE  CUSTOMERS.MARITAL_STATUS = 'married' AND CUSTOMERS.CREDIT_LIMIT > 1000  AND  ADDRESSES.CUSTOMER_ID=CUSTOMERS.CUSTOMER_ID 
  12.     )
  13.    )  
  14. )

Another useful illustration of things working in harmony is related to the improved code generation for Oracle connectivity - the basics of database links have been greatly improved (no more requirements for view on source, but still supported). The framework change in order to support this will have more use cases down the line too. Now you can use the 12c database link LKMs to pull data from tables over a database link and leverage custom IKMs on the target. One of the 11g issues was related to the requirements the 11g Oracle to Oracle database link LKM placed on users, this has been greatly improved, you can now take advantage of this, plus use existing IKMs, build new ones etc.

The mapping above uses new 12c LKMs to access remote Oracle datastores and a customized IKM to illustrate the mix and match capabilities. 

Hopefully this gives you some background and insight into the ODI 12c mapping and knowledge module capabilities that you weren't aware of. Looking forward to any questions and additional ideas, insights that you have.

Tuesday Dec 17, 2013

ODI 12c - Slowly Changing Dimensions

Here we see how to setup a slowly changing dimension load in ODI 12c, everything from defining the metadata on the datastore to loading the data. I illustrated how this was done in ODI 11g here, for ODI 12c this has been simplified, the ODI 12c viewlet here provides a very quick look at setting up a datastore for supporting slowly changing dimension data loading. It uses the IKM Slowly Changing Dimension and shows dimension members being versioned when for example a marital status change happens.

The datastore editor has been enhanced to be able to quickly annotate attributes in the datastore with SCD metadata - information such as the surrogate key attribute, or the attribute that triggers history.

This used to be a particular pain to setup, so now you can quickly define this information and also get a convenient display/overview of the attributes and what they mean and are used for. Check out the viewlet above, see how it hangs together. I plan on illustrating thus further, so we have seen how you can load slowly changing dimensions for a data warehouse, we can take this even further and use the temporal validity feature of the Oracle 12c database - how you load temporal data, what does the KM give you etc. 

Wednesday Dec 11, 2013

GoldenGate 12c - What is Coordinated Delivery?

In Oracle GoldenGate 12c, the parallel apply is improved with two features: Integrated Delivery and Coordinated Delivery. Different from the Integrated Delivery, Coordinated Delivery is not limited to Oracle Database.This blog discusses the new Coordinated Delivery feature by answering two questions:


  • Why do I need Coordinated Delivery?

  • What is Coordinated Delivery?
  • [Read More]

    Thursday Nov 21, 2013

    Recap of Oracle GoldenGate 12c and Oracle Data Integrator 12c Launch Webcast

    Last week we had a great video webcast for Oracle GoldenGate 12c and Oracle Data Integrator 12c. Our development executives, Brad Adelberg and Tim Hall,  talked about the new features and how the new release helps with delivering future-ready data integration solutions with extreme performance and high IT productivity. 

    If you missed the webcast you can watch it on demand via the following page:
    On-Demand Video Webcast: Introducing 12c for Oracle Data Integration

    In previous blogs we have talked about the new features of Oracle GoldenGate 12c and Oracle Data Integrator 12c, so I am not going to repeat them here. But worth repeating are what our customers and partners say about the new 12c release.

    SolarWorld’s Senior Database Administrator, Russ Toyama was in the studio with our executives for the launch webcast, and Russ discussed their GoldenGate implementation for SolarWorld's manufacturing process. SolarWorld is the largest U.S. solar panel manufacturer for more than 35 years and needed operational intelligence to continuously improve the quality of its products, while ensuring the systems operate with high performance and stability.

    SolarWorld uses GoldenGate to move data from multiple manufacturing databases into a single decision support system database in real time, freeing up the OLTP systems for transaction processing, which improves their performance and stability. The DSS database is very flexible in meeting reporting needs, and provides a comprehensive view of multiple manufacturing processes. This provides the traditional roles of reporting and engineering analysis to continuously improve product quality, yield and efficiency, and enables real time monitoring of the production manufacturing process. Using this real-time monitoring capability SolarWorld is able to detect the deviations from the norm right away, and take action to remedy or understand the situation. This not only improves production quality, but also improves cost management. The manufacturing process is a series of steps building upon the previous, so if there is an issue, it needs to be corrected as soon as possible to prevent waste and reduce manufacturing costs.

    For GoldenGate 12c we heard from Surren Partabh, CTO for Technology Services for BT. Surren explained the role of Oracle GoldenGate for their private cloud initiative and how they have improved customer experience and availability, while managing costs as well.  Surren highlighted that Oracle's data integration product family is one of the cornerstones for BT"s cloud migration project and enable to migrate to cloud simply, and in an agile manner. BT used GoldenGate to build a replication hub to help with the migration from legacy systems to the cloud with a reliable fallback strategy. Surren also commented on the tight integration between GoldenGate 12c and Oracle Data Integrator 12c,  saying that it is a "step in the right direction" as it simplifies the actual installation, configuration, management, and monitoring of solutions. You can watch the interview with BT's Surren Partabh here.

    In the launch webcast we also heard from Mark Rittman, CTO of Rittman Mead Consulting. Mark talked about the Oracle Data Integrator 12c new features in great depth, given that he was closely involved with the Beta testing program. Mark shared his opinion that the new flow-based design interface is the most critical feature of this 12c release and will bring major productivity gains for developers. He added that interoperability with Oracle Warehouse Builder for easier migration and tighter integration with Oracle GoldenGate are very valuable for customers. Mark also discussed the new release of Oracle BI Applications and how its use of Oracle Data Integrator for data movement and transformations simplifies life for Oracle BI Apps customers.  The complete interview with Mark Rittman is available for you as well.

    If you missed the launch webcast last week, I hope you take the time to watch it on demand and discover how Oracle has changed the data integration and replication technology space with Oracle Data Integrator 12c and Oracle GoldenGate 12c. For more information including white papers and podcasts you can also download free resources.

    Wednesday Nov 06, 2013

    Oracle Data Integration 12c: Perspectives of Industry Experts, Customers and Partners

    As you may have seen from our recent blog posts on Oracle Data Integrator 12c and Oracle GoldenGate 12c, we are very excited to share with you the great new features the 12c release brings to Oracle’s data integration solutions. And, fortunately we are not alone in this sentiment. Since the press announcement October 17th, which incorporates our customers' and experts' testimonials, we have seen positive comments in leading technology publications and social media as well. Here are some examples:

    • In CIO and PCWorld you can find Joab Jackson’s article, Oracle Data Integrator 12c ready for real-time analysis, where wrote about the tight integration between Oracle Data Integrator and Oracle GoldenGate . He noted “Heeding the call from enterprise customers who clamor for more immediacy in their data-driven reports, Oracle has updated its data-integration software portfolio so that it can more rapidly deliver data to data warehouses and analysis applications.”
    • Integration Developer News’ Vance McCarthy wrote the article Oracle Ships ‘Future Proofs’ Integration Tools for Traditional, Cloud, Big Data, Real-Time Projects and mentioned that “Oracle Data Integrator 12c and Oracle GoldenGate 12c sport a wide range of improvements to let devs more easily deliver data integration for cloud, analytics, big data and other new projects that leverage multiple datasets for business.
    • InformationWeek’s Doug Henschen gave a great overview to several key features including the new flow-based UI in Oracle Data Integrator. Doug said “Oracle Data Integrator 12c introduces a complete makeover of the job-building experience, while real-time oriented GoldenGate 12c introduces performance gains “.
    • In Database Trends and Applications’ article Oracle Strengthens Data Integration with Release of Oracle Data Integrator 12c and Oracle GoldenGate 12c highlighted the productivity aspect of the new solution with his remarks: “tight integration between Oracle Data Integrator 12c and Oracle GoldenGate 12c enables developers to leverage Oracle GoldenGate’s low overhead, real-time change data capture completely within the Oracle Data Integrator Studio without additional training”.
    We are also thrilled about what our customers and partners have to say about our products and the new release. And we are equally excited to share those perspectives with you in our upcoming launch video webcast on November 12th. SolarWorld Industries America’s Senior Database Manager, Russ Toyama will join our executives in our studio in Redwood Shores to discuss GoldenGate’s core benefits and the new release, while Surren Partharb, CTO of Strategic Technology Services for BT, and Mark Rittman, CTO of Rittman Mead, will provide their comments via the interviews conducted in the UK. This interactive panel discussion in the video webcast will unveil the new release with the expertise of our development executives and the great insight from our customers and partners. In addition, our product experts will be available online to answer chat questions. This is really a great opportunity to learn how Oracle's data integration offering has changed the integration and replication technology space with the new release, and established itself as the new leader.

    If you have not registered for this free event yet, you can do so via this link. We will run the live event at 8am PT/4pm GMT, followed by a replay of the event with live chat for Q&A  at 10am PT/6pm GMT. The replay will be available on-demand for those who register but cannot attend either session on November 12th.

    Tuesday Nov 05, 2013

    ODI 12c - Aggregating Data

    This posting will look at the aggregation component that was introduced in ODI 12c. For many ETL tool users this shouldn't be a big surprise, its a little different than ODI 11g but for good reason. You can use this component for composing data with relational like operations such as sum, average and so forth. Also, Oracle SQL supports special functions called Analytic SQL functions, you can use a specially configured aggregation component or the expression component for these now in ODI 12c. In database systems an aggregate transformation is a transformation where the values of multiple rows are grouped together as input on certain criteria to form a single value of more significant meaning - that's exactly the purpose of the aggregate component.

    In the image below you can see the aggregate component in action within a mapping, for how this and a few other examples are built look at the ODI 12c Aggregation Viewlet here - the viewlet illustrates a simple aggregation being built and then some Oracle analytic SQL such as AVG(EMP.SAL) OVER (PARTITION BY EMP.DEPTNO) built using both the aggregate component and the expression component.

    In 11g you used to just write the aggregate expression directly on the target, this made life easy for some cases, but it wan't a very obvious gesture plus had other drawbacks with ordering of transformations (agg before join/lookup. after set and so forth) and supporting analytic SQL for example - there are a lot of postings from creative folks working around this in 11g - anything from customizing KMs, to bypassing aggregation analysis in the ODI code generator.

    The aggregate component has a few interesting aspects.

    1. Firstly and foremost it defines the attributes projected from it - ODI automatically will perform the grouping all you do is define the aggregation expressions for those columns aggregated. In 12c you can control this automatic grouping behavior so that you get the code you desire, so you can indicate that an attribute should not be included in the group by, that's what I did in the analytic SQL example using the aggregate component.

    2. The component has a few other properties of interest; it has a HAVING clause and a manual group by clause. The HAVING clause includes a predicate used to filter rows resulting from the GROUP BY clause. Because it acts on the results of the GROUP BY clause, aggregation functions can be used in the HAVING clause predicate, in 11g the filter was overloaded and used for both having clause and filter clause, this is no longer the case. If a filter is after an aggregate, it is after the aggregate (not sometimes after, sometimes having). 

    3. The manual group by clause let's you use special database grouping grammar if you need to. For example Oracle has a wealth of highly specialized grouping capabilities for data warehousing such as the CUBE function. If you want to use specialized functions like that you can manually define the code here. The example below shows the use of a manual group from an example in the Oracle database data warehousing guide where the SUM aggregate function is used along with the CUBE function in the group by clause.

    The SQL I am trying to generate looks like the following from the data warehousing guide;

    1. SELECT channel_desc, calendar_month_desc, countries.country_iso_code,
    2.       TO_CHAR(SUM(amount_sold), '9,999,999,999') SALES$
    3. FROM sales, customers, times, channels, countries
    4. WHERE sales.time_id=times.time_id AND sales.cust_id=customers.cust_id AND
    5.   sales.channel_id= channels.channel_id
    6.  AND customers.country_id = countries.country_id
    7.  AND channels.channel_desc IN
    8.   ('Direct Sales', 'Internet') AND times.calendar_month_desc IN
    9.   ('2000-09', '2000-10') AND countries.country_iso_code IN ('GB', 'US')
    10. GROUP BY CUBE(channel_desc, calendar_month_desc, countries.country_iso_code);

    I can capture the source datastores, the filters and joins using ODI's dataset (or as a traditional flow) which enables us to incrementally design the mapping and the aggregate component for the sum and group by as follows;

    In the above mapping you can see the joins and filters declared in ODI's dataset, allowing you to capture the relationships of the datastores required in an entity-relationship style just like ODI 11g. The mix of ODI's declarative design and the common flow design provides for a familiar design experience. The example below illustrates flow design (basic arbitrary ordering) - a table load where only the employees who have maximum commission are loaded into a target. The maximum commission is retrieved from the bonus datastore and there is a look using employees as the driving table and only those with maximum commission projected.

    Hopefully this has given you a taster for some of the new capabilities provided by the aggregate component in ODI 12c. In summary, the actions should be much more consistent in behavior and more easily discoverable for users, the use of the components in a flow graph also supports arbitrary designs and the tool (rather than the interface designer) takes care of the realization using ODI's knowledge modules.

    Interested to know if a deep dive into each component is interesting for folks. Any thoughts? 

    About

    Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality

    Search

    Archives
    « April 2014
    SunMonTueWedThuFriSat
      
    2
    3
    5
    6
    7
    8
    9
    10
    12
    13
    14
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
       
           
    Today