Friday Feb 28, 2014

Pivoting Data in ODI 12c

We have recently added several new Mapping Components in Oracle Data Integrator 12c such as Pivot or Unpivot. In this blog post we will walk you through an example of how to use the new Pivot Component.

You can use the following SQL statements to recreate this example in your environment. It will create the source (PIVOT_TEST) and target (TRG_PIVOT_TEST) tables used in this article in your database then you can reverse engineer them in ODI.

CREATE TABLE pivot_test (
year NUMBER,
quarter VARCHAR2(255),
sales NUMBER

insert into pivot_test values (2012, 'Q1', 10.5);
insert into pivot_test values (2012, 'Q2', 11.4);
insert into pivot_test values (2012, 'Q3', 9.5);
insert into pivot_test values (2012, 'Q4', 8.7);
insert into pivot_test values (2013, 'Q1', 9.5);
insert into pivot_test values (2013, 'Q2', 10.5);
insert into pivot_test values (2013, 'Q3', 10.3);
insert into pivot_test values (2013, 'Q4', 7.6);

CREATE TABLE trg_pivot_test (
year NUMBER,
q1_sales NUMBER,
q2_sales NUMBER,
q3_sales NUMBER,
q4_sales NUMBER

Our goal is to pivot the data on the Quarter column when going from PIVOT_TEST into TRG_PIVOT_TEST as shown below:

Follow these steps to add and configure a Pivot Component in an ODI 12c Mapping:

  1. First add the Source table PIVOT_TEST into your Mapping, to do so drag and drop the PIVOT_TEST datastore from the Models into the Mapping.
  2. Next add a Pivot component into the Mapping. This is done by clicking on the Pivot Component in the Component palette and then clicking on the Mapping diagram. A new PIVOT component will appear in the Mapping:

  3. Drag and drop the YEAR column from PIVOT_TEST into the PIVOT component. There is no need to add the QUARTER and SALES attributes yet, they will be used later in the Row Locator and Attributes sections.

  4. Click on the PIVOT component and in the Properties window select the Row Locator panel. In our example the Row Locator will be the QUARTER column which is transposed from rows into 4 columns in our target table TRG_PIVOT_TEST.

  5. Open up the Expression Editor next to the Row Locator field and select the QUARTER column from our source table PIVOT_TEST. Then click OK.

  6. Now specify the various values the QUARTER column can take. This is done using the Row Locator Values table. Click on the + icon under Row Locator Values and add the 4 possible values: ‘Q1’, ‘Q2’, ‘Q3’ and ‘Q4’.

  7. Then click on the Attributes panel and add the 4 output attributes which correspond to each Row Locator values: Q1_SALES, Q2_SALES, Q3_SALES and Q4_SALES.

  8. Next select a Matching Row for the output attributes you just created. The Matching Row values come from the Row Locator Values entered earlier.
    Pick ‘Q1’ for Q1_SALES, ‘Q2’ for Q2_SALES, ‘Q3’ for Q3_SALES and ‘Q4’ for Q4_SALES.
    Finally enter an expression for each of the new attributes, use PIVOT_TEST.SALES for all of them as we are interested in getting the Sales data into those columns. You can type the expression using the Expression Editor or drag and drop the SALES column from PIVOT_TEST into each of the newly created attributes.

  9. Finally add the target table TRG_PIVOT_TEST and connect the PIVOT component to it. Unselect the Create Attributes on Source checkbox in the Attribute Matching window and click OK to finish the Mapping configuration.

  10. In this example you can use the default Physical settings for your Mapping. Integration Type is set to Control Append by default and the IKM Oracle Insert is used.
  11. Click on Run to execute the Mapping, 2 inserts are performed and you should see the following data in your target table.

  12. If you review the generated code you will notice that ODI leverages the PIVOT function on Oracle to perform such operation. The Pivot component supports Oracle as well as any other database supported by ODI 12c.

You can recreate the following example using the ODI 12c Getting Started VirtualBox image which is available on OTN:

Monday Feb 24, 2014

Highlighting Oracle Data Integrator 12c (ODI12c)

Towards the last two months of 2013 we highlighted several features of ODI12c's various features with full length blogs for each of the features. This was so popular that we bring you a one stop shop where you can browse through the various entries at your convenience. This is a great page to bookmark, even if we say it ourselves if you are using or thinking of using ODI12c.

Blog Title

Blog Description

Kicking off the ODI12c Blog Series

Talks about ODI12c themes and features at a high level shedding light on the new releases focus areas.

ODI 12c's Mapping Designer - Combining Flow Based and Expression Based Mapping

Talks about ODI's new declarative designer with the familiar flow based designer.

Big Data Matters with ODI12c

Talks about ODI12c enterprise solutions for the movement, translation and transformation of information and data heterogeneously and in Big Data Environments.

ODI 12c - Aggregating Data

Look at the aggregation component that was introduced in ODI 12c for composing data with relational like operations such as sum, average and so forth.

ODI 12c - Parallel Table Load

Looks at the ODI 12c capability of parallel table load from the aspect of the mapping developer and the knowledge module developer - two quite different viewpoints.

In-Session Parallelism in ODI12c

Discusses the new in-session parallelism, the intelligence to concurrently execute part of the mappings that are independent of each other, introduced in the ODI12c release.

ODI 12c - Mapping SDK the ins and outs

Talks about the ODI 12c SDK that provides a mechanism to accelerate data integration development using patterns and the APIs in the SDK

ODI 12c - XML improvements

Explains ODI support to advanced XML Schema constructs including union, list, substitution groups, mixed content, and annotations.

ODI 12c - Components and LKMs/IKMs

Illustrates capabilities of ODI 12c's knowledge module framework in combination with the new component based mapper.

Welcome Oracle Management Pack for Oracle Data Integrator! Let’s maximize the value of your Oracle Data Integrator investments!

To help you make the most of Oracle Data Integrator, and to deliver a superior ownership experience in an effort to minimize systems management costs, Oracle recently released Oracle Management Pack for Oracle Data Integrator.

ODI 12c - Mapping SDK Auto Mapping

If you want to properly leverage the 12c release the new mapping designer and SDK is the way forward.

ODI 12c - Table Functions, Parallel Unload to File and More

Helps you integrate an existing table function implementation into a flow.

And below is a list of “how to” and “hands on” blogs about ODI 12c and to get started.

ODI 12c - Getting up and running fast

A quick A-B-C to show you how to quickly get up and running with ODI 12c, from getting the software to creating a repository via wizard or the command line, then installing an agent for running load plans and the like.

Time to Get Started with Oracle Data Integrator 12c!

We would like to highlight for you a great place to begin your journey with ODI.  Here you will find the Getting Started section for ODI,

ODI 12c - Slowly Changing Dimensions

Helps setup a slowly changing dimension load in ODI 12c, everything from defining the metadata on the datastore to loading the data.

ODI 12c - Temporal Data Loading

The temporal validity feature in 12c of the Oracle Database is a great feature for any time based data. If you are thinking dimensional data that varies over time.... the temporal validity capabilities of 12c are a great fit, worth checking it out.

ODI 12.1.2 Demo on the Oracle BigDataLite Virtual Machine

Oracle's big data team has just announced the Oracle BigDataLite Virtual Machine, a pre-built environment to get you started on an environment reflecting the core software of Oracle's Big Data Appliance 2.4. BigDataLite is a VirtualBox VM that contains a fully configured Cloudera Hadoop distribution CDH 4.5, an Oracle DB 12c, Oracle's Big Data Connectors, Oracle Data Integrator 12.1.2, and other software.

Webcast - Oracle Data Integrator 12c and Oracle Warehouse Builder

If you missed the recent Oracle Data Integrator 12c and Oracle Warehouse builder live webcast. You can catch up on the events and connect with us with your feedback here. Here we discuss customer examples,ODI12c new features, Big Data compatibility, Oracle Warehouse Builder Migration Utility and Support and live Q and A among other topics.  

 For more information on Oracle Data Integrator visit the ODI Resource Center.

Tuesday Feb 18, 2014

Recap of Oracle GoldenGate 12c Webcast with Q&A

Simply amazing! That’s how I would summarize last week’s webcast for Oracle GoldenGate 12c.  It was a very interactive event with hundreds of live attendees and hundreds of great questions. In the presentation part my colleagues, Doug Reid and Joe deBuzna, went over the new features of Oracle GoldenGate 12c. They explained Oracle GoldenGate 12c key new features including:

  • Integrated Delivery for Oracle Database,
  • Coordinated Delivery for non-Oracle databases,
  • Support for Oracle Database 12c multitenant architecture,
  • Enhanced high availability via integration with Oracle Data Guard Fast-Start Failover,
  • Expanded heterogeneity, i.e. support for new databases and operating systems,
  • Improved security,
  • Low-downtime database migration solutions for Oracle E-Business Suite,
  • Integration with Oracle Coherence.

We also had a nice long and live Q&A section. In the previous Oracle GoldenGate webcasts, we could not respond to all audience questions in a 10-15 minute timeframe at the end of the presentation. This time we kept the presentation part short and left more than 30 minutes for Q&A. To our surprise, we could not answer even half of the questions we received. 

If you missed this great webcast discussing the new features of Oracle GoldenGate 12c,  and more than 30 minutes of Q&A with GoldenGate Product Management, you can still watch it on demand via the link below.

On Demand Webcast: Introducing Oracle GoldenGate 12c: Extreme Performance Simplified

On this blog post I would like to provide brief answers from our PM team  for some of the questions that we were not able to answer during the live webcast.

1) Does Oracle GoldenGate replicate DDL statements or DML for Oracle Database?

    Oracle GoldenGate replicates DML and DDL operations for Oracle Database and Teradata.

2) Where do we get more info on how to setup integration with Data Guard Fast-Start Failover (FSFO)?

     Please see the following blog posts or documents on My Oracle Support:

Best Practice - Oracle GoldenGate and Oracle Data Guard - Switchover/Fail-over Operations for GoldenGate    [My Oracle Support Article ID   1322547.1] 

Best Practice - Oracle GoldenGate 11gr2 integrated extract and Oracle Data Guard - Switchover/Fail-over Operations  [My Oracle Support Article ID 1436913.1] 

3) Does GoldenGate support SQL Server 2012 extraction? In the past only apply was supported.

Yes, starting with the new 12c release GoldenGate captures from SQL Server 2012 in addition to delivery capabilities.

4) Which RDBMS does GoldenGate 12c support?

GoldenGate supports all major RDBMS. For a full list of supported platforms please see Oracle GoldenGate certification matrix.

5) Could you provide some more details please on Integrated Delivery for dynamic parallel threads at Target side?

Please check out our white papers on Oracle GoldenGate 12c resource kit for more details on the new features, and how Oracle GoldenGate 12c works with Oracle Database. 

6) What is the best way to sync partial data (based on some selection criterion) from a table between databases?

 Please refer to the article: How To Resync A Single Table With Minimum Impact To Other Tables' Replication? [Article ID 966211.1]

7) How can GoldenGate be better than database trigger to push data into custom tables?

Triggers can cause high CPU overhead, in some cases almost double compared to reading from redo or transaction logs. In addition, they are intrusive to the application and cause management overhead as application changes. Oracle GoldenGate's log-based change data capture is not only low-impact in terms of CPU utilization, but also non-intrusive to the application with low maintenance requirements.

8) Are there any customers in the manufacturing industry using GoldenGate and for which application?

We have many references in manufacturing. In fact, SolarWorld USA was our guest speaker in the executive video webcast last November. You can watch the interview here. RIM Blackberry uses Oracle GoldenGate for multi-master replication between its global manufacturing systems. Here is another manufacturing customer story from AkzoNobel.

9) Does GoldenGate 12c support compressed objects for replication? Also does it supports BLOB/CLOB columns?

Yes, GoldenGate 12c and GoldenGate 11gR2 both support compressed objects. GoldenGate has been supporting BLOB/CLOB columns since version 10.

10) Is Oracle Database mandatory to use GoldenGate 12c Integrated Delivery? Not earlier versions?

Yes. To use GoldenGate 12c’s Integrated Delivery, for the target environment Oracle Database 11.2.04 and above is required .

11) We have Oracle Streams implementation for more than 5 years. We would like to migrate to GoldenGate, however older version of GoldenGate were not supporting filtering individual transactions. Is it supported in GoldenGate 12c?

      Yes, it is supported in GoldenGate 12c.

In future blog posts I will continue to provide answers for common questions we received in the webcast. In the meanwhile I highly recommend watching the Introducing Oracle GoldenGate 12c: Extreme Performance Simplified webcast on demand.

Friday Feb 14, 2014

ODI 12c - Table Functions, Parallel Unload to File and More

ODI 12c includes a new component for integrating and transformation data programmatically, there have been plenty of examples through the years of such implementations, recent examples include SQL access to R from Mark Hornick (see an example blog here). As well as a great integration technique they have fantastic performance and scalability options - hence you see posts and talks from Kuassi Mensah on in-database map-reduce; all about leveraging the Oracle database's parallel query engine and the skills you already have (SQL and PLSQL/java).

The table function component in ODI 12c lets you integrate an existing table function implementation into a flow - the parameters for the table function can be scalar or a ref cursor, you can see how the examples from the AMIS posting here are defined within the mapping designer below, there are multiple table functions chained together, used as both a data source and a transformation;

In the above image you can see the table function name defined in the database is specified in the component's general properties (property is Function Name). The signature for the function must be manually defined by adding input/output connector points and attributes. Check the AMIS blog and reflect on the design above.

Regarding performance, one of the examples I blogged (OWB here and ODI here) was parallel unload to file. The table function examples from those previous blogs were fairly rudimentary, in this blog we will see what happens when we tweak the implementation of such functions - we can get much better performance. Here is the table function implementation I will use within the ODI examples (the type definitions used come from the OWB blog post above).

  1. create or replace function ParallelUnloadX (r SYS_REFCURSOR) return NumSet 
  3.    TYPE row_ntt IS TABLE OF VARCHAR2(32767);
  4.    v_rows row_ntt;
  5.    v_buffer VARCHAR2(32767);
  6.    i binary_integer := 0; 
  7.    v_lines pls_integer := 0;
  8.    c_eol CONSTANT VARCHAR2(1) := CHR(10); 
  9.    c_eollen CONSTANT PLS_INTEGER := LENGTH(c_eol); 
  10.    c_maxline CONSTANT PLS_INTEGER := 32767; 
  11.    out utl_file.file_type; 
  12.    filename varchar2(256) := 'dbunload'; 
  13.    directoryname varchar2(256) := 'MY_DIR'; 
  14.    vsid varchar2(120); 
  15. begin 
  16.    select sid into vsid from v$mystat where rownum=1; 
  17.    filename := filename || vsid || '.dat'; 
  18.    out := utl_file.fopen (directoryname, filename , 'w');

  19.    loop 
  20.      fetch r BULK COLLECT INTO v_rows; 
  21.      for i in 1..v_rows.COUNT LOOP
  22.        if LENGTH(v_buffer) + c_eollen + LENGTH (v_rows(i)) <= c_maxline THEN
  23.          v_buffer := v_buffer || c_eol || v_rows(i);
  24.        else
  25.          IF v_buffer IS NOT NULL then
  26.            utl_file.put_line(out, v_buffer);
  27.          end if;
  28.          v_buffer := v_rows(i);
  29.        end if;
  30.      end loop;
  31.      v_lines := v_lines + v_rows.COUNT;
  32.      exit when r%notfound;
  33.    end loop;
  34.    close r;
  35.    utl_file.put_line(out, v_buffer); 

  36.    utl_file.fclose(out); 
  37.    PIPE ROW(i); 
  38.    return ;
  39. end; 
  40. /

The function uses PARALLEL_ENABLE and PARTITION BY keywords - these 2 are critical to performance and scalability. In addition, this function is further optimized; it uses the PLSQL BULK COLLECT capability and also buffers data in PLSQL variables before writing to file (this avoids IO calls). This was not rocket science to tune (plenty of posts on PLSQL IO tuning such as this) yet you can see the impact it has on performance further below.

My mapping using the table function as a target is shown below, 

In the physical design I define the parallel hints, this will then perform parallel unloads to file and you can easily leverage the hardware and power of the Oracle database. Using the hints to tweak the physical design let's the designer very easily compare and tune performance - you do not have to design the parallelism in your own flows.

In the table below you can see the performance difference when I use the PARALLEL(4) hint on a 4 CPU machine;

5 million rows  16s  6s
32 million rows 200s  47s 

If I execute the agent based SQL to file LKM, the time taken out of the box is 398 seconds (slower than 47s above when a hint is used) on the 32 million row example, the only divide and conquer techniques with the LKM are building a custom workflow to do such. With the table function approach if your database is on a bigger, more powerful host you can easily take advantage of the system by tweaking the hints.

As you see, the ODI table function component provides another custom exit point in a flow which let's you not only provide some useful integration capabilities but you can also do it in a very efficient manner - leveraging and exploiting the database you are running on. Hopefully this gives you a little insight and an interesting use case or two.

Wednesday Feb 12, 2014

ODI 12c - Mapping SDK Auto Mapping

The ODI 12c release has the new flow based mapping designer, this comes with new concepts to make the mapping design as efficient as possible as well as the runtime execution of such! The 12c release also has a new SDK for mapping, the 11g SDK is still available for backwards compatibility, but if if you want to properly leverage the 12c release the new mapping designer and SDK is the way forward. I posted a bunch of SDK examples (here and here) which demonstrated different mapping designs - the examples were in groovy and the column/attribute level mapping expressions were all done explicitly, I did not illustrate any auto mapping capabilities. So... I thought I should do it here. In doing so I'll show some other APIs within the mapping area that are very useful.

The 12c release introduced mapping components and categorized such components so that we can minimize the column level mapping expressions. If you compare ODI 12c with OWB, ODI 12c has a lot less inter component and cross component information, OWB capture a lot of information in a very explicit manner (it was very verbose, concise, but verbose).

One of the useful capabilities in the UI is to perform auto mapping, the function createExpressions below will use all available in-scope attributes that are upstream from the target component and match with attributes in the component you are targeting. The match can be done by equality, ends or starts and ignore case or exact match. Quite a simple piece of code and you can see the use of the function getUpstreamLeafAttributes for components or even getUpstreamInScopeAttributes for connector points. Some components have multiple input connector points with different graphs, for example set component, other ones are simple.

  2. enum MatchCaseTypes {MATCH,IGNORECASE}

  3. def createExpressions(component, conPoint, matchType, matchCaseType) { 
  4.   atts = null
  5.   if (conPoint != null)   atts = conPoint.getUpstreamInScopeAttributes()
  6.   else atts = component.getUpstreamLeafAttributes(component)
  7.   tatts = component.getAttributes()
  8.   for (MapAttribute tgt_attr : tatts) {
  9.     attr_str = tgt_attr.getName()
  10.     if (matchCaseType == MatchCaseTypes.IGNORECASE) {
  11.       attr_str = attr_str.toLowerCase()
  12.     }
  13.     sourceCol = null;
  14.     for (MapAttribute src_attr : atts) {
  15.       src_attr_str = src_attr.getName()
  16.       if (matchCaseType == MatchCaseTypes.IGNORECASE) {
  17.        src_attr_str = src_attr_str.toLowerCase()
  18.       }
  19.       if ( (matchType == MatchTypes.SRCENDSWITH && src_attr_str.endsWith( attr_str )) ||
  20.            (matchType == MatchTypes.SRCSTARTSWITH && src_attr_str.startsWith( attr_str )) ||
  21.            (matchType == MatchTypes.TGTSTARTSWITH && attr_str.startsWith( src_attr_str )) ||
  22.            (matchType == MatchTypes.TGTENDSWITH && attr_str.endsWith( src_attr_str )) ||
  23.            (matchType == MatchTypes.EQUALS && attr_str.equals( src_attr_str )) ) {
  24.        sourceCol = src_attr
  25.        break
  26.       }
  27.     }
  28.     if (sourceCol != null && conPoint != null)  tgt_attr.setExpression( conPoint, sourceCol, null )      
  29.     else if (sourceCol != null)  tgt_attr.setExpression( sourceCol )      
  30.   }
  31. }

You can then call this function on a datastore to auto map all attribute expressions in the component as follows;

  • createExpressions(tgtempDatastoreComponent, null,MatchTypes.EQUALS,MatchCaseTypes.MATCH);

To illustrate the set component, you can code the population of each connector point as follows;

  • createExpressions(setComponent, inConnectorPoint1,MatchTypes.EQUALS,MatchCaseTypes.MATCH);
  • createExpressions(setComponent, inConnectorPoint2,MatchTypes.EQUALS,MatchCaseTypes.IGNORECASE);

this will auto map the attributes in a set component for each connector point with different rules (just for illustration purposes). You can see below the result of calling these 2 functions on the set component for each connector point. All upstream in-scope attributes are considered.

These APIs to get scoping attributes make it simple to build customized accelerators for building expressions when auto mapping. Its a little different than in 11g, have a look at the examples I posted above and the snippets above, there's a lot you can do and its easy to utilize.

Wednesday Feb 05, 2014

Introducing Oracle GoldenGate 12c: Extreme Performance Simplified

Oracle GoldenGate 12c was released last fall with a long list of new features that simplify configuration and increase flexibility, while delivering easy-to-use, advanced solutions with multi-fold performance gain.  We have been discussing these new features in various blog posts including:

· Advanced Replication for The Masses – Oracle GoldenGate 12c for the Oracle Database.

· GoldenGate 12c - What is Coordinated Delivery?

· GoldenGate 12c - Coordinated Delivery Example

· Oracle GoldenGate 12c - Announcing Support for Microsoft and IBM

And you can find new white papers about the 12c release in the Oracle GoldenGate resource kit.

Following the executive video webcast launching Oracle Data Integrator 12c and Oracle GoldenGate 12c in November 2013, we have set up another webcast for Oracle GoldenGate 12c where our product management team discusses the key new features in more depth and takes live questions from the audience.

If you would like to learn more about GoldenGate 12c I invite you to join us on Feb 12th in a webcast with product experts. You can register for this free event via the link below.

Live Webcast: Introducing Oracle GoldenGate 12c: Extreme Performance Simplified

February 12, 2014 -  10am PT/ 1pm ET

If you have missed our executive launch webcast I highly recommend that you watch it on demand via the link below. It talks about the tighter integration between Oracle Data Integrator 12c and Oracle GoldenGate and features customer and partner speakers from SolarWorld, BT, and Rittman Mead Consulting.

On-Demand Video Webcast: Introducing 12c for Oracle Data Integration


Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality


« February 2014 »