Monday Dec 02, 2013

Planning the journey from Oracle Warehouse Builder to Oracle Data Integrator

by Julien Testut and David Allan

Introduction:

In this post we are going to discuss how Oracle Warehouse Builder (OWB) customers can easily migrate to Oracle Data Integrator 12c (ODI12c) thanks to two key new features introduced recently.

As mentioned in our statement of direction Oracle Data Integrator is Oracle's strategic product for heterogeneous data integration. recognizing that many Oracle Database customers have a significant investment in OWB, ODI now provides mechanisms to support a phased migration, specifically from OWB 11gR2 (11.2.0.4) to ODI 12c (12.1.2).

The following features are provided to make the transition to ODI easier:

      · ODI 12c supports the execution and administration of OWB 11gR2 jobs directly within ODI Studio, ODI Console and Enterprise Manager. This provides a single orchestration and monitoring solution and allows companies to continue running OWB Mappings and perform a phased migration to ODI.

       · A new migration utility is provided that will automatically translate many OWB objects and mappings into their ODI equivalents. The migration utility is a command-line tool and requires two patches: one for OWB (17547241) and one for ODI (17053768, both can be found on My Oracle Support.

Phased Migration using OWB and ODI run time Integration

The run time integration of Oracle Warehouse Builder and Oracle Data Integrator can be used a first step towards aligning with Oracle’s data integration strategy. You can start new developments in ODI 12c while continuing to run side-by-side the two products and start migrating OWB Mappings into ODI Mappings using the migration utility.

In Oracle Data Integrator 12c it is now possible to define a connection to an Oracle Warehouse Builder workspace in Topology Navigator. Storing the connection and credential details in the ODI repository allows developers to invoke OWB processes in Packages with the OdiStartOwbJob tool (OdiStartOwbJob documentation). It also allows developers and operators to monitor the execution of OWB processes in ODI Studio, ODI Console or Enterprise Manager along with the rest of the ODI jobs.

The Technical Details

Invoking OWB processes in ODI 12c is simple. You have to first add the OdiStartOwbJob tool to a Package. Then point the tool to the OWB workspace entry in Topology. This then allows you to browse the content of the OWB workspace and select the specific processes you’d like to invoke as you can see in the following screenshot:


Once the Package design is complete, you can execute it and monitor its execution in ODI. The OWB and ODI logs have been integrated allowing end users to centrally monitor their ETL processes.


OWB to ODI Migration Utility

     In addition to the integration of OWB jobs there is also a metadata migration utility to migrate metadata from OWB 11gR2 to ODI 12c. A wide range of metadata can now be migrated with this initial release;(you can find a complete list of objects in the documentation). A large selection of core mapping operators are supported – In addition to the patch containing the utility for the OWB migration, there is also a patch for ODI 12c, and new Knowledge Modules to support some of the OWB mapping capabilities. The user documentation explores these considerations in great detail.

     1. The migration process itself, a good read to be prepared

     2. Details for preparing to migrate

     3. Information on using the migration utility

     4. Reviewing the migration – what to look for in the logs and so on

Error reference guide and a list of migrated components

The Technical Details

Below you can see a project in OWB 11gR2; this has a number of modules including Oracle and flat file modules; the Oracle one has mappings for loading a data warehouse from external tables based on flat files.



The migration utility is a command line tool that is executed from the OWB home. The utility uses a driver file to specify which modules and objects to migrate. It also has 3 modes of operation, this lets you discover how much of your project will get migrated. The modes are;

      · FAST_CHECK – performs a read only check, which is the fastest way to get an understanding of how much will be migrated.

      · DRY_RUN – a sanity check mode which will try creating the objects in-memory using the ODI 12c SDK but will not commit in the ODI repository.

      · RUN – performs the migration and commit the newly migrated objects in the 12c repository.

The driver properties file can specify the specific objects to migrate, specify wildcards and lists of objects. The utility produces a report and a log file from its execution. For example, when I migrated the OWB project above, I used the following driver file;

      · MIGRATION_OBJECTS=PROJECT.FERRARI_DEMO

      · ODI_MASTER_USER=DEV_ODI_REPO

      · ODI_MASTER_URL=jdbc:oracle:thin:@localhost:1521:orcl11204

      · ODI_MASTER_DRIVER=oracle.jdbc.OracleDriver

      · ODI_USERNAME=SUPERVISOR

      · ODI_WORK_REPOSITORY_NAME=WORKREP

      · OWB_WORKSPACE_OWNER=OWB_REPOS

      · OWB_URL=localhost:1521:orcl11204

      · OWB_WORKSPACE_NAME=OWB_WKSP

      · MIGRATION_MODE=RUN

The utility driver parameters are fully document in the user documentation. Firstly I ran the fast check just to see if there were any issues reported. There were none which was good news and bad – I wanted to illustrate an example that wasn’t migrated, so I will create another to show you what happens.

In the report file when I executed there is a convenient summary of what happened, you can quickly see what was migrated, the number of objects and also a column defining if any were not migrated;


You can see the migration utility also has migrated external tables; As part of the patch for ODI there is a new component KM to define an external table on top of the flat file definition. This supports the external table capabilities that were in OWB (both bound and unbound external table). Below you can see the mapping loading the sales data that was shown above, it looks identical to what was in OWB, the datastore representing the external table is still the source, the data is split and written to multiple targets.


The mapping has been created and configured to mimic the OWB behavior. The utility doesn’t configure the multi insert IKMs as this was an optimization within the OWB code generator, in ODI 12c you simply change the assignment of the IKM on each target. So in above my targets were each assigned IKM Oracle Insert, I changed them both to IKM Oracle Multi-Insert after the migration. You can see the physical configuration below;


You can then generate the code and in simulation mode see the Oracle multi table insert generated processing the external table (also created in the mapping) as source.

I mentioned that the report details information on the objects which cannot be migrated. An example is a pure row based PL\SQL mapping (for example where a PL\SQL procedure was used), this kind of mapping would result in a mapping not automatically migrated, you would see errors such as the following in the migration report;

      · MAP_ROW_BASED

o [ERROR][Migration][MU-5003]Unable to migrate mapping with mapping operator TRANSFORMATION:GEN_DATA_PROC with OUT parameter DATA_NAME.

o [ERROR][Migration][MU-5003]Unable to migrate mapping with mapping operator TRANSFORMATION:GEN_DATA_PROC with OUT parameter DATA_VALUE.

o [ERROR][Migration][MU-5006]Unable to migrate mapping with mapping operator TRANSFORMATION:GEN_DATA_PROC that does not define return attribute.

With these kinds of cases one approach to moving forward to is wrap the PL\SQL procedure in a function if possible. The utility will serve multiple purposes – along with migrating the metadata, for OWB users it will also serve as a way of understanding how concepts map from OWB to ODI.

Summary

In this post we discussed how Oracle Warehouse Builder customers can easily migrate to Oracle Data Integrator 12c thanks to the run time integration and metadata migration features introduced recently.

That’s a little taster of the capabilities with the migration, more detailed information on specific components will keep coming. There are now a lot of new exciting capabilities in each. These components are now no longer Oracle only but also can be used for many different systems so there is a lot of fun and exciting possibilities, stay tuned!

Monday Nov 04, 2013

Big Data Matters with ODI12c

contributed by Mike Eisterer

On October 17th, 2013, Oracle announced the release of Oracle Data Integrator 12c (ODI12c).  This release signifies improvements to Oracle’s Data Integration portfolio of solutions, particularly Big Data integration.

Why Big Data = Big Business

Organizations are gaining greater insights and actionability through increased storage, processing and analytical benefits offered by Big Data solutions.  New technologies and frameworks like HDFS, NoSQL, Hive and MapReduce support these benefits now. As further data is collected, analytical requirements increase and the complexity of managing transformations and aggregations of data compounds and organizations are in need for scalable Data Integration solutions.

ODI12c provides enterprise solutions for the movement, translation and transformation of information and data heterogeneously and in Big Data Environments through:

  • The ability for existing ODI and SQL developers to leverage new Big Data technologies.
  • A metadata focused approach for cataloging, defining and reusing Big Data technologies, mappings and process executions.
  • Integration between many heterogeneous environments and technologies such as HDFS and Hive.
  • Generation of Hive Query Language.

Working with Big Data using Knowledge Modules

 ODI12c provides developers with the ability to define sources and targets and visually develop mappings to effect the movement and transformation of data.  As the mappings are created, ODI12c leverages a rich library of prebuilt integrations, known as Knowledge Modules (KMs).  These KMs are contextual to the technologies and platforms to be integrated.  Steps and actions needed to manage the data integration are pre-built and configured within the KMs. 

The Oracle Data Integrator Application Adapter for Hadoop provides a series of KMs, specifically designed to integrate with Big Data Technologies.  The Big Data KMs include:

  • Check Knowledge Module
  • Reverse Engineer Knowledge Module
  • Hive Transform Knowledge Module
  • Hive Control Append Knowledge Module
  • File to Hive (LOAD DATA) Knowledge Module
  • File-Hive to Oracle (OLH-OSCH) Knowledge Module 

Nothing to beat an Example:

To demonstrate the use of the KMs which are part of the ODI Application Adapter for Hadoop, a mapping may be defined to move data between files and Hive targets. 

The mapping is defined by dragging the source and target into the mapping, performing the attribute (column) mapping (see Figure 1) and then selecting the KM which will govern the process. 

In this mapping example, movie data is being moved from an HDFS source into a Hive table.  Some of the attributes, such as “CUSTID to custid”, have been mapped over.