Friday Oct 24, 2014

Automating ODI development tasks using the SDK

By Ayush Dash, Oracle Consulting Services

Oracle Data Integrator (ODI) 11g uses Interfaces to move data from a source to a target datastore. ODI Studio is a very convenient drag and drop UI to build, test and execute such interfaces. These interfaces can have different processing logic and complex transformations for disparate requirements, but there could be interfaces which behave in the exact same manner except the source and target are different.

Let’s say, I have these requirements defined, able to identify the different buckets of processing required and develop respective interfaces. That’s easily done! However, if I change the requirement such that I need 10 interfaces for each bucket, the requirement gets little complex and am I face increased level of effort. What about 100 such interfaces for each buckets? Much more effort required now! It’s the same repetitive set of tasks but it needs to be done for each interface for each bucket. The problem we face here is to somehow expedite and automate the entire sequence of steps for each bucket and reduce the redundant, manual development of ODI interfaces. As the number of interface grows, our problem (increase in effort) compounds.

Note, this is not limited to interfaces only, it can be extended to generate scenarios, packages etc.

Use Case:

In one of my ODI engagements, we had the below requirements with aggressive timelines.

  1. Incremental Loads from a Source Database to Incremental Database. (ODI interfaces)
  2. Data masking on Incremental Database (not an ODI task)
  3. Incremental loads from Incremental Database to Target Database. (ODI Interfaces)

This had to be done for Oracle and PeopleSoft (2 buckets) and a total of 2300 tables (So a total of 4600 interfaces. 2300 interfaces for step 1 and 2300 for step 3) and eventually respective scenarios.

ODI SDK Groovy Scripts:

ODI Studio provides a Groovy Editor; a java based scripting language as part of its install. Groovy can be leveraged to work with ODI SDK and build automation scripts. Below is the list of scripts;

  • CreateProject – Creates an ODI Project with a ProjectName and FolderName.
  • ImportKnowledgeModules – Imports the specified Knowledgemodules to the KM directories.
  • CreateModels – Creates the source and target Models for existing Datastores.
  • CreateModelsandDatastore – Creates Datastores and Models.
  • CreateInterfaceIterative – Iterates through all the Source Datastores and generates an interface for each with respective Target Datastores.
  • CreateInterfaceIterativeandSetKMOptions – Creates Interfaces and set a KM options iteratively.
  • CreateScenario – Create scenarios for all the interfaces.
  • ExecuteScenario – Executes all the scenarios under all the interfaces.
  • CountInterface – Counts the no. of interfaces, can be used al validation.

The scripts and guide have been uploaded to the Oracle Data Integration project on Java.net: https://java.net/projects/oracledi.

All the scripts can be downloaded from here: https://java.net/projects/oracledi/downloads/download/ODI/SDK%20Samples/ODI%2011g%20SDK%20Automation/Automation%20Scripts/Automation%20Scripts.zip

Please refer to the ODI SDK Automation Guide for detailed steps: https://java.net/projects/oracledi/downloads/download/ODI/SDK%20Samples/ODI%2011g%20SDK%20Automation/ODI%2011g%20SDK%20Automation%20Guide.doc

Thursday Oct 23, 2014

Big Data Governance Webcast

Earlier during the week we announced Oracle Enterprise Metadata Management (OEMM). Join us as we follow up the release with a webcast where we discuss the pressing issue of Data Governance and how it applies to Big Data. Jeff Pollock, Vice President of Product Management, talks about 

  • Applying Governance to Big Data 
  • The role that managing metadata plays in Governance
  • How data quality fits into the whole Governance framework and 
  • The actual product Oracle Enterprise Metadata Management.

Register here and join us for the webcast on the 30th.

Monday Oct 20, 2014

Announcing Availability of Oracle Enterprise Metadata Management

Oracle today announced the general availability of Oracle Enterprise Metadata Management (OEMM), Oracle's comprehensive Metadata Management technology for Data Governance. With this release Oracle stresses the importance that it lays on it's product strategy that not just offers best in class Data Integration solutions like Oracle Data Integrator (ODI), Oracle GoldenGate (OGG) and Oracle Enterprise Data Quality (OEDQ), but also on technology that ties together business initiatives like governance.

Data Governance Considerations

Organizations have been struggling to impose credible governance onto their data for long with ad-hoc processes and technologies that are unwieldy  and unscalable. There were a number of reasons why this was the case.

  • a. Data Governance cannot be done without managing metadata.
  • b. Data Governance cannot be done without extending across all platforms irrespective of technologies.
  • c. Data Governance cannot be done without a business and IT friendly interface.

Complete Stewardship -  Data Transparency from Source to Report

The biggest advantages of having an airtight Data Governance program is to reduce data risk, increase security and to manage your organization's Data Life-cycle. Any governance tool should be able to surface lineage, impact analysis and data flow not just within a Business Analytics, or within a Data Ware house but across all these systems, no matter what technology one is using. This increased transparency assesses accurately risks and impacts during changes to data. 


Data Flow Diagram across platforms.

With a focus on stewardship OEMM is designed to be intuitive and search based. It's search catalog allows easy browsing of all objects with collaboration and social features for the Data Steward.

 Search based catalog and Business Glossary for easy browsing of objects.

Big Data Governance

OEMM along with Oracle Data Integrator provides a powerful combination to govern Big Data standards including HBase, SQOOP and JSON. With ODI providing complete support to these data standards for Data loading and transformation, OEMM harvests the ODI metadata to stitch together a complete data map that even traverses through any Big Data Reservoir that organizations have in place. 

Oracle and 3rd Party Metadata

OEMM is truly heterogeneous. It is designed to pull in and manage metadata from Oracle and 3rd party Data Bases, Data Warehouses, ETL, Business Intelligence, and other Reporting Tools. 

Visit the OEMM homepage for more information about Oracle Enterprise Metadata Management.

Friday Oct 17, 2014

Upcoming Webinar: Data Transformation and Acquisition Techniques, to Handle Petabytes of Data

Many organizations have become aware of the importance of big data technologies, such as Apache Hadoop but are struggling to determine the right architecture to integrate it with their existing analytics and data processing infrastructure. As companies are implementing Hadoop, they need to learn new skills and languages, which can impact developer productivity. Often times they resort to hand-coded solutions which can be brittle, impact the productivity of the developer and the efficiency of the Hadoop cluster.

To truly tap into the business benefits of the big data solutions, it’s necessary to ensure that the business and IT have simple tools-based methods to get data in, change and transform it, and keep it continuously updated with their data warehouse.

In this webinar you’ll learn how the Oracle and Hortonworks solution can:

  • Accelerate developer productivity
  • Optimize data transformation workloads for on Hadoop
  • Lower cost of data storage and processing
  • Minimize risks in deployment of big data projects
  • Provide proven industrial scale tooling for data integration projects

We will also discuss how technologies from both Oracle and Hortonworks can deploy the big data reservoir or data lake, an efficient cost-effective way to handle petabyte-scale data staging, transformations, and aged data requirements while reclaiming compute power and storage from your existing data warehouse.

Speakers:
Jeff Pollock, Vice President, Oracle
Tim Hall, Vice President, Hortonworks

Hosted by:
Tim Matteson
, Co-Founder, Data Science Central

Click Here to Register.

Wednesday Oct 15, 2014

Oracle Data Integrator Certified with Hortonworks HDP 2.1

To often companies fall into what they perceive is the path of least resistance by using custom, hand-coded methods to create big data solutions but with the rush to production these hand coded solutions more often perform slower and are more costly to maintain.   To truly tap into the business benefits of the big data solutions, a simple tools based solutions is required to move large volumes of data into Hadoop and efficiently transform it without the need for costly mid-tier servers.    The Oracle Data Integration Solutions team is pleased to announce the certification of Oracle Data Integrator with Hortonworks HDP 2.1. 

This collaboration between both the Oracle Data Integrator and Hortonworks teams will provide customers a familiar and comprehensive data integration platform for Hadoop covering high-volume, high-performance batch-loads,  agile transformations using the power of Hadoop and a superior developer experience with the flow-based declarative user interface of Oracle Data Integrator. 

To learn more, click here.    

Also, on November 11th, 2014 Jeff Pollock, VP Oracle Data Integration Solutions and Tim Hall, VP of Product Management Hortonworks will be hosting a joint webinar to discuss the certification and how technologies from both Oracle and Hortonworks can be used to deploy big data reservoirs.    To register, click here

Friday Oct 10, 2014

Oracle Data Integrator Webcast Archives

Check out the recorded webcasts on Oracle Data Integrator! 

Each month the Product Management Team hosts a themed session for your viewing pleasure.  Recent topics include Oracle Data Integrator (ODI) and Big Data, Oracle Data Integrator (ODI) and Oracle GoldenGate Integration, BigData Lite, the Oracle Warehouse Builder (OWB) Migration Utility, the Management Pack for Oracle Data Integrator (ODI), along with other various topics focused on Oracle Data Integrator (ODI) 12c.

You can find the Oracle Data Integrator (ODI) Webcast Archives here.

Take a look at the individual sessions:

The webcasts are publicized on the ODI OTN Forum if you want to view them live.  You will find the announcement at the top of the page, with the title and details for the upcoming webcast.

Thank you – and happy listening!

Wednesday Oct 08, 2014

Meet the Data Integration Winners of 2014 Oracle Excellence Awards for Fusion Middleware Innovation

Last Tuesday evening we had a special celebration at Oracle OpenWorld. In an Oscars-like ceremony we announced the winners of the 2014 Oracle Excellence Awards for Fusion Middleware Innovation. You can find an overview of the Fusion Middleware Innovation Awards Ceremony and the full list of winners across all 12 categories in the following blog post: And The Winners Are....

In this blog post I would like to introduce you the 2014 Fusion Middleware Innovation Award winners in the Data Integration category: NET Serviços and Griffith University.

NET Serviços is Latin America’s largest multi-service cable company and enjoyed a steep growth in its business. However the existing system architecture was not supporting this business growth, and the new product areas and business models they were expanding to.  Net Serviços team, led by Robin Michael Gray, built a single corporate real-time integration platform in a standards-based private cloud architecture with high level of maturity of standardization and architecture documentation. They implemented the private cloud environment on Oracle Exadata and Oracle Exalogic, and for enterprise-wide real-time integration they are using Oracle Data Integrator (ODI), Oracle GoldenGate and Oracle SOA Suite. This platform enabled:

  • 60% reduction in development time
  • 50% reduction in management costs via native solutions for monitoring and error handling
  • 80% reduction in data quality problems via data validation as part of integration
  • Increased agility in identifying and fixing problems due to very innovative ODI data quality firewall and Oracle GoldenGate "error hospital" 

Griffith University is based in Australia, and is one of the most influential universities in the Asia-Pacific region. Bruce Callow, chief technology officer for Griffith University, and his team saw the opportunity to improve student experience and attract more enrollment by building a modern web and mobile interface. Griffith University deployed Oracle Data IntegratorOracle SOA Suite and Oracle Service Bus with the help of their partner Integral to build a real-time data hub that supports key applications such as Oracle RightNow Service Cloud and allows a new web and mobile interface for students. The solution also reduced the load on the core Oracle PeopleSoft system during peak load windows. With this solution Griffith University:

• Achieved unified view of students by using Oracle Data Integrator, Oracle SOA Suite, and Oracle Service Bus

• Improved visibility on the state of demand for courses and to optimize enrollments and course scheduling in real time to best meet that demand

• Increased student satisfaction in enrollment process from 20% to 86%.

You can learn more about Griffith University's solution here: Griffith University Manages Enrollment Peaks 4x Faster—Boosts Student Satisfaction and Retention Rate

Congratulations to Net Serviços and Griffith University with their partner Integral for winning this year's Oracle Excellence Awards for usion Middleware Innovation in Data Integration category! 

A Recap of the Data Integration Track's Final Day in OpenWorld 2014

Last week during OpenWorld, my colleague Madhu provided a summary of the first 3 days of the Data Integration track in his blog post: Data Integration At OOW14 -Recapping days 1,2 and 3Today I would like to mention a few key sessions we presented on Thursday.

We kicked off the last day of OpenWorld with the cloud topic. In the Oracle Data Integration: A Crucial Ingredient for Cloud Integration [CON7926] session Julien Testut from Data Integration product management team presented with Sumit Sarkar from Progress DataDirect. Julien provided an overview of the various data integration solutions for cloud deployments, including integration between databases and applications hosted on cloud,  as well as loading cloud BI infrastructures. Sumit followed Julien with a live demo using Oracle Data Integrator and the Progress DataDirect JDBC drivers for loading and transforming data on Amazon Redshift, and extracting data from Salesforce.com. All of us in the audience were amazed that the demo worked seamlessly using only the OpenWorld wi-fi network.

For Oracle GoldenGate we had 2 key sessions on Thursday:

Achieving Zero Downtime During Oracle Applications Upgrades and System Migrations was in the Fusion Middleware AppAdvantage track and featured Oracle GoldenGate’s zero downtime upgrades and migration solution with customer Symantec and partner Pythian. In this session, Doug Reid presented GoldenGate's 3 different deployment models for zero downtime application upgrades.  In his slides, Doug highlighted GoldenGate’s certified solution for Siebel CRM, and mentioned the support for zero downtime application upgrade for JD Edwards and Billing and Revenue Management (BRM) as well. Following Doug, Symantec’s Rohit Muttepawar, came to stage and talked about their database migration project for their critical licensing database. And Pythian’s  Gleb Otochkin and Luke Davies presented how using Oracle GoldenGate in a 6-node active-active replication environment helped Western Union in achieving application releases with zero downtime, database patches and upgrades with zero downtime, and a real-time reporting database with no impact to online users.

The other key GoldenGate session was Managing and Monitoring Oracle GoldenGate. Joe deBuzna in the GoldenGate product management team provided an overview of the new monitoring and management capabilities included in the Oracle GoldenGate Plug-in for Enterprise Manager, and Oracle GoldenGate Monitor. Both of these products are included in the Oracle Management Pack for Oracle GoldenGate license. With the new 12c release,  they can now control the starting, stopping and configuration of existing Oracle GoldenGate processes, and include many new metrics that strengthen how users can monitor their Oracle GoldenGate deployments. 

The Data Integration track at OpenWorld closed with " Insight into Action: Business Intelligence Applications and Oracle Data Integrator". Jayant Mahto from ODI Development team, and Gurcan Orhan from Wipro Technologies focused  on the Oracle BI Applications latest release that embeds Oracle Data Integrator for data loading and transformations, and provides the option to use Oracle GoldenGate for real-time data feeds into the reporting environment. The session highlighted how the new Oracle BI Apps release provides greater strategic insight quickly, efficiently, and with a low total cost of ownership, as well as the role Oracle Data Integrator as the data integration foundation. Jayant and Gurcan presented how Oracle Data Integrator enables users to increase IT efficiency and reduce costs by increasing data transparency, easing setup and maintenance, and improving real-time reporting.

In case you missed it, I'd like to remind you on the press announcement that went out on September 29th and gives a summary of the key developments in the Fusion Middleware product family and Oracle's data integration offering. As mentioned in the press announcement, we have now a new offering for metadata management. An overview of this product was delivered in the “Oracle Data Integration and Metadata Management for the Seamless Enterprise [CON7923] session We will post a dedicated blog on this topic later in the week. Stay tuned for more on that.

Wednesday Oct 01, 2014

Data Integration At OOW14 -Recapping days 1,2 and 3

It has been an action packed three days here in San Francisco at the Oracle Open World 2014 event for Data Integration.

Day 1

It all started with the keynote presentations by the executives stressing Oracle's focus around Cloud, Big Data and Mobile developments. Thomas Kurien's session addressed Data Integration's key role in binding the applications, moving data and simplifying access to Big Data.

Jeff Pollock, in his session "Unlocking Big Data Silos in the Cloud" then addressed the various customer best practices and architectural considerations to implement a full or partial cloud based business solution and how Oracle Data Integration products fit in. An interesting real life example was that of the UK Conservatory using Oracle Data Integration in the cloud.

 Product update sessions for the Oracle Data Integrator and Oracle GoldenGate revealed how much of the upcoming releases focus on not  just integrating the products and covering for heterogeneous data source compatibility but also on cloud and Big Data features. Customer testimonials from various well known brands like DirectTV, Canon and LinkedIn further cemented Oracle Data Integration Solution's success.

 Day 2

The highlight of the day was the Fusion Middleware Innovation Awards, awarded to the most innovative use of Oracle technology to solve business needs that can be showcased as best practices to be emulated by customers in the industry. Net Servicos de Communicacao from Brazil and Griffith University from Australia walked away with the awards for the Data Integration category.

Day 2 also saw product updates with Data Quality and how Oracle is now extending its capabilities considerably into the metadata management space to help govern and bring transparency to data within the organization. Enterprise Metadata Management was an area that was of particular interest (a demonstration of which, in the product booth has been always packed with eager visitors).

 Day 3

Oracle Data Integration is one of four pillars key to Oracle's Big Data Strategy stressed Senior Vice president Balaji Yelamanchili. Building on the concept of Data Reservoir, the session "Tapping into the Big Data Reservoir with all Data" explored how it is fully supported by Oracle with Oracle GoldenGate providing real time incremental feed into the reservoir and Oracle Data Integrator doing the data transformations within the reservoir.

The Big Data Lite page offers a great environment that can be downloaded and explored to see all these concepts and technologies in action.

Day 4 is packed with sessions about Cloud Data Integration, Metadata Management, Achieving Zero Downtime migrations and a host of other insightful sessions. Check out the full list here to plan accordingly.


Streaming Relational Transactions to Flume using Oracle GoldenGate

In prior articles, we have introduced architectures for streaming transactions from Oracle GoldenGate to HDFS, Hive, and HBase. In this article we are adding to this list by presenting a solution for streaming transactions into the Flume service. 

Apache Flume is a distributed, fault tolerant, and available service for efficiently collecting, aggregating, and moving large amounts of streaming data into HDFS. It can be configured into a variety of distribution mechanisms, such as delivery to multiple clusters, or rolling of HDFS files based on time or size. 

As shown in the diagram below, streaming database transactions to Flume is accomplished by developing a custom handler using Oracle GoldenGate's Java API and Flume's Avro RPC APIs.

The custom handler is deployed as an integral part of the Oracle GoldenGate Pump process.   The Pump process and the custom adapter are configured through the Pump parameter file and custom adapter's properties file. The Pump process executes the adapter in its address space. The Pump reads the Trail File created by the Oracle GoldenGate Capture process and passes the transactions to the adapter. The adapter then writes the transactions to a Flume Avro RPC source at the given host/port defined in the configuration file. The Flume Agent streams the data to the final destination; in the supplied example Flume writes into an HDFS directory for subsequent consumption by Hive. 

A sample implementation of the Flume Adapter for Oracle GoldenGate is provided at the My Oracle Support site as Knowledge Base article 1926867.1

Are you at Oracle OpenWorld?

The week isn’t over just yet!

Oracle Data Integration has nine pods located throughout Moscone South where you can meet the product management and development teams. You will find us in the Fusion Middleware, Database, Application and Big Data areas this year.

For additional details on the Oracle Data Integration Solutions track at OpenWorld, make sure to check us out!  There are still many more sessions to attend before the end of the conference!

Friday Sep 26, 2014

Oracle GoldenGate and Oracle Data Integrator on the Oracle BigDataLite 4.0 Virtual Machine

Oracle's big data team has just announced the Oracle BigDataLite Virtual Machine 4.0, a pre-built environment to get you started on an environment reflecting the core software of Oracle's Big Data Appliance 4.0. BigDataLite is a VirtualBox VM that contains a fully configured Cloudera Hadoop distribution CDH 5.1.2, Oracle DB 12.1.0.2.0 with Big Data SQL, Oracle's Big Data Connectors, Oracle Data Integrator 12.1.3, Oracle GoldenGate, and other software.

The demos for Oracle GoldenGate and Oracle Data Integrator show an end-to-end use case of the fictional Oracle MoviePlex on-line movie streaming company. It shows how to load data into a Data Reservoir in real-time, process and transform the data using the power of Hadoop, and utilize the data on an Oracle data warehouse, either by loading the data with Oracle Loader for Hadoop or by using Hive tables within Oracle SQL queries with Oracle Big Data SQL. 

Please follow the demo instructions to try out all these steps yourself. If you would like to build the ODI mappings from scratch, try our ODI and Big Data Hands-on Lab.  Enjoy! 

Data Quality, Data Governance & Metadata Management @OOW14

As you pack our bags for the Open world this year, the importance of Data Quality and Data Governance has only increased with the addition of Big Data and Cloud technologies. This Open World includes sessions that range from dealing with traditional on premise standardization and cleansing of data to understanding how managing your enterprise metadata increases your ability to understand data provenance. Also meet with customers, partners and product experts who have built a complete data management platform that includes not just data movement, but have implemented data quality and data governance enterprise data initiatives and the benefits that accrue from them.

Here are some hot Data Quality and Data Governance Sessions

Data Quality Maturity Journey: Building Toward Strong Enterprise Data Quality [CON7776]

Oracle Enterprise Data Quality: Product Overview and Roadmap [CON7780]

The Essential Core of Data Governance with Oracle Enterprise Data Quality [CON7775]

Oracle Enterprise Data Quality Introduction [HOL9438]

Demos:

Oracle Data Integrator, Oracle GoldenGate, and Oracle Enterprise Data Quality

Trusted Data with Oracle Enterprise Data Quality Solutions

Metadata Management Sessions:

Oracle Data Integration and Metadata Management for the Seamless Enterprise [CON7923]

Demo: Oracle Enterprise Metadata Management: Trust Your Data

We will keep bringing you regular updates on the happenings from OpenWorld. Meanwhile you can see all the Data Integration related sessions right here.

Thursday Sep 25, 2014

ODI 12c - Migration from OWB to ODI - PLSQL Procedures Pt 2

In the first part of this blog post I showed you how PLSQL procedures could be invoked from ODI and help you benefit from supporting such functionality in the tool and increasing performance. Here in part 2, I'll show how the map can be changed in OWB and subsequently migrated using the migration utility.

Remember the original mapping design in OWB used a transformation operator to invoke the PLSQL procedure, the same mapping can be modified by replacing that transformation component with a construct operator to build the ref cursor and also a table function operator, see below for the original map design at the top and the modified one below with the ref cursor and table function;


The mapping can be configured to be further optimized by specifying extraction and loading hints in order to tune performance for your system;


You will have to enable the parallel DML capabilities for the OWB mapping also. With this design you can test/ensure it works and then use it in a migration. Why go through all this? You may want to try and optimize existing mappings in OWB, or when migration your map may be more complex and you do not wish to reconstruct it (perhaps the mapping is large and complex upstream from the PLSQL procedure). Doing this will save that work, you need to remove the PLSQL procedure plus target attribute mappings and insert the cursor and table function.

When the migration utility migrates this, it will migrate the entire logical design and also the hints you have specified. You will see the following mapping in ODI;

You will have to add the enable parallel DML code into the begin mapping command and then the code will be equivalent and performing as such. For details of the OWB to ODI migration utility see here, it's also worth checking various other useful resources such as the migration in action demo here and Stewart Bryson's 'Making the Move from Oracle Warehouse Builder to Oracle Data Integrator 12c' article here (useful tip on database link mechanics in ODI).

ODI 12c - Migrating from OWB to ODI - PLSQL Procedures

The OWB to ODI migration utility does a lot, but there are a few things it doesn't handle today. Here's one for our OWB customers moving to ODI who are scratching their heads on the apparent lack of support for PLSQL procedures in ODI... With a little creative work you can not only get those mappings into ODI but there is potential to dramatically improve performance (my test below improves performance by 400% and can be easily further tuned based on hardware).

This specific illustration will cover the support for PLSQL procedures (not functions) in OWB and what to do in ODI, OWB takes care of invoking PLSQL procedures mid-flow - it did this by supporting PLSQL row-based code for certain (not all) map designs out of the box - PLSQL procedure invocation was one case this was done. The PLSQL that OWB generated was pretty much as efficient as it could have been for PLSQL (used bulk collect and many other best practices you didn't have to worry about as a map designer) but it was limited in the logical map support (couldn't have a set-based operator such as join after a row based only operator such as a PLSQL transformation) - it was also PLSQL not SQL.

Here we see how with a simple pipelined, parallel enabled table function wrapper around your PLSQL procedure call, how you capture the same design in ODI 12c and/or get the mapping migrated from OWB. I think the primary hurdle customers have is what is the option going forward. To solve this, we will just leverage more of the Oracle database; table functions and parallelize the <insert your favorite word> out of it!

The mapping below calls a PLSQL procedure and OWB generated PLSQL row based code for this case, the target is getting loaded with data from the source table and the 2 output parameters of the PLSQL procedure;

When you try and migrate such a mapping using the OWB to ODI migration utility, you'll get a message indicating that the map cannot be migrated using the utility. Let's see what we can do! The vast majority of mappings are set-based, generally a very small subset are row based PLSQL mappings. Let's see how this is achieved in ODI 12c.

I did a test using the generated code from OWB - no tuning just the raw code for the above mapping - it took ">">">12 minutes 32 seconds to process about 32 million rows and invoke the PLSQL procedure and perform a direct path insert into the target. With my ODI 12c design using a very simple table function wrapper around the PLSQL procedure I can cut the time to 3 minutes 14 seconds!! Not only can I do this, but I can easily further optimize it to better leverage the Oracle database server by quickly changing the hints - I had a 4 processor machine, so that's about as much as I could squeeze out of it.

Here is my map design in ODI 12c;

The table function wrapper to call the PLSQL procedure is very simple, line 7 is where I call the PLSQL procedure, I use the object instance in the call and pipe the data when the call is made;

  1. create or replace function TF_CALL_PROC(input_values sys_refcursor) return TB_I_OBJ pipelined parallel_enable(PARTITION input_values BY ANY) AS
  2.   out_pipe i_obj := i_obj(null,null,null,null,null,null,null,null,null);
  3. BEGIN
  4.   LOOP
  5.     FETCH input_values INTO out_pipe.prod_id, out_pipe.cust_id,out_pipe.time_id,out_pipe.channel_id,out_pipe.promo_id,out_pipe.quantity_sold,out_pipe.amount_sold;
  6.     EXIT WHEN input_values%NOTFOUND;
  7.     MYPROC(out_pipe.prod_id,out_pipe.status,out_pipe.info);
  8.     PIPE ROW(out_pipe);
  9.   END LOOP;
  10.   CLOSE input_values;
  11.   RETURN;
  12. END;

This is a very simple table function (with enough metadata you could generate it), it uses table function pipelining and parallel capabilities - I will be able to parallelize all aspects the generated statement and really leverage the Oracle database. The above table function uses the types below, it has to project all of the data used downstream - whereas OWB computed this, you will have to do that.

  1. create or replace type I_OBJ as object (
  2.  prod_id number,
  3.  cust_id number,
  4.  time_id date,
  5.  channel_id number, 
  6.  promo_id number,
  7.  quantity_sold number(10,2),
  8.  amount_sold number(10,2),
  9.  status varchar2(10),
  10.  info number
  11.   );
  12. create or replace type TB_I_OBJ as table of I_OBJ; 

The physical design in ODI has the PARALLEL(4) hints on my source and target and I enable parallel DML using the begin mapping command within the physical design.

You can see in above image when using Oracle KMs there are options for hints on sources and targets, you can easily set these to take advantage of the hardware resources, tweak these to pump the performance throughput!

To summarize, you can see how we can leverage the database to really speed the process up (remember the 400%!), also we can still capture the design in ODI and on top of that unlike in OWB, this approach let's us carry on doing arbitrary data flow transformations after the table function component which is invoking our PLSQL procedure - so we could join, lookup etc. Let me know what you think of this, I'm a huge fan of table functions I think they afford a great extensibility capability.  

About

Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality

Search

Archives
« April 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
7
11
12
13
14
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  
       
Today