Monday Jan 12, 2015

ODI 12c - Eclipse and Updated Mapping Builder Example

The 12c ODI release offers a full SDK which allows great opportunities for automation. You can reduce development times and work smarter. Here I will show how to use Eclipse to configure and setup the the SDK to start working smarter. In this example I will use an updated version of Mapping Builder which is a sample SDK illustration I blogged about previously. The builder has been updated to include auto mapping of attributes, multiple targets, loading modes and more. The updated OdiMappingBuilder.java code used in this example is here

There is a viewlet illustrating the using Eclipse and the ODI SDK here. The viewlet walks through the entire configuration of an Eclipse project using the 12.1.3 release of ODI. 


The important areas are with respect to the libraries you need to define for the project, below you can see the libraries used from the ODI installation, I have installed my software in C:\Oracle\ODI_1213 in these examples.

Once the libraries are configured you can import the mapping builder java source code...

Then you must define how the code is executed, you specify the parameters as arguments. This is of course an example for illustration, you can define how you would like this to be done in whatever way you like. You specify all of the mapping builder parameters here. I also pass the control file as a parameter rather than via stdin (which is what I did in previous blog posts).

I mentioned the mapping builder code now supports auto mapping, multiple targets and loading mode, you can see an example driver file below. The line starting with >mapping has 2 qualifiers, below I have used EQUALS and IGNORECASE. The first parameter  supports the following values;

  • EQUALS - supports matching EMPNO with EMPNO (or empno with EMPNO when IGNORECASE used for example)
  • SRCENDSWITH - supports matching source XEMPNO with EMPNO (plus case option as above)
  • TGTENDSWITH - supports matching source EMPNO with XEMPNO (plus case option as above)

The second parameter supports;

  • MATCH - exact match, matches EMPNO with EMPNO or empno with empno
  • IGNORECASE - supports mismatched case matching EMPNO matches empno

As well as auto mapping you can also specify specific column level mapping explicitly as was previously supported, the viewlet has an example. 

 You can also see the control file has a target_load directive which can be given a condition or else in the final field, this let's you load multiple conditional based targets. This is a new addition to the existing target directive. So the example has 3 directives now;

  • target - load via insert/append the data
  • targetinc - incremental update / merge the data
  • target_load - insert multiple targets 

In the example above, the following mapping is created, you can see the multiple conditional based targets and data is mapped all from a simple driver file;

Using Eclipse you can quickly build and debug your utilities and applications using the SDK and get the benefits of using an IDE such as code insight and auto completion.

Check out the viewlet and the mapping builder example plus the blogs on the SDK such as the recent one on the SDK overview and work smarter. Also, the updated OdiMappingBuilder.java code used in this example is here - get an overview of the mapping builder here.

Friday Jan 09, 2015

ODI 12c - Mapping SDK Overview

In this post I'll show some of the high level concepts in the physical design and the SDKs that go with it. To do this I'll cover some of the logical design area so that it all makes sense. The conceptual model for logical mapping in ODI 12c is shown below (it's quite a change from 11g), the model below allows us to build arbitrary flows. Each entity below you can find in the 12c SDK. Many of these have specialized classes - for example MapComponent has specializations for the many mapping components available from the designer - these classes may have specific business logic or specialized behavior. You can use the strongly typed, highly specialized classes like DatastoreComponent or you can write applications in a generic manner using the conceptual high level SDK - this is the technique I used in the mapping builder here.

The heart of the SDK for this area of the model can be found here;

If you need to see these types in action, take the mapping illustration below as an example, I have annotated the different items within the mapper. The connector points are shown in the property inspector, they are not shown in the graphical design. Some components have many input or output connector points (for example set component has many input connector points). Some components are simple expression based components (such as join and filter) we call these selector components, other components project a specific shape, we call those projectors - that's just how we classify them. 

In 12c we clearly separated the physical design from the logical, in 11g much of this was blended together. In separating them we also allow many physical designs for a logical mapping design. We also had to change the physical SDK and model so that we could support multiple targets and arbitrary flows. 11g was fairly rigid - if you look at the 'limitations' sections of the KMs you can see some of that. KMs are assigned on map physical nodes in the physical design, there are some helper methods on execution unit so you can set/get KMs.

The heart of the SDK for this logical mapping area of the model can be found here;

If we use the logical mapping shown earlier and look at the physical design we have for it, we can annotate the items below so you can envisage how each of the classes above is used in the design;

The MapPhysicalDesign class has all of the physical related information such as the ODI Optimization Context and Default Staging Location (there also exists a Staging Location Hint on the logical mapping design) - these are items that existed in ODI 11g and are carried forward.

To take an example if I want to change the LKMs or IKMs set on all physical designs, one approach would be to iterate through all of the nodes in a physical design and you can check whether an LKM or an IKM is assigned for that node - this then let;s you do all sorts - from get the current setting, to setting it with a new value. The snippet below gives a small illustration using groovy of the methods from the ODI SDK;

  1.         PhysicalDesignList=map.getPhysicalDesigns()
  2.          for (pd in PhysicalDesignList){
  3.             PhysicalNodesList=pd.getPhysicalNodes()
  4.             for (pn in PhysicalNodesList){
  5.                 if (pn.isLKMNode()){
  6.                     CurrentLKMName=pn.getLKMName()
  7. ...
  8.                          pn.setLKM(myLKM) 
  9.                 }else if (pn.isIKMNode()){
  10.                     CurrentIKMName=pn.getIKMName()
  11. ...
  12.                      pn.setIKM(myIKM)

There are many other methods within the SDK to do all sorts of useful stuff - first example is the getAllAPNodes method on a MapPhysicalDesign. This gives all of the nodes in a design which will have LKMs assigned - so you can quickly set or check. The second example is the getTargetNodes method on MapPhysicalDesign - this is handy to get all target nodes to set IKMs on, Final example is to find an AP node in the physical design for a logical component in your design - use the method findNode to achieve this.

Hopefully there are some useful pointers here, worth being aware of the ODI blog on Mapping SDK the ins and outs which provides an overview and cross reference to the primary ODI objects and the underpinning SDKs. If there are any other specific questions let us know.

Wednesday Dec 17, 2014

Oracle Partition Exchange Blog from the ODI A-Team

More great information from the ODI A-Team!

Check out the A-Team’s most recent blog about the Oracle Partition Exchange – it does come in two parts:

Using Oracle Partition Exchange with ODI

Configuring ODI with Oracle Partition Exchange

The knowledge module is on Java.Net, and it is called “IKM Oracle Partition Exchange Load”.  To search for it, enter “PEL” or “exchange” in the Search option of Java.Net.

A sample ODI 12.1.3 Repository is available as well.  The ODI sample repository has great examples of how to perform both initial and incremental data upload operations with Oracle Partition Exchange.  This repository will help users to understand how to use Oracle Partition Exchange with ODI.

Happy reading!

Wednesday Nov 12, 2014

ODI 12c - Spark SQL and Hive?

In this post I'll cover some new capabilities in the Apache Spark 1.1 release and show what they mean to ODI today. There's a nice slide shown below from the Databricks training for Spark SQL that pitches some of the Spark SQL capabilities now available. As well as programmatic access via Python, Scala, Java, the Hive QL compatibility within Spark SQL is particularly interesting for ODI...... today. The Spark 1.1 release supports a subset of the Hive QL features which in turn is a subset of ANSI SQL, there is already a lot there and it is only going to grow. The Hive engine today uses map-reduce which is not fast today, the Spark engine is fast, in-memory - you can read much more on that elsewhere.

Figure taken from from the Databricks training for Spark SQL, July 2014.

In the examples below I used the Oracle Big Data Lite VM, I downloaded the Spark 1.1 release and built using Maven (I was on CDH 5.2). To use Spark SQL in ODI, we need to create a Hive data server - the Hive data server masquerades as many things, it can can be used for Hive, for HCatalog or for Spark SQL. Below you can see my data server, note the Hive port is 10001, by default 10000 is the Hive server port - we aren't using Hive server to execute the query, here we are using the Spark SQL server. I will show later how I started the Spark SQL server on this port (Apache Spark doc for this is here).

I started the server using the Spark standalone cluster that I configured using the following command from my Spark 1.1 installation;

./sbin/start-thriftserver.sh --hiveconf hive.server2.thrift.bind.host bigdatalite --hiveconf hive.server2.thrift.port 10001 --master spark://192.168.122.1:7077

You can also specify local (for test), Yarn or other cluster information for the master. I could have just as easily started the server using Yarn by specify the master URI as something like --master yarn://192.168.122.1:8032 where 8032 is my Yarn resource manager port. I ran using the 10001 port so that I can run both Spark SQL and Hive engines in parallel whilst I do various tests. To reverse engineer I actually used the Hive engine to reverse engineer the table definitions in ODI (I hit some problems using the Spark SQL reversing, so worked around it) and then changed the model to use my newly created Spark SQL data server above.

Then I built my mappings just like normal - and used the KMs in ODI for Hive just like normal. For example the mapping below aggregates movie ratings and then joins with movie reference data to load movie rating data - the mapping uses the datastores from a model obtained from the Hive metastore;

If you look at the physical design the Hive KMs are assigned but we will execute this through the Spark SQL engine rather than through Hive. The switch from engine to engine was handled in the URL within our our Hive dataserver.

When the mapping is executed you can use the Spark monitoring API to check the status of the running application and Spark master/workers.

You can also monitor from the regular ODI operator and ODI console. Spark SQL support uses the Hive metastore for all the table definitions be they internally or externally managed data. 

There are other blogs from tools showing how to access and use Spark SQL, such as the one here from Antoine Amend using SQL Developer. Antoine has also another very cool blog worth checking out Processing GDELT Data Using Hadoop. In this post he shows a custom InputFormat class that produces records/columns. This is a very useful post for anyone wanting to see the Spark newAPIHadoopFile api in action. It has a pretty funky name, but is a key piece (along with its related methods) of the framework.

  1. // Read file from HDFS - Use GdeltInputFormat
  2. val input = sc.newAPIHadoopFile(
  3.    "hdfs://path/to/gdelt",
  4.    classOf[GdeltInputFormat],
  5.    classOf[Text],
  6.    classOf[Text]

Antoine also provides the source code to GdeltInputFormat so you can see the mechanics of his record reader, although the input data is delimited data (so could have been achieved in different ways) it's a useful resource to be aware of.

If you are looking at Spark SQL, this post was all about using Spark SQL via the JDBC route - there is another whole topic on transformations using the Spark framework alongside Spark SQL that is for future discussion. You should be aware of and check out the Hive QL compatibility documentation here, check what you can do can't do within Spark SQL today. Download the BDA Lite VM and give it a try.

Tuesday Oct 28, 2014

ODI 12c and DBaaS in the Oracle Public Cloud

This article illustrates how to connect ODI on premise to Oracle in the cloud (OPC), specifically the Database as a Service (DBaaS, see doc here) offering. You will see how easy it is to configure connectivity from within ODI and the use of familiar tools gives you the same consistency from on premise use to the cloud. A big concern for cloud computing is security and ensuring access is restricted and as secure as possible. For ODI on premise the challenge is how to connect to such a secure service. ODI provides tasks for transferring files to and from locations - databases are generally accessed via JDBC.

The initial state of an Oracle DBaaS service restricts remote access to SSL - so you can't just remotely connect by default to an Oracle database listener for example (it is possible to open that up by configuring this within DBaaS). File transfer to the cloud can be done out of the box using sftp capabilities, access to the database in order to load it with data, to transform data within it and to extract data from it can be done with a small bit of SSL tunneling - let's see how. The examples discussed in this article have been developed with ODI 12.1.3, a copy of the driver which performs the SSL tunneling can be found on java.net here. With this driver it is effortless to work with an on premise ODI and an Oracle database in the cloud.

Before we get to the ODI parts let's look at the basics, this is mentioned in the DBaaS documentation but sometimes it's simpler to read what someone has done than follow the doc.....

If you want to be using ODI or other remote capabilities such as ssh, sftp then before creating the Oracle database instance in the cloud you should generate a secure private key-public key pair. The public key gets used when you create the Oracle database instance in the cloud and the private key is used by SSL tools (such as sftp, ssh or the driver used) to securely connect to the cloud. 

When you create the key using something like PUTTY, then ensure you save the public key, private key and export the key using the OpenSSH key option also. ODI actually needs the OpenSSH format right now as the version of a library it depends on supports this.


You can see where the public key is provided below in the Instance Configuration section.....


The great news about the DBaaS capabilities is that it is all very familiar for Oracle database folks also the service itself can be managed from the command line - so as well as the web pages console and EM etc, you can use the command line and work the way you are used to.

Anyway, back on course... when you have the instance up and running it's time to have some fun with it!

File Transfer to the Oracle Public Cloud

In ODI you can use the OdiSftpPut/Get tool in a package, procedure or KM to transfer data to/from the cloud. You can see in the example below the OdiSftpPut tools is being used to transfer a file 'm.csv' from the local filesystem (d:\data) to the directory (/home/oracle) in the cloud. The private key is specified in the property 'SSH Identity File' and the key file password is specified in 'Remote User Password'. The OS user to use for the ftp is specified as 'oracle' in the property 'Remote User Name'.

Very simple. The DBaaS instance has OS users created when initialized you can read more about the users 'opc' and 'oracle' in the DBaaS documentation.

Transforming Data in the Oracle Public Cloud

Mappings are used to transform data from one representation to another. In this example you will see how the file staged in the Oracle Public Cloud is integrated with an Oracle target - just like standard ODI on premise use cases. It is no different. Below you can see the image has the logical mapping at the top, with the file being mapped to an Oracle target table, then the middle part of the image shows the physical design, the map uses the LKM File to Oracle (External Table) KM to create an external table on top of the file in the cloud and then the target is integrated with the Oracle Insert KM. 

When I execute the mapping all of the statements to transform are executed in the OPC - in this particular design everything is executed inside the Oracle database.

The ODI data server definition is using a custom driver (here) which extends the Oracle JDBC driver. The driver creates a SSH tunnel between the host executing the connect and the instance in the cloud. This means all ODI objects such as procedures, mappings and so forth that execute statements on regular Oracle systems can execute them on the cloud instances too. I actually created my demonstration schemas and granted all the permissions using a procedure in ODI. The procedure is shown below, see the snippet of the statements creating the users - the target logical schema was my DBAAS_INSTANCE.

Let's dig under the covers and have a look at how the physical schema is defined. You'll need the driver, and have it copied into your oracledi/userlib directory (or wherever your agent is installed if using an agent). You can then define the connection, specify the database user name and password for that database user;

Then you specify the driver, the driver you need to download and mentioned above. The URL is of the form of the Oracle JDBC driver. The difference is in how you specify the host, port and sid/service. The sid/service are your actual cloud service details. Since we are using the SSH tunnel technique, we actually specify the local host and a port number (default 5656) on the local host.

The properties to configure the SSH tunnel are defined either in the properties panel or in a file specified in the properties panel. I've decided here to use the file approach, so can specify the file name in the property propertiesFile.

In my example, this file contains;

  • sslUser=oracle
  • sslHost=<my_cloud_ip_address>
  • sslRHost=<my_cloud_ip_address>
  • sslRPort=1521
  • sslPassword=my_private_key_password
  • sslPrivateKey=D:\\credentials\\dbcloud12c_private_key_openssh.ppk

That is all that is needed and you can be very creative using all the powers of Oracle in the cloud and ODI for integration. Here is a summary of the properties supported by the driver.

Property Name Property Description
sslUser The OS user to connect via SSL with.
sslHost The address of the host to SSL to.
sslRHost The tunnel can be made between the client through the SSL host to this host. In my case this was the same as the SSL host.
sslRPort The port to tunnel to. The Oracle listener is often run on 1521, so this is the default if this property is not specified.
sslPassword The password for the private key file. In ODI you must use OpenSSH formatted private key file.
sslPrivateKey The SSL private key file location.
sslLPort By default the local port used is 5656, it can be changed with this property. You must reference this port number in the URL also.

The driver is a fairly simple wrapper around the Oracle JDBC driver, it leverages SSL tunneling to forward requests on a secure port to the Oracle TNS listener. This technique enables a very familiar way of connecting and interacting with the Oracle database, the driver is on java.net and is available to try and get feedback on. So try it and let us know what you think. Familiarity and consistency are very important both from the stance of the tooling and leveraging existing knowledge (including modules). This allows ODI users to work with the Oracle Public Cloud DBaaS instance just as they do with their on premise systems. Have fun!

Friday Oct 24, 2014

Automating ODI development tasks using the SDK

By Ayush Dash, Oracle Consulting Services

Oracle Data Integrator (ODI) 11g uses Interfaces to move data from a source to a target datastore. ODI Studio is a very convenient drag and drop UI to build, test and execute such interfaces. These interfaces can have different processing logic and complex transformations for disparate requirements, but there could be interfaces which behave in the exact same manner except the source and target are different.

Let’s say, I have these requirements defined, able to identify the different buckets of processing required and develop respective interfaces. That’s easily done! However, if I change the requirement such that I need 10 interfaces for each bucket, the requirement gets little complex and am I face increased level of effort. What about 100 such interfaces for each buckets? Much more effort required now! It’s the same repetitive set of tasks but it needs to be done for each interface for each bucket. The problem we face here is to somehow expedite and automate the entire sequence of steps for each bucket and reduce the redundant, manual development of ODI interfaces. As the number of interface grows, our problem (increase in effort) compounds.

Note, this is not limited to interfaces only, it can be extended to generate scenarios, packages etc.

Use Case:

In one of my ODI engagements, we had the below requirements with aggressive timelines.

  1. Incremental Loads from a Source Database to Incremental Database. (ODI interfaces)
  2. Data masking on Incremental Database (not an ODI task)
  3. Incremental loads from Incremental Database to Target Database. (ODI Interfaces)

This had to be done for Oracle and PeopleSoft (2 buckets) and a total of 2300 tables (So a total of 4600 interfaces. 2300 interfaces for step 1 and 2300 for step 3) and eventually respective scenarios.

ODI SDK Groovy Scripts:

ODI Studio provides a Groovy Editor; a java based scripting language as part of its install. Groovy can be leveraged to work with ODI SDK and build automation scripts. Below is the list of scripts;

  • CreateProject – Creates an ODI Project with a ProjectName and FolderName.
  • ImportKnowledgeModules – Imports the specified Knowledgemodules to the KM directories.
  • CreateModels – Creates the source and target Models for existing Datastores.
  • CreateModelsandDatastore – Creates Datastores and Models.
  • CreateInterfaceIterative – Iterates through all the Source Datastores and generates an interface for each with respective Target Datastores.
  • CreateInterfaceIterativeandSetKMOptions – Creates Interfaces and set a KM options iteratively.
  • CreateScenario – Create scenarios for all the interfaces.
  • ExecuteScenario – Executes all the scenarios under all the interfaces.
  • CountInterface – Counts the no. of interfaces, can be used al validation.

The scripts and guide have been uploaded to the Oracle Data Integration project on Java.net: https://java.net/projects/oracledi.

All the scripts can be downloaded from here: https://java.net/projects/oracledi/downloads/download/ODI/SDK%20Samples/ODI%2011g%20SDK%20Automation/Automation%20Scripts/Automation%20Scripts.zip

Please refer to the ODI SDK Automation Guide for detailed steps: https://java.net/projects/oracledi/downloads/download/ODI/SDK%20Samples/ODI%2011g%20SDK%20Automation/ODI%2011g%20SDK%20Automation%20Guide.doc

Thursday Sep 25, 2014

ODI 12c - Migration from OWB to ODI - PLSQL Procedures Pt 2

In the first part of this blog post I showed you how PLSQL procedures could be invoked from ODI and help you benefit from supporting such functionality in the tool and increasing performance. Here in part 2, I'll show how the map can be changed in OWB and subsequently migrated using the migration utility.

Remember the original mapping design in OWB used a transformation operator to invoke the PLSQL procedure, the same mapping can be modified by replacing that transformation component with a construct operator to build the ref cursor and also a table function operator, see below for the original map design at the top and the modified one below with the ref cursor and table function;


The mapping can be configured to be further optimized by specifying extraction and loading hints in order to tune performance for your system;


You will have to enable the parallel DML capabilities for the OWB mapping also. With this design you can test/ensure it works and then use it in a migration. Why go through all this? You may want to try and optimize existing mappings in OWB, or when migration your map may be more complex and you do not wish to reconstruct it (perhaps the mapping is large and complex upstream from the PLSQL procedure). Doing this will save that work, you need to remove the PLSQL procedure plus target attribute mappings and insert the cursor and table function.

When the migration utility migrates this, it will migrate the entire logical design and also the hints you have specified. You will see the following mapping in ODI;

You will have to add the enable parallel DML code into the begin mapping command and then the code will be equivalent and performing as such. For details of the OWB to ODI migration utility see here, it's also worth checking various other useful resources such as the migration in action demo here and Stewart Bryson's 'Making the Move from Oracle Warehouse Builder to Oracle Data Integrator 12c' article here (useful tip on database link mechanics in ODI).

ODI 12c - Migrating from OWB to ODI - PLSQL Procedures

The OWB to ODI migration utility does a lot, but there are a few things it doesn't handle today. Here's one for our OWB customers moving to ODI who are scratching their heads on the apparent lack of support for PLSQL procedures in ODI... With a little creative work you can not only get those mappings into ODI but there is potential to dramatically improve performance (my test below improves performance by 400% and can be easily further tuned based on hardware).

This specific illustration will cover the support for PLSQL procedures (not functions) in OWB and what to do in ODI, OWB takes care of invoking PLSQL procedures mid-flow - it did this by supporting PLSQL row-based code for certain (not all) map designs out of the box - PLSQL procedure invocation was one case this was done. The PLSQL that OWB generated was pretty much as efficient as it could have been for PLSQL (used bulk collect and many other best practices you didn't have to worry about as a map designer) but it was limited in the logical map support (couldn't have a set-based operator such as join after a row based only operator such as a PLSQL transformation) - it was also PLSQL not SQL.

Here we see how with a simple pipelined, parallel enabled table function wrapper around your PLSQL procedure call, how you capture the same design in ODI 12c and/or get the mapping migrated from OWB. I think the primary hurdle customers have is what is the option going forward. To solve this, we will just leverage more of the Oracle database; table functions and parallelize the <insert your favorite word> out of it!

The mapping below calls a PLSQL procedure and OWB generated PLSQL row based code for this case, the target is getting loaded with data from the source table and the 2 output parameters of the PLSQL procedure;

When you try and migrate such a mapping using the OWB to ODI migration utility, you'll get a message indicating that the map cannot be migrated using the utility. Let's see what we can do! The vast majority of mappings are set-based, generally a very small subset are row based PLSQL mappings. Let's see how this is achieved in ODI 12c.

I did a test using the generated code from OWB - no tuning just the raw code for the above mapping - it took ">">">12 minutes 32 seconds to process about 32 million rows and invoke the PLSQL procedure and perform a direct path insert into the target. With my ODI 12c design using a very simple table function wrapper around the PLSQL procedure I can cut the time to 3 minutes 14 seconds!! Not only can I do this, but I can easily further optimize it to better leverage the Oracle database server by quickly changing the hints - I had a 4 processor machine, so that's about as much as I could squeeze out of it.

Here is my map design in ODI 12c;

The table function wrapper to call the PLSQL procedure is very simple, line 7 is where I call the PLSQL procedure, I use the object instance in the call and pipe the data when the call is made;

  1. create or replace function TF_CALL_PROC(input_values sys_refcursor) return TB_I_OBJ pipelined parallel_enable(PARTITION input_values BY ANY) AS
  2.   out_pipe i_obj := i_obj(null,null,null,null,null,null,null,null,null);
  3. BEGIN
  4.   LOOP
  5.     FETCH input_values INTO out_pipe.prod_id, out_pipe.cust_id,out_pipe.time_id,out_pipe.channel_id,out_pipe.promo_id,out_pipe.quantity_sold,out_pipe.amount_sold;
  6.     EXIT WHEN input_values%NOTFOUND;
  7.     MYPROC(out_pipe.prod_id,out_pipe.status,out_pipe.info);
  8.     PIPE ROW(out_pipe);
  9.   END LOOP;
  10.   CLOSE input_values;
  11.   RETURN;
  12. END;

This is a very simple table function (with enough metadata you could generate it), it uses table function pipelining and parallel capabilities - I will be able to parallelize all aspects the generated statement and really leverage the Oracle database. The above table function uses the types below, it has to project all of the data used downstream - whereas OWB computed this, you will have to do that.

  1. create or replace type I_OBJ as object (
  2.  prod_id number,
  3.  cust_id number,
  4.  time_id date,
  5.  channel_id number, 
  6.  promo_id number,
  7.  quantity_sold number(10,2),
  8.  amount_sold number(10,2),
  9.  status varchar2(10),
  10.  info number
  11.   );
  12. create or replace type TB_I_OBJ as table of I_OBJ; 

The physical design in ODI has the PARALLEL(4) hints on my source and target and I enable parallel DML using the begin mapping command within the physical design.

You can see in above image when using Oracle KMs there are options for hints on sources and targets, you can easily set these to take advantage of the hardware resources, tweak these to pump the performance throughput!

To summarize, you can see how we can leverage the database to really speed the process up (remember the 400%!), also we can still capture the design in ODI and on top of that unlike in OWB, this approach let's us carry on doing arbitrary data flow transformations after the table function component which is invoking our PLSQL procedure - so we could join, lookup etc. Let me know what you think of this, I'm a huge fan of table functions I think they afford a great extensibility capability.  

Tuesday Sep 09, 2014

ODI 12c - Models in Data Modeler and ODI

Ever wondered how to get your models from Oracle SQL Developer Data Modeler (SDDM) in to ODI data models? The most common practice is to generate and execute the physical DDL scripts from SDDM to the target database. Another technique is possible leveraging the ODI SDK and the SDDM SDK - that's what I will illustrate here. There is an example script posted here on the java.net site.

There is an end to end demo viewlet below to see the script in action, check it out;

https://blogs.oracle.com/dataintegration/resource/viewlets/odi_models_from_sddm_viewlet_swf.html

The viewlet shows the transformation script in action creating the ODI model from the Data Modeler design, here's a peek at the script in action;

In the viewlet you will see how I added the groovy scripting engine as an engine for SDDM - I can then leverage my groovy skills and the ODI SDK to build the script and provide useful capabilities.

This script is publicly available at link above, take, enhance and comment! Join the communities on LinkedIn for Oracle Data Integration and the OTN forum to learn and exhange with other members of the community.

Wednesday Aug 06, 2014

OWB to ODI 12c Migration in action

The OWB to ODI 12c migration utility provides an easy to use on-ramp to Oracle's strategic data integration tool. The utility was designed and built by the same development group that produced OWB and ODI.

Here's a screenshot from the recording below showing a project in OWB and what it looks like in ODI 12c;


There is a useful webcast that you can play and watch the migration utility in action. It takes an OWB implementation and uses the migration utility to move into ODI 12c.

http://oracleconferencing.webex.com/oracleconferencing/ldr.php?RCID=df8729e0c7628dde638847d9511f6b46

It's worth having a read of the following OTN article from Stewart Bryson which gives an overview of the capabilities and options OWB customers have moving forward.
http://www.oracle.com/technetwork/articles/datawarehouse/bryson-owb-to-odi-2130001.html

Check it out and see what you think!

Monday Jul 28, 2014

New Security Enhancements in ODI 12.1.3

Oracle Data Integrator now uses Advanced Encryption Standard (AES) as the standard encryption algorithm for encrypting ODI objects such as Knowledge Modules, procedures, scenarios or actions as well as any password.

You can configure the encryption algorithm and key length to meet your specific requirements. By default ODI uses AES-128 but you can also use cryptographic keys of 192 and 256 bits.

Passwords and other sensitive information included in repository exports are also now encrypted and secured by an export key. This export key must be provided when importing the exported content, if it is not provided all the sensitive information is removed from the imported object.

You can find more information about these security enhancements in the ODI 12.1.3 documentation: Advanced Encryption Standard


This is just one of the many new features added to ODI 12.1.3! You can find a complete list in the following document: Oracle Data Integrator 12c New Features Overview.

Wednesday Jul 23, 2014

ODI 12c - Sessionizing Data with MATCH_RECOGNIZE and SQL Override

The Oracle 12c database introduced a very useful new pattern matching capability, useful for many different use cases. The example I will use in this blog post within ODI is sessionizing data - some log data with sample URLs, IP addresses initiating the clicks and their timestamps. To incorporate this in ODI, I will use one of the lesser known capabilities that many users often want to use - it is the ability to override the generated SQL. I blogged about how this could be done in ODI 11g here and here - in this post I will show how this can be done in 12c.

There are some great posts and presentations from Keith Laker and many other partners and users on the MATCH_RECOGNIZE capability, the post on Sessionization with 12c pattern matching is super fast is a great read, check it out! I will base the SQL used on that post. 

My source data has the following information;

  1. create table clicks (IP varchar2(16), Click_Timestamp timestamp, URL varchar2(200));
  2. insert into clicks values ('192.65.20.33','03-JUL-2014 11:33:15','stubhub.com');
  3. insert into clicks values ('192.65.20.33','03-JUL-2014 11:33:21','mlssoccer.com');
  4. insert into clicks values ('192.65.20.33','03-JUL-2014 11:59:08','netflix.com');
  5. insert into clicks values ('192.65.9.89','04-JUL-2014 09:02:19','amazon.com');
  6. insert into clicks values ('192.65.9.89','04-JUL-2014 09:02:26','amazon.com/travel');
  7. insert into clicks values ('192.65.9.89','04-JUL-2014 09:48:27','citibank.com');
  8. insert into clicks values ('192.65.20.33','04-JUL-2014 10:01:05','ebay.com');
  9. commit;

The type of query we need to generate a session ID using MATCH_RECOGNIZE is shown below. As in Keith's blog, we defined a session as a sequence of one or more events with the same partition key (IP address) where the gap between the timestamps is less than 10 seconds - in this demo, the gap is completely arbitrary and could be set to any number as required. Here we detect the various sessions and assign a surrogate session id to each session within each partition (IP). 

  1. SELECT ip, session_id start_time, no_of_events, duration
  2. FROM clicks MATCH_RECOGNIZE
  3.  (PARTITION BY ip ORDER BY click_TimeStamp 
  4.   MEASURES match_number() session_id, 
  5.            count(*) as no_of_events,
  6.            first(click_timestamp) start_time, 
  7.            last(click_timestamp) - first(click_timestamp) duration 
  8.   ONE ROW PER MATCH 
  9.   PATTERN (b s*) 
  10.   DEFINE 
  11.        s as ((s.click_TimeStamp - prev(s.click_TimeStamp)) <= TO_DSINTERVAL('0 0:00:10.0')) 
  12.  ); 

The Oracle syntax (...FROM clicks MATCH_RECOGNIZE....) for this operation is a little different from other SQL constructs, it isn't a join, it isn't a table function, it might be a custom component - but until that's public need another way. You could create a view and use that in ODI. Another option is to use the SQL override, custom template capability. This let's you define any arbitrary code for a source component in a mapping. Source datastores in mappings have a physical option named CUSTOM_TEMPLATE. You can define code here such as the SQL to produce the sessionized data above. The code can use ODI's substitution reference methods to make the code more generic (parameterized schemas etc.). For example the physical map design below defines a mapping with a source datastore defining the attributes projected from my query. I defined the datastore in an ODI model - it isn't actually in a database, it's just used in the mapping.

As it's description states 'Custom SQL statement as an inline view to override default code generation.' is what this is for. 

 

Note this option is under the extract options on the source datastore's physical node, the value for CUSTOM_TEMPLATE can be edited and arbitrary code, in this case SQL, defined. Double clicking the value field in the KM options table brings up an editor where you can see the entire code block I have defined. Note below, I have hard-wired the clicks table in the SQL;

To make this more generic and flexible, I can use the odiRef methods to get the name of the clicks table generated based on the topology defined in ODI. My clicks table is in a model in ODI and the use of odiRef.getObjectName below let's me get the name correctly generated depending on the generation context - perhaps in one context it is in schema DEVX and in another it is in DEVY, using the SDK shields us from such changes.

You will note also that there is another KM option named SELECT_HINT, this is used by the default 12c code generation capabilities but if you override the code using this custom code option described above you can still incorporate the hints supplied in the mapping. So if you define a value for the KM option SELECT_HINT you can have it included in your custom code by using this getOption method - pretty flexible! 

Hope this is a useful insight into some of the capabilities provided in ODI 12c. I used the latest and greatest 12.1.3 release in this post, that's what you will need to use this functionality.

Monday Jul 21, 2014

ODI 12.1.3: New Model and Topology Objects Wizard

Oracle Data Integrator 12.1.3 introduces a new wizard to quickly create Models. This wizard will not only help you create your Models more easily, if needed it will also create the entire required infrastructure in the ODI Topology: Data Servers, Physical and Logical Schemas.

In this blog article we will go through an example together and add a new Model to access the HR sample schema of an Oracle database. You can follow through this example using the ODI Getting Started VirtualBox image which is available here: http://www.oracle.com/technetwork/middleware/data-integrator/odi-demo-2032565.html

The ‘New Model and Topology Objects’ wizard can be accessed from the Models menu as shown below:

The wizard opens up and displays default settings. From there we can customize our objects before they actually get created in the ODI repositories.

In this example we want to access tables stored in the HR schema of an Oracle database so we name the Model ORACLE_HR. Note that the Logical Schema as well as the Schema and Work Schema fields in the Physical Schema section automatically default to the Model name:

Next we will give a new name to our Data Server: LINUX_LOCAL_ORACLE since we are connecting to a local Oracle database running on a Linux host.

We then fill in the User, Password and URL fields to reflect the environment we are in. To access the HR schema we use the ODI Staging area user which is ODI_STAGING. This is a best practice and it also ensures that the Work Schema field automatically gets updated with the right value for the Staging Area.

Note that the wizard also allows us to link a new Model to an existing Data Server.

Finally we click on Test Connection to make sure the parameters are correct.


Then we update the Schema field using the drop-down list to point to the HR schema at the database level.

Our Model is now fully set up, we click on OK to have it created along with its related Topology objects. The Model ORACLE_HR opens up allowing us to reverse-engineer the tables using the Selective Reverse-Engineering tab:

We pick all the tables and click on the Reverse Engineer button to start this process and save the Model at the same time. A new Model called ORACLE_HR was created as shown below as well as the appropriate objects in the Topology:


Thursday Jul 17, 2014

ODI 12c and Eloqua using DataDirect Cloud JDBC Driver

Sumit Sarkar from Progress DataDirect just posted a great blog on connecting to Eloqua in ODI 12c using the DataDirect Cloud JDBC driver. You can find the article here: http://blogs.datadirect.com/2014/07/oracle-data-integrator-etl-connectivity-eloqua-jdbc-marketing-data.html


The steps described in this tutorial also apply to other datasources supported by the DataDirect Cloud JDBC driver.

Wednesday Jul 09, 2014

ODI 11g - HDFS Files to Oracle, community post from ToadWorld

There is a new tutorial on using ODI to move HDFS files into an Oracle database using OLH-OSCH from Deepak Vohra on the ToadWorld blog. This article covers all the setup required in great detail and will be very helpful if you're planning on integrating with HDFS files.

http://www.toadworld.com/platforms/oracle/w/wiki/10957.integrating-hdfs-file-data-with-oracle-database-11g-in-oracle-data-integrator-11g.aspx

About

Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality

Search

Archives
« March 2015
SunMonTueWedThuFriSat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
    
       
Today