Friday Jan 09, 2015

ODI 12c - Mapping SDK Overview

In this post I'll show some of the high level concepts in the physical design and the SDKs that go with it. To do this I'll cover some of the logical design area so that it all makes sense. The conceptual model for logical mapping in ODI 12c is shown below (it's quite a change from 11g), the model below allows us to build arbitrary flows. Each entity below you can find in the 12c SDK. Many of these have specialized classes - for example MapComponent has specializations for the many mapping components available from the designer - these classes may have specific business logic or specialized behavior. You can use the strongly typed, highly specialized classes like DatastoreComponent or you can write applications in a generic manner using the conceptual high level SDK - this is the technique I used in the mapping builder here.

The heart of the SDK for this area of the model can be found here;

If you need to see these types in action, take the mapping illustration below as an example, I have annotated the different items within the mapper. The connector points are shown in the property inspector, they are not shown in the graphical design. Some components have many input or output connector points (for example set component has many input connector points). Some components are simple expression based components (such as join and filter) we call these selector components, other components project a specific shape, we call those projectors - that's just how we classify them. 

In 12c we clearly separated the physical design from the logical, in 11g much of this was blended together. In separating them we also allow many physical designs for a logical mapping design. We also had to change the physical SDK and model so that we could support multiple targets and arbitrary flows. 11g was fairly rigid - if you look at the 'limitations' sections of the KMs you can see some of that. KMs are assigned on map physical nodes in the physical design, there are some helper methods on execution unit so you can set/get KMs.

The heart of the SDK for this logical mapping area of the model can be found here;

If we use the logical mapping shown earlier and look at the physical design we have for it, we can annotate the items below so you can envisage how each of the classes above is used in the design;

The MapPhysicalDesign class has all of the physical related information such as the ODI Optimization Context and Default Staging Location (there also exists a Staging Location Hint on the logical mapping design) - these are items that existed in ODI 11g and are carried forward.

To take an example if I want to change the LKMs or IKMs set on all physical designs, one approach would be to iterate through all of the nodes in a physical design and you can check whether an LKM or an IKM is assigned for that node - this then let;s you do all sorts - from get the current setting, to setting it with a new value. The snippet below gives a small illustration using groovy of the methods from the ODI SDK;

  1.         PhysicalDesignList=map.getPhysicalDesigns()
  2.          for (pd in PhysicalDesignList){
  3.             PhysicalNodesList=pd.getPhysicalNodes()
  4.             for (pn in PhysicalNodesList){
  5.                 if (pn.isLKMNode()){
  6.                     CurrentLKMName=pn.getLKMName()
  7. ...
  8.                          pn.setLKM(myLKM) 
  9.                 }else if (pn.isIKMNode()){
  10.                     CurrentIKMName=pn.getIKMName()
  11. ...
  12.                      pn.setIKM(myIKM)

There are many other methods within the SDK to do all sorts of useful stuff - first example is the getAllAPNodes method on a MapPhysicalDesign. This gives all of the nodes in a design which will have LKMs assigned - so you can quickly set or check. The second example is the getTargetNodes method on MapPhysicalDesign - this is handy to get all target nodes to set IKMs on, Final example is to find an AP node in the physical design for a logical component in your design - use the method findNode to achieve this.

Hopefully there are some useful pointers here, worth being aware of the ODI blog on Mapping SDK the ins and outs which provides an overview and cross reference to the primary ODI objects and the underpinning SDKs. If there are any other specific questions let us know.

Monday Dec 29, 2014

Oracle Data Enrichment Cloud Service (ODECS) - Coming Soon

What are your plans around Big Data and Cloud?

If your organization has already begun to explore these topics, you might be interested a new offering from Oracle that will dramatically simplify how you use your data in Hadoop and the Cloud:

Oracle Data Enrichment Cloud Service (ODECS)

There is a perception that most of the time spent in Big Data projects is dedicated to harvesting value. The reality is that 90% of the time in Big Data projects is really spent on data preparation. Data may be structured, but more often it will be semi-structured such as weblogs, or fully unstructured such as free form text. The content is vast, inconsistent, and incomplete, often off topic, and from multiple differing formats and sources. In this environment each new dataset takes weeks or months of effort to process, frequently requiring programmers writing custom scripts. Minimizing data preparation time is the key to unlocking the potential of Big Data.

Oracle Data Enrichment Cloud Service (ODECS) addresses this very reality. ODECS is a non-technical, web-based tool that sets out to minimize data preparation time in an effort to quickly unlock the potential of your data. The ODECS tool provides an interactive set of services that automate, streamline, and guide the process of data ingestion, preparation, enrichment, and governance without costly manual intervention.

The technology behind this service is amazing; it intuitively guides the user with a machine learning driven recommendation engine based on semantic data classification and natural language processing algorithms. But the best part is that non-technical staff can use this tool as easily as they use Excel, resulting in a significant cost advantage for data intensive projects by reducing the amount of time and resources required to ingest and prepare new datasets for downstream IT processes.

Curious to find out more? We invite you to view a short demonstration of ODECS below:


Let us know what you think!

Stay tuned as we write more about this offering…

Wednesday Dec 17, 2014

Oracle Partition Exchange Blog from the ODI A-Team

More great information from the ODI A-Team!

Check out the A-Team’s most recent blog about the Oracle Partition Exchange – it does come in two parts:

Using Oracle Partition Exchange with ODI

Configuring ODI with Oracle Partition Exchange

The knowledge module is on Java.Net, and it is called “IKM Oracle Partition Exchange Load”.  To search for it, enter “PEL” or “exchange” in the Search option of Java.Net.

A sample ODI 12.1.3 Repository is available as well.  The ODI sample repository has great examples of how to perform both initial and incremental data upload operations with Oracle Partition Exchange.  This repository will help users to understand how to use Oracle Partition Exchange with ODI.

Happy reading!

Thursday Dec 11, 2014

Recap of Oracle GoldenGate 12c for the Enterprise and the Cloud webcast

Last week I hosted a webcast on Oracle GoldenGate 12c's latest features and its solutions for cloud environments. For those of you who missed it, I wanted to give a quick recap and remind that you can watch it on-demand via the following link:

Oracle GoldenGate 12c for the Enterprise and the Cloud

In this webcast my colleague Chai Pydimukkala, senior director of product management for Oracle GoldenGate and I talked about some of the key challenges in cloud deployments and how Oracle GoldenGate addresses them. We discussed examples of cloud-specific data integration use cases, such as synchronizing data between on-premises systems and Oracle Cloud or Amazon Cloud environments. We also discussed zero downtime consolidation to cloud using Oracle GoldenGate.

In the webcast, Chai also presented  the latest features of Oracle GoldenGate 12.1.x including:

  • New database support, including Informix, SQL Server 2014, MySQL Community Edition
  • Real-time data integration between on-premises and cloud with SOCKS5 compliance
  • New features in Oracle GoldenGate Veridata especially the new data repair capabilities
  • Enhancements to Integrated Delivery, and support for capturing data from Active Data Guard standby system
  • The new migration utility to help with the move from Oracle Streams to Oracle GoldenGate.
As with previous GoldenGate webcasts we had a very interactive Q&A where we received tons of questions. We tried answer as much as possible in the available time but could not get to all of them.  Below are some of the commonly asked questions we received during the webcast and brief answers:

Question: Does GoldenGate replace ODI? When shall we use an ETL tool vs GoldenGate?

Answer: GoldenGate is designed for real-time, change data capture, routing, and delivery. It  performs basic, row level transformations. For complex transformation requirements you still need ETL/E-LT solutions. Our customers augment their existing ETL/E-LT solutions by adding GoldenGate for real-time, low-impact change data capture and delivery. GoldenGate can deliver data for ETL in flat file format, or feed staging tables, or it can be JMS messages.Oracle Data Integrator's E-LT architecture creates perfect combination as GoldenGate can capture changed data non-intrusively with low-impact, and deliver to staging tables in the target with sub-seconds latency. With ability to perform transformations within the target (or source) database, ODI takes this change data, performs transformations in micro-batches and loads user tables with high performance. Because of this natural and strategic fit between the products, we have tightly integrated ODI and GoldenGate.To learn more about how GoldenGate and ODI are integrated and work together, please watch this on-demand webcast. I also recommend reading the following white paper on real-time data warehousing best practices

Here you can see a demo of ODI and for customer examples, you can watch Paychex , RBS, and Raymond James videos.

Question: Is there a plan to sell GoldenGate as a service soon?

Answer: Yes, it is in the plans. We are working with the Oracle Cloud team. But we are not able to give a timeline.

Question: Are Integrated Capture and Delivery only available for Oracle Database or this can be used for non-Oracle databases?

Answer: Integrated Capture and Delivery are only available for Oracle Database and truly differentiate Oracle GoldenGate against other data integration and replication vendors. We offer Coordinated Delivery for all supported databases. Coordinated Delivery simplifies configuration significantly as well and it works with non-Oracle databases too. You can read more about Coordinated Delivery in a related blog,  via Oracle GoldenGate 12c Release 1 New Features Overview white paper or documentation.

Question: Is GoldenGate available for download for trial?

Answer: Yes, you can download GoldenGate on OTN for education and development purposes: http://www.oracle.com/technetwork/middleware/goldengate/downloads/index.html.  For big data use case, you can use Big Data Lite virtual environment to experiment with Oracle GoldenGate. 

Question: Does GoldenGate replace Active Data Guard? 

 No. The products are complementary. Data Guard is a physical replication solution designed for Oracle Database disaster recovery and offers it with great simplicity and performance. Oracle GoldenGate offers logical/transactional data replication which supplements Active Data Guard by eliminating downtime during planned outages (migration, consolidation, maintenance), and active-active data center synchronization for maximum availability. License for Oracle GoldenGate for Oracle Database includes also Active Data Guard. As mentioned in the webcast, GoldenGate 12c now can capture data from Active Data Guard's standby system too.

Question:  Does the GoldenGate Veridata repair subset of data instead of doing full sync ? Example : I want to repair only missed deletes.

Yes. Oracle GoldenGate Veridata can do granular repair for out-of-sync records. Please see our Oracle GoldenGate Veridata data sheet for more info.

Question: How do we use Enterprise Manager for GoldenGate? 

Answer: Oracle Management Pack for Oracle GoldenGate license includes a Enterprise Manager Plug-in that allows you to use your Oracle Enterprise Manager solution to monitor and manage Oracle GoldenGate solutions. 

If you have not attended the webcast live, I highly recommend watching Oracle GoldenGate 12c for the Enterprise and the Cloud on demand and listening to the long Q&A session with Chai. During the webcast we covered many other frequently asked questions.


Wednesday Dec 10, 2014

Oracle Enterprise Metadata Management 12.1.3.0.1 is now available!

As a quick refresher, Metadata Management is essential to solve a wide variety of critical business and technical challenges which include how report figures are calculated, understanding the impact of changes to data upstream, providing reports in a business friendly way in the browser and providing reporting capabilities on the entire metadata of an enterprise for analysis and improvement. Oracle Enterprise Metadata Management is built to solve all these pressing needs for customers in a lightweight browser-based interface. Today, we announce the availability of Oracle Enterprise Metadata Management 12.1.3.0.1 as we continue to enhance this offering.

With Oracle Enterprise Metadata Management 12.1.3.0.1, you will find business glossary updates, updates for a better experience to the user interface as well as improved and new metadata harvesting bridges including Oracle SQL Server Data Modeler, Microsoft SQL Server Integration Services, SAP Sybase PowerDesigner, Tableau and more. There are also new dedicated web pages for tracing data lineage and impact! At a more granular level you will also find new customizable action menus per repository object type for more personalization. For a full read on new features, please read here. Additionally, view here for the certification matrix details.

Download Oracle Enterprise Metadata Management 12.1.3.0.1!

Tuesday Dec 09, 2014

Big Data Governance– Balancing Big Risks and Bigger Profits

To me, nothing exemplifies the real value that Big Data brings to life than the role it played in the last edition of the FIFA soccer world cup. Stephen Hawkins predicted that England’s chance of winning a game drops by 60 percent every time the temperature increases by 5ºC. Moreover, he found that England plays better in stadiums situated 500 meters above sea level, and perform better if the games kick off later than 3PM local time. In short, England’s soccer team struggles to cope with the conditions in hot and humid countries.

We all have heard, meditated and opined about the value of Big Data, the panacea for all problems. And it is true. Big Data has started delivering real profits and wins to businesses. But as with any data management program, profit benefits should be maximized while striving to minimize potential risks and costs.

Customer Data is Especially Combustible Data

The biggest lift in businesses using Big Data is obtained through the mining of customer data. By storing and analyzing seemingly disparate customer attributes and running analytic models through the whole data set (data sampling is dying a painful demise), businesses are able to accurately predict buying patterns, customer preferences and create products and services that cater to today’s demanding consumers. But this veritable mine of customer information is combustible. And by that, what I mean is that a small leak is enough to undo any benefits hitherto extracted from ensuing blowbacks like financial remuneration, regulatory constrictions and most important of all the immense reputational damage. And this is why Big Data should always be well governed. Data Governance is an aspect of data security that helps with safeguarding Big Data in business enterprises.

Big Data Governance

Big Data Governance is but a part (albeit a very important part) of a larger Big Data Security strategy. Big Data security should involve considerations along the efficient and economic storage of data, retrieval of data and consumption of data. It should also deal with backups, disaster management and other traditional considerations.

When properly implemented a good Governance program serves as a crystal ball to the data flow within the organizations. It will answer questions on how safe the data is, who can and should be able to lay their hands on the data and proactively prevent data leakage and misuse. Because when dealing with Big Reservoirs of Data, small leakages can go unnoticed. 

Thursday Nov 20, 2014

Let Oracle GoldenGate 12c Take You to the Cloud

If your organization is in the ~80% of the global business community, you are most likely working on a cloud computing strategy for your organization, or actively implementing. The cloud computing growth rate is 5X more than the overall IT growth rate because of the clear and already proven cost savings, agility, and  scalability benefits of cloud architectures.

When organizations decide to embark on their cloud journey, they notice there are several questions and challenges to be addressed, involving data accessibility, security, availability, system management, performance etc. Oracle GoldenGate's real-time data integration and bi-directional transactional replication technology addresses critical challenges such as:

  • How to move my systems to the cloud without interrupting operations?
  • How to enable timely data synchronization between the systems on the cloud and on-premises to ensure access to consistent data for all end users?
  • How do I run operational reports with the data I have in cloud environments, or feed my analytical systems in cloud solutions?
  • In managed or private clouds, how do I keep the cloud platform highly available when I need to do maintenance, upgrades?

 On Tuesday,  December 2nd we will tackle these questions in a free webcast:

Live Webcast: Oracle GoldenGate 12c for the Enterprise and the Cloud

Tuesday, December 2nd, 2014 10am PT/ 1pm ET 

In this webcast, you will not only hear about Oracle GoldenGate's strong solutions for cloud environments, but also the latest features that strengthen its offering. The new features we will discuss include:

  • Support for Informix, SQL Server 2014, MySQL Community Edition, and big data environments
  • Real-time data integration between on premises and cloud with SOCKS5 compliance
  • New data repair functionality to help ensure database consistency across heterogeneous systems
  • Moving from Oracle Streams to GoldenGate with the new migration utility

 I would like to invite you to join me and my colleague Chai Pydimukkala, Senior Director of Product Management for Oracle GoldenGate in this session to learn the latest on GoldenGate 12c and ask your questions in a live Q&A.

Hope to see you there!

Tuesday Nov 18, 2014

Oracle GoldenGate for Informix is Released

Oracle GoldenGate for Informix 12.1.2.1.0 is available on OTN and Oracle eDelivery. This is a new addition to Oracle GoldenGate's heterogeneous database support.[Read More]

Wednesday Nov 12, 2014

ODI 12c - Spark SQL and Hive?

In this post I'll cover some new capabilities in the Apache Spark 1.1 release and show what they mean to ODI today. There's a nice slide shown below from the Databricks training for Spark SQL that pitches some of the Spark SQL capabilities now available. As well as programmatic access via Python, Scala, Java, the Hive QL compatibility within Spark SQL is particularly interesting for ODI...... today. The Spark 1.1 release supports a subset of the Hive QL features which in turn is a subset of ANSI SQL, there is already a lot there and it is only going to grow. The Hive engine today uses map-reduce which is not fast today, the Spark engine is fast, in-memory - you can read much more on that elsewhere.

Figure taken from from the Databricks training for Spark SQL, July 2014.

In the examples below I used the Oracle Big Data Lite VM, I downloaded the Spark 1.1 release and built using Maven (I was on CDH 5.2). To use Spark SQL in ODI, we need to create a Hive data server - the Hive data server masquerades as many things, it can can be used for Hive, for HCatalog or for Spark SQL. Below you can see my data server, note the Hive port is 10001, by default 10000 is the Hive server port - we aren't using Hive server to execute the query, here we are using the Spark SQL server. I will show later how I started the Spark SQL server on this port (Apache Spark doc for this is here).

I started the server using the Spark standalone cluster that I configured using the following command from my Spark 1.1 installation;

./sbin/start-thriftserver.sh --hiveconf hive.server2.thrift.bind.host bigdatalite --hiveconf hive.server2.thrift.port 10001 --master spark://192.168.122.1:7077

You can also specify local (for test), Yarn or other cluster information for the master. I could have just as easily started the server using Yarn by specify the master URI as something like --master yarn://192.168.122.1:8032 where 8032 is my Yarn resource manager port. I ran using the 10001 port so that I can run both Spark SQL and Hive engines in parallel whilst I do various tests. To reverse engineer I actually used the Hive engine to reverse engineer the table definitions in ODI (I hit some problems using the Spark SQL reversing, so worked around it) and then changed the model to use my newly created Spark SQL data server above.

Then I built my mappings just like normal - and used the KMs in ODI for Hive just like normal. For example the mapping below aggregates movie ratings and then joins with movie reference data to load movie rating data - the mapping uses the datastores from a model obtained from the Hive metastore;

If you look at the physical design the Hive KMs are assigned but we will execute this through the Spark SQL engine rather than through Hive. The switch from engine to engine was handled in the URL within our our Hive dataserver.

When the mapping is executed you can use the Spark monitoring API to check the status of the running application and Spark master/workers.

You can also monitor from the regular ODI operator and ODI console. Spark SQL support uses the Hive metastore for all the table definitions be they internally or externally managed data. 

There are other blogs from tools showing how to access and use Spark SQL, such as the one here from Antoine Amend using SQL Developer. Antoine has also another very cool blog worth checking out Processing GDELT Data Using Hadoop. In this post he shows a custom InputFormat class that produces records/columns. This is a very useful post for anyone wanting to see the Spark newAPIHadoopFile api in action. It has a pretty funky name, but is a key piece (along with its related methods) of the framework.

  1. // Read file from HDFS - Use GdeltInputFormat
  2. val input = sc.newAPIHadoopFile(
  3.    "hdfs://path/to/gdelt",
  4.    classOf[GdeltInputFormat],
  5.    classOf[Text],
  6.    classOf[Text]

Antoine also provides the source code to GdeltInputFormat so you can see the mechanics of his record reader, although the input data is delimited data (so could have been achieved in different ways) it's a useful resource to be aware of.

If you are looking at Spark SQL, this post was all about using Spark SQL via the JDBC route - there is another whole topic on transformations using the Spark framework alongside Spark SQL that is for future discussion. You should be aware of and check out the Hive QL compatibility documentation here, check what you can do can't do within Spark SQL today. Download the BDA Lite VM and give it a try.

Monday Nov 10, 2014

Big Data Governance and Metadata Management - A Recap

On the 30th of November we held a webcast on governing Big Data. It was the second in the series on Big Data (if you missed the first you can register for it here). We discussed the importance of bringing transparency to the Big Data Reservoir architecture and how to improve and enrich data within the reservoir using Oracle's Enterprise Data Quality (OEDQ). Oracle also announced Oracle Enterprise Metadata Management (OEMM), a comprehensive metadata management tool that is built with a business friendly search driven interface. 

Here is a quick recap of some of the questions that came through. 

Do these principles and technology of Metadata Management, Data Governance and Data Quality apply to Big Data as well as traditional Data?

All the technologies are equally applicable to Big Data as well as traditional data warehousing. In fact, Oracle Enterprise Data Quality and Oracle Enterprise Metadata Management are designed to bridge these two worlds.   

 Does Oracle Enterprise Metadata Management work with 3rd party metadata?

 Yes. We recognize that to truly govern data life cycle  Oracle Enterprise Metadata Management should be able to harvest data across multiple technologies and platforms including Oracle and Non Oracle Data bases, Business Analytics, Data Warehouses and ETL engines.

Is Oracle Enterprise Metadata Management compatible with 11g?

Oracle Enterprise Metadata Management is compatible with many 11g products too.

Where can I get more information about Oracle’s Data Integration products?

The best resource for Oracle Data Integration Products are

The Oracle Data Integration Home Page,

The Oracle Data Integration Technology Network,

The Oracle Data Integration Blog

Also connect with us on facebook and twitter (#OEDQ, #OEMM, ORCLGoldenGate, ODI12c)

Thursday Nov 06, 2014

Oracle Data Integrator and Hortonworks

Check out Oracle's Alex Kotopoulis being features on Hortonworks blog discussing how Oracle Data Integrator is the best tool for data ingest to Hadoop!

Remember to register for the November 11th joint webinar presented by Jeff Pollock, VP Oracle, and Tim Hall, VP Hortonworks.  Click here to register.  

Tuesday Oct 28, 2014

ODI 12c and DBaaS in the Oracle Public Cloud

This article illustrates how to connect ODI on premise to Oracle in the cloud (OPC), specifically the Database as a Service (DBaaS, see doc here) offering. You will see how easy it is to configure connectivity from within ODI and the use of familiar tools gives you the same consistency from on premise use to the cloud. A big concern for cloud computing is security and ensuring access is restricted and as secure as possible. For ODI on premise the challenge is how to connect to such a secure service. ODI provides tasks for transferring files to and from locations - databases are generally accessed via JDBC.

The initial state of an Oracle DBaaS service restricts remote access to SSL - so you can't just remotely connect by default to an Oracle database listener for example (it is possible to open that up by configuring this within DBaaS). File transfer to the cloud can be done out of the box using sftp capabilities, access to the database in order to load it with data, to transform data within it and to extract data from it can be done with a small bit of SSL tunneling - let's see how. The examples discussed in this article have been developed with ODI 12.1.3, a copy of the driver which performs the SSL tunneling can be found on java.net here. With this driver it is effortless to work with an on premise ODI and an Oracle database in the cloud.

Before we get to the ODI parts let's look at the basics, this is mentioned in the DBaaS documentation but sometimes it's simpler to read what someone has done than follow the doc.....

If you want to be using ODI or other remote capabilities such as ssh, sftp then before creating the Oracle database instance in the cloud you should generate a secure private key-public key pair. The public key gets used when you create the Oracle database instance in the cloud and the private key is used by SSL tools (such as sftp, ssh or the driver used) to securely connect to the cloud. 

When you create the key using something like PUTTY, then ensure you save the public key, private key and export the key using the OpenSSH key option also. ODI actually needs the OpenSSH format right now as the version of a library it depends on supports this.


You can see where the public key is provided below in the Instance Configuration section.....


The great news about the DBaaS capabilities is that it is all very familiar for Oracle database folks also the service itself can be managed from the command line - so as well as the web pages console and EM etc, you can use the command line and work the way you are used to.

Anyway, back on course... when you have the instance up and running it's time to have some fun with it!

File Transfer to the Oracle Public Cloud

In ODI you can use the OdiSftpPut/Get tool in a package, procedure or KM to transfer data to/from the cloud. You can see in the example below the OdiSftpPut tools is being used to transfer a file 'm.csv' from the local filesystem (d:\data) to the directory (/home/oracle) in the cloud. The private key is specified in the property 'SSH Identity File' and the key file password is specified in 'Remote User Password'. The OS user to use for the ftp is specified as 'oracle' in the property 'Remote User Name'.

Very simple. The DBaaS instance has OS users created when initialized you can read more about the users 'opc' and 'oracle' in the DBaaS documentation.

Transforming Data in the Oracle Public Cloud

Mappings are used to transform data from one representation to another. In this example you will see how the file staged in the Oracle Public Cloud is integrated with an Oracle target - just like standard ODI on premise use cases. It is no different. Below you can see the image has the logical mapping at the top, with the file being mapped to an Oracle target table, then the middle part of the image shows the physical design, the map uses the LKM File to Oracle (External Table) KM to create an external table on top of the file in the cloud and then the target is integrated with the Oracle Insert KM. 

When I execute the mapping all of the statements to transform are executed in the OPC - in this particular design everything is executed inside the Oracle database.

The ODI data server definition is using a custom driver (here) which extends the Oracle JDBC driver. The driver creates a SSH tunnel between the host executing the connect and the instance in the cloud. This means all ODI objects such as procedures, mappings and so forth that execute statements on regular Oracle systems can execute them on the cloud instances too. I actually created my demonstration schemas and granted all the permissions using a procedure in ODI. The procedure is shown below, see the snippet of the statements creating the users - the target logical schema was my DBAAS_INSTANCE.

Let's dig under the covers and have a look at how the physical schema is defined. You'll need the driver, and have it copied into your oracledi/userlib directory (or wherever your agent is installed if using an agent). You can then define the connection, specify the database user name and password for that database user;

Then you specify the driver, the driver you need to download and mentioned above. The URL is of the form of the Oracle JDBC driver. The difference is in how you specify the host, port and sid/service. The sid/service are your actual cloud service details. Since we are using the SSH tunnel technique, we actually specify the local host and a port number (default 5656) on the local host.

The properties to configure the SSH tunnel are defined either in the properties panel or in a file specified in the properties panel. I've decided here to use the file approach, so can specify the file name in the property propertiesFile.

In my example, this file contains;

  • sslUser=oracle
  • sslHost=<my_cloud_ip_address>
  • sslRHost=<my_cloud_ip_address>
  • sslRPort=1521
  • sslPassword=my_private_key_password
  • sslPrivateKey=D:\\credentials\\dbcloud12c_private_key_openssh.ppk

That is all that is needed and you can be very creative using all the powers of Oracle in the cloud and ODI for integration. Here is a summary of the properties supported by the driver.

Property Name Property Description
sslUser The OS user to connect via SSL with.
sslHost The address of the host to SSL to.
sslRHost The tunnel can be made between the client through the SSL host to this host. In my case this was the same as the SSL host.
sslRPort The port to tunnel to. The Oracle listener is often run on 1521, so this is the default if this property is not specified.
sslPassword The password for the private key file. In ODI you must use OpenSSH formatted private key file.
sslPrivateKey The SSL private key file location.
sslLPort By default the local port used is 5656, it can be changed with this property. You must reference this port number in the URL also.

The driver is a fairly simple wrapper around the Oracle JDBC driver, it leverages SSL tunneling to forward requests on a secure port to the Oracle TNS listener. This technique enables a very familiar way of connecting and interacting with the Oracle database, the driver is on java.net and is available to try and get feedback on. So try it and let us know what you think. Familiarity and consistency are very important both from the stance of the tooling and leveraging existing knowledge (including modules). This allows ODI users to work with the Oracle Public Cloud DBaaS instance just as they do with their on premise systems. Have fun!

Monday Oct 27, 2014

Updated Statement of Direction for Oracle Business Intelligence Analytics (OBIA)

Oracle's product strategy around the Oracle Business Intelligence Analytics (OBIA) has been published this October in the latest Statement of Direction.

Interesting points relative to the BI Applications around data integration:

  • Oracle’s strategic development for ELT for BI Applications will focus on the Oracle Data Integrator and related technologies. Since the fielding of the ODI compatible version of BI Applications in the 11g series, customers have realized substantial financial and operational benefits from reduced time to value and improved speed of operations. Oracle continues to evolve and develop ODI, and Oracle’s BI Applications will take advantage of the latest capabilities as they become available.

  • Oracle will continue to support the 7.9.6.x product series according to the Oracle Lifetime Support policy including certifications of databases, operating systems, and enabling 3rd  party technologies.  However, Oracle will no longer develop new content for this series, nor extend the 7.9.6.x support or any series based on an ETL architecture with Informatica.

You can find the related blog entry with additional details from the BI Team here.

Raymond James Financial Leverages Oracle Data Integration

Hot off the press! 

Raymond James Financial shares how it uses Oracle Data Integrator and Oracle GoldenGate in establishing an enterprise information platform that integrates data from multiple heterogeneous sources such as HP NonStop, Microsoft SQL Server, etc and provides a consolidated company view.  This solution provides quicker access to actionable, timely data that helps operational efficiency for its 6,000 financial advisers by enabling repeatable processes in data migration and loading.

Read the press release.

For more information on this solution, this blog may be of interest also.

Friday Oct 24, 2014

Automating ODI development tasks using the SDK

By Ayush Dash, Oracle Consulting Services

Oracle Data Integrator (ODI) 11g uses Interfaces to move data from a source to a target datastore. ODI Studio is a very convenient drag and drop UI to build, test and execute such interfaces. These interfaces can have different processing logic and complex transformations for disparate requirements, but there could be interfaces which behave in the exact same manner except the source and target are different.

Let’s say, I have these requirements defined, able to identify the different buckets of processing required and develop respective interfaces. That’s easily done! However, if I change the requirement such that I need 10 interfaces for each bucket, the requirement gets little complex and am I face increased level of effort. What about 100 such interfaces for each buckets? Much more effort required now! It’s the same repetitive set of tasks but it needs to be done for each interface for each bucket. The problem we face here is to somehow expedite and automate the entire sequence of steps for each bucket and reduce the redundant, manual development of ODI interfaces. As the number of interface grows, our problem (increase in effort) compounds.

Note, this is not limited to interfaces only, it can be extended to generate scenarios, packages etc.

Use Case:

In one of my ODI engagements, we had the below requirements with aggressive timelines.

  1. Incremental Loads from a Source Database to Incremental Database. (ODI interfaces)
  2. Data masking on Incremental Database (not an ODI task)
  3. Incremental loads from Incremental Database to Target Database. (ODI Interfaces)

This had to be done for Oracle and PeopleSoft (2 buckets) and a total of 2300 tables (So a total of 4600 interfaces. 2300 interfaces for step 1 and 2300 for step 3) and eventually respective scenarios.

ODI SDK Groovy Scripts:

ODI Studio provides a Groovy Editor; a java based scripting language as part of its install. Groovy can be leveraged to work with ODI SDK and build automation scripts. Below is the list of scripts;

  • CreateProject – Creates an ODI Project with a ProjectName and FolderName.
  • ImportKnowledgeModules – Imports the specified Knowledgemodules to the KM directories.
  • CreateModels – Creates the source and target Models for existing Datastores.
  • CreateModelsandDatastore – Creates Datastores and Models.
  • CreateInterfaceIterative – Iterates through all the Source Datastores and generates an interface for each with respective Target Datastores.
  • CreateInterfaceIterativeandSetKMOptions – Creates Interfaces and set a KM options iteratively.
  • CreateScenario – Create scenarios for all the interfaces.
  • ExecuteScenario – Executes all the scenarios under all the interfaces.
  • CountInterface – Counts the no. of interfaces, can be used al validation.

The scripts and guide have been uploaded to the Oracle Data Integration project on Java.net: https://java.net/projects/oracledi.

All the scripts can be downloaded from here: https://java.net/projects/oracledi/downloads/download/ODI/SDK%20Samples/ODI%2011g%20SDK%20Automation/Automation%20Scripts/Automation%20Scripts.zip

Please refer to the ODI SDK Automation Guide for detailed steps: https://java.net/projects/oracledi/downloads/download/ODI/SDK%20Samples/ODI%2011g%20SDK%20Automation/ODI%2011g%20SDK%20Automation%20Guide.doc

About

Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality

Search

Archives
« August 2015
SunMonTueWedThuFriSat
      
1
2
3
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
     
Today