Thursday Sep 26, 2013

Early Arriving Facts in ODI

Nice set of posts from Ben on early arriving facts in ODI! Check them out and see an approach to handle loading facts where the dimension refs are arriving before the dimension definitions....

http://www.ateam-oracle.com/implementing-early-arriving-facts-in-odi-part-i-proof-of-concept-overview

http://www.ateam-oracle.com/implementing-early-arriving-facts-in-odi-part-ii-implementation-steps/


Friday Jul 26, 2013

The Best Data Integration for Oracle Exadata Comes from Oracle

In a previous blog post I talked about about how Oracle Exadata customers can migrate/consolidate their systems without downtime. In that blog post I mentioned that Oracle Data Integrator and Oracle GoldenGate offer unique and optimized data integration solutions for Oracle Exadata. For example, customers that choose to feed their data warehouse or reporting database with near real-time throughout the day, can do so without decreasing  performance or availability of source and target systems. And if you ask why real-time, the short answer is: in today’s fast-paced, always-on world, business decisions need to use more relevant, timely data to be able to act fast and seize opportunities. A longer response to "why real-time" question can be found in a related blog post.

If we look at the solution architecture, as shown on the diagram below,  Oracle Data Integrator and Oracle GoldenGate are both uniquely designed to take full advantage of the power of the database and to eliminate unnecessary middle-tier components. Oracle Data Integrator (ODI) is the best bulk data loading solution for Exadata. ODI is the only ETL platform that can leverage the full power of Exadata, integrate directly on the Exadata machine without any additional hardware, and by far provides the simplest setup and fastest overall performance on an Exadata system.

We regularly see customers achieving a 5-10 times boost when they move their ETL to ODI on Exadata. For  some companies the performance gain is even much higher. For example a large insurance company did a proof of concept comparing ODI vs a traditional ETL tool (one of the market leaders) on Exadata. The same process that was taking 5hrs and 11 minutes to complete using the competing ETL product took 7 minutes and 20 seconds with ODI. Oracle Data Integrator was 42 times faster than the conventional ETL when running on Exadata.This shows that Oracle's own data integration offering helps you to gain the most out of your Exadata investment with a truly optimized solution. 

GoldenGate is the best solution for streaming data from heterogeneous sources into Exadata in real time. Oracle GoldenGate can also be used together with Data Integrator for hybrid use cases that also demand non-invasive capture, high-speed real time replication. Oracle GoldenGate enables real-time data feeds from heterogeneous sources non-invasively, and delivers to the staging area on the target Exadata system. ODI runs directly on Exadata to use the database engine power to perform in-database transformations. Enterprise Data Quality is integrated with Oracle Data integrator and enables ODI to load trusted data into the data warehouse tables. Only Oracle can offer all these technical benefits wrapped into a single intelligence data warehouse solution that runs on Exadata.


Compared to traditional ETL with add-on CDC this solution offers:

  • Non-invasive data capture from heterogeneous sources and avoids any performance impact on source
  • No mid-tier; set based transformations use database power
  • Mini-batches throughout the day –or- bulk processing nightly which means maximum availability for the DW
  • Integrated solution with Enterprise Data Quality enables leveraging trusted data in the data warehouse

In addition to Starwood Hotels and Resorts, Morrison Supermarkets, United Kingdom’s fourth-largest food retailer, has seen the power of this solution for their new BI platform and shared their story with us. Morrisons needed to analyze data across a large number of manufacturing, warehousing, retail, and financial applications with the goal to achieve single view into operations for improved customer service. The retailer deployed Oracle GoldenGate and Oracle Data Integrator to bring new data into Oracle Exadata in near real-time and replicate the data into reporting structures within the data warehouse—extending visibility into operations. Using Oracle's data integration offering for Exadata, Morrisons produced financial reports in seconds, rather than minutes, and improved staff productivity and agility. You can read more about Morrison’s success story here and hear from Starwood here.

I also recommend you watch our on demand webcast on Zero-Downtime Migration to Oracle Exadata Using Oracle GoldenGate: A Customer Case Study and download free resources on Oracle Data Integration products to learn more about their powerful architecture and solutions for data-driven enterprises.

Friday Jun 21, 2013

What Comes Next After You Decide on Using Oracle Exadata

As Oracle Exadata continues to expand its footprint for both transaction and analytical processing, moving existing systems to Exadata and feeding it with enterprise data on an ongoing basis have become important discussion topics for Exadata customers. Consolidation and migration is the first step of this powerful journey with Exadata, and I'd like to start there in today's blog post.   

The systems that benefit from Exadata's extreme performance and reliability are typically business-critical systems that carry major risks when it comes to migration. Any downtime or data loss can have significant impact to the business in terms of revenue generation, customer loyalty, and productivity. As Oracle GoldenGate user community knows well, GoldenGate's heteregenous, real-time, and bidirectional replication capabilities enable very strong zero downtime migration and consolidation solutions for major databases and platforms including Oracle, IBM DB2 (zOS, iSeries, and LUW), HP NonStop, SQL Server, Sybase ASE, MySQL, and Teradata.

We discussed GoldenGate's zero downtime migration to Exadata offering and best practices with our customer IQNavigator in a webcast that is now available on demand:

Zero-Downtime Migration to Oracle Exadata Using Oracle GoldenGate: A Customer Case Study

If you have not watched it, I highly recommend listening to the discussion, as it clearly explains there should be no concerns around causing business interruption when moving to Oracle Exadata using GoldenGate.  GoldenGate's failback option to the old environment is a great tool for minimizing risk and many organizations adopt that approach for their business-critical systems.  

In addition to migration to Oracle Exadata, customers use GoldenGate, and Oracle Data Integrator, with Exadata in a variety ways leveraging the natural fit between these technologies:

  • Active-active database synchronization across the globe for data distribution, continuous availability, and zero downtime maintenance purposes.
  • Real-time or near real-time data loading to data warehouse, or consolidated database, on Oracle Exadata from heterogeneous sources. Oracle Data Integrator plays the major role in this use case as it integrates with GoldenGate and loads data warehouse in near real-time after performing transformations within the Exadata machine. This use case will be another blog topic soon as it is a strong best practice for performing ETL/ E-LT for Exadata.
  • Moving change data from an OLTP application running on Exadata in real time, for downstream consumption by other systems including supporting service integration.

As additional resources on best practices for migrating to Exadata I'd like to point you to couple of great white papers: Zero-Downtime Migration to Oracle Exadata Using Oracle GoldenGate and Oracle GoldenGate on Exadata Database Machine.

Friday May 31, 2013

Improving Customer Experience for Segment of One Using Big Data

Customer experience has been one of the top focus areas for CIOs in recent years. A key requirement for improving customer experience is understanding the customer: their past and current interactions with the company, their preferences, demographic information etc. This capability helps the organization tailor their service or products for different customer segments to maximize their satisfaction. This is not a new concept. However, there have been two parallel changes in how we approach and execute on this strategy.

First one is the big data phenomenon that brought the ability to obtain a much deeper understanding of customers, especially bringing in social data. As this Forbes article "Six Tips for Turning Big Data into Great Customer Experiences" mentions big data especially has transformed online marketing. With the large volume and different types of data we have now available companies can run more sophisticated analysis, in a more granular way. Which leads to the second change: the size of customer segments. It is shrinking down to one, where each individual customer is offered a personalized experience based on their individual needs and preferences. This notion brings more relevance into the day-to-day interactions with customers, and basically takes customers satisfaction and loyalty to a new level that was not possible before. 

One of the key technology requirements to improve customer experience at such a granular level is to obtaining a complete and  up-to-date view of the customer. And that requires integrating data across disparate systems and in a timely manner. Data integration solution should move and transform large data volumes stored in heterogeneous systems in geographically dispersed locations. Moving data with very low latency to the customer data repository or a data warehouse, enables companies to have a relevant and actionable insight for each customer. Instead of relying on yesterday's data, which may not be pertinent anymore, the solution should analyze latest information and turn them into a deeper understanding of that customer. With that knowledge the company can formulate real opportunities to drive higher customer satisfaction.

Real-time data integration is key enabling technology for real-time analytics. Oracle GoldenGate's real-time data integration technology  has been used by many leading organizations to get the most out of their big data and build a closer relationship with customers.  One good example in the telecommunications industry is MegaFon. MegaFon is Russia's top provider of mobile internet solutions. The company deployed Oracle GoldenGate 11g to capture billions of monthly transactions from eight regional billing systems. The data was integrated and centralized onto Oracle Database 11g  and distributed to business-critical subsystems. The unified and up-to-date view into customers enabled more sophisticated analysis of mobile usage information and facilitated more targeted customer marketing. As a result of  the company increased revenue generated from the current customer base. Many other telecommunications industry leaders, including DIRECTV, BT, TataSky, SK Telecom, Ufone, have improved customer experience by leveraging real-time data integration. 

Telecommunications is not the only industry where single view of the customer drives more personalized interaction with customers. Woori Bank  implemented Oracle Exadata and Oracle GoldenGate.  In the past, it had been difficult for them to revise and incorporate changes to marketing campaigns in real time because they were working with the previous day’s data. Now, users can immediately access and analyze transactions for specific trends in the data mart access layer and adjust campaigns and strategies accordingly. Woori Bank can also send tailored offers to customers. 

This is just one example of how real-time data integration can transform business operations and the way a company interacts with its customers. I would like to invite you to learn more about data integration facilitating improved customer experience by  reviewing our  free resources here and following us on Facebook, Twitter, YouTube, and Linkedin.

Image courtesy of jscreationzs at FreeDigitalPhotos.net

Thursday May 16, 2013

Sabre Holdings Case Study Webcast Recap

Last week at Oracle we had a very important event. In addition to the visit by the roaming Gnome, who really enjoyed posing for pictures on our campus, I had the priviledge to host a webcast with guest speaker Amjad Saeed from Sabre Holdings. We focused on Sabre's data integration solution  leveraging Oracle GoldenGate and Oracle Data Integrator for their enterprise travel data warehouse (ETDW).

Amjad, who leads the development effort for Sabre's enterprise data warehouse, presented us how they approached various data integration challenges, such as growing number of sources and data volumes, and what results they were able to achieve. He shared with us how using Oracle's data integration products in heterogeneous environments enabled right-time market insights, reduced complexity, and decreased time to market by 40%. Sabre was also able to standardize development for its global DW development team, achieve real-time view in the execution of the integration processes, and the ability to manage the data warehouse & BI performance on demand. I would like to thank Amjad again very much for taking the time to share their data integration best practices with us on this webcast.

In this webcast my colleague Sandrine Riley and I provided an overview of Oracle Data Integration products' differentiators. We explained architectural strengths that deliver a complete and integrated platform that offers high performance, fast time-to-value, and low cost of ownership.  If you have not had a chance to attend the live event we have the webcast now available on demand via this link for you to watch at your convenience:

Webcast Replay: Sabre Holdings Case Study: Accelerating Innovation using Oracle Data Integration

There were many great questions from our audience. Unfortunately we did not have enough time to respond to all of them. While we are individually following up with the attendees, I also want to post the questions and answer for some of the commonly asked questions here.

    Question: How do I learn Oracle Data Integrator or GoldenGate? Is training the only option ?

    Answer: We highly recommend training through Oracle University. The courses will cover all the foundational components needed to get up and running with ODI or GoldenGate. Oracle University offers instructor-led and online trainings. You can go to http://education.oracle.com to get a complete listing of the courses available.Additionally – but not in replacement of training – you can get started with a guided ‘getting started’ type tutorial which you can find on our OTN page:  in the ‘Getting Started’ section of the page.Also, there are some helpful ‘Oracle by Example’ exercises/videos which you can find on the same page.

    For Oracle GoldenGate, we recommend watching instructional videos on its Youtube channel: Youtube/oraclegoldengate. A good example is here

    Last but not least, at Oracle OpenWorld there are opportunities to learn in depth by attending our hands-on-labs, even though it does not compare to/replace taking training.

    Question: Compare and contrast Oracle Data Integrator to the Oracle Warehouse Builder ETL process. Is ODI repository-driven and based on creation of "maps" when   creating ETL modules?

    Answer: ODI has been built from the ground up to be heterogeneous – so it will excel on both Oracle and non-Oracle platforms.  OWB has been a more Oracle centric product.  ODI mappings are developed with a declarative design based approach, and processes are executed via an agent – who is orchestrating and delegating the workload in a set-based manner to the database.  OWB deploys packages of code on the database and produces more procedural code.   For more details – please read our ODI architecture white paper

    Question:  Is the metadata navigator for ODI flexible and comprehensive so that it could be used as a documentation tool for the ETL?

    Answer: Oracle Data Integrator's metadata navigator has been renamed – now called ODI Console.  The ODI console is a web-based interface to view in a more picture based manner what is inside of the ODI repository.  It could be used as documentation.  Beyond the ODI console, ODI provides full documentation from the ODI Studio.  For any given process, project, etc.  you are able to right click – and there is an option that says ‘print’ – and this will provide you with a PDF document including the details down to the transformations.  These documents may be a more appropriate method of documentation.  Also – please check out a whitepaper on Managing Metadata with ODI.

If you would like to learn more about Oracle Data Integration products please check out our free resources here. I also would like to remind you to follow us on social media if you do not already. You can find us on Facebook, Twitter, YouTube, and Linkedin.



Wednesday May 01, 2013

When The Roaming Gnome Conquers Data Integration

It is always fascinating to see how our customers turn Oracle Data Integration products into a major force for their critical initiatives. I particularly like the success stories that tie back to the products or services that I use in my personal life. A little gnome that travels around the world is my new hero when it comes to seeing Oracle Data Integration in action, in day-to-day life.  Well, of course, I am referring to the Travelocity Gnome that we are familiar with from TV ads. And we know that behind this little gnome, is a great innovative IT team serving Sabre Holdings, which owns Travelocity. They deserve the praise for supporting business innovation with cutting edge data warehousing/BI solutions.  

the Traveling Gnome in the office (Photo thanks to Flickr user Ian Kershaw, available under by-nc-sa v2.0)

the Traveling Gnome in the office (Photo thanks to Flickr user Ian Kershaw, available under by-nc-sa v2.0)

Sabre Holdings demonstrated its ability to excel in implementing data integration solutions by winning Oracle Excellence Awards for Fusion Middleware Innovation in Data Integration category in 2011. We now have the privilege to hear directly from Sabre how they used Oracle Data Integrator and Oracle GoldenGate for their critical enterprise data warehouse that drives all kinds of innovative products and services for Sabre employees, partners, and customers. Sabre partnered with Oracle to achieve major improvements from reducing complexity,  better handling growing data volumes and decreasing time to market by 40%.

Next week on May 8th we will host a free webcast where you can hear Sabre's Amjad Saeed, who manages development for their enterprise data warehouse, present how they leveraged advanced data integration approaches in achieving their data warehouse solution goals.

Sabre Holdings Case Study: Accelerating Innovation with Oracle Data Integration

May 8th 1pm ET/ 10am PT

If you have not seen Oracle Data Integration in action, this is a must-see event to attend. I also would like to remind you that this year's  Oracle Excellence Awards for Oracle Fusion Middleware Innovation is open for submissions. You can submit your nomination by June 18th here.


Tuesday Apr 02, 2013

Starwood Hotels Presents in Customer Success Forum on April 11th

Register before April 5th for the online Customer Success Forum with Starwood Hotels and Resorts (April 11th 11am ET), and ask questions about how they used Oracle Exadata, Oracle GoldenGate, and Oracle Data Integrator for data warehousing and operational reporting.[Read More]

Tuesday Feb 26, 2013

ODI - Slowly Changing Dimensions in 0 to 60

Want to see understand how to setup your datastores for supporting dimension loading concepts such as type 2 slowly changing dimensions? See the viewlet here.  The viewlet provides a very quick look at setting up a datastore for supporting slowly changing dimension data loading using ODI 11g. It uses the IKM Slowly Changing Dimension and shows dimension members being versioned when for example a marital status change happens.

Tuesday Oct 16, 2012

Sabre Manages Fast Data Growth with Oracle Data Integration Products

Last year at OpenWorld we announced Sabre Holding as a winner of the Fusion Middleware Innovation Awards. The Sabre team did an excellent job at leveraging cutting edge technologies for managing rapid data growth and exponential scalability demands they have experienced in the travel industry.

Today we announced the details and specific benefits of Sabre’s new real-time data integration solution in a press release. Please take a look if you haven’t seen it yet. Sabre Holdings Deploys Oracle Data Integrator and Oracle GoldenGate to Support Rapid Customer Growth

There are 3 different areas of benefits Sabre achieved by using Oracle Data Integration products:

  • Manages 7X increase in data sources for the enterprise data warehouse
  • Reduced infrastructure complexity
  • Decreased time to market for new products and services by 30 percent.

This simply shows that using latest technologies helps the companies to innovate robust solutions against today’s key data management challenges. And the benefit of using a next generation data integration technology is not only seen in the IT operations, but also in the business side. A better data integration solution for the enterprise data warehouse delivered the platform they need to accelerate how they service their customers, improving their competitive advantage.

Tomorrow I will give another great example of innovation with next generation data integration from Oracle. We will be discussing the Fusion Middleware Innovation Awards 2012 winners and their results with using Oracle’s data integration products.

Wednesday Mar 07, 2012

New productivity features in ODI 11.1.1.6: Global KMs, Variable tracking, Groovy Editor

ODI 11.1.1.6 introduces a number of features that make the everyday tasks of ELT developers easier. This blog explains the new features Global Knowledge Modules, Variable and Sequence Tracking, and Groovy Editor.

[Read More]

Friday Feb 10, 2012

Eliminating Batch Windows Using Real-Time Data Integration

When we invest in technology solutions we expect improvement in productivity, agility, performance and more. We don’t want to be limited by the technology we select. While data warehouses are designed to give us the freedom to access complete and reliable information, the underlying data integration architecture and the type of solution used can lead to significant constraints on how we manage our critical production systems.

With data warehousing solutions, one of the most common constraints is the time window available for batch extract processing on the source systems. The resource intensive extract process typically has to be done in off-business hours and restricts access to critical source systems.

A low-impact, real-time data integration solution can liberate your systems from batch windows. When the extract component uses a non-intrusive method, such as reading database transaction logs to capture only the changed data, it does not burden source systems. Hence, data extract can happen at any time of the day, and throughout the day, while all users are online.

SunGard is a great example for achieving a major transformation in data warehousing solutions by using log-based real-time data integration. The company removed batch processing timeline related constraint by using Oracle GoldenGate to capture all of the intraday changes that take place on the selected tables. As a result they reduced the nightly extract process from the Oracle E-Business Suite application by 9 hours. In addition, Oracle GoldenGate enables to feed EBusiness Suite data to the Oracle BI Application for Finance throughout the day. Hence, end users have access to up-to-date information on the Oracle Business Intelligence dashboards as changes occur.

To read more about how real-time data integration can free your critical systems from batch window constraints and how SunGard leveraged Oracle GoldenGate in their data warehousing implementation, I invite you to check out the SunGard case study and the article “Freedom from Batch Windows Using Real-Time Data Integration we wrote for TDWI's What Works in Emerging Technologies publication. As always, you can find more resources on Oracle GoldenGate on our recently redesigned website.

Monday Jun 20, 2011

Oracle Data Integrator 11.1.1.5 Complex Files as Sources and Targets

Overview

ODI 11.1.1.5 adds the new Complex File technology for use with file sources and targets. The goal is to read or write file structures that are too complex to be parsed using the existing ODI File technology. This includes:

    • Different record types in one list that use different parsing rules
    • Hierarchical lists, for example customers with nested orders
    • Parsing instructions in the file data, such as delimiter types, field lengths, type identifiers
    • Complex headers such as multiple header lines or parseable information in header
    • Skipping of lines
    • Conditional or choice fields

Similar to the ODI File and XML File technologies, the complex file parsing is done through a JDBC driver that exposes the flat file as relational table structures. Complex files are mapped to one or more table structures, as opposed to the (simple) file technology, which always has a one-to-one relationship between file and table. The resulting set of tables follows the same concept as the ODI XML driver, table rows have additional PK-FK relationships to express hierarchy as well as order values to maintain the file order in the resulting table.

pic1.jpg

The parsing instruction format used for complex files is the nXSD (native XSD) format that is already in use with Oracle BPEL. This format extends the XML Schema standard by adding additional parsing instructions to each element. Using nXSD parsing technology, the native file is converted into an internal XML format. It is important to understand that the XML is streamed to improve performance; there is no size limitation of the native file based on memory size, the XML data is never fully materialized. The internal XML is then converted to relational schema using the same mapping rules as the ODI XML driver.

How to Create an nXSD file

Complex file models depend on the nXSD schema for the given file. This nXSD file has to be created using a text editor or the Native Format Builder Wizard that is part of Oracle BPEL. BPEL is included in the ODI Suite, but not in standalone ODI Enterprise Edition. The nXSD format extends the standard XSD format through nxsd attributes. NXSD is a valid XML Schema, since the XSD standard allows extra attributes with their own namespaces.

The following is a sample NXSD schema blog.xsd:

<?xml version="1.0"?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:nxsd="http://xmlns.oracle.com/pcbpel/nxsd" elementFormDefault="qualified" xmlns:tns="http://xmlns.oracle.com/pcbpel/demoSchema/csv" targetNamespace="http://xmlns.oracle.com/pcbpel/demoSchema/csv" attributeFormDefault="unqualified"

nxsd:encoding="US-ASCII" nxsd:stream="chars" nxsd:version="NXSD">

<xsd:element name="Root">

<xsd:complexType><xsd:sequence>

<xsd:element name="Header">

<xsd:complexType><xsd:sequence>

<xsd:element name="Branch" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy=","/>

<xsd:element name="ListDate" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy="${eol}"/>

</xsd:sequence></xsd:complexType>

</xsd:element>

<xsd:element name="Customer" maxOccurs="unbounded">

<xsd:complexType><xsd:sequence>

<xsd:element name="Name" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy=","/>

<xsd:element name="Street" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy="," />

<xsd:element name="City" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy="${eol}" />

</xsd:sequence></xsd:complexType>

</xsd:element>

</xsd:sequence></xsd:complexType>

</xsd:element>

</xsd:schema>

The nXSD schema annotates elements to describe their position and delimiters within the flat text file. The schema above uses almost exclusively the nxsd:terminatedBy instruction to look for the next terminator chars. There are various constructs in nXSD to parse fixed length fields, look ahead in the document for string occurences, perform conditional logic, use variables to remember state, and many more.

nXSD files can either be written manually using an XML Schema Editor or created using the Native Format Builder Wizard. Both Native Format Builder Wizard as well as the nXSD language are described in the Application Server Adapter Users Guide. The way to start the Native Format Builder in BPEL is to create a new File Adapter; in step 8 of the Adapter Configuration Wizard a new Schema for Native Format can be created:

pic2.jpg

The Native Format Builder guides through a number of steps to generate the nXSD based on a sample native file. If the format is complex, it is often a good idea to “approximate” it with a similar simple format and then add the complex components manually. The resulting *.xsd file can be copied and used as the format for ODI, other BPEL constructs such as the file adapter definition are not relevant for ODI. Using this technique it is also possible to parse the same file format in SOA Suite and ODI, for example using SOA for small real-time messages, and ODI for large batches.

This nXSD schema in this example describes a file with a header row containing data and 3 string fields per row delimited by commas, for example blog.dat:

Redwood City Downtown Branch, 06/01/2011
Ebeneezer Scrooge, Sandy Lane, Atherton
Tiny Tim, Winton Terrace, Menlo Park

The ODI Complex File JDBC driver exposes the file structure through a set of relational tables with PK-FK relationships. The tables for this example are:

Table ROOT (1 row):

ROOTPK

Primary Key for root element

SNPSFILENAME

Name of the file

SNPSFILEPATH

Path of the file

SNPSLOADDATE

Date of load

Table HEADER (1 row):

ROOTFK

Foreign Key to ROOT record

HEADERORDER

Order of row in native document

BRANCH

Data

BRANCHORDER

Order of Branch within row

LISTDATE

Data

LISTDATEORDER

Order of ListDate within row

Table CUSTOMER (2 rows):

ROOTFK

Foreign Key to ROOT record

CUSTOMERORDER

Order of rows in native document

NAME

Data

NAMEORDER

Oder of Name within row

STREET

Data

STREETORDER

Order of Street within row

CITY

Data

CITYORDER

Order of City within row

Every table has PK and/or FK fields to reflect the document hierarchy through relationships. In this example this is trivial since the HEADER and all CUSTOMER records point back to the PK of ROOT. Deeper nested documents require this to identify parent elements. All child element tables also have a order field (HEADERORDER, CUSTOMERORDER) to define the order of rows, as well as order fields for each column, in case the order of columns varies in the original document and needs to be maintained. If order is not relevant, these fields can be ignored.

How to Create an Complex File Data Server in ODI

After creating the nXSD file and a test data file, and storing it on the local file system accessible to ODI, you can go to the ODI Topology Navigator to create a Data Server and Physical Schema under the Complex File technology.

pic3_new.jpg

This technology follows the conventions of other ODI technologies and is very similar to the XML technology. The parsing settings such as the source native file, the nXSD schema file, the root element, as well as the external database can be set in the JDBC URL:

pic4.jpg

The use of an external database defined by dbprops is optional, but is strongly recommended for production use. Ideally, the staging database should be used for this. Also, when using a complex file exclusively for read purposes, it is recommended to use the ro=true property to ensure the file is not unnecessarily synchronized back from the database when the connection is closed. A data file is always required to be present at the filename path during design-time. Without this file, operations like testing the connection, reading the model data, or reverse engineering the model will fail.

All properties of the Complex File JDBC Driver are documented in the Oracle Fusion Middleware Connectivity and Knowledge Modules Guide for Oracle Data Integrator in Appendix C: Oracle Data Integrator Driver for Complex Files Reference.

David Allan has created a great viewlet Complex File Processing - 0 to 60 which shows the creation of a Complex File data server as well as a model based on this server.

How to Create Models based on an Complex File Schema

Once physical schema and logical schema have been created, the Complex File can be used to create a Model as if it were based on a database. When reverse-engineering the Model, data stores(tables) for each XSD element of complex type will be created. Use of complex files as sources is straightforward; when using them as targets it has to be made sure that all dependent tables have matching PK-FK pairs; the same applies to the XML driver as well.

Debugging and Error Handling

There are different ways to test an nXSD file. The Native Format Builder Wizard can be used even if the nXSD wasn’t created in it; it will show issues related to the schema and/or test data. In ODI, the nXSD will be parsed and run against the existing test XML file when testing a connection in the Dataserver. If either the nXSD has an error or the data is non-compliant to the schema, an error will be displayed.

Sample error message:

Error while reading native data.
[Line=1, Col=5] Not enough data available in the input, when trying to read data of length "19" for "element with name D1" from the specified position, using "style" as "fixedLength" and "length" as "". Ensure that there is enough data from the specified position in the input.

Complex File FAQ

Is the size of the native file limited by available memory?
No, since the native data is streamed through the driver, only the available space in the staging database limits the size of the data. There are limits on individual field sizes, though; a single large object field needs to fit in memory.

Should I always use the complex file driver instead of the file driver in ODI now?
No, use the file technology for all simple file parsing tasks, for example any fixed-length or delimited files that just have one row format and can be mapped into a simple table. Because of its narrow assumptions the ODI file driver is easy to configure within ODI and can stream file data without writing it into a database. The complex file driver should be used whenever the use case cannot be handled through the file driver.

Should I use the complex file driver to parse standard file formats such as EDI, HL7, FIX, SWIFT, etc.? 
The complex file driver is technically able to parse most standard file formats, the user would have to develop an nXSD to parse the expected message. However, in some instances the use case requires a supporting infrastructure, such as message validation, acknowledgement messages, routing rules, etc. In these cases products such as Oracle B2B or  Oracle Service Bus for Financial Services will be better suited and could be combined with ODI.

Are we generating XML out of flat files before we write it into a database?
We don’t materialize any XML as part of parsing a flat file, either in memory or on disk. The data produced by the XML parser is streamed in Java objects that just use XSD-derived nXSD schema as its type system. We use the nXSD schema because is the standard for describing complex flat file metadata in Oracle Fusion Middleware, and enables users to share schemas across products.

Is the nXSD file interchangeable with SOA Suite?
Yes, ODI can use the same nXSD files as SOA Suite, allowing mixed use cases with the same data format.

Can I start the Native Format Builder from the ODI Studio?
No, the Native Format Builder has to be started from a JDeveloper with BPEL instance. You can get BPEL as part of the SOA Suite bundle. Users without SOA Suite can manually develop nXSD files using XSD editors.

When is the database data written back to the native file?
Data is synchronized using the SYNCHRONIZE and CREATE FILE commands, and when the JDBC connection is closed. It is recommended to set the ro or read_only property to true when a file is exclusively used for reading so that no unnecessary write-backs occur.

Is the nXSD metadata part of the ODI Master or Work Repository?
No, the data server definition in the master repository only contains the JDBC URL with file paths; the nXSD files have to be accessible on the file systems where the JDBC driver is executed during production, either by copying or by using a network file system.

Where can I find sample nXSD files?
The Application Server Adapter Users Guide contains nXSD samples for various different use cases.

Friday May 20, 2011

Real-Time Data Integration for Operational Data Warehousing

Gone are the days when data warehouses were just for reporting, strategic analytics, and forecasting. Today, more companies are using their data warehouses for operational decision making – and thus more critical to the business. To be able to influence operational decisions the analytical environment needs to be able to stay current with the business events happening right now. Therefore an important requirement is to enable lowest possible latency in which new data is delivered to the data warehouse, ideally in real time.

For example, when a snowstorm hits a certain area, the operational data warehouse can help monitor snow shovel sales and then provide information to help determine whether other stores in the affected areas should move more merchandise to the shelves and also offer other related products at a discounted price. There is a lot less value in reacting to data for a snowstorm that occurred 24 hours ago – or even 6 hours ago.

There are many data integration technologies that serve the data acquisition needs of a data warehouse, however only a few offer real-time data delivery with no impact on source systems’ performance. The challenge for the IT group is to determine what solution or combination of data integration solutions will meet their data delivery and performance needs to help propel the move to operational data warehousing. Such data integration evaluations are aided with the understanding of:

  • Selection Criteria – This should include considerations for acceptable latency, data quantity/volumes, data integrity, transformation requirements, and processing overhead/impact on availability.
  • “Right Time” vs. "Real-Time" – When evaluating solutions, the technology should deliver real-time data capabilities and let the user choose the “right time” as a business decision. Right-time should be a component of decision latency – a user preference, not a technical constraint.
  • Transformations – As the data warehouse approaches real time, transformations ideally should take place within the data warehouse in order to reduce data and analysis latency. This eliminates the need for additional steps for aggregating changed data on a middle-tier server until it is batch processed-- not to mention the TCO savings of not acquiring or maintaining a middle-tier transformation server.

Oracle offers a complete and certified data integration solution for implementing operational data warehouse on Oracle Exadata. Oracle GoldenGate provides low-impact, real-time change data capture and delivery, while Oracle Data Integrator EE provides high performance transformations within Oracle Exadata. Oracle Data Integrator also offers integrated solution for data profiling and data quality to enable analysis with trusted data. Here is a great customer example how Oracle’s products enable the move to operational data warehousing. You can read more about Oracle's data integration solution for operational data warehousing in our white paper. If you would like to read more about how to use Oracle Data Integration products for Oracle Exadata please check out our recent data sheet.

About

Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
2
3
5
6
7
8
9
10
12
13
14
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today