OWB 11gR2 – Architecture Overview
By David Allan on Sep 20, 2009
This post gives an overview of the OWB 11gR2 code template framework and runtime infrastructure, the code template framework is a key component in the product's customization and open connectivity capabilities. This framework enables OWB to integrate much more data than before, faster, in many more ways and gives customers the capabilities to leverage their existing assets efficiently. There are lots of new features to check out, see the list within the Concepts guide here.
Oracle Warehouse Builder is built on top of a metadata repository that captures your design. Once you have created your design in the repository, Warehouse Builder generates the code, and you deploy it somewhere. Prior to the 11gR2 release, for data integration routines, the tool generated a combination of SQL and PL/SQL (plus ABAP and SQL*Loader), allowing you to perform complex transformations on the data you are moving. It was primarily focused on Oracle, files and SAP sources with heterogeneous connectivity to all other systems using the Database Gateways. It was never possible to roll your sleeves up and reconfigure the physical execution of the
mapping to make better use of the assets at hand. It was never possible to change the built-in templates to support features that were not supported out of the box. As we know, technology and the world around us is forever changing! There is a need to be able to integrate new types of data in new and different ways.
With Warehouse Builder 11gR2 it all changes.
The 11gR2 release supports both the existing code generation capabilities and an open extensible code template and connectivity framework based on the Oracle Data Integrator Knowledge Module framework. The benefits of open, extensible code template support is that you can:
- Have much greater architectural flexibility; you can apply technology at the right place and at the right time (push code to execute wherever it can be optimally executed).
- Use alternative data movement strategies (bulk extract, ftp, scp, piped etc).
- Integrate with additional systems through the Open Connectivity framework (platforms and code templates)
- Provide implementations for standard design patterns for these systems (CDC, control, slowly changing dimensions etc).
- Integrate database features that were not incorporated in the core product (for example, data pump external tables, pre-process files for external table) out of product release cycles.
- Construct templates in a modular manner to share across common patterns.
There are different categories of code templates which serve different purposes; there are templates for moving data over the wire (Load), templates for capturing changes (CDC), templates for some data quality tasks (Control) and for data integration (Integrate and Oracle Target). The illustration below gives an overview of how they fit together.
With the OWB 11gR2 repository a number of code templates are already pre-seeded. There are a number of seeded platforms for integration such as Oracle, File, DB2 UDB etc and this set can be extended, allowing the reach of data to grow dynamically and the supporting integration capabilities extensible - a truly pluggable framework.
Code Code Templates (CTs) are components of Oracle Warehouse Builder's open connectivity framework. CTs contain the knowledge required by Oracle Warehouse Builder to perform a specific set of tasks against a specific system or set of systems. Combined with a connectivity layer such as JDBC, CTs define an open connector that performs tasks against a system, such as connecting to this system, extracting data from it, transforming the data, checking it, integrating it, etc.
Open connectors contain a combination of:
- Connection strategy (JDBC, database utilities for instance).
- Correct syntax (SQL etc.) for the platforms involved.
- Control over the creation and deletion of all temporary/work tables, views, triggers, etc.
- Data processing and transformation strategies.
- Data movement options (create target table, insert/delete, update etc.).
With these concepts the template builder is in complete control of fulfilling the implementation. When extending the reach of the data to be integrated, a new platform can be added, defining a new platform allows system specific code templates to be built that leverage specific features of that system. The platform defines the characteristics of a system such as how to connect to it, SQL characteristics (such as column alias separators, null keyword and so on) and the data types the system supports.
Code Template Mappings
The mapping design is mostly as before, a single logical design with a major shift that you can now configure the physical design and assign customizable code templates or assign the existing non-customizable Oracle Target template. The mapping can include a mixture of both, so you can leverage the flexibility offered for staging data via the open connectivity framework and the transformation and data warehouse operators (for example) provided by the traditional Oracle Target template - for example can leverage the CDC code templates for capturing changes and the data warehouse operators to build the warehouse.
The illustration below shows how a logical map design is physically configured into execution units which are assigned code templates, and the physical design is per configuration.
When we were building this we drew an analogy to islands and bridges, some code templates are bridges that move data from one island to another, and transformations or integration tasks can be performed on an island.
The code template based mappings reside in a different module, since they are executed in a different engine (the Control Center Agent), this module (just like other modules in OWB) is assigned a location, an Agent. The agent is where the mapping will be deployed and when executed this agent will orchestrate its execution.
Control Center Agent
The code template based mappings and web service support is facilitated through the Control Center Agent, a program that can be run on a machine to orchestrate the tasks in the mappings.
A mapping executes on a single Control Center Agent. If you want to remotely execute parts of the map on different systems you will have to use remote API calls including JDBC invocations, remote execute, database jobs using agent and so on.
The agent itself could be placed on a source or transformation system for example if you wanted to unload data using a native un-loader. The code templates are based on the Oracle Data Integrator 10g Substitution Reference interface, this interface (its only an interface, you can't deploy OWB maps to an ODI agent) has been implemented based on the metadata OWB deploys for locations, connectors, mappings, web services and OWB Control Center Agent components.
The figure below illustrates the mechanics and dependencies of code template based mappings within the Control Center Agent.
The mappings are deployed to this OWB agent in a completely headless manner. They are deployed as J2EE applications but are not heavyweight java applications - the application is basically a very small driver script and a some metadata about the map. The connectors (from agent to source and target system) are deployed as J2EE data sources.
The objects following the same pattern as previous design items, just you can get much more control of what capabilities of a system you would like to leverage. Can envisage some interesting templates leveraging the different loading capabilities of DBFS into database machine in the future.
Anyway that's a quick update which was easier to do that laying bamboo flooring and building closets (how many times did I hammer my hand, I know get a nail gun next time).