Developers using Oracle Warehouse Builder have always been able to leverage the features and scalability of Oracle Database to load, transform, and store their data warehouse data. Oracle Warehouse Builder provides a graphical point-and-click environment that helps you easily map data between sources and targets with a set of rich SQL and PL/SQL transformations and manage the full data warehouse project lifecycle, from capture of the data model to deployment across multiple environments.
Oracle Warehouse Builder 11g Release 2 extends this capability to non-Oracle sources and targets by leveraging technology from another Oracle product, Oracle Data Integrator. Code templates based on Oracle Data Integrator knowledge modules implement best practices for connecting to, extracting, validating, and integrating data from a wide range of databases and middleware sources. Code templates use a mix of languages and substitution variables to leverage the native capabilities of various datasources. You can also write new code templates for features or datasources not built into Oracle Warehouse Builder.
Code templates that handle various aspects of the extract, transform, and load (ETL) process come in six categories:
Load code templates, for extracting data from sources Integration code templates, for integrating (loading) data into targets
Control code templates, for validating and checking data quality
Change data capture (CDC) code templates, for tracking new and changed data in source tables
Oracle target code templates, for encapsulating database-resident mapping (formerly known as Oracle Warehouse Builder mapping) logic and making it available in a code template mapping
Function code templates, for deploying functions, packages, and procedures in a code template mapping
Code templates provide the technology foundation for three key new Oracle Warehouse Builder features: heterogeneous database support, support for CDC, and support for code template mappings.
Heterogeneous database support. Oracle Warehouse Builder 11g Release 2 supports access to a range of Oracle Database instances and non-Oracle databases through the code template framework. Source platforms that work out of the box with code templates include Oracle Database, files, IBM DB2 UDB, and Microsoft SQL Server, and support for other platforms such as Oracle Essbase and MySQL can be added through Oracle metabase scripting.
Developers can still choose to access non-Oracle databases through gateways or Open Database Connectivity (ODBC), but the new code template framework makes setting up connections to other databases easier and does not require configuration steps outside of Oracle Warehouse Builder.
Support for CDC. Oracle Database has had native capabilities for the identification and transportation of new and changed data from datasources since Oracle9i Database. CDC code templates remove the complexity of identifying such data, through techniques such as mining the source database redo log or applying triggers to capture changes prior to a database commit. The code templates that ship with Oracle Warehouse Builder 11g Release 2 provide this functionality for several Oracle and non-Oracle database datasources. Select the tables for which to track changes when defining your database module, and Oracle Warehouse Builder will handle the technical details in the background.
Code template mappings. Probably the most significant new feature for Oracle Warehouse Builder developers, though, is support for code template mappings. Code template mappings combine the familiarity and flexibility of traditional Oracle Warehouse Builder mappings (now called database-resident mappings) with the integration possibilities provided by code templates.
For example, with code template mappings, you can extract data natively from a non-Oracle database; perform a JOIN between tables directly on the source database platform; and then load the result, using a MERGE statement, into an Oracle Database instance, thereby following SQL best practices at both the source and the target. The remainder of this article steps through a simple example.Code Template Mapping Example: Cross-Platform Data Integration
Let’s take two Microsoft SQL Server 2005 tables containing customer and address data, join them, and then incrementally load data from them into an Oracle Database 11g table.
To work through this example, you need Oracle Database 11g Release 2 (which comes with Oracle Warehouse Builder 11g Release 2 already installed) or you can install the Oracle Warehouse Builder 11g Release 2 standalone software in an Oracle Database 10g Release 1 (or higher) instance. To use the default Oracle Warehouse Builder schema in an Oracle Database 11g Release 2 installation, unlock the OWBSYS and OWBSYS_AUDIT accounts. Among other things, the OWBSYS account (or another schema you might have created for the same purpose) consists of the repository that houses the workspaces and workspace users. In turn, each workspace contains one or more projects. To get started, run the Repository Assistant to create a workspace and a project to use for this example. See the Oracle Warehouse Builder Installation and Administration Guide for details.
Preliminary setup. In addition to Oracle Warehouse Builder 11g Release 2, you need the source Microsoft SQL Server database and Microsoft’s JDBC driver, which enables Oracle Warehouse Builder 11g Release 2 to connect natively to Microsoft SQL Server 2005. Download Microsoft SQL Server 2005 JDBC Driver 1.0 from the Microsoft Website. Uncompress the archive, and copy the sqljdbc.jar file from the package to the $OWBHOME/owb/lib/ext directory.
The built-in code templates that access non-Oracle database data consist of Java code that executes in a Java 2 Platform, Enterprise Edition (J2EE) container on the Control Center Agent (also new in this release), the J2EE runtime included with Oracle Warehouse Builder 11g Release 2. Code template mappings are deployed to the Control Center Agent. A module that ties connection and process to the agent is defined during the course of this example, so go ahead and start the Control Center Agent before continuing.
To start the Control Center Agent, execute the ccastart script within the $OWBHOME/owb/bin/unix or $OWBHOME/owb/bin/windows directory:
oracle@ora11g cd /u01/app/oracle/product/11.2.0/owb11gR2/owb/bin/unixoracle@ora11g ./ccastart
The first time the Control Center Agent is started, you must enter a password for the Oracle Containers for J2EE (OC4J) Administrator account. You will need the password later in this example, so make a note of it.
With the server components in place, we can turn to the Oracle Warehouse Builder client application, Design Center, to get started creating the modules—the containers that identify a location and define the load, transform, or other actions—we’ll need for this integration.
Step 1: Create the SQL Server source module. To begin, do the following:
Log in to the Design Center application as a workspace user. In the Projects pane, open your project and locate the Databases node.
Double-click the Databases node to display the list of available platforms.
Right-click the SQL Server node, and select New -> SQL Server Module to launch the Create Module wizard.
Name the module SS_SOURCE , and select Native Database Connection for the access method to use the native JDBC drivers.
Click OK to continue. The wizard prompts you to create a new database location.
Accept the default name, SS_SOURCE_LOCATION1 , and complete the other connection details for the SQL Server database, as shown in Figure 1. For this example, the location details are as follows:
User Name: SA (the default system administrator account in SQL Server 2005)
Password: password (change this to the relevant password for your source database)
Host: 172.16.233.130 (again, change as appropriate)
Database Name: SampleDB (change as appropriate)
Schema: dbo (the default “database owner” schema in SQL Server 2005)
Version: 9.0 (also supports 7.0 and 8.0)
|Figure 1: Specifying the SQL Server database location|
When the connection details are correct, click OK to close the dialog box. On the next page of the wizard, check the Import after Finish checkbox to automatically start the metadata import process when this wizard finishes.
Click Next , and click Finish to close the Create Module wizard. The Import Metadata wizard launches.
Select Table for Object Type , and then select the tables whose metadata you want to import. For this example, my tables are CUSTOMERS and ADDRESSES.
Step 2: Create the Oracle Database target module. The Oracle Database target module identifies the location of the Oracle Database instance and the table for the transformed data. For this example, because we are using a MERGE transformation to incrementally load the data, the table requires a primary key. To create this module in your project
Right-click the Oracle node in the Databases list (under the Project Navigator in Design Center).
Select New -> Oracle Module to launch the Create Module wizard.
Name the module CUST_DW.
Enter the connection details for the target database. Select Import after Finish to launch the Import Metadata wizard when the Create Module wizard finishes.
After defining the module, create (or import) the metadata for your target table. For this example, I’ve named the target CUSTOMERS_FULL .
Step 3: Create the template mapping module. To take advantage of the new heterogeneous connectivity features in Oracle Warehouse Builder, you must create your data mapping as a code template mapping. Code template mappings are created in the Template Mappings area of a project. Before you can create a code template mapping, however, you must define a template mapping module that connects to a Control Center Agent, as follows:
In the Project Navigator of the Design Center, locate the Template Mappings node within your project.
Right-click the Template Mappings node, and select New Mapping Module to launch the Create Module wizard.
Enter a name for the new module. For this example, we’ll use DEFAULT_AGENT.
Click Next to continue. You are prompted to create a location pointing to a Control Center Agent.
Click the Edit button to create the location, and enter the connection details for the Control Center Agent, as shown in Figure 2. For example,
User Name: oc4jadmin
Password: welcome1 (password you entered in “Preliminary Setup”)
Host: ora11g.rittman (change as appropriate)
Port Type: RMI
Instance Name: <leave blank>
Application Name: jrt
HTTP Port: 8888
|Figure 2: Creating the template mapping module|
With the source, target, and container modules defined, you can now create the code template mapping.
From the Projects pane, right-click the DEFAULT_AGENT template mapping module created in the previous steps and select New Mapping .
Enter a name for the mapping, such as SAMPLE_CODE_TEMPLATE_MAP .
Click OK to create the empty mapping.
Add the two SQL Server source tables to the mapping canvas.
Add a joiner transformation operator to the mapping canvas.
Connect the two source tables to the joiner.
Define the JOIN condition.
Add the Oracle Database target table to the mapping.
Connect the Oracle Database target table to the joiner transformation, so that the mapping looks like that in Figure 3.
|Figure 3: Code template mapping logical view|
The logical view is complete, so we can now define the execution view.
Step 4: Create the execution view for the code template mapping. Execution units are new in this release of Oracle Warehouse Builder. They are used to associate related mapping operations with a specific code template. This particular code template mapping requires two execution units: one that executes on the SQL Server source and another that executes on the Oracle Database target.
After defining the execution units, you will use two code templates to perform the data mapping:
LCT_SQL_TO_ORACLE, a load code template that extracts from any generic SQL database source and loads into a temporary Oracle Database table
DEFAULT_ORACLE_TARGET_CT, an Oracle Database target code template that acts as a “wrapper” around classic Oracle Warehouse Builder SQL and PL/SQL mapping functionality
To create these two execution units and assign their code templates, return to the Design Center. With your mapping open and with Execution View selected, click the Default Execution Units button in the top left corner. The two execution units needed for this project are created automatically.
With Execution View selected, click the SS_SOURCE_LOCATION1_EU execution unit. A multitabbed panel appears by default, under Execution View .
Locate the SS_SOURCE_LOCATION_EU - Code Template tab under Execution View .
Select the Integration/Load Code Template subtab, and then choose the PUBLIC_PROJECT/BUILT_IN_CT/LCT_SQL_TO_ORACLE code template.
Repeat this process for the CUST_DW_LOCATION_EU execution unit (which runs on the target Oracle Database instance ), but select the PUBLIC_PROJECT/BUILT_IN_CT/DEFAULT_ORACLE_TARGET_CT code template.
Step 5: Deploy the code template mapping. The mapping is now ready to deploy to the Control Center Agent, where it will run. There are a couple of different ways to do this; the simpler one is to open the Control Center Manager from the Tools menu of Design Center and then do the following in the Control Center navigation tree:
Expand the project node under which you created the code template mapping, and then expand the location node associated with the mapping module containing the code template mapping.
Expand the mapping module node containing the code template mapping.
Select the code template mapping created in Step 4, and in the Object Details panel, select Create as the Deploy Action .
Click the deploy icon.
Check for any warnings or errors that may be generated during deployment, but you can safely ignore any VLD_2753 and VLD-2752 warnings on the source execution unit—they will be suppressed in a future patch release.
Finally, run your code template mapping and check that data has been transferred as expected into your Oracle Database instance. As with database-resident mappings, you can start and then monitor the progress of your mapping, using the Control Center Manager, or you can start it directly from within the Project Navigator. For this example, we’ll run the template mapping from the Project Navigator and then display the results of the execution in the log.
Step 6: Execute the code template mapping. In the final step, do the following:
Select the SAMPLE_CODE_TEMPLATE mapping within the Projects pane, right-click it, and select Start.
The code template mapping executes. You can watch the progress of the execution in the Log pane.
As you have seen in this example, the new code template mapping capability enables you to leverage the strengths of non-Oracle source databases without losing the benefits of classic Oracle Warehouse Builder integration, load, and control code mappings. Oracle Database-specific transformation operators—such as dimension operators, cube operators, and match-merge operators—can be used with the new code templates to support both heterogeneous and Oracle Database-specific data transformations.
By mixing and matching database-resident and code template mappings in your process flows, you can create data integration processes that leverage the capabilities of both types of mapping. Check out the Oracle Warehouse Builder 11g Release 2 documentation on Oracle Technology Network for more details.
Code templates bring the heterogeneous data connectivity and data integration capabilities of Oracle Data Integrator into the Oracle Warehouse Builder toolset, giving you the ability to load, validate, and integrate data across a wider range of datasources. Best of all, because code templates are written with a well-documented template language and a set of substitution variables, you can extend them or write your own to suit your own particular circumstances.
DOWNLOAD Oracle Database 11g Release 2 for Linux, Oracle Solaris, HP-UX, or AIX
Photography by Ricardo Gomez, Unsplash