Developers working on data integration projects are often required to load numerous database tables in a particular sequence, with parts of the load process carried out in parallel to reduce load times. Ideally such load routines should be configurable—so that, for example, a data warehouse can be reloaded or refreshed with new data—and it should be possible to restart a failed load routine once the reason for the failure has been addressed.
To handle these requirements, the 184.108.40.206 release of Oracle Data Integrator 11g introduces load plans. Load plans—building on the interfaces, packages, procedures, and scenarios already present in Oracle Data Integrator projects—provide the ability to create hierarchical data integration processes that enable conditional execution, parallel execution of integration tasks, and plan restartability after a failure.Creating Your First Load Plan
So how do load plans work, and how do they differ from packages, the traditional way to sequence integration steps in Oracle Data Integrator 11g? To find out, let’s work through a scenario in which data is sourced from the OE (Order Entry) sample schema that comes with most Oracle Database releases and is loaded into product and customer dimension tables as well as an ORDERS fact table in another schema. If you want to try this new feature yourself, download and install Oracle Data Integrator 220.127.116.11, access a database with the OE sample schema installed, and download and install the load plan project files. Follow the instructions in the zip file for installing the load plan project files.
In the initial version of this article’s Oracle Data Integrator project, a package loads each table in turn via a set of interfaces. Now let’s enhance this load routine, so that (1) the two dimension tables are loaded in parallel before the fact table and (2) the user has the option to load just the fact table, skipping the dimension table load.
To do this, follow these steps:
With Oracle Data Integrator’s Studio integrated development environment (IDE) open, click the Designer navigator tab and navigate to the Load Plans and Scenarios pane. At the right of the pane header, select New Load Plan.
The load plan editor opens on the right-hand side of the screen. Ensure that the Definition tab is selected, and then enter the following details:
Name : OELoadPlan
Description : Load Plan to load
customer, product, and order data
Click Save to save the load plan’s initial definition.
To add a new first step to the plan that will run a procedure called Trunc Error Table to truncate the error table, first select the Steps tab in the left column. Then click the add step button (the green plus [+] sign), select Serial Step from the menu, and rename it Initialize.
To add the Trunc Error Table procedure to the step, locate it on the Projects panel and drag and drop it on top of the new Initialize step. Load plans run only scenarios, the compiled form of procedures, and other Oracle Data Integrator integration objects, so when you drop the step, the load plan editor automatically creates the scenario and adds it to the load plan for you.
Figure 1: Adding a Case step to the load plan
The Case Step wizard launches. Click Lookup Variable, select the variable to use in the Case step—LoadOrdersOnly in this case—and then click Finish.
A Case step is accompanied by one or more When steps that test for individual values and an Else step that covers all other values. Here’s how to add a When step that loads just the fact table when this variable value is set to 1: With Case Step selected, click the add step button and select When Step from the menu. Then on the Step Properties panel, enter and select the following values:
Name : When Value = 1
Operator : Equals (=)
Value : 1
Then go back to Case Step and click the add step button to add an Else step to it. Finally, click When Step, Else Step, and the add step button to add a new Serial step to each one—ready for you to start adding project interfaces to each of the new steps.
The first interface you’ll add is for loading just the fact table. To do this, drag and drop the Pop.Fact_Orders interface onto the Serial step under the When Value = 1 step.
For full loads handled by the Else step, you first want to load the two dimension tables in parallel and then load the fact table. To load the two dimensions in parallel, click the add step button to add a new Parallel step under the Serial step under the Else step and then drag and drop the Pop.Dim_Products and Pop.Dim_Customers interfaces onto this new Parallel step.
Then click back on Serial step under the Else step, click the add step button to add a new Serial step under it, and then drop the Pop.Fact_Orders interface onto it. Once complete, your load plan should look like the one in Figure 2.
Figure 2: The initial load plan
Now that you’ve created the basic load plan, let’s test it out. Click Save to save your load plan details, and ensure that you have a standalone agent running (because you cannot use the built-in agent that comes with Oracle Data Integrator’s Studio to run load plans). Click the Execute button at the top of the load plan editor, enter 0 as the startup value of the LoadOrdersOnly variable to trigger a full load, and then switch to the Load Plan Executions pane within the Operator navigator to see the outcome of the load plan run.
Double-click the load plan run under the Agent folder on the Load Plan Executions pane. A window opens, showing the actual steps that were executed by this load plan run. In this case, because you passed 0 as the variable value when executing the load plan, the Else part of the plan executed and performed a full load. If you executed the load plan again but this time passed 1 as the LoadOrdersOnly variable value, you would see the When part executed instead.Exceptions and Plan Restartability
So far you’ve seen the conditional execution part of load plans in action, but what about exceptions and restartability?
Let’s continue this scenario by considering how you might handle a situation in which the load plan tries to process rows for the ORDERS fact table but those orders reference product dimension IDs that don’t exist, a common scenario for data warehouse developers.
To simulate this situation, let’s first disable the constraint on the OE.ORDER_ITEMS table that stops you from entering invalid product ID values into the PROD_ID column. (You might want to back up your OE schema before doing this, so that you can restore it to its original values afterward.)
ALTER TABLE order_items
DISABLE CONSTRAINT order_items_
Now let’s add new values into the ORDERS and ORDER_ITEMS tables that reference a PROD_ID that doesn’t exist in the OE.PRODUCT_DESCRIPTIONS table:
INSERT INTO orders
INSERT INTO order_items
INSERT INTO order_items
Now execute the load plan again, passing 1 as the LoadOrdersOnly value to trigger a full load. This time the load plan fails at the step where it tries to load the fact table, because the product key lookup fails and Oracle Database raises an error when the load plan subsequently tries to insert a NULL value into the OE_TARGET .FACT_ORDERS.PROD_ID column, which has a NOT NULL constraint on it, as shown in Figure 3.
Figure 3: The load plan showing the error caused by an invalid product ID
To deal with this type of data issue, you need to do two things:
Create an exception with an Exception step that, in turn, runs an Oracle Data Integrator procedure that moves any such rows out of the OE.ORDER_ITEMS table into an error table in the OE_TARGET schema.
Associate this exception with the scenarios in the load plan that load the data warehouse fact table, so that when you try to restart the failed load plan, it will complete successfully.
With the load plan open in the Designer navigator, click the Exceptions tab, click the add step button, and select Exception Step from the menu.
Double-click the new Exception step to rename it, and call it Load Order Exception. To add the Oracle Data Integrator procedure that moves the rows to the error table, drag and drop the Move Offending Items procedure from the Projects pane onto the new Load Order Exception step, so that it is added as a scenario to the load plan, as shown in Figure 4.
Figure 4: Defining the Exception step
Now locate the steps in your load plan that run the scenarios that load the FACT_ORDERS table—in Figure 2, these are steps 6 and 13—and change their restart type in the Property Inspector from Restart from New Session to Restart from Failed Step.
Navigate in turn to each of these step’s parent steps—in Figure 2, these are steps 5 and 12—and in the Property Inspector, change the Exception step value to Load Order Exception, the exception you defined in the previous step.
Choosing these settings ensures that in the event of an error, the Move Offending Items procedure will run to remove the erroneous rows and the load plan can be restarted at this point, skipping all the previous steps.
Figure 5: The restarted load plan, rerunning only the failed step
Load plans in Oracle Data Integrator 11g give you the ability to define data warehouse and other data integration load routines that enable conditional execution and support exceptions and restartability. Available in the 18.104.22.168 release and with additional features in the 22.214.171.124 release, load plans build on the existing interface, procedure, and package features in Oracle Data Integrator and provide a new way to orchestrate and manage your data loading routines.
READ more about Oracle Data Integrator
READ more Rittman
Photography by Igor Ovsyannykov, Unsplash