Oracle Data Integrator 11gR1 PS1 introduces a major new feature called Load Plans. This post will give you an overview of this feature.
The documentation defines the load plan as an executable object that can contain a hierarchy of steps that can be executed conditionally, in parallel or in series. The leaves of this hierarchy are Scenarios. Packages, interfaces, variables, and procedures can be added to Load Plans for executions in the form of scenarios.
In a nutshell, Load Plans are extremely powerful objects for organizing and launching scenarios in a production context. They should help you getting rid of manual scripts coded for starting scenarios in the correct order, and of packages used to launch other scenarios in cascade.
Creating a Load Plan
Load Plans appear in both the Designer and Operator Navigator as shown below. They are available for edition in a development and production repositories, and can be organized into scenario folders.
Creating a load plan is pretty much straightforward: Right-click and select New (Load Plan), specifiy a Name for the Load Plan. As a Load Plan will be launching scenarios, you can define at that level how these scenarios will be logged (Log Session, and Log Session Step, etc options).
The real work with Load Plans takes place on the Steps tab. There, you can define a hierarchy of steps. The leaves of this hieararchy will be Scenarios that will be started in sequence, in parallel and/or conditionally based on the values of variables.
In the example below, the Datawarehouse Load Plan does the following in a sequence (serial step).
- First it runs an Initialization step (this step starts a scenario called INITIALIZATION),
- It does a Refresh [of the] Dimensions in Parallel (more information below.)
- Then it evaluates the value of the IS_LOAD_FACT variable. This variable passed as a startup parameter of this load plan.
- If this value if 1, it run the LOAD_SALES then the FINALIZE_FACT_LOADING scenarios
- If this value if 2, it run the LOAD_SALES scenario only
- Otherwise, it runs the FINALIZE_FACT_LOADING scenario
Refreshing the Dimensions in Parallel implies that we perform two actions simultaneously:
- Load the Products (this is done by the LOAD_PRODUCTS scenario)
- Load Geographies (This is a package loading a set of country/regions/cities tables) and then load Customers (this is a second package).
This steps embeds a serial step within a parallel step. It is possible in load plans to embed steps within steps, creating a complete execution flow in the hierarchy.
To add the steps, you can either use the wizards (available with the "+" button in the toolbar). You can also drag and drop scenarios, interfaces, procedures, etc directly from the Designer tree view into the Step hierarchy to automatically create a scenario for this component and add this scenario as a step in the load plan.
If you prefer top-down development, you can create a load plan and add using the wizard scenarios that do not exist yet. In the example below, the scenario added with the wizard does not exist yet, and by using version number -1, we simply tell the load plan to execute the latest version of this scenario.
In addition, you can, from a load plan step, access the object from which the scenario was created or regenerate the scenario.
Reorganizing the load plan is also extremely simple as it is just matter of drag and drop !
Running Load Plans
After saving your Load Plan this, you can run it by clicking the execute button in the toolbar. The load plan running will be shown in the Load Plan Executions accordion on the Operator. The Steps tab of the Load Plan Run will show you the steps executed, their status and statistics. This whole tab reflects the executions in progress, and can be refreshed while the executions take place.
The sessions started by the load plan still appear in the Session's list, but the Steps tab is 10 times more useful to monitor the overall execution of all these sessions. By clicking on the Session ID link (in blue) in this tab, you open the Session editor and can drill down into the session.
Like Scenarios, Load Plans can also be started from a command line or a web service interface.They can of course be scheduled using external scheduling or the built-in scheduler.
Note that Load Plans require a JEE or Standalone agent for running. They cannot run within the Studio Local Agent. This is due to the fact that the Load Plan execution flow is distributed across the agents running the sessions started from the Load Plan. Using this architecture, there is no single technical failure point that may prevent a load plan from proceeding its execution flow when the execution takes place on multiple agents.
Exception Handling and Restartability behavior are one of coolest things in the Load Plans.
An Exception is simply a group of steps (like a mini Load Plans) that can be executed in case of a failure.
In the example above, I have defined two Exceptions (Minor Exception and Major Exception). They will start a scenario that mail the administrator. The major one in addition starts a scenario to dump the log. These exceptions can be triggered on step failure.
Every step has a property that indicates when a whether the exception should be executed when this steps fails, and whether the failure on this step should be raised to the parent step in the hierarchy. By raising a failure, you can escalate the failure, up to the root step, which fails the whole Load. By ignoring the failure, you flag this step's failure as a minor error.
In the example below, if any of the parallel steps refreshing the dimensions fails ("Max Error Child Count=0") the Refresh dimension is considered as failed. In the even of such failure, I will run the Minor Exception and continue the load. Even if not all dimensions are refreshed, the fact can still be loaded, as I am using ODI data integrity framework to isolate facts that would reference dimensions not correctly refreshed.
This example also illustrate the restartability for such a step. If I decide to restart this load plan, only the failed children would be restarted, as defined by the Restart Type option.
Note that when restarting an existing load plan, ODI does not overwrite the first load plan run, but copies it and restarts the copy. Each Load Plan Run is preserved for error identification and tracking.
Load Plans vs. Packages
Users already knowledgeable of ODI should now wonder: Are Load Plans a new type of packages? Well, although there are similarities between these two objects, they do not have the same objective:
- Packages are (simply said) technical workflows with a strong transaction nature, produced mainly by data integration developers.
- Load Plans aim at making easier the functional design and administration of a production, and are produced by production users and data integration project leads/architects.
Let's discuss the differences:
||Design-Time and Run-Time
||Design Time. Packages are compiled into scenarios at run-time.
||If production needs to modify the execution flow, it is preferable to deliver a load plan.
||UI, Command-Line, Web Services, Scheduling
||UI, Command-Line, Web Services, Scheduling
||Both features are equivalent.
||Each Load Plan Step contains its own transactions.
||Package Steps may share transaction.
||If the workflow requires a transaction that spawns accross several steps, use a package.
||Yes, using Parallel Steps. Parallel execution is easy to follow in the Operator.
||Yes, by starting other scenarios. Parallel execution is hard to follow in the Operator.
||When there is a strong need for parallel step execution, use preferably Load Plans.
||Yes. Status of previous runs is persisted.
||Yes. Status of previous executions is overwritten. Database transactions are not continued, hence restarting the whole package is often needed.
||Due to their transactional nature and the fact that their execution is overwritten by the new execution, packages are often restarted as atomic units of work. Load Plan provide better flexibility for restartability.
||Branching (Case/When is supported) Loops are not supported.
||Branching and Looping are supported.
||If there is a need for looping in a workflow, use preferably packages.
That's all for today. Stay tuned for more deep dives in the 18.104.22.168 new features !!!