June 10, 2009

File staging using external table preprocessor

Here is a quick write up of how to take advantage of the external table preprocessor feature which was squeezed into the 11.1.0.7 and 10.2.0.5 database patch sets from within OWB. OWB has rich metadata support for file design and its manipulation from the file metadata itself to the external table definition, SQL*Loader mappings and PLSQL support, it also provides a flexible approach (more like tips and tricks) for incorporating aspects of the database that are not directly designed in the UI, such objects include views, unbound operators in mapping etc. For external table, the access specification can be hand-crafted rather than generated from the metadata or imported from the data dictionary. It is this flexibility that allows you to incorporate the preprocessor directive into external table designs within production OWB releases.

For details of the preprocessor directive see the article ‘Using the Preprocessor Feature with External Tables in Oracle Database 11g’ on OTN.

The example we will illustrate here is using a compressed data file as the data file source for the external table. The external table represents customer details and is defined in a database schema. We will setup the metadata location for the Oracle module and select ‘Import’ (external tables can also be designed, you do not need to base it on a file defined in OWB, if you hit ‘Finish’ on the file selection page of the wizard you will create an ‘unbound’ external table);

exttab_gzip1

From the import wizard we can select the external table to be imported;

exttab_gzip2

When the table has been imported we get the signature of the table with all of the columns and data types defined;

exttab_gzip3

In the access specification section we see the PREPROCESSOR directive that our external table was defined with also imported. The script gunzipdb is used to encapsulate the unzipping and write the resultant file to the standard output stream to be consumed by the external table access driver. As mentioned from the above  post although commonly used for compressed files it could perform any manipulation on the file in preparation for it being consumed by the external table.

exttab_gzip4

The gunzipdb is just a shell script I created to wrapper the options to the gzip executable, it performs the unzip using the –d and –c options (on Windows I included the echo off command, also had much better success with large compressed files on Linux than on Windows).

We can inspect the external tables configuration parameters and customize the external table’s treatment for missing values, trimming etc.

exttab_gzip5

We can also see the data file imported with the external table definition, note it is a gzip’d file and the SQL directory location can be set. We can also add additional data files if we wanted to process many files.

exttab_gzip6

 

We can now either just use this external table in a mapping or generate and deploy the external table definition off to another schema if desired.

exttab_gzip7

That’s a quick run through of using the external table preprocessor capabilities from within OWB, if there are other interesting use-cases it’d be interesting to get some feedback.

June 9, 2009

Using Oracle Warehouse Builder with Exadata: Success Story at Allegro Group

Oracle Magazine’s feature story, “Lead with Intelligence”, documents the successful experience of Oracle Exadata at Allegra Group, the largest e-commerce company in Eastern Europe. Warehouse Builder was selected as the ETL tool for this multi-terabyte warehouse project on Oracle’s most advanced data warehouse architecture delivering extreme data warehouse performance.

The HP Oracle Database Machine, announced at Oracle OpenWorld in September 2008, is a grid of eight database servers with 64 Intel processor cores running Oracle Database 11g and Oracle Real Application Clusters on Oracle Enterprise Linux. This data warehouse solution also includes a storage grid of 14 HP Oracle Exadata Storage Servers with 112 processor cores that’s connected to database servers over InfiniBand. …

In December 2008, Allegro Group deployed the HP Oracle Database Machine and began generating operational and statistical reports using Oracle Business Intelligence Suite, Enterprise Edition Plus. Allegro Group also created an interface to its existing systems using Oracle Warehouse Builder, a core component of Oracle Database 11g. Each day, Oracle Warehouse Builder automatically loads production data into Allegro Group’s Oracle data warehouse, which already holds more than 7TB of data on the HP Oracle Database Machine.

More at Oracle Magazine.

May 14, 2009

DW Migration and Upgrades: Cheaper, Faster, Safer with OWB

Handcoding may look cheaper and quicker sometimes, but OWB gets you greater flexibility and lower costs over the long term.  This news item from ZDNet down under showcases an OWB customer winning big in a quick migration of their data warehouse from Oracle Database 10.2 to Database 11.1:

Retail distributor Metcash, owner of IGA distribution and IGA Fresh, this week revealed it had migrated to Oracle's Database 11g from 10.2g earlier this year.

The upgrade decision came as part of a wider data warehouse hardware consolidation program in late 2008, according to Carl Young, Metcash's technical lead for the project, with the company ditching its HP Superdome Unix servers for IBM P6 servers running the AIX operating system.

"I thought it best to roll up the two testing cycles into one and move on to the latest Oracle software," Young said of Metcash's motivation to upgrade to Database 11g.

The migration to 11g took a total of 12 weeks, while testing in Metcash's production environment with the systems running in parallel took six weeks. Metcash's 11g went into production in February this year.

Metcash's use of Oracle's programming tool Warehouse Builder, which automatically regenerated code from the source 10.2g database to the target 11g database, meant programmers weren't required for the taskBenefits to Metcash included reducing batch report production time, cutting out the use of Excel spreadsheets, and the ability to store more data online. Other performance benefits came from query caching, which helped avoid duplicating retrieval as well as the ability to make structural changes to the database without having to rebuild the "materialised database view aggregate" — a process that used to take 24 hours.

So… the takeaway: OWB made it easy to build a faster, more flexible data warehouse that leveraged the 11g database on a short schedule, with minimal development costs, low overall migration costs, and low project risk.

May 6, 2009

Headless Operation – OWB code generation, going it alone

There are a few different implementation scenarios for moving designs from OWB into an execution environment, out of the box OWB has a runtime service component which supports deployment and execution of all of OWB’s design objects. The standard deployment through the control center is the approach the majority take, it is the one we document and include in training etc. This post will illustrate how you can roll your sleeves up and go it alone in a headless mode, using the deploy to file utility (on exchange here). In the OWB designer you can generate code for any object, take the code by saving it to file or copy-pasting and run where ever you want. This utility speeds up this process and saves scripts for a bunch of objects into a directory and provides a convenient top-level wrapper script.

The deploy to file utility uses some OMB scripting commands to generate a directory of SQL scripts and control files for objects in OWB’s designer. Not all objects can be deployed in this way, but common ones can;

  • tables/views/materialized views
  • dimensions/sequences
  • procedures/functions/packages
  • mappings (PLSQL/SQLLoader)

The utility will generate the code as a bunch of SQL scripts (and SQLLoader control files) and be installed independently of the OWB product (there is still the need for the repository tables for the runtime etc.). The utility provides the ability to deploy from a collection or an Oracle module. It uses the OMBDEPLOY ....AS SCRIPT TO ... grammar to generate a script that can be executed (deployed) in SQLPlus like any other Oracle SQL script. For SQLLoader mappings a control file is generated and this control file can be used by invoke SQLLoader command line and passing the data input file name on the command line. For execution of the PLSQL mappings use the MAIN function method (not the procedure entry point). Process flows cannot be deployed since the code is XPDL (XML) that is interpreted by the runtime adapter and transformed into Oracle Workflow for example. Executing in this manner will also not give the full audit capabilities of OWB, so there are a bunch of caveats.

Users can use collections to group sets of objects together and use this as a deployment unit. For example to deploy all objects in a collection simply right click on the collection node and select 'Deploy To File', the example below will specifically only generate code for the table and mappings in the collection DEP_TEST;

deploy2file3

Next you will be prompted for the directory to generate the SQL scripts in;

deploy2file4

That's it. The utility will generate all of the SQL scripts and save them to file also generating a master create.sql script (dependent order is not guaranteed). The utility is an expert, so can be altered, enhanced to provide such capabilities.

The utility is also available from the Oracle module node :

deploy2file5

When executed from the Oracle module node you will be prompted to select the types you wish to deploy, so you can decide to only generate tables or mappings or any combination can be selected;

deploy2file6

A handy little utility.

I mentioned that not all of the OWB metadata is included since manual deployment was chosen, so you cannot see the deployment information in the OWB runtime tables, or the executions from the control center. The audit details such as errors and row counts are still performed. For example taking the following query;

  • select task_type,exec_location_uoid,object_name,object_type, number_task_errors,number_task_warnings from all_rt_audit_executions

The task_type, exec_location_uoid are normally populated when executing an object deployed through OWB, this ties all the metadata together that the control center manager and browser use, When you manually deploy the script this is unavailable, audit rows are still recorded so the number of task errors and warnings are still reported. The auditing is still done for PLSQL maps, for SQLLoader maps, scraping of the audit details from the log into the audit tables is done by the control center service, so manual execution will not have such audit info.

There is a SQLDeveloper report for viewing the execution information included in the zip. Import the file owb_sqldev_headless.xml from the 'User Defined Reports' node in the Reports panel of SQLDeveloper. This will create a report 'Mappings (headless)' under 'User Defined Reports/OWB/Execution' folders. The report is a master detail report showing;

  • throughput
  • mapping audit
  • mapping errors
  • throughput chart
  • timing analysis

deploy2file9

This approach may be useful for you when you are happy manually building the system and leveraging the code generation capabilities of OWB. This is definitely for the more savvy OWB user since you will be responsible for the database links/directories for example that are used by the mappings and for orchestration of the mappings etc.

April 17, 2009

LinkedIn Group Breaks 500 Member Mark

522members

Wow, you folks move fast. In three days we’ve added over 30 LinkedIn members, which has to be a good thing. Obviously somebody’s reading the blog… :)

More real content soon, I promise—we’re just in a busy time right now, working on OWB 11.2 and a few other surprises. Meanwhile, we’re also thinking ahead to OpenWorld planning… Let us know what you want to see there.

April 14, 2009

LinkedIn Group Closing in on 500 Members

The Oracle Data Integration Linkedin Group (originally just an OWB group, now open to all users of Data Integration technologies with the Oracle database) is closing in on 500 members now.

Come join us at LinkedIn.

Group members include product management, development and marketing from the OWB and ODI teams, as well as users of each product, and recruiters and hiring managers looking for candidates with OWB skills. People post job leads, questions, and questions and discussions about the future of data integration at Oracle.

Will you be the one to put us over the top? (I sure hope someone is...)

April 3, 2009

User-Contributed Script: Export All Projects in a Repository

Oracle Warehouse Builder user Hanming Tu contrinbuted a Korn Shell script and OMB*Plus script to export all the projects in a OWB repository to MDL. (It should be possible to write a Win32-specific wrapper for use with the OMB*Plus script as well, if you don’t have a korn shell implementation on your Windows box.)

Known limitations:

  • it requires single user mode to connect to the repository
  • it stops the export if users are still in a project

You can download the script here

April 2, 2009

User-Contributed Utility: Source to Target Impact Analysis Expert

Submitted by Anthony Chow of Octagon Research, this expert locates all target tables for any given source table and output the results to a CSV.

Through OMB+, I used the technique of parsing the result strings from OMBIMPACT DEPENDENCYTYPE calls.
Some assumptions used during development, which you can certainly modify:
· "Project" is equivalent to projects shown as top nodes in the Project Explorer
· "Study" is equivalent to Oracle databases under each project
· Select to analyze multiple source tables by using the Ctrl key
· Target tables have "_FINAL" as the suffix in the name
Known limitations:
· Patience is certainly virtuous; great results take time to produce
· OMBIMPACT DEPENDENCYTYPE is unable to detect impacts against columns embedded into expressions; and the pivot and unpivot operators

March 30, 2009

User-Contributed OWB Utility: Post them if you have them!

For some years we have had our own OWB Utility Exchange on OTN where we post useful scripts and experts for the OWB user base. We are planning to reorganize that content to make it easier to access.

We also receive from time to time contributions of scripts from our user community that other customers may find valuable. Until recently, we have encouraged posting those on the official Oracle Wiki page for OWB (http://wiki.oracle.com/page/OWB+Utilities+and+Experts). However, to maintain a certain amount of control over that content and to concentrate it where our traffic actually goes, we have decided to start posting them on the blog instead.

Check the User-Contributed Content category to find such material. 

Contributing a Utility or Expert for Oracle Warehouse Builder

      • To submit material, send an email message to antonio.romero@oracle.com and tell me what you have.
      • Describe the expert or other script, the version of OWB it was developed for (including patch level), how to install and use it, and who to contact with any questions.
      • If we think it’s of general interest and well-crafted etc., we’ll run it and link back to your site.

      March 20, 2009

      Process Flow Variables – To Quote or Not To Quote

      Here is some information on the process flow variable literal quote or not quote query. Its not exactly intuitive, so hopefully this helps! The flow designer has to know what is a PLSQL activity and what is not. Process flow variables are added in the same manner as process flow parameters (see the post here for details)


      Literal = FALSE

      When Literal = FALSE is set then the value entered must be a valid PL/SQL expression which is evaluated at the Control Center e.g.
      'Hello World!'
      22 / 7


      Literal = TRUE

      When Literal = TRUE then the value is dependent on the the type of Activity.  If the activity is a PL/SQL object i.e. Mapping or Transformation, then the value is PL/SQL snippet.  The critical difference here is that the value is macro substituted into the call for the object.  The format of the value is identical to that entered as default value is the Mapping editor. e.g.
      'Hello World!'
      sysdate()

      If the activity type is not a PL/SQL object then the value is language independent. e.g.
      Hello World
      3.1427571

      Some illustrations

      So what happens when you get it wrong? The following variable definitions and default value

      variables0

      will give the following errors when the flow is executed:

      Error
      RPE-01003: An infrastructure condition prevented the request from completing.
      PFLOWCON
      Error
      RPE-01038: Failed to evaluate expression declare "$LOOP_DETECT$" NUMBER := 0;function "PFLOWCON" return VARCHAR2 is "WARNING" NUMBER := 2;"RETURN_RESULT_NUMBER" NUMBER := NULL;"NUMBER_OF_ERRORS" NUMBER := 0;"SUCCESS" NUMBER := 1;"ITEM_TYPE" VARCHAR2(4000) := (owbsys.wb_rt_task_variable.get_value(12137));"ERROR" NUMBER := 3;"EVAL_LOCATION" VARCHAR2(4000) := (owbsys.wb_rt_task_variable.get_value(12140));"OK" NUMBER := 1;"FAILURE" NUMBER := 3;"AUDIT_ID" NUMBER := 12136;"RETURN_CODE" NUMBER := NULL;"NUMBER_OF_WARNINGS" NUMBER := 0;"ITEM_KEY" VARCHAR2(4000) := (owbsys.wb_rt_task_variable.get_value(12139));"RETURN_RESULT" VARCHAR(64) := NULL;"PARENT_AUDIT_ID" NUMBER := NULL;"OK_WITH_WARNINGS" NUMBER := 2;begin "$LOOP_DETECT$" := "$LOOP_DETECT$" + 1;if "$LOOP_DETECT$" > 2 then raise_application_error(-20001, 'Loop detected calling "PFLOWCON"');end if;return to_char(defaultvalue);end;begin :result := "PFLOWCON";end;.  Please modify the expression, redeploy and retry again.
      PFLOWCON
      Error
      ORA-06550: line 1, column 831:
      PLS-00201: identifier 'DEFAULTVALUE' must be declared
      ORA-06550: line 1, column 816:
      PL/SQL: Statement ignored

      This error happens when you have a variable defined as a STRING, with literal defined as false, and the value of the variable is not quoted.

      So the following two variable definitions are equivalent and valid:

      variables1

      So watch out for the RPE-01038 exception and check the values and literal definitions.

      About

      My Profile

      Powered by
      Movable Type and Oracle