SQL and OWB - Accelerated Map Construction? Part 1

For those folk who know SQL even a little SQL and then come to OWB, this would be a great helper; build your map from SQL. How do I build a map that looks like this, how do you support this etc. The reality is that although there are a percentage of SQL statements that fit nicely into the OWB mapping paradigm, there are times when you must rethink the SQL to fit into the capabilities of OWB.

Imagine the possibilities, you could use it to learn how to build maps, to create maps quickly, to construct maps from SQL generators such as summary advisors etc. Before getting carried away though... this is a demo, a demo built using OWB experts and an ANSI SQL parser, like most generators there are various caveats, with this demo there are even more, but I'll do my best to maximize the demo appeal;)

A Simple Example
Let's take a simple example from the SQL reference manual, this is an interesting and small example:

 SELECT department_id, MIN(salary) MIN_SAL, MAX (salary) MAX_SAL
   FROM HR.employees
    WHERE HIRE_DATE > '01-JAN-98'
    GROUP BY department_id
    HAVING MIN(salary) < 5000
    ORDER BY department_id;

With OWB this would be represented by a number of operators with various properties set. How long would it take to build this by hand? How many mistakes would you make? Would you forget the order in which the operators should be created?  Let's look at this in an accelerated way using copy-paste!

On the 'Mappings' node in the tree, select 'Build from SQL...'. With this we can copy-paste some SQL (ANSI SQL) and get the map automatically built. I did this as a demo there are a few caveats but with what is here you will see the great potential time saver plus resource of information that this can be.

Launch the dialog to create via SQL:

Map Builder 1:

Below we copy-paste the SQL we are interested in

Map Builder 2:

The dialog as well as prompting for the map name and SQL also included an option for the code generation. Since I generally build SQL maps, I defaulted the expert to use set based mode as the default (this set both the generation mode and default operating mode to set based).  After completing this dialog, the generator is off and running - it has conveniently generated the map including all of the operators with respective properties to represent the SQL entered:

Map Builder 3:

So what did the generator do?
  1. Added tables for all of the FROM tables (schema name is mapped to module, alias is operator name)
  2. Added filter operator for WHERE clause if a single table
  3. Added joiner operator for WHERE clause if multiple tables
  4. Added expression operator for any expressions in select list or
  5. Added aggregation operator if using an aggregated function (including group by clause, having clause)
  6. Added constant operator for literals in select list
  7. Added set operator for any set operation such as UNION, MINUS etc
  8. Added sorter for any ordering defined
  9. Inserted a target table if an INSERT clause is used
This demo works nicely for INSERTs and SELECT statements, it breaks down on the richness of the SQL parsing support (I used one that I googled that had a nice API and I could quickly build a demonstration from). It would be better to have rich support for all the Oracle specifics including specific grammar, hints etc. Later I'll cover DELETEs and UPDATEs.

Taking this Further
OK, taking this further it would be great to then click on a view or materialized view and construct the map from the view/materialized view query - then you can get really rich lineage/impact analysis information.

Map Builder 4:

This example gets fairly complex data warehouse SQL reverse engineered into an OWB mapping all with an effortless copy/paste;

Map Builder 5:

The SQL happens to be straight from the Oracle Data Warehousing guide (Example 20-8 GROUPING combined with HAVING), I made it slightly more complex by union the query with itself to illustrate the union operator in the map (just for fun);

SELECT channel_desc, calendar_month_desc, country_iso_code,
SUM(amount_sold), '9,999,999,999') SALES$,
       GROUPING(channel_desc) CH,
(calendar_month_desc) MO,
       GROUPING(country_iso_code) CO
FROM sales, customers, times, channels, countries

WHERE sales.time_id=times.time_id AND sales.cust_id=customers.cust_id
AND customers.country_id = countries.country_id
AND sales.channel_id= channels.channel_id
AND channels.channel_desc IN ('Direct Sales', 'Internet')
AND times.calendar_month_desc IN ('2000-09', '2000-10')
AND country_iso_code IN ('GB', 'US')
GROUP BY CUBE(channel_desc, calendar_month_desc, country_iso_code)
HAVING (GROUPING(channel_desc)=1 AND GROUPING(calendar_month_desc)= 1
AND GROUPING(country_iso_code)=1) OR (GROUPING(channel_desc)=1
AND GROUPING (calendar_month_desc)= 1) OR (GROUPING(country_iso_code)=1
AND GROUPING(calendar_month_desc)= 1);

Constructing this would have involved quite a number of mouse clicks I bet. As you see this demos really nicely for a few examples, even as an accelerator for building a percentage of the map it is quite nice ie. quickly get a bunch of operators on the mapping canvas with basic joiners, sorters etc..

Deletes and Updates
I guess this is where it gets tough. How can something so simple in SQL be tricky for me to understand in OWB? Well with any tool it will have a few nuances, a few rough edges that you can forgive for a while. How can the following queries be so tricky to model;


2. DELETE FROM TGTAB where somedate is null;

3. DELETE FROM TGTAB where somedate < '01-JAN-1995';

This partly falls down to modeling for the complex scenario and not making some simple ones simple. Back to generating the maps, you can  imagine parsing something like this and generating a map becomes a little trickier task.

To build the OWB mapping for the above involves a little more work than you might think, and there are a few ways to do it, here is the SQL for (3) implemented in 2 ways with OWB;

Delete Map 1:

As well as the table operator which has the DELETE loading type, you will need the same table as a source. There are 2 options to the SQL where clause for the date, you can either include a FILTER operator between the tables or define the filter (option 2 above) on the target table using the 'Target Filter for Delete' property (option 1 above). You must also either select match by constraints or set the match column when delete property for each column to be matched. The other SQL statements above are variations on the above either with different filters or no filter at all (1).

The UPDATE SQL statement is another one similar in style to DELETE because of the matching columns.

1. UPDATE TGTAB set somestatus=NULL;
2. UPDATE employees
SET salary = salary * 1.1
WHERE department_id = 100

Add the table also as a source, and ensure that the match column when updating and load column when updating properties are set appropriately.  In this example I made sure SALARY was match when updating was false and load when updating was true. For EMPLOYEE_ID I set as match when updating to true and load when updating to false.

Update Map 1:

Even with some of these simple DELETE/UPDATE statements there are some nuances, so you can imagine the complexity when you get some complex scenarios, generation of the map from SQL needs some more thought. The earlier map generations were a good fit for the OWB mapping paradigm, and were easy to build in a generator, the UPDATE/DELETE statements need some more thought, but there is hope. The table operator when used for update/delete has a lot of properties from the table operator down to each attribute - all of which must be tweaked to ensure you get what you want. These kind of issues will be resolved in OWB to make the
SQL to OWB transition much simpler. When this happens it will not only be simpler to construct such maps but there will be a much simpler model for generating the maps.

In Summary
For some statements at times manual construction of the map is a fine art, so automation via a generator is a little tricky, tricky that is until we resolve some of the ease of use issues, which will make the generator a reality for more than INSERT and SELECT statements.. I thought I'd share some of these ideas anyway and see what people thought.


Very interesting article!! We started to implement some tcl scripts to create mappings from views and I am very interested in the SQL parser you used. Can you send me some more information on that? Thanks and regards, Maren

Posted by Maren Eschermann on July 01, 2007 at 07:09 AM PDT #

The demo looks really interesting. Is this this based on shared experts which we can download to play with, or is this still in development?

Posted by Robbert Michel on July 05, 2007 at 10:34 AM PDT #

It was developed using experts, it was a back burner activity that still needs some work.

Posted by David Allan on July 05, 2007 at 12:48 PM PDT #

Very handy tool!! I hope this feature will be bundled in the OWB next release.. In the mean time, could you please share us your lovely work.. :D Thanks & regards, iqbalm@inco.com

Posted by iqbal on July 13, 2007 at 07:51 PM PDT #

We will post some more here, however due to vacations I think it will towards the end of the month.

Posted by jean-pierre dijcks on July 18, 2007 at 08:11 AM PDT #

Any update? Regards, Iqbal

Posted by iqbal on August 12, 2007 at 07:12 PM PDT #

I've added a new post (part 2, following on from this) with the code, some notes and examples.

Posted by David Allan on August 14, 2007 at 11:08 AM PDT #

sounds excellent, I am trying to use the demo struggling with this part : 4. In the main tree add the expert as a shortcut to a map. Right click map, select Add/Remove experts enable check box for build map from SQL, click OK. The expert should now appear on the menu.

Posted by Mark on April 15, 2008 at 10:45 PM PDT #

I think this option is only available to repository administrators/owners. So if you log in as a regular user you will not see the add/remove but if the expert is added by the owner, you will see it and be able to execute it as a user. So login as the owner add the expert to the tree.

Posted by David Allan on April 18, 2008 at 05:22 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed

ETL, CDC, Real-Time DI and Data Quality for the Oracle Database from the inside.


« June 2016