File staging using external table preprocessor
By David Allan-Oracle on Jun 10, 2009
Here is a quick write up of how to take advantage of the external table preprocessor feature which was squeezed into the 220.127.116.11 and 10.2.0.5 database patch sets from within OWB. OWB has rich metadata support for file design and its manipulation from the file metadata itself to the external table definition, SQL*Loader mappings and PLSQL support, it also provides a flexible approach (more like tips and tricks) for incorporating aspects of the database that are not directly designed in the UI, such objects include views, unbound operators in mapping etc. For external table, the access specification can be hand-crafted rather than generated from the metadata or imported from the data dictionary. It is this flexibility that allows you to incorporate the preprocessor directive into external table designs within production OWB releases.
For details of the preprocessor directive see the article 'Using the Preprocessor Feature with External Tables in Oracle Database 11g' on OTN.
The example we will illustrate here is using a compressed data file as the data file source for the external table. The external table represents customer details and is defined in a database schema. We will setup the metadata location for the Oracle module and select 'Import' (external tables can also be designed, you do not need to base it on a file defined in OWB, if you hit 'Finish' on the file selection page of the wizard you will create an 'unbound' external table);
From the import wizard we can select the external table to be imported;
When the table has been imported we get the signature of the table with all of the columns and data types defined;
In the access specification section we see the PREPROCESSOR directive that our external table was defined with also imported. The script gunzipdb is used to encapsulate the unzipping and write the resultant file to the standard output stream to be consumed by the external table access driver. As mentioned from the above post although commonly used for compressed files it could perform any manipulation on the file in preparation for it being consumed by the external table.
The gunzipdb is just a shell script I created to wrapper the options to the gzip executable, it performs the unzip using the -d and -c options (on Windows I included the echo off command, also had much better success with large compressed files on Linux than on Windows).
We can inspect the external tables configuration parameters and customize the external table's treatment for missing values, trimming etc.
We can also see the data file imported with the external table definition, note it is a gzip'd file and the SQL directory location can be set. We can also add additional data files if we wanted to process many files.
We can now either just use this external table in a mapping or generate and deploy the external table definition off to another schema if desired.
That's a quick run through of using the external table preprocessor capabilities from within OWB, if there are other interesting use-cases it'd be interesting to get some feedback.