OWB 11gR2 – MySQL Bulk Extract
By David Allan on Jan 10, 2010
Next on the MySQL series, let's look at bulk extract to file from MySQL. We'll see how the OWB 11gR2 out of the box functionality can be enhanced to utilize some native MySQL capabilities and boost the performance! All with some baby steps into the code template world...
To make it a little more interesting we will create multiple files from a single map, one with the high bonuses and another with the low bonuses.
The mapping will have 2 execution units and we will utilize the SQL to File integration code template which is shipped with OWB 11gR2. Look at the mapping below, operators can appear in multiple execution units, you see the EMP_BONUS table is used multiple times and that is reflected with the angle brackets < EMP_BONUS >
Deploying and executing the mapping we see the elapsed timing, 23 seconds for the writing of the files.
As mentioned, this is using the unload to file code template which is shipped with OWB 11gR2, essentially a JDBC program writing to disk. A quick check of the MySQL reference manuals you can find that there is a SELECT <stuff> INTO OUTFILE <filename> command in MySQL. This is where writing custom code templates really kicks in! Let's see.....
We can easily add a new template that uses the 'SELECT ... INTO OUTFILE ...' for any map we design from MySQL to file! Creating a new integration template with MySQL as the source and File as the target containing a single step is simple. We essentially copy-paste the SQL part of the template from another mapping, I have used Jython as the task type to do the job (I tried JDBC at first but the MySQL JDBC driver choked when it executed SELECT...INTO OUTFILE using the executeUpdate API...luckily we can workaround these kinds of challenges).
Now we can assign the MySQL to File code template to each execution unit and compare the performance.
Executing this is substantially faster!
In the audit information panel we can see the elapsed time is less than 1 second! Also note the SQL that was generated and executed.
So you can see from these baby steps into building code templates its very flexible and powerful too. This has endless possibilities for further optimizations; for example we can include options for the delimiters etc for the file within the code template (all possible with the MySQL unload) and also rather than just unloading to file, we can optimize for other systems such as Oracle and create Load Code Templates that extract in bulk from MySQL (or whatever) transfer to the Oracle system and create an external table as the staging table. The framework for this pattern once constructed means for each system added thereafter that the work to add is very small.
Hopefully this gives you a taster for the capabilities of the bulk extract capabilities using MySQL as an illustration.