X

Welcome to All Things Warehouse Builder

ODI 11g – Faster Files

David Allan
Architect

Deep in the trenches of ODI development I raised my head above the parapet to read a few odds and ends and then think why don’t they know this? Such as this article here – in the past customers (see forum) were told to use a staging route which has a big overhead for large files. This KM is an example of the great extensibility capabilities of ODI, its quite simple, just a new KM that;


  1. improves the out of the box experience – just build the mapping and the appropriate KM is used
  2. improves out of the box performance for file to file data movement.

This improvement for out of the box handling for File to File data integration cases (from the 11.1.1.5.2 companion CD and on) dramatically speeds up the file integration handling. In the past I had seem some consultants write perl versions of the file to file integration case, now Oracle ships this KM to fill the gap. You can find the documentation for the IKM here. The KM uses pure java to perform the integration, using java.io classes to read and write the file in a pipe – it uses java threading in order to super-charge the file processing, and can process several source files at once when the datastore's resource name contains a wildcard. This is a big step for regular file processing on the way to super-charging big data files using Hadoop – the KM works with the lightweight agent and regular filesystems.

So in my design below transforming a bunch of files, by default the IKM File to File (Java) knowledge module was assigned. I pointed the KM at my JDK (since the KM generates and compiles java), and I also increased the thread count to 2, to take advantage of my 2 processors.

For my illustration I transformed (can also filter if desired) and moved about 1.3Gb with 2 threads in 140 seconds (with a single thread it took 220 seconds) - by no means was this on any super computer by the way. The great thing here is that it worked well out of the box from the design to the execution without any funky configuration, plus, and a big plus it was much faster than before,

So if you are doing any file to file transformations, check it out!

Join the discussion

Comments ( 16 )
  • David Monday, June 25, 2012

    There is a bug 13646250 on Windows in this IKM, the single quote wrappering the javac command is causing the problem in the compile program task and the execute program task. Removing the quotes on the compile and execute makes it work for me, you can change the IKM yourself and try.

    Change the Compile Program task from ...

    OdiOSCommand "-COMMAND='<%= odiRef.getOption("JAVA_HOME") %>/bin/javac' <%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %>/<?= getOdiClassName() ?>.java" "-ERR_FILE=<%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %>/<?= getOdiClassName() ?>_err.txt" "-OUT_FILE=<%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %>/<?= getOdiClassName() ?>_out.txt"

    to

    OdiOSCommand "-COMMAND=<%= odiRef.getOption("JAVA_HOME") %>/bin/javac <%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %>/<?= getOdiClassName() ?>.java" "-ERR_FILE=<%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %>/<?= getOdiClassName() ?>_err.txt" "-OUT_FILE=<%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %>/<?= getOdiClassName() ?>_out.txt"

    Change the Execute Program task from....

    OdiOSCommand "-COMMAND='<%= odiRef.getOption("JAVA_HOME") %>/bin/java' -cp <%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %> <?= getOdiClassName() ?>" "-ERR_FILE=<%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %>/<?= getOdiClassName() ?>_err.txt" "-OUT_FILE=<%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %>/<?= getOdiClassName() ?>_out.txt"

    to

    OdiOSCommand "-COMMAND=<%= odiRef.getOption("JAVA_HOME") %>/bin/java -cp <%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %> <?= getOdiClassName() ?>" "-ERR_FILE=<%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %>/<?= getOdiClassName() ?>_err.txt" "-OUT_FILE=<%= odiRef.getSrcTablesList("", "[WORK_SCHEMA]", "", "") %>/<?= getOdiClassName() ?>_out.txt"

    Cheers

    David


  • Ashok Thursday, November 22, 2012

    Hi David,

    Can you provide the link to donwload the IKM File to File (Java)

    Thanks.

    Regards

    Ashok


  • guest Friday, November 23, 2012

    Hi David,

    When I am using this KM, I am getting one error in the log file

    Column C1 : is mandatory

    What could be the reason behind this ? All data going to bad file.

    Please suggest.

    Thanks


  • David Monday, November 26, 2012

    Have you specified data for all mandatory target columns? It sounds like you haven't.

    Cheers

    David


  • guest Monday, November 26, 2012

    Yes I have mapping for all target columns. its an one to one mapping. By giving a key column on target side of the interface is not making any difference. Same error in this case also. What i did is that i reversed two text files with pipe delimiter. Then did a one to one mapping and selected the IKM file to file (Java). But no luck till now.

    When I had code some values in the target column like "11" then i can see few records loaded which is in a different format.

    Input file

    C1|C2|C3|C4

    1412|tom|333|4455

    2414|alex|333|4455

    103|aaac|33|4435

    4525|aaad|33|4415

    5525|aaae|333|445

    6525|aaaf|333|55

    7|aaag|333|4955

    8|aaah|33|4625

    9|aaai|33|325

    10|aaaj|336|4545

    11|aaak|334|65

    output file has blank rows

    output log file contains

    Oracle Data Integrator * File to File:

    Copyright (c) Oracle Corporation. All rights reserved.

    Number of threads: 1

    Discradmax: 1

    OutputFile:

    D:/FF/fjava_tgt/fjava_tgt.txt

    BAD file:

    D:/FF/fjava_tgt/fjava_tgt.txt.bad

    Pattern: D:\\FF/fjava1\.txt

    Input file:

    D:\FF\fjava1.txt

    Error line: 2

    Column C1 : is mandatory

    Maximum number of errors reached

    Number of lines read for this file:

    2

    *************************************************************************

    2 Rows successfully read.

    1 Rows skipped (Header).

    0 Rows successfully loaded.

    ==>0 Rows loaded with warning.

    1 Rows not loaded due to data errors.

    0 Rows not loaded because of filter.

    Run began on Fri Nov 23 18:39:56 IST 2012

    Run ended on Fri Nov 23 18:39:56 IST 2012

    Elapsed time was:

    140 milliseconde


  • David Monday, November 26, 2012

    I found the problem, the java is tokenizing the string using the split function which actually uses a regular expression and pipe (|) is a special regular expression character.

    To fix, escape the pip character in your file, change the delimiter for your source file to \|

    Cheers

    David


  • David Monday, November 26, 2012

    I raised a bug to track this...15918659


  • Ashok Tuesday, November 27, 2012

    Thanks David for the information.


  • Pavan KUmar Thursday, February 21, 2013

    Hi David,

    can you please provide me the link for downloading this IKM file to file (java) as the same scenario came to me.

    Kindly do the needful..

    Many Thanks,

    Pavan Kumar


  • David Thursday, February 21, 2013

    Hi Pavan

    You can get it from the ODI companion CD from OTN.

    Cheers

    David


  • guest Tuesday, October 27, 2015

    Hi David,

    The articles says that filters can be applied but whenever we apply the same, it is not working. A physical filter is not even being accepting in the mapping. If we use the java code like equals() etc. it's still not working.

    Did you try applying filters with this.

    Thanks,

    Chaitanya


  • David Tuesday, October 27, 2015

    I may have changed the File technology to allow filters. If you are using an 11g ODI that may be the issue you are hitting.

    Cheers

    David


  • Chaitanya Tuesday, October 27, 2015

    Could you please elaborate on that. We have both 11g and 12c but could not apply any filters.

    Thanks,

    Chaitanya


  • David Tuesday, October 27, 2015

    Hi Chaitanya

    If you edit the File technology you will probably find the WHERE capability is not enabled. ... that's a guess at what your problem is. Enabling that would at least allow you to define the interface/mapping.

    Cheers

    David


  • Chaitanya Tuesday, October 27, 2015

    Thanks David.

    Worked like a charm after enabling that.

    Regards,

    Chaitanya


  • David Tuesday, October 27, 2015

    That's great, thanks for the feedback.

    Cheers

    David


Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.