Process Flow - Execute a map for all files in a directory

How to call a map for every file in a directory? Flat file maps have the data file name as a parameter. The map can be called from a flow and the data file name set, the map will be wrapped in a while loop that iterates over every file in a directory. Here is an illustration, it uses the functions getfilecount and getnthfile from the post here.

The flow looks like you'd expect for a basic programming loop;

  • an initialization block to get the number of files in a directory, I have the directory as a parameter to the process flow too. The initialization is the GETFILECOUNT activity which uses the function from the blog post linked above.
  • the while statement with the condition (iterate for every file in directory using the index)
    • get the nth file in the directory using the function GETNTHFILE
    • execute the map (LOAD_F2) passing the fully qualified name of the file
    • increment the index
  • end loop
The process flow looks like this;

Process All Files:

The process flow uses the following variables and parameters;

Process All Files Parameters:

In the above image DIR_PATH is a process flow parameter, so the flow can be executed and directory specified upon execution. There are a few variables; one for holding an index for the loop (FILE_INDEX), one for holding the file name returned from getnthfile (FILE_NAME) and another variable for the number of files (NO_OF_FILES).

The while condition for the loop is defined as loop while index is less than the number of files;

Process All Files While:

The increment index (ASSIGN) activity inside the while loop looks like

Process All Files Increment:

This lets us easily execute a single map many times with a group of files. Its also possible to separate the control file/log file location so they can be written to one directory and source data files processed from another (so getnthfile will not process control files etc..).


I have an OMB script that builds the pattern described above, I'll tidy this up and share. This will just let you click on a map and build the flow components based upon the selected map and some input (module/process flow/directory).

Posted by David Allan on July 05, 2007 at 12:43 PM PDT #

See post$160 for details on the expert.

Posted by David Allan on July 06, 2007 at 06:59 AM PDT #

Following the same logic, what's needed to get parameters from column's cursor/view/table ??? declare -- Look for parameters for mapping LOAD_SALES cursor log is select period, store from log where status = 'OK'; begin for rec in cursor loop -- Start mapping for each record load_sales(rec.period,; end loop; end;

Posted by Antonio Akiyama on September 18, 2008 at 01:51 AM PDT #

Dave, I just tried the link for the OMB+ example you mentioned and was unable to get to this. Can you help me out?

Posted by Paul on September 23, 2008 at 10:56 PM PDT #

Hi, Do you have any idea how to modify this to use the post from Tom Kyte a couple of years ago: This basically stores the dirlist in a table , and thus allows you to do sorting the way you want. So that would mean that the flow had to be adapted to use the rows returned from a select as input to the while loop. Can this be done ?

Posted by Michael Reitsma on January 13, 2011 at 04:58 AM PST #

Hi Michael You could follow the same principles as in the post; But rather than the getfilecount and getnthfile being written in java they would be PLSQL routines that query the table populated from code like in Tom's post, the get nth would get nth based on the sorted data. Cheers David

Posted by David Allan on January 13, 2011 at 05:45 AM PST #

Another option to this dynamic nature is to use the external table preprocessor which can basically do whatever you want and stream the files on the standard output stream for the external table processor. See this post for some info... Cheers David

Posted by David Allan on January 13, 2011 at 05:48 AM PST #

hi ,I cant see any pictures... can you upload them back? kinda hard to get it without any pictures..

Posted by guest on May 09, 2011 at 05:04 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed

ETL, CDC, Real-Time DI and Data Quality for the Oracle Database from the inside.


« April 2014