Event Based Execution - File Arrival Example

If you want to do truly event based execution based on arrival of a file there is a convenient package supplied with the Oracle database scheduler component that can be used, the package can be downloaded from OTN here.

For example if you have an OWB mapping that uses an external table as a source and you wish to execute the mapping when a file arrives on the filesystem (that the external table is dependent on), then you can do this a number of ways, here we will illustrate how the database scheduler is used to execute the map.

After the package otn_sched_file_watch has been installed from the Oracle scheduler download described earlier we can use it to execute any job we have privilege to. For example given the following mapping;

owb_event1.JPG

which uses the file in STAGING_DIR named geog.csv;

owb_event2.JPG

If we wanted to execute the mapping when geog.csv arrives on the file system then we can create a file watch schedule and then create a job, this is described in the readme file for the file watch package. For an 10gR2 mapping this would be like the following (11g would also include the set_workspace code);

begin
otn_sched_file_watch.create_file_watch_schedule('GEOGFILES','STAGING_DIR','geog.csv',FALSE);
dbms_scheduler.create_job(job_name=>'FA_JOB',
schedule_name=>'GEOGFILES',
job_action=>'declare v varchar2(4000);r varchar2(4000); begin load_geog0.main(v,''false'',r); end;',
job_type=>'PLSQL_BLOCK',
auto_drop=>false,
enabled=>true);
end;
/

The mapping is a PLSQL implemented mapping so we can use the MAIN procedure to execute the mapping (as defined in the job_action parameter), we could also have used the OWB runtime APIs to execute the map or any process flow for example based upon events like this. The action in this case is a PLSQL block as defiend by the job_type parameter. We use the schedule GEOGFILES in the job creation which was defined using the api otn_sched_file_watch.create_file_watch_schedule

The file wait package will poll every 5 minutes for a while, as defined in the documentation the dba can change the setting using s function (otn_sched_file_watch_int.set_file_watch_intv) or each job owner can alter the frequency interval using dbms_scheduler.set_attribute using attribute repeat_interval.
for example
exec dbms_scheduler.set_attribute('OTN_SCHED$_FILE_WATCH_JOB', 'REPEAT_INTERVAL', 'FREQ=MINUTELY;INTERVAL=1');

will change the repeat interval to every minute, a bit too frequent but ok for demos and getting the environment setup. See the DBMS_SCHEDULER documentation for the syntax for this property.

As you can see this is fairly straightforward to leverage and allows you to move data upon events rather than time based schedules. There are also some ftp based routines that are worth checking out in dbms_scheduler, these could be embedded as pre-post mapping triggers for example all leveraging the components of the Oracle database.


Comments:

Hi! Is there any way, we can make sure that the file is fully uploaded to our server ? For example, if the file is huge (5GB) and the package polls the directory while the file is being uploaded, will the job start executing ? If that`s the case then how can we make sure that we didn`t miss any data (particularly for files without a trailer) ? Thanks... In response: ========== The job will only be activated when the timestamp has been modified and the file size has changed. So you will not get multiple jobs launched to process the same file. Cheers David

Posted by user on September 09, 2008 at 02:34 AM PDT #

Blogs for the Oracle database 11gR2 feature for initiating jobs upon file arrival;
http://awads.net/wp/2011/03/29/did-you-know-about-file-watchers/

Cheers
David

Posted by David on March 20, 2013 at 02:27 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

ETL, CDC, Real-Time DI and Data Quality for the Oracle Database from the inside.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today