Parallel Unload to File

Here we see how to to construct a parallel unload mapping leveraging the Oracle Database parallel table function capabilities to write the results from a PLSQL Ref Cursor to file. The example also illustrates how the table function arguments are constructed within OWB. The format of the file is determined by the user in the mapping by an expression defining the data to be written.

The example is a simple table to file scenario just to illustrate the basic mechanics, but the query/operator set can obviously be more complex. Below we see an example writing the results of a SALES table to a file, the target is a PLSQL table function which has a refcursor parameter; a ref cursor which produces the record to be written to the file (the filename and directory name are embedded in the function definition).

parallel_unload8

The record itself is a string and is a concatenation of the columns, below we see the fields delimited by commas, if you wanted any specific locale expressions or wrapped in quotes for example you can do this here.

parallel_unload2

The table function itself has details of the underlying PLSQL function used and an indicator that it is being used as a target in the mapping (you must set this, its not smart enough to figure it out).

parallel_unload3

Now you see that the table function operator has 2 groups, 1 input group and one output group. There is a property which defines the type for the group and if you have a mixture of ref cursor inputs and scalars then you will need more than one group. So I have one group (INGRP1) for the ref cursor which produces the record to be written, it is defined with a type of ref cursor (so the operators sourcing the attribute in this group will be input via a PLSQL refcursor for use in the function).

parallel_unload9

The output group is simply returning a table of scalars so I made sure the group has that property set.

parallel_unload6

That's about it, so when we generate the code we see a call to the table function ParallelUnload wrapped in a TABLE call, and the parameter passed in; a ref cursor containing the query representing the operators sourcing the attribute (the 'select blah from SALES' statement). Note also the PARALLEL hint which I configured on the SALES table and defined 5 as the number of concurrent servers.

parallel_unload10

Running this will allow parallel unload to file. The function

create or replace type NumSet as table of number;

/

create or replace package RefCur_Unload as
    type Line is record (vc varchar2(32767));
    type MultiLine_cur is ref cursor return Line;
end;

/

create or replace function ParallelUnload
       (r RefCur_Unload.MultiLine_cur) return NumSet
   PIPELINED  PARALLEL_ENABLE (PARTITION r BY ANY) is
   i binary_integer := 0;
   rec2 varchar2(32767);
   out utl_file.file_type;
   filename varchar2(256) := 'dbunload';
   directoryname varchar2(256) := 'MY_DIR';
   vsid varchar2(120);
begin
   select sid into vsid from v$mystat where rownum=1;
   filename := filename || vsid || '.dat';
   out := utl_file.fopen (directoryname, filename , 'w');

   loop
     fetch r into rec2;
     exit when r%notfound;

     utl_file.put_line(out, rec2);
     i := i + 1;

   end loop;

   close r;

   utl_file.fclose(out);
   PIPE ROW(i);
   return ;

end;
/

The function uses the PIPELINED AND PARALLEL_ENABLE keywords to indicate the table function and parallel capabilities, the Oracle Database Data Warehousing Guide has more details on this in the 'Loading and Transformation' chapter if you want to dive into more detail. Also each parallel session of the function’s execution will write a data file (so data file will be named dbunload<sessionid>.dat). There was an earlier post on table functions (here) as source and target operators which is also worth checking out, they are a useful way of adding some custom transformation code into the mapping design, and all leveraging Oracle PLSQL.

Comments:

Hi David, One more remark. A few years ago I've implemented a similar "universal" pl/sql procedure for file unloading (though not parallel but in serial). I came up to similar approach - fetching from single-column varchar2 cursor. (1) Well, one day I had to generate a file from quite wide table (800 columns, many of them - quite long). So as you can guess - some limits have been reached: a query with so long concatenation was not supported... Finally I've implemented unloading this file in Java (concatenating the multiple column values has been performed dynamically). However that worked significantly slower than the pl/sql implementation. (2) Even earlier I had fights with another limitation: with utl_file buffers when writing too long lines. As far as I remember - working with utl_fie.put_raw instead of put_line, put helped in that case. Regards, Yavor

Posted by Yavor Nikolov on August 29, 2009 at 04:53 PM PDT #

Hi, it works. But if you execute the mappings in parallel you get the error ORA-01422. The reason is "SELECT 1 INTO owb_temp_variable1 ...". In a parallel execution eg parallel 4 the statment returns 4 rows. How do you solve this using owb? Maybe this is a bug or feature in owb. Regards, Bert.

Posted by Bert on March 14, 2010 at 08:15 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

ETL, CDC, Real-Time DI and Data Quality for the Oracle Database from the inside.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today