A table function is a function you can use like a table in the FROM clause of a SELECT statement. A common usage of table functions is to stream data directly from one process or transformation to the next process without intermediate staging, and a table function used in this way is called a streaming table function. As you might be able to tell from the reference to transformation above, this technique is most often used in data warehouses as part of extract, transform, and load (ETL) operations.
My previous article, “When Is a Function Like a Table? When It’s a Table Function!,” includes an overview of table functions. In this article, I will show you the basic steps needed to create a streaming table function, but before diving into the details, let’s look at a streaming table function example:
INSERT INTO tickers SELECT * FROM TABLE (doubled (CURSOR (SELECT * FROM stocks)))
What’s going on here? Let’s take it step by step from the inside out:
|SELECT * FROM stocks||Gets all the rows from the stocks table|
|CURSOR ()||Creates a cursor variable with the CURSOR expression that points to the result set|
|()||Passes that cursor variable to the doubled table function|
|doubled ()||Performs its transformation and returns a nested table of object type instances|
|SELECT * FROM TABLE(...)||Converts the collection into a relational set of rows and columns|
|INSERT INTO tickers||Inserts those rows into the tickers table|
Sometimes (often?) you will need to perform more than one transformation as part of the streaming process. No problem. You can certainly string together multiple invocations of table functions:
INSERT INTO tickers SELECT * FROM TABLE (transform2 ( CURSOR (SELECT * FROM TABLE (transform1 ( CURSOR (SELECT * FROM stocks ))))))
Setting Up Tables for Transformation
To transform data from one table to another, you need tables and the data in those tables. In this article, I will start with the stocks table, each row of which contains the opening and closing trading prices for each stock ticker symbol:
CREATE TABLE stocks ( ticker VARCHAR2 (20), trade_date DATE, opening_price NUMBER, closing_price NUMBER) /
My transformation is simple: for each row in the stocks table, generate two rows for the tickers table (one row each for the opening and closing prices):
CREATE TABLE tickers ( ticker VARCHAR2 (20), pricedate DATE, pricetype VARCHAR2 (1), price NUMBER ) /
Before continuing, I feel obligated to point out that for this particular transformation (one row in stocks to two rows in tickers), you don’t need a table function to get the job done. For example, you can use INSERT ALL to insert into tickers twice:
INSERT ALL INTO tickers (ticker, pricedate, pricetype, price) VALUES (ticker, trade_date, ‹O›, opening_price) INTO tickers (ticker, pricedate, pricetype, price) VALUES (ticker, trade_date, ‹C›, closing_price) SELECT * FROM stocks /
You could also use UNPIVOT (thanks, Chris Saxon @chrisrsaxon, for making me aware of this technique!):
INSERT INTO tickers (ticker, pricedate, pricetype, price) SELECT * FROM stocks UNPIVOT (price FOR price_type IN (opening_price AS 'O', closing_price AS 'C')) /
SQL is an extraordinarily powerful language. It is quite likely that the transformation you would like to perform is doable in pure SQL. And if you can avoid the use of a table function, implementing your requirement in SQL instead, then you should by all means do so. To strengthen your SQL skills, take advantage of workouts, classes, and quizzes at the Oracle Dev Gym, especially the SQL analytics class by Connor McDonald.
If, however, the transformation requires the use of procedural logic (hence, PL/SQL) or if you can’t sort through the SQL syntax, table functions offer a powerful, straightforward way to get the job done.
For the purposes of this article, assume that the transformation is much more complex and requires use of a table function.
Types and Package for Table Functions
As shown in my previous article on table functions, when you need a table function to return more than one piece of data in each collection element (for example, more than just a list of strings or numbers), you need to create an object type and a collection of those object types.
In this article’s example, I want to move stock data from the stocks table to the tickers table, so I need an object type that “looks like” the tickers table.
Ideally, I would create a collection type like this:
CREATE TYPE tickers_nt AS TABLE OF tickers%ROWTYPE; /
But %ROWTYPE is a PL/SQL declaration attribute and is not known to the SQL engine, so this statement fails with
PLS-00329: schema-level type has illegal reference to TICKERS
Instead, I create an object type that mimics the structure of my table, as follows:
CREATE TYPE ticker_ot AUTHID DEFINER IS OBJECT ( ticker VARCHAR2 (20), pricedate DATE, pricetype VARCHAR2 (1), price NUMBER ); /
I then create a nested table of those object types:
CREATE TYPE tickers_nt AS TABLE OF ticker_ot; /
I plan to use this table function in a streaming process. This means that I will be passing in a set of data (rows and columns) from SQL. To do this, I will also need to define a strong REF CURSOR type that will be used as the data type of the parameter accepting the dataset inside the SQL statement. In the package specification below, I create two REF CURSOR types, one for rows from the stocks table and another for the tickers table.
CREATE OR REPLACE PACKAGE stock_mgr AUTHID DEFINER IS TYPE stocks_rc IS REF CURSOR RETURN stocks%ROWTYPE; TYPE tickers_rc IS REF CURSOR RETURN tickers%ROWTYPE; END stock_mgr; /
The variable you declare based on a REF CURSOR type is a cursor variable. Within PL/SQL, you might write code like this:
DECLARE l_cursor stock_mgr.stocks_rc; l_stock stocks%ROWTYPE; BEGIN /* With a static SQL statement */ OPEN l_cursor FOR SELECT * FROM stocks; LOOP FETCH l_cursor INTO l_stock; EXIT WHEN l_cursor%NOTFOUND; END LOOP; CLOSE l_cursor; /* Or with a dynamic SQL statement */ OPEN l_cursor FOR 'select * from stocks'; LOOP FETCH l_cursor INTO l_stock; EXIT WHEN l_cursor%NOTFOUND; END LOOP; CLOSE l_cursor; END; /
Note that you can use all the usual cursor attributes and operations on cursor variables: FETCH, %FOUND, CLOSE, and so on.
As you will see later, the way you use this REF CURSOR type will be a bit different for streaming table functions in the SQL context.
Define the Table Function
The main distinction between streaming table functions and “normal” table functions such as those addressed in the previous article is that at least one parameter to that streaming table function is a cursor variable. The table function could have more than one cursor variable input and other parameters of other types such as a string or date. In this article, I will stick with the minimum: a single cursor variable parameter.
Generally, the flow within a streaming table function is
Now let’s see how this pattern unfolds in my doubled function—one stocks row doubled to two ticker rows—and the line descriptions that follow.
CREATE OR REPLACE FUNCTION doubled (rows_in stock_mgr.stocks_rc) RETURN tickers_nt AUTHID DEFINER IS TYPE stocks_aat IS TABLE OF stocks%ROWTYPE INDEX BY PLS_INTEGER; l_stocks stocks_aat; l_doubled tickers_nt := tickers_nt (); BEGIN LOOP FETCH rows_in BULK COLLECT INTO l_stocks LIMIT 100; EXIT WHEN l_stocks.COUNT = 0; FOR l_row IN 1 .. l_stocks.COUNT LOOP l_doubled.EXTEND; l_doubled (l_doubled.LAST) := ticker_ot (l_stocks (l_row).ticker, l_stocks (l_row).trade_date, ‹O›, l_stocks (l_row).opening_price); l_doubled.EXTEND; l_doubled (l_doubled.LAST) := ticker_ot (l_stocks (l_row).ticker, l_stocks (l_row).trade_date, ‹C›, l_stocks (l_row).closing_price); END LOOP; END LOOP; CLOSE rows_in; RETURN l_doubled; END;
|1||Use the REF CURSOR type defined in the package for the rows passed in. Because I am selecting from the stocks table, I use the stocks_rc type.|
|2||Return an array, each of whose elements looks just like a row in the tickers table.|
|5, 6||Declare an associative array to hold rows fetched from the rows_in cursor variable.|
|8||Declare the local variable to be filled and then returned back to the SELECT statement.|
|10||Start up a simple loop to fetch rows from the cursor variable. It’s already open—the CURSOR expression takes care of that.|
|11, 12||Use the BULK COLLECT feature to retrieve as many as 100 rows with each fetch. I do this to avoid row-by-row processing, which is not efficient enough. Exit the loop when the associative array is empty.|
|14||For each element in the array (row from the cursor variable)...|
|16–21||Use EXTEND to add another element at the end of the nested table, then call the object type constructor to create the first (“opening”) of the two rows for tickers table, and put the element into the new last index value of the collection.|
|23–28||Do the same for the second (“closing”) row of the tickers table.|
|31||Close the cursor variable, now that all the rows have been fetched. Note: this step is optional. When you use a CURSOR expression to pass in the result set, the cursor will be closed automatically when the function terminates.|
|33||Send the nested table back to the SELECT statement for streaming.|
Regarding FETCH-BULK COLLECT-LIMIT
I used a value of 100 for the LIMIT clause. That’s a decent default value—it’s the number of rows retrieved by cursor FOR loops with each fetch. But if you are processing an extremely large number of rows and want to squeeze better performance out of your function, you might try a larger LIMIT value. Note, however, that this will consume more Process Global Area (PGA) memory and that, at some point, your code will slow down due to excessive memory consumption.
You should also pass the LIMIT value as a parameter to give you the ability to modify the performance profile without recompiling your function, as in
CREATE OR REPLACE FUNCTION doubled ( rows_in stock_mgr.stocks_rc, limit_in IN INTEGER DEFAULT 100) ... BEGIN LOOP FETCH rows_in BULK COLLECT INTO l_stocks LIMIT limit_in;
OK, let’s do some streaming. Assume that I previously loaded the stocks table with 1,000 rows.
SELECT COUNT (*) c FROM tickers / C ———— 0 INSERT INTO tickers SELECT * FROM TABLE (doubled (CURSOR (SELECT * FROM stocks))) / 2000 row(s) inserted. SELECT COUNT (*) FROM tickers / C ———— 2000 SELECT * FROM tickers FETCH FIRST 10 ROWS ONLY / TICKER PRICEDATE PRICETYPE PRICE STK214 06-JUL-18 C 229 STK215 06-JUL-18 O 215 STK215 06-JUL-18 C 230 STK216 06-JUL-18 O 216 STK216 06-JUL-18 C 231 STK217 06-JUL-18 O 217 STK217 06-JUL-18 C 232 STK218 06-JUL-18 O 218 STK218 06-JUL-18 C 233 STK219 06-JUL-18 O 219
A Two-Step Transformation
In my two-step transformation, I will create a function that returns a nested table of elements that matches the stocks table. So I will need an object type and a nested table type.
CREATE OR REPLACE TYPE stock_ot AUTHID DEFINER IS OBJECT ( ticker VARCHAR2 (20), trade_date DATE, opening_price NUMBER, closing_price NUMBER ) / CREATE OR REPLACE TYPE stocks_nt AS TABLE OF stock_ot; /
Now I create a table function that accepts a cursor variable, each of whose rows contains ticker data and returns a nested table, each of whose elements looks like a row in the stocks table. Because it follows precisely the same pattern as the doubled function, I will not describe the individual lines.
CREATE OR REPLACE FUNCTION singled ( tickers_in IN stock_mgr.tickers_rc) RETURN stocks_nt AUTHID DEFINER IS TYPE tickers_aat IS TABLE OF tickers%ROWTYPE INDEX BY PLS_INTEGER; l_tickers tickers_aat; l_singles stocks_nt := stocks_nt (); BEGIN LOOP FETCH tickers_in BULK COLLECT INTO l_tickers LIMIT 100; EXIT WHEN l_tickers.COUNT = 0; FOR indx IN 1 .. l_tickers.COUNT LOOP l_singles.EXTEND; l_singles (l_singles.LAST) := stock_ot (l_tickers (indx).ticker, l_tickers (indx).pricedate, l_tickers (indx).price, l_tickers (indx).price * .5); END LOOP; END LOOP; RETURN l_singles; END; /
Now let’s do a two-step transformation: stocks -> tickers -> more_stocks!
CREATE TABLE more_stocks AS SELECT * FROM TABLE ( singled ( CURSOR ( SELECT * FROM TABLE (doubled (CURSOR (SELECT * FROM stocks)))))) / SELECT COUNT (*) FROM more_stocks /
Streaming table functions play a crucial role in data warehouse ETL operations. Oracle Database makes building such functions easy, through its implementation of PL/SQL cursor variables and the CURSOR expression.
Remember that the collection constructed and returned by a streaming table function will consume PGA memory, so very large datasets passed into the function via the cursor variable may result in errors.
What can you do about that? Make that streaming table function a pipelined streaming table function, which I’ll cover in the next article in this series.
READ more about table functions.
LEARN more about PL/SQL table functions (a Dev Gym class).
Illustration by Wes Rowell