My mentor told me that when querying data I should always use a cursor FOR loop, even for a single row lookup. He says it’s the easiest way to fetch data, and Oracle Database automatically optimizes it in Oracle Database 10g and above. Do you recommend this practice?
You may want to find a new mentor. It’s great to learn from others, and it’s especially wonderful when the lesson you learn is simple and easy to remember. It’s not so great, however, when the advice is simplistic and results in suboptimal code. That’s the case with this cursor FOR loop recommendation.
I have a different set of recommendations about cursor FOR loops.
Let’s take a closer look at these five cursor FOR loop recommendations.
Never use a cursor FOR loop if the loop body executes non-query DML (INSERT, UPDATE, DELETE, MERGE). Even when the query itself is automatically optimized to return 100 rows with each fetch, the INSERT or UPDATE will happen on a row-by-row basis. This is one of the worst performance “anti-patterns” in database programming. The result is too many context switches and generally awful performance. Instead, use BULK COLLECT to fetch the data into one or more collections, then use FORALL to do the DML operations in bulk.
Having said that, I can identify two circumstances in which using a cursor FOR loop would do little harm.
First, because Oracle Database automatically optimizes cursor FOR loops to execute similarly to BULK COLLECT, as long as the body of your loop does not push changes back to the database, your “read-only” cursor FOR loop will almost always suffice. If you are fetching very large numbers of rows (10s of 1000s and more), it is possible that switching to FETCH-BULK COLLECT and a high LIMIT value will make your code somewhat faster. The gains will likely not, however, be great.
Next, if you’re writing a “one-off” script, or a program that is run only occasionally and is not in the critical path of operations, you may want to choose the simplicity and readability of the cursor FOR loop over the incremental improvement in performance (and additional complexity of code) that BULK COLLECT offers.
Use an implicit SELECT INTO for single-row fetches. Developers often tell me that they write a cursor FOR loop to fetch a single row. Why not? Oracle Database does so much of the work for you, saving several lines of code and several minutes of typing.
But there’s a problem with using a cursor FOR loop for a single-row fetch: the resulting code is very misleading. It looks like you expect to retrieve multiple rows, yet you get just one.
Will this cause a problem? Maybe not. However, from a best practices standpoint, writing code that is transparent in purpose and easy to read and understand is most important. You should not write code that appears to do one thing while it, in fact, does another.
If you need to retrieve a single row and you know that at most one row should be retrieved, you should use a SELECT INTO statement, as in the following:
PROCEDURE process_employee ( id_in IN employees.employee_id%TYPE) IS l_last_name employees.last_name%TYPE; BEGIN SELECT e.last_name INTO l_last_name FROM employees e WHERE e.employee_id = process_employee.id_in; ... END process_employee;
The implicit SELECT INTO offers the most-efficient means of returning that single row of information to your PL/SQL program. In addition, the use of SELECT INTO states very clearly that you expect at most one row, and the statement will raise exceptions (NO_DATA_FOUND or TOO_MANY_ROWS) if your expectations are not met.
I further recommend that you encapsulate your SELECT INTO statements into their own functions whose sole purpose is to retrieve this one row of information, as shown in Listing 1.
Code Listing 1: Encapsulating SELECT INTO in a function
PACKAGE employees_qp IS FUNCTION last_name (id_in IN employees.employee_id%TYPE) RETURN employees.last_name%TYPE; END employees_qp; PROCEDURE process_employee (id_in IN employees.employee_id%TYPE) IS l_last_name employees.last_name%TYPE; BEGIN l_last_name := employees_qp.last_name (id_in); END process_employee;
With this approach, you are much more likely to reuse that SELECT INTO rather than write it repeatedly in your code. This reuse is important, because it will make optimizing the SQL statements in your application much easier. In particular, you will be able to take advantage of Oracle Database 11g’s function result cache feature more quickly and smoothly.
BULK COLLECT into a varray when you know the upper limit. A varray is a collection that has an upper limit on the number of elements that can be defined in the collection. This upper limit is specified when the varray type is declared; it can also be modified afterward, if you are using Oracle Database 10g Release 2 or higher.
Listing 2 shows how to define a varray type and declare a variable based on this type. If you try to add a 13th month to this list, Oracle Database will raise an exception:
Code Listing 2: Defining a varray type and declaring a variable
DECLARE TYPE at_most_twelve_t IS VARRAY (12) OF VARCHAR2 (100); l_months at_most_twelve_t := at_most_twelve_t ('January' , 'February' , 'March' , 'April' , 'May' , 'June' , 'July' , 'August' , 'September' , 'October' , 'November' , 'December' );
BEGIN l_months.extend; l_months(13) := 'Extra-ember'; END; ORA-06532: Subscript outside of limit
Varrays offer a very nice mechanism when you need to retrieve multiple rows of data efficiently and the number of rows should never exceed a certain limit. Suppose, for example, that I have a table (training_months) of the months in a year in which I am available to provide training on the PL/SQL language. There cannot be more than 12 rows in this table, but there may certainly be fewer.
I need to retrieve these months into a collection and then display each month. I can use a varray to ensure that the absolute limit of 12 is respected, as shown in Listing 3. If, for some reason, more than 12 rows are found in the table, Oracle Database will raise an exception.
Code Listing 3: sing a varray to ensure a limited number of rows
DROP TABLE training_months / CREATE TABLE training_months (month_name VARCHAR2(100)) / BEGIN /* No trainings in the depths of summer and winter... */ INSERT INTO training_months VALUES ('March'); INSERT INTO training_months VALUES ('April'); INSERT INTO training_months VALUES ('May'); INSERT INTO training_months VALUES ('June'); INSERT INTO training_months VALUES ('September'); INSERT INTO training_months VALUES ('October'); INSERT INTO training_months VALUES ('November'); COMMIT; END; / DECLARE TYPE at_most_twelve_t IS VARRAY (12) OF VARCHAR2 (100); l_months at_most_twelve_t; BEGIN SELECT month_name BULK COLLECT INTO l_months FROM training_months; FOR indx IN 1 .. l_months.COUNT LOOP DBMS_OUTPUT.put_line (l_months (indx)); END LOOP; END; /
However, there’s one concern with this varray approach: what if you know the maximum number of elements that can appear in the varray and that maximum is 1,000,000? This technique will “work,” but the program will consume a dangerously large amount of per-session memory. In this case, you should forsake the varray. Instead, switch to an associative array or nested table and use the LIMIT clause with BULK COLLECT (as described in the next section).
By the way, if you are writing back-end code that serves up data to a stateless user interface, as is often the case with browser-based applications, you would never retrieve so many rows at once. Instead, consider a “next page” paradigm, in which the page has a preset maximum number of rows and the query that retrieves the next set of rows includes a WHERE clause that specifies that range of rows.
BULK COLLECT with LIMIT when you don’t know the upper limit. BULK COLLECT helps retrieve multiple rows of data quickly. Rather than retrieve one row of data at a time into a record or a set of individual variables, BULK COLLECT lets us retrieve hundreds, thousands, even tens of thousands of rows with a single context switch to the SQL engine and deposit all that data into a collection. The resulting performance improvement can be an order of magnitude or greater.
However, in such a case, that boost in performance results in an increase in the amount of per-session memory consumed by the collection populated by the query. In addition, each session connected to Oracle Database has its own per-session memory area. Therefore, a BULK COLLECT that fetches numerous rows can seriously affect memory management on your database server.
So for scenarios in which you need to fetch multiple rows of data and the number of rows fetched can be large or grow over time after your program is put into production, you should use BULK COLLECT with the LIMIT clause. Listing 4 shows the use of the LIMIT clause.
Code Listing 4: Using the LIMIT clause
PROCEDURE bulk_with_limit ( dept_id_in IN employees.department_id%TYPE , limit_in IN PLS_INTEGER DEFAULT 100 ) IS CURSOR employees_cur IS SELECT * FROM employees WHERE department_id = dept_id_in; TYPE employee_tt IS TABLE OF employees_cur%ROWTYPE INDEX BY PLS_INTEGER; l_employees employee_tt; BEGIN OPEN employees_cur; LOOP FETCH employees_cur BULK COLLECT INTO l_employees LIMIT limit_in; FOR indx IN 1 .. l_employees.COUNT LOOP process_each_employees (l_employees (indx)); END LOOP; EXIT WHEN employees_cur%NOTFOUND; END LOOP; CLOSE employees_cur; END bulk_with_limit;
The limit value can be a variable; in this case, I provide a default value of 100 (retrieve up to 100 rows with each fetch). As for the appropriate limit value, you will theoretically use the largest number that can be accommodated within the per-session memory your DBA feels can be allocated per connection.
In practice, I haven’t seen significant differences between limits of 100, 500, or even 1,000. However, I have heard that for retrieval of very large data sets, developers have obtained optimal performance with limits as high as 25,000. Experiment with your own data sets and various limit values, and see if the performance varies.
Convert existing cursor FOR loops only when necessary. To finish this answer, let’s consider the question of what you should do about all those existing cursor FOR loops in your applications.
Oracle Database does automatically optimize the performance of cursor FOR loops. They do not generally execute as efficiently as explicitly coded BULK COLLECT statements, but they are much more performant than single-row fetches. Consequently, I suggest that you convert cursor FOR loops to BULK COLLECT retrievals only if you identify a performance bottleneck in that part of your code. Otherwise, leave the cursor FOR loop in place.
That’s it for my recommendations. But an unattributed programmers’ axiom has it that rules exist to serve, not to enslave. And a variation on this theme adds a paradox: all rules were meant to be broken—including these. This concept is expressed more usefully as the first [meta]principle of PL/SQL best practice: Do not deviate from any of the principles without first discussing with a more experienced programmer the use case that seems to warrant such a deviation.Next Steps
DOWNLOAD Oracle Database 11g
Photography by Ricardo Gomez, Unsplash