I understand read consistency—the fact that a result set returned by a single query is entirely consistent with respect to some point in time—but recently I’ve noticed something strange. A certain query is returning inconsistent results: the rows returned contain values from the database as of different points in time. Why is this? The query is a simple query against a single table, and it calls a PL/SQL function, which in turn runs another query.
This is an interesting question. First I’ll make sure everyone is on the same page and demonstrate what read consistency means. Then I’ll demonstrate this issue and suggest some approaches to correct it.
First, read consistency is an attribute of every query ever executed in Oracle Database. It ensures that every row in a result set presented to an application comes from the database as of the same point in time. If a simple query such as SELECT * FROM a_really_big_table takes one hour to complete, the first row returned from the query will be consistent timewise with respect to the last row returned from the query. For example, suppose the “really big table” is an ACCOUNTS table and has columns ACCOUNT# and BALANCE. Further, assume that the data in the table looks like this before the query is executed:
|… millions of rows here||….|
So, I have a very large table (ACCOUNTS) and a query (SELECT * FROM ACCOUNTS) that will take an hour to read from row 1 (with ACCOUNT# 123) to the last row (with ACCOUNT# 789). All the rows are not read at exactly the same instant, of course, but instead—during the course of the full table scan—over time. The results should be some set of results that existed in the database at some point in time. (That seems reasonable, doesn’t it? To get an answer that exists?)
As Oracle Database reads this data, it will be checking every time it reads to ensure that the data it is looking at is as of the point in time the query began. By default, read consistency ensures that the result set is as of the point in time the query began (but later I’ll mention another way to set the value of the point in time). If the data the database reads from the table represents data that was modified after the query began—for example, the data might still be locked by another transaction or it might have been changed and committed by another transaction—Oracle Database will utilize undo (rollback) information to put the row back the way it was when the statement began.
As an example, suppose your program is querying this ACCOUNTS table, your program has read and printed out row 1 for ACCOUNT# 123, and the end user has seen the $500 amount for the BALANCE. Now suppose that while your report is printing out the rest of the data—rows 2, 3, and so on —someone uses an automated teller machine to transfer $100 from account 123 to 789 and commits the work. The last row in the table will now look like this:
When your query—which has been running for an hour—reaches this last record, what should it do? Should it read and print out $400 for ACCOUNT# 789? According to Oracle Database, no, it should not (and I happen to agree 100 percent with that decision). If the query did return $400 for ACCOUNT# 789, you would have a report that shows the account balances to be $100 more than ever existed in the bank at any time—because the first row and the last row in the report would have double-counted that same money. Upon encountering the data that has been changed since the query began, Oracle Database will roll back the change—rolling it back as many times as necessary (because it could have been changed many times in the last hour)—until it contains the data that was committed in the database when the query began. That is what gets returned to your application: the value of the last row that is consistent with the value of the first row returned to you. This is true regardless of the type of query you execute.
Suppose you had issued select sum(balance) from accounts instead of selecting every row individually. You would get only one row back from this query (not millions), but it still would take some measurable time to execute (to read and sum up all of those BALANCE values). In this case, read consistency is perhaps even more relevant. What if during the processing of the select sum query—after you read the first record with a $500 BALANCE but before you read the last record with a $300 BALANCE—someone did the $100 transfer? Without read consistency, if you just read whatever value you found in the last row ($400 in this example), the SUM(BALANCE) returned would be $100 more than ever existed. That would be a rather obvious error on a balance sheet. Instead of returning that erroneous value, Oracle Database returns a SUM(BALANCE) value that incorporates BALANCE values as of the same point in time. The answer you get back from Oracle Database incorporates the $300 balance for account 789, not the “modified after you began” $400 balance.
Now, let’s suppose you are executing a simple query such as select a, b, c, f(x) from t, where a, b, c, and x are columns of a table t and f is a PL/SQL function you’ve written that contains its own SQL statements. And you are seeing what appear to be inconsistent results—that is, the data returned for the first row is inconsistent with the data returned for the last row, something that sounds impossible in Oracle Database. Well, the rules of read consistency have not been violated here: every query executed in this case will be read-consistent. However, each query executed will be read-consistent with respect to itself by default. If you use SQL to execute a function that in turn runs its own SQL, each query executed is by default consistent with itself but not with the original driving SQL. I can demonstrate this rather easily by using a copy of the EMP and DEPT tables from the SCOTT schema. I know there should be a grand total of 4 departments in the DEPT table and 14 employees in the EMP table—3 in DEPTNO 10, 5 in DEPTNO 20, 6 in DEPTNO 30, and 0 in DEPTNO 40. I write a little logic in PL/SQL to count the employees by department (a PL/SQL function that does some SQL to return a result):
SQL> create or replace function f( p_deptno in number ) return number 2 as 3 l_cnt number; 4 begin 5 select count(*) 6 into l_cnt 7 from emp 8 where emp.deptno = p_deptno; 9 10 do_work_in_another_transaction(); 11 12 return l_cnt; 13 end; 14 / Function created.
Note that this function (F) calls another procedure: DO_WORK_IN_ANOTHER_TRANSACTION(). This is a stored procedure I wrote just for this example that simulates a situation in which other concurrent users in the database are doing some work (modifying the data I am reading) and committing that work. This represents real life, where other users are in the database making modifications to the data I am currently reading. Here is the implementation of the DO_WORK_IN_ANOTHER_TRANSACTION() stored procedure:
SQL> create sequence s start with 8000; Sequence created. SQL> create or replace procedure do_work_in_another_transaction 2 as 3 pragma autonomous_transaction; 4 begin 5 for i in 1 .. 4 6 loop 7 insert into emp( empno, deptno ) 8 values ( s.nextval, i*10 ); 9 end loop; 10 commit; 11 end; 12 / Procedure created.
Every time the procedure is called, I “hire” a new set of employees—one each in each of the departments—and then commit. Now, let’s query my data. I’ll select the DEPTNO and DNAME columns from DEPT and invoke my function (F) to get the count by DEPTNO. To show the effects of read consistency, I’ll also get the “true” count with a scalar subquery—I’ll just take the SELECT COUNT(*) from my function (F) and invoke that directly in the query itself:
SQL> select deptno, dname, 2 (select count(*) 3 from emp 4 where emp.deptno= dept.deptno) cnt1, 5 f(deptno) cnt2 6 from dept 7 / DEPTNO DNAME CNT1 CNT2 ————————— —————————————— ————— ————— 10 ACCOUNTING 3 3 20 RESEARCH 5 6 30 SALES 6 8 40 OPERATIONS 0 3
Note the differing results between CNT1 and CNT2. This points out the read consistency issue clearly. The scalar subquery on lines 2 to 4 is part of the outer query; it runs at the same point in time as the outer query itself. It shares the same point in time with the parent query simply because it is part of the parent query. But note the results from the function (F) that runs the same SQL. For the first row, the counts agree (returning 3 for both CNT1 and CNT2), for the simple reason that no updates have been performed yet. But by the time of the second row in the result set, the counts differ by 1 (5 for CNT1 and 6 for CNT2). The function can see the result of the previous transaction’s INSERT and COMMIT (from the DO_WORK_IN_ANOTHER_TRANSACTION() procedure). The function counts that new employee in DEPTNO 20. The third row shows the results diverging by 2 (6 for CNT1 and 8 for CNT2), and the last row shows the results diverging by 3 (0 for CNT1 and 3 for CNT2). Each row returned shows a different read-consistent count for the function (F) but not so for the scalar subquery. Every SQL statement in this example is read-consistent (there is no getting around that), but each query executed as a standalone statement is read-consistent with respect only to its own start time—not to the start time of the originating SQL statement.
So what can you do when you’ve got multiple read-consistent SQL statements starting at different times as standalone SQL and via PL/SQL and you need the same read consistency across all your statements? The best approach, in my opinion, is to not call the PL/SQL from SQL in the first place. I have a mantra that starts with “If you can do it in a single SQL statement, do it.” In this particular example, there was no need to invoke PL/SQL from SQL—it was unnecessary overhead (there is measurable overhead in invoking PL/SQL from SQL) that added nothing to the query that could not have been done directly in the query. In fact, it detracted from the query, in that it returned read-inconsistent results.
Another valid approach would be to use serializable (or read-only, if applicable) transactions (see oracle.com/technetwork/issue-archive/2005/05-nov/o65asktom-082389.html for a description of the various isolation levels). Using serializable transactions, all SQL statements will execute as of the transaction’s start time—they will not see any work that was committed during their transaction until they themselves commit. This will ensure that all the SQL in your transaction is as of the same point in time. So if you reset the example (re-create the EMP and DEPT tables in your schema) and execute the query again:
SQL> set transaction isolation level serializable; Transaction set. SQL> select deptno, dname, 2 (select count(*) 3 from emp 4 where emp.deptno= dept.deptno) cnt1, 5 f(deptno) cnt2 6 from dept 7 / DEPTNO DNAME CNT1 CNT2 ————————— —————————————— ————— ————— 10 ACCOUNTING 3 3 20 RESEARCH 5 5 30 SALES 6 6 40 OPERATIONS 0 0
both CNT1 and CNT2 will be as of the same point in time—they will be read-consistent. This query will return this same consistent result set until you commit your transaction. Then and only then will your query be able to see the work of other transactions.
Another approach would be to use flashback query. This is not widely applicable, though. It would be applicable if and only if you were performing a flashback query in the outer parent query in the first place. In this case, you would have to pass the as-of time or SCN (system change number) as well as the DEPTNO to the function. Then the outer parent query and the queries executed by the function would use the same as-of time consistently.
In short, the best approach, to me, would be to use a single SQL statement (and avoid the overhead of calling PL/SQL repeatedly from SQL). If you are concerned about code maintenance (you hid the count in the function for maintainability purposes), I encourage you to use a view. You can use views to hide reusable code as easily as using a function. And for my simple example, I wouldn’t even use a view or a scalar subquery, but rather
SQL> select dept.deptno, dept.dname, count(emp.empno) 2 from dept left outer join emp 3 on (dept.deptno = emp.deptno) 4 group by dept.deptno, dept.dname 5 order by dept.deptno 6 / DEPTNO DNAME COUNT(EMP.EMPNO) ————————— —————————————— ————————————————————— 10 ACCOUNTING 3 20 RESEARCH 5 30 SALES 6 40 OPERATIONS 0
That’s just a single, simple bulk SQL statement. And it will always be read-consistent.
A column-level constraint’s scope is only the column on which it is defined. A table-level constraint, on the other hand, can see every column in the table. That is the only difference between the two constraints.
Any column-level constraint—with the exception of a NOT NULL constraint—can be expressed at the table level, but the opposite is not true. That does not mean that all constraints should be expressed at the table level but rather that it is possible to express them at the table level. It is my strong opinion that a column-level constraint should be used unless it cannot be used. That is, if you can express your constraint at the column level, you should do so at the column level. A column-level constraint is syntactically clearer—it is obvious that it applies to that single column. It is more meaningful, therefore, when applied at the column level.
The use of a table-level constraint would imply that more than one column was involved. You gain information—context—via the constraint’s definition. If you have a column-level constraint, you know that it affects a single column without even looking at it. On the other hand, if you have a table-level constraint and follow this thinking, you know that it must utilize at least two columns in its definition.
As for their runtime implementations, column- and table-level constraints are identical. There is no runtime difference between a column-level constraint and a table-level constraint.
Some might point out that a NOT NULL constraint is implemented technically in the database as a CHECK constraint. For example, if you create the following table
SQL> create table t 2 ( x int not null, 3 y int check (y is not null), 4 z int, 5 constraint z_check check(z is not null) 6 ); Table created.
you’ll discover that all three constraints end up looking very much the same in the data dictionary:
SQL> select constraint_type, 2 search_condition 3 from user_constraints 4 where table_name = 'T'; C SEARCH_CONDITION — —————————————————————————— C "X" IS NOT NULL C y is not null C z is not null
A NOT NULL constraint is implemented under the covers as a CHECK constraint (CONSTRAINT_TYPE is C for Check). Do not let that fact lure you into using a CHECK constraint to verify that a column is NOT NULL, however. If a column is, in fact, NOT NULL, you should use the NOT NULL constraint, because the optimizer recognizes the existence of NOT NULL but not the existence of a CHECK constraint you write yourself that checks for NOT NULL. For example, if you index column X of the table, you’ll see that the optimizer can use that index to perform a COUNT(*), as shown in Listing 1.
Code Listing 1: The optimizer knows that X is NOT NULL. SQL> set autotrace on explain
SQL> create index t_x_idx on t(x); Index created. SQL> select /*+ index_ffs( t t_x_idx ) */ count(*) from t; COUNT(*) ——————————————— 0 ————————————————————————————————————————————————————————————————————————— | Id | Operation | Name | Rows | Cost (%CPU)| Time | ————————————————————————————————————————————————————————————————————————— | 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| T_X_IDX | 1 | 2 (0)| 00:00:01 |
However, if you replace that index with an index on Y, you’ll see that the optimizer cannot use that new index to perform a COUNT(*), as shown in Listing 2.
Code Listing 2: The optimizer does not know that Y is NOT NULL.
SQL> drop index t_x_idx; Index dropped. SQL> create index t_y_idx on t(y); Index created. SQL> select /*+ index_ffs( t t_y_idx ) */ count(*) from t; COUNT(*) ——————————————— 0 ——————————————————————————————————————————————————————————————————— | Id | Operation | Name | Rows | Cost (%CPU)| Time | ——————————————————————————————————————————————————————————————————— | 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T | 1 | 2 (0)| 00:00:01 |
In the first case, with the index on X, the optimizer knew that X was NOT NULL and that the index on X therefore pointed to every row in the table, so it was safe to count rows via the index. In the second case, with the index on Y, the optimizer was not aware that Y was NOT NULL, because the CHECK constraint didn’t provide enough information for the optimizer to know that. Because entirely null entries are not made into B-tree indexes in Oracle Database and because Y was presumed by the optimizer to allow for NULLs, the optimizer could not use the index on Y, having assumed that the index might not point to every row in the table.
This demonstrates why you want to be specific. You want to use the more specific column-level constraints over table constraints whenever possible, because the choice of constraints conveys information to the maintainers of the system when they review the schema later. You want to use the more specific NOT NULL constraint over an equivalent (to you and me, anyway) CHECK constraint, because it conveys more information to the optimizer at parse time.And Something “Unlearned” Sometimes new learning is unlearning. There are things I have to unlearn, because what used to be true has changed over time. Once upon a time—and I remember the day I learned this, during a benchmark in 1993—I learned that unindexed foreign keys had some locking implications. Specifically, if you
Then Oracle9i Database was released, and I had to relearn the rule. The rule in Oracle9i Database was still as above, other than having been modified in terms of the duration of the lock. (Many people think the restriction actually went away, but it did not.) In Oracle9i Database and later, for a parent table with an unindexed foreign key, if you update the parent table or delete from it, the child table will still be locked, but just for the duration of the UPDATE or DELETE. The lock is released after the statement is processed—not when you commit. This is better than in 1993, but the lock still exists.
Sometime during my use of Oracle9i Database, I learned yet another modification to the “unindexed foreign keys and locking implications” rule above. The rule in Oracle9i Database has to include
As I was getting ready to add that to the second edition of Expert Oracle Database Architecture, I learned something new: the rule has changed again. The MERGE doesn’t always lock the table in Oracle Database 11g Release 1 and above. If you are interested in the full test case demonstrating this, I encourage you to read tkyte.blogspot .com/2010/04/something-i-recently-unlearned.html, where I first wrote this up (the example is too large to fit here).
So, in short, with releases prior to Oracle Database 11g Release 1, you will want an index on the foreign key of the child table if you do any of the following:
And the rule in Oracle Database 11g Release 1 and later is that you will want an index on the foreign key of the child table if you do any of the following:
See bit.ly/aFqMGm for more information on unindexed foreign keys, including a script for detecting whether you have any.
Photography by Scott Webb, Unsplash