I have a table with the following definition:
SQL> desc OAUTH2_REFRESH_TOKEN ID RAW(16) TOKEN_ID VARCHAR2(255 CHAR) TOKEN BLOB AUTHENTICATION BLOB
ID is a primary key. There are many deletes on this table using ID, and I’ve noticed that they are doing a lot of consistent gets. For example, here is a runtime profile of a delete:
SQL> DELETE 2 FROM OAUTH2_REFRESH_TOKEN 3 WHERE "ID"= '8A81B295444C2881014452803709648E' 4 / 1 row deleted. —————————————————————————————————————————————————— |Id|Operation |Name | —————————————————————————————————————————————————— | 0|DELETE STATEMENT| | | 1|DELETE |OAUTH2_REFRESH_... | |*2| INDEX FULL SCAN|SYS_C00203982 | —————————————————————————————————————————————————— Predicate Information (identified by operation id): —————————————————————————————————————————————————— 2 - filter(RAWTOHEX("ID")= '8A81B295444C2881014452803709648E') Statistics —————————————————————————————————————————————————— 10 db block gets 10592 consistent gets … 1 rows processed
I tried to improve performance by rebuilding the primary key index, and the size of the index went from 88 MB to 10 MB. This did decrease the number of consistent gets from 10,592 to 1,228, but you would expect a delete by a primary key value to do very few consistent gets, wouldn’t you?
I don’t want to be rebuilding this index all the time, so I am wondering if there is a better way to set it up.
The good news for you is that you will never really have to rebuild this primary key index again—because it is not the problem. The problem is in your code. Your developers have programmed a very nasty bug, and the bug is called “implicit conversions.”
SQL and PL/SQL are very “datatype friendly.” They attempt—whenever possible—to do what you ask them to do. Compare a number with a string, and SQL and PL/SQL will do that by converting the string to a number implicitly—behind the scenes—and then comparing them. Compare a date with a string, and SQL and PL/SQL will do that by converting the string to a date implicitly. I consider all these implicit conversions to be bugs in the developed code, for the following reasons:
SQL> create or replace procedure inj( p_date in date ) 2 as 3 l_rec all_users%rowtype; 4 c sys_refcursor; 5 l_query long; 6 begin 7 l_query := ' 8 select * 9 from all_users 10 where created = ''' ||p_date ||''''; 11 12 dbms_output.put_line( l_query ); 13 open c for l_query; 14 15 for i in 1 .. 5 16 loop 17 fetch c into l_rec; 18 exit when c%notfound; 19 dbms_output.put_line 20 ( l_rec.username || '.....' ); 21 end loop; 22 end; 23 / Procedure created.
Line 10 of this code is the culprit here. It has not one but two implicit conversions. The code in line 10 is equivalent to
10 where created = TO_DATE(''' || TO_CHAR(p_date) ||''')';
When I concatenate the P_DATE variable with a VARCHAR2 variable, it must be converted from a DATE datatype to a VARCHAR2 datatype, and that is done by implicitly applying TO_CHAR —using the default NLS_DATE_FORMAT for the current session—to the P_DATE variable.
Later, when the query is executed, it compares the CREATED database column, which is a DATE column, with the string produced by the TO_CHAR implicit conversion. To make that possible, the string returned by the TO_CHAR function must be converted back into a date with the default NLS_DATE_FORMAT setting once again.Logic Bombs and Default Settings
Think about the logic bombs inherent in such an approach. By default, the NLS_DATE_FORMAT in a database is DD-MON-RR. If I were to pass a date such as TO_DATE( ’18-JUN-1812 13:00:00’, ‘DD-MON-YYYY HH24:MI:SS’ ), what date would I be searching for in the above query? Would it be the 18th of June, 1812, at 1 p.m.? Almost certainly not, but it could be! That is the disturbing part: You cannot answer the question I just posed. No matter what you answer, I’ll say you are wrong and prove it.
By default, in a “normal” Oracle database with default settings, searching with the date TO_DATE( ’18-JUN-1812 13:00:00’, ‘DD-MON-YYYY HH24:MI:SS’ ) will search for the 18th of June, 2012. And it will continue to do that for a few more years, at which point the code will start searching for the 18th of June, 2112 (all by itself, without a change in a line of code!). That is because the default NLS_DATE_FORMAT currently is DD-MON-RR. By using the RR format, you present the century and the year in a date converted into a string with just the year component. So 1812 will become simply 12. Later, when you convert it back to a date with DD-MON-RR, you’ll take that two-character year and look at what the current year is. Using the rules documented in Oracle Database SQL Language Reference, the string “12” will be converted to the century and year 2012, because this article was written in 2014. (If I’d written this article in 1949, it would have returned 1912. If I were to write this article in the future, in 2050, it would return 2112). The very passage of time affects, by default, what data your application returns! This is a logic bomb just waiting to happen.
The other part of the logic bomb is that, by default, the time component is stripped off. I specified “13:00:00” in my date input to the procedure. With the default NLS_DATE_FORMAT, that time component is stripped away.
I have encountered developers who say “we want the date to be a sliding date based on the current year, and we want the time component stripped off.” To them I say “great, perfect; then write that code—be explicit.” The reason? A simple change by a DBA at some point in the future will change the default behavior the developers are implicitly relying on. That is, the code works today—by accident—but breaks tomorrow because a default has changed. For example, the application might have been developed in Europe, where displaying the date as day-month-year is the standard. Now the same application is to be deployed in the US, where the standard date display is month-day-year. So, to accommodate that change, the DBA changes the NLS_DATE_FORMAT to MM/DD/YYYY. Now that you are no longer using the RR format, the full year is preserved and the behavior of the application changes over time. Or perhaps the default NLS_DATE_FORMAT is changed to MM/DD/YYYY HH24. Now, all of a sudden, a portion of the time is preserved and the outcome of your application changes.
This is the nightmare that is reliance on default implicit conversions. An external change can and will affect you; it will change the behavior of your code. I have seen this happen hundreds, if not thousands, of times in my career.
So, what potential “fix” could you put in place? One quick one would be to use explicit conversions and explicit date masks. If the developer’s goal was to have the date be the full YYYY (enabling you to search for 1812, 1912, 2012, and so on whenever you want to), then explicitly code it:
10 where created = TO_DATE(''' || TO_CHAR(trunc( p_date, 'd'), 'dd-mm-yyyy' ) ) ||''', ''dd-mm-yyyy'' )';
By using the TRUNC( p_date, ‘d’ ) function, you explicitly remove the time component. By explicitly using the DD-MM-YYYY format in the TO_CHAR call, you save just the day, the month, and the full year (including the century). Later, in the TO_DATE call, you use the same format to reconstitute your date. Now it no longer matters what NLS_DATE_FORMAT might be set in your database—your code will execute in a known manner, regardless. You are no longer subject to external changes. Your code is stable, predictable, and free of the logic bomb.
But is this the right way? Is it the correct approach? Probably not, because there is a way to completely avoid the conversions altogether. You have a date input, you have a date column, and there should be no need to convert the date to a string and back into a date. You can—and should—achieve this easily by using bind variables:
7 l_query := ' 8 select * 9 from all_users 10 where created = :x'; 11 12 dbms_output.put_line( l_query ); 13 open c for l_query using trunc(p_date,'d');
If you bind in the date, you will be comparing a date with a date. There are no conversions, and there is no chance of a loss of information (as there can be with the RR date format). Less work is performed by the code overall (so this is more performant). In my experience, this is the correct solution: avoid the logic bomb, avoid doing as much work, and avoid some other nasty surprises I’ll talk about in a moment.SQL Injection Issues
Many years ago, in the January/February 2005 issue of Oracle Magazine, I wrote about the issue of SQL injection in code. The problem of SQL injection is just as rampant today in 2014 as it was almost a decade ago—if anything, it is worse today than a decade ago, with many major attacks against websites and databases involving SQL injection. Therefore, it is paramount to protect yourself from SQL injection vulnerabilities wherever possible.
If you consider the original INJ procedure code, the one with the double implicit conversion, you’ll find a major SQL injection flaw in it. Study that code, and see if you can find it before reading further. There is a very high probability that the vast majority of readers will not be able to express how the code is “SQL-injectable.” Therein lies a serious problem. If you can look at a piece of code you know is SQL-injectable (I’ve told you it is) and you cannot tell why it is SQL-injectable, how can you write code that is free from SQL injection? I’ll give you a method for writing code that cannot be SQL-injected in a moment, but let’s look at the INJ stored procedure first and discover how to exploit it.
If there is a user who has CREATE SESSION privileges in your database and EXECUTE privileges on the INJ stored procedure, that user will have read access to virtually every object the owner of that procedure has read access to. That is right—a user with just CREATE SESSION privileges and the ability to execute that procedure can read any table the owner of that procedure can read. That is a pretty big security hole (and it gets worse: if the user also has CREATE PROCEDURE privileges in that database, the user can use that procedure to do anything the owner of that procedure can do!).
So how does someone exploit this SQL injection flaw? Easy. It’s just a matter of playing around with the default NLS_DATE_FORMAT.
OPS$TKYTE is the owner of the INJ stored procedure. By default, if someone else ran the INJ procedure in a “normal” Oracle database with all default settings, that user—user A in this example—would expect to see
SQL> create user a identified by a; User created. SQL> grant create session to a; Grant succeeded. SQL> grant execute on inj to a; Grant succeeded. SQL> connect a/a Connected. SQL> exec ops$tkyte.inj(sysdate) select * from all_users where created = '06-JUN-14' PL/SQL procedure successfully completed.
That is a well-formed query. SYSDATE has been converted into a sensible string and can be converted back into a date to be compared with the CREATED column. But what if user A changes the NLS_DATE_FORMAT? (Note: You need no privileges to change the NLS_DATE_FORMAT. Every user who can connect to the database can do this). Perhaps the user can also change the meaning of the SQL statement. For example,
SQL> alter session set 2 nls_date_format = 'dd-mon-yyyy"'' or 1=1 --'; Session altered. SQL> select sysdate from dual; SYSDATE —————————————————————————————— 06-jun-2014' or 1=1 –
All of a sudden, the default conversion of SYSDATE no longer results in anything resembling a date. It now looks a bit like SQL! I (user A) run the buggy INJ stored procedure again:
SQL> exec ops$tkyte.inj(sysdate) select * from all_users where created = '06-jun-2014' or 1=1--’ FB_DEMO..... BIG_TABLE..... SCOTT..... B..... CLONEDEV..... PL/SQL procedure successfully completed.
Notice how different the resultant SQL text is. It now shows all the rows in that table—not just the rows matching some date input. You might say, “That is OK; the user is allowed to see that table,” but what about this:
SQL> alter session set 2 nls_date_format = '''"union all select ename,0,null from emp_t"--'; Session altered. SQL> select sysdate from dual; SYSDATE ————————————————————————————————— 'union all select ename,0,null from emp_t-—’
That looks interesting. If OPS$TKYTE, the owner of the INJ stored procedure, has a table called EMP_T, the following should query it and I might be able to see the data:
SQL> select * from ops$tkyte.emp_t; select * from ops$tkyte.emp_t * ERROR at line 1: ORA-00942: table or view does not exist SQL> exec ops$tkyte.inj(sysdate) select * from all_users where created = ''union all select ename,0,null from emp_t--' SCOTT..... ALLEN..... JONES..... BLAKE..... CLARK..... PL/SQL procedure successfully completed.
The first query shows that I (user A) do not have access to OPS$TKYTE.EMP_T, so I cannot select from it. However, I can trick the stored procedure owned by OPS$TKYTE, who does have access, to show it to me. Using UNION, I can see anything I want that OPS$TKYTE can see. If I don’t know what else OPS$TKYTE can see, I can add UNION to a query against the dictionary views to see what is available:
SQL> alter session set 2 nls_date_format = '''"union all select tname,0,null from tab"--'; Session altered. SQL> exec ops$tkyte.inj(sysdate) select * from all_users where created = ''union all select tname,0,null from tab--' ALL_OBJECTS_UNLOAD..... ALREADY_EXISTING_TABLE..... AUDIT_TRAIL..... BIG_TABLE..... T..... PL/SQL procedure successfully completed.
Now I know all the tables owned by OPS$TKYTE, so I can start querying away.
A default implicit conversion has opened this code to a SQL injection attack, one that is hard to see just by looking at the code (unless you know what to look for). If I’m the code owner, how can I protect myself? The answer is rather simple. I want to (a) avoid the use of defaults at all times and (b) avoid implicit conversions. If I used
10 where created = TO_DATE(''' || TO_CHAR(trunc( p_date, 'd'), 'dd-mm-yyyy' ) ) ||''', ''dd-mm-yyyy'' )';
my INJ procedure code would not be subject to SQL injection. No matter what the attacker set the NLS_DATE_FORMAT to, it would not affect me. However, there is an infinitely better way, and that is to once again use bind variables. If the code uses bind variables (demonstrated above), it will be impossible for the code to be SQL-injected. Let me repeat this:
If you use bind variables, your code cannot be SQL-injected. If you do not concatenate inputs sent to you from outside your code (an end-user submission from a form on a screen, inputs from another developer, whatever) and if you use bind variables to send those values to SQL, your code cannot be SQL-injected—it is not possible. On the other hand, if you use string concatenation and concatenate into your SQL inputs from some external source (any parameter to your subroutine example), your code will be subject to SQL injection issues. You might not be able to see the SQL injection risks, but they likely exist!
So, by using bind variables of the correct datatype, you can avoid implicit conversion issues (you never have to convert the types!) and you can avoid SQL injection attacks. It’s sort of a two-for-one reason to use bind variables.Reduced Access Paths
As if the preceding issues were not enough, there are yet more reasons to avoid implicit conversions at all costs. Implicit conversions can easily remove many access paths from consideration and prevent partition elimination.
Take, for example, a rather simple table T:
SQL> create table t 2 ( x varchar2(20) constraint t_pk primary key, 3 y varchar2(30) 4 ); Table created. SQL> insert into t 2 select user_id, username 3 from all_users; 47 rows created. SQL> commit; Commit complete.
Table T has a primary key; column X is that primary key, and it is a VARCHAR2. However, the application that places data into table T puts only numbers in there. So, even though T.X is a VARCHAR2, the only thing end users and developers see is numbers. As far as they are concerned, T.X is a number. So, when they query it, they tend to use a number as an input. This will, of course, cause an implicit conversion, and you can easily see this in action. Let’s use a small stored procedure P, which queries table T, as shown in Listing 1.
Code Listing 1: Procedure P, with implicit conversion and compilation warnings
SQL> alter session set plsql_warnings='enable:all'; SQL> create or replace procedure p authid definer 2 as 3 l_rec t%rowtype; 4 l_key number := 5; 5 begin 6 select * into l_rec from t where x = l_key; 7 for x in (select plan_table_output 8 from TABLE( dbms_xplan.display_cursor())) 9 loop 10 dbms_output.put_line( x.plan_table_output ); 11 end loop; 12 end; 13 / SP2-0804: Procedure created with compilation warnings
This simple procedure queries table T, but it uses NUMBER (L_KEY) to compare the VARCHAR2 column T.X values. This will cause an implicit conversion, but it is not readily apparent in the code. That is why I enabled PL/SQL warnings (a feature of PL/SQL since Oracle Database 10g), and the PL/SQL warning facility will actually alert me to these evil implicit conversions:
SQL> show errors Errors for PROCEDURE P: LINE/COL ERROR ———————— —————————————————————————— 6/42 PLW-07204: conversion away from column type may result in sub-optimal query plan
PL/SQL is warning me that on line 6 (“WHERE x = l_key”), I have an implicit conversion issue.
That this fact can easily be pointed out is a great bonus of using PL/SQL. Before I look at solutions, let’s run the procedure as is, as shown in Listing 2.
Code Listing 2: Execution of P, with full table scan
SQL> exec p SQL_ID 18796jgha0hwz, child number 0 ————————————————————————————————————— SELECT * FROM T WHERE X = :B1 Plan hash value: 1601196873 —————————————————————————————————————————————————————————————————————————— | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | —————————————————————————————————————————————————————————————————————————— | 0 | SELECT STATEMENT | | | | 3 (100)| | |* 1 | TABLE ACCESS FULL| T | 1 | 29 | 3 (0)| 00:00:01 | —————————————————————————————————————————————————————————————————————————— Predicate Information (identified by operation id): —————————————————————————————————————————————————————————————————————————— 1 – filter(TO_NUMBER("X")=:B1)
Notice that I am performing a full scan of table T, which is unexpected, because my predicate is “WHERE X = :B1” and X is the primary key. I would typically expect to see an index unique scan to retrieve this data, but the predicate information section of the plan report shows what the issue is. The true predicate is “TO_NUMBER(“X”) = :B1”—not just “X = :B1”. The implicit conversion leaps into view. That implicit conversion negates the use of the index on T.X to access the data. The index is on T.X, not on TO_NUMBER(T.X).
If I modify the procedure to compare strings with strings (as I should!), I will see something completely different, as shown in Listing 3.
Code Listing 3: Revised procedure P, with no compilation warnings
SQL> create or replace procedure p authid definer 2 as 3 l_rec t%rowtype; 4 l_key varchar2(5) := '5'; 5 begin 6 select * into l_rec from t where x = l_key; 7 for x in (select plan_table_output 8 from TABLE( dbms_xplan.display_cursor())) 9 loop 10 dbms_output.put_line( x.plan_table_output ); 11 end loop; 12 end; 13 / Procedure created. SQL> show errors No errors.
The first thing to notice in Listing 3 is that there are no warnings at all—the code compiles cleanly. But even better, the code in Listing 4 executes the updated procedure P and the expected plan is used, with the expected index unique scan.
Code Listing 4: Execution of revised P, with index unique scan
SQL> exec p SQL_ID 18796jgha0hwz, child number 1 ———————————————————————————————————————————————————— SELECT * FROM T WHERE X = :B1 Plan hash value: 1303508680 ———————————————————————————————————————————————————— | Id | Operation | Name | Rows | ———————————————————————————————————————————————————— | 0 | SELECT STATEMENT | | | | 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | |* 2 | INDEX UNIQUE SCAN | T_PK | 1 | ———————————————————————————————————————————————————— Predicate Information (identified by operation id): ———————————————————————————————————————————————————— 2 - access("X"=:B1)Back to the Beginning
And that is a very long answer to the original question. Here is the original query plan:
SQL> DELETE 2 FROM OAUTH2_REFRESH_TOKEN 3 WHERE "ID"= '8A81B295444C2881014452803709648E' 4 / 1 row deleted. ———————————————————————————————————————————————————— |Id| Operation | Name | ———————————————————————————————————————————————————— | 0| DELETE STATEMENT| | | 1| DELETE | OAUTH2_REFRESH.. | |*2| INDEX FULL SCAN| SYS_C00203982 | Predicate Information (identified by operation id): ——————————————————————————————————————————————————— 2 - filter(RAWTOHEX("ID")= '8A81B295444C2881014452803709648E')
Recall that ID is a primary key and a RAW(16). I would expect “WHERE ID = :x” to use an index unique scan—not an index full scan—but it doesn’t. And the reason it doesn’t use an index unique scan stands out in the predicate information section. An implicit conversion is being applied to the ID column, because ID is a RAW(16) and you are comparing it with a VARCHAR2(32). You are forced to implicitly convert the RAW(16) to a VARCHAR2(32). That implicit conversion negates the use of the index!
The solution for the performance issue lies in the application. The application developers should avoid the implicit conversion being applied to the ID column, and they can do that by Using WHERE “ID” = hextoraw(‘8A81B295444C2881014452803709648E’ ) or Binding a RAW(16) instead of binding a VARCHAR2(32), which is the correct approach (use bind variables!) or Binding a VARCHAR2(32) but wrapping the bind in a HEXTORAW call, such as WHERE ID = hextoraw(:x)Next Steps
Tom Kyte answers your most difficult technology questions. Highlights from that forum appear in this column.
READ more Tom
DOWNLOAD Oracle Database 12c
EXPLORE Oracle Learning Library
READ more about SQL injection
“On Injecting and Comparing”
FOLLOW Tom on Twitter
Photography by Ricardo Gomez, Unsplash