Let me provide a little background to this question before answering it.
This application is a home-grown Java based app and the SQL statement in questions looks like this:
INSERT /*+APPEND*/ INTO t1
(ROW_ID, MODIFICATION_NUM, OPERATION, LAST_UPD)
SELECT ROW_ID ,1 ,'I' ,LAST_UPD
WHERE t2.LAST_UPD > :1;
The table T2 has approximately 1.4 million rows and one index IND_T2 on the LAST_UPD and ROW_ID columns. The execution plans are quite simple and involve just the IND_T2 index. The paln in SQL* Plus does an index range scan on the IND_T2 index.
While the application does an index full scan of the same index.
As you can see the SQL statement is using one bind variable, but bind peeking has been disabled disabled (_optim_peek_user_binds = false) and the same bind value is used in both SQL*Plus and the application. In fact, the SQL*Plus session uses the same db user, SQL_ID and the optimizer_env_hash_value as the application.
Statistics exist for both the table and the index, they are not stale and are not changing between the time the application executes the SQL statement and the time it is executed through SQL*Plus. Dynamic sampling is not kicking in for this SQL statement in either scenario.
The SQL statement is executed only once during the day and is not present in the shared pool by the next morning so it is always hard parsed. And this issue is occurring in multiple databases, across 2 versions (10.2.0.4. and 220.127.116.11).
So why are the plans different?
The plan difference is actually due to a datatype mismatch between the bind variable and the column. The Java application declares the bind variable to be a timestamp data type while the column (LAST_UPD) in the table T2 is defined as a date. In the SQL*Plus session, the bind variable was being declared correctly as a date.
Since the application was using an incorrect datatype the optimizer rewrote the where clause predicate to be;
The internal_function is wrapped around the column LAST_UPD to convert it to a timestamp data type. The presence of the function around the column causes the cardinality estimate to be a guess. In this cause the guess is 5% of the rows in the table or 70,414, instead of the correct estimate of 12,782.
By changing the application datatype to be a date instead of a timestamp, they were able to get the application to do an index range scan just like SQL*Plus!Maria Colgan+