X

Insights into Statistics, Query Optimization and the Oracle Optimizer

Tips on SQL Plan Management and Oracle Database In-Memory - Part 2

Nigel Bayliss
Product Manager

In Part 1 of this series of tips on SQL Plan Management (SPM) and Oracle Database In-Memory, I covered what would happen if we have a SQL plan baseline for a full table scan query when the table was populating the In-Memory column store. 

In this part I’m going to cover a scenario where a query has more than one SQL plan baseline: 

  • There is a query (called Q2, for short).
  • Q2 queries a table called MYSALES, which is not yet populating the In-Memory column store.
  • Q2 filters rows in MYSALES using a predicate on the SALE_TYPE column.
  • Data in SALE_TYPE is skewed, so there’s an index and a histogram on this column.
  • Because there is data skew, Q2 has two accepted SQL plan baselines; one with a full table scan and one with an index range scan.

You’ve probably come across this situation many times: the Oracle Optimizer must choose between a full table scan or an index range scan depending on predicate selectivity. The ability to change the execution plan based on the value of bind variables is called adaptive cursor sharing. If you’ve not come across that, then you’ll find it useful to check out the section on this topic in the Database SQL Tuning Guide.

What’s great about SPM is that it allows you to have multiple SQL plan baselines for individual queries, so you're not forced to pick one plan in preference to another. This capability is most relevant in environments where SQL statements use bind variables and there is a good deal of data skew. Queries like this are likely to have their plans affected by Oracle In-Memory Database because in-memory full table scans will have a lower cost than storage-resident table scans. Clearly, the In-Memory column store will affect the point of inflection where a full table scan will become more efficient than an index range scan. How is this going to work with SPM? 

Take a look at the following example. Q2 executes and matches 2 million rows because I picked the value of bind variable “:val” to do just that. The Optimizer chooses a full table scan: 

PLAN_TABLE_OUTPUT
-----------------
SQL_ID  d3u63rk540w0r, child number 1
-------------------------------------
select /* SPM */ count(*),sum(val) from mysales where sale_type = :val
 
Plan hash value: 3292460164
 
------------------------------------------------------------------------------
  Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |       |       |  2475 (100)|          |
|   1 |  SORT AGGREGATE    |         |     1 |    17 |            |          |
|*  2 |   TABLE ACCESS FULL| MYSALES |  2000K|    32M|  2475   (1)| 00:00:01 |
------------------------------------------------------------------------------

 
Predicate Information (identified by operation id):

---------------------------------------------------

  2 - filter("SALE_TYPE"=:VAL)

Note
-----
  - SQL plan baseline SQL_PLAN_93ct9zmnvtbuhc69cec1f used for this statement

For the second execution, the value “:val” is set so that it would match only 20,001 rows. This time the Optimizer chooses an index range scan: 

PLAN_TABLE_OUTPUT
-----------------
SQL_ID  d3u63rk540w0r, child number 2
-------------------------------------
select /* SPM */ count(*),sum(val) from mysales where sale_type = :val
 
Plan hash value: 1266559460
 
------------------------------------------------------------------------------------------------
  Id  | Operation                            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |         |       |       |   133 (100)|          |
|   1 |  SORT AGGREGATE                      |         |     1 |    17 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| MYSALES | 20001 |   332K|   133   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | SI      | 20001 |       |    44   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):a
---------------------------------------------------
  3 - access("SALE_TYPE"=:VAL)
 
Note
-----
  - SQL plan baseline SQL_PLAN_93ct9zmnvtbuh5d8bf80c used for this statement
 

As you will have figured out, the Optimizer has calculated that the index is less efficient than a full table scan when Q2 matches a large number of rows (2 million in this case) so we have two viable SQL execution plans for this query. Before I ran the queries above, I accepted two SQL plan baselines for Q2. You can see in the “note” sections above that two different baselines are used (one ending in “80c” and one ending in “c1f”). They can be seen in the dba_sql_plan_baselines view: 

SELECT plan_name,sql_text,enabled, accepted 
FROM   dba_sql_plan_baselines
WHERE  sql_text LIKE '%SPM%';
 
PLAN_NAME                           SQL_TEXT                                ENA ACC
----------------------------------- ----------------------------------      --- ---
SQL_PLAN_93ct9zmnvtbuhc69cec1f      select /* SPM */ count(*),sum(val)      YES YES

                                    from mysales where sale_type = :val           
SQL_PLAN_93ct9zmnvtbuh5d8bf80c      select /* SPM */ count(*),sum(val)      YES YES

                                    from mysales where sale_type = :val  
 

We’re good shape here. The Optimizer is adapting the query execution plan to take into account bind variable values and data skew. What’s more, SPM is working with us and not against us because it is not forcing Q2 to use a single SQL execution plan.

What happens if we populate MYSALES into the In-Memory column store? 

-- Mark MYSALES with the In-Memory attribute

ALTER TABLE mysales INMEMORY;
 

-- Access MYSALES to trigger population into In-Memory column store

SELECT count(*) FROM mysales;
 

If we execute Q2 to match 2 million rows, the Optimizer continues to choose a full table scan: 

PLAN_TABLE_OUTPUT
-----------------
SQL_ID  d3u63rk540w0r, child number 1
-------------------------------------
select /* SPM */ count(*),sum(val) from mysales where sale_type = :val
 
Plan hash value: 3292460164
 
---------------------------------------------------------------------------------------
  Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |       |       |   115 (100)|          |
|   1 |  SORT AGGREGATE             |         |     1 |    17 |            |          |
|*  2 |   TABLE ACCESS INMEMORY FULL| MYSALES |  2000K|    32M|   115  (20)| 00:00:01 |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  2 - inmemory("SALE_TYPE"=:VAL)
      filter("SALE_TYPE"=:VAL)
 
Note
-----
  - SQL plan baseline SQL_PLAN_93ct9zmnvtbuhc69cec1f used for this statement
 

The full table scan is now annotated with INMEMORY, so we know that some or all of the data for MYSALES is scanned via the In-Memory column store. The “note” section reports that the same baseline is being used as before (ending in “c1f”). This is good news, and it’s the scenario that was covered in Part 1 of this series.  

What if we executed the query to match 20,001 rows? You can probably guess what’s coming; the Optimizer judges that the In-Memory scan is more efficient than the index range scan: 

PLAN_TABLE_OUTPUT
-----------------
SQL_ID  d3u63rk540w0r, child number 2
-------------------------------------
select /* SPM */ count(*),sum(val) from mysales where sale_type = :val
 
Plan hash value: 3292460164
 
---------------------------------------------------------------------------------------
  Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |       |       |   115 (100)|          |
|   1 |  SORT AGGREGATE             |         |     1 |    17 |            |          |
|*  2 |   TABLE ACCESS INMEMORY FULL| MYSALES | 20001 |   332K|   115  (20)| 00:00:01 |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  2 - inmemory("SALE_TYPE"=:VAL)
      filter("SALE_TYPE"=:VAL)
 
Note
-----
  - SQL plan baseline SQL_PLAN_93ct9zmnvtbuhc69cec1f used for this statement
 

Since there is a SQL plan baseline that allows a full table scan to be used, Q2 can use this access method straight away and we get immediate benefit from scanning the In-Memory column store!

Hold on a minute! Wasn’t that just a little bit too convenient? I arranged it so that there was a handy full-table-scan SQL plan baseline ready and waiting for when I "flipped the switch" and started using the In-Memory column store. This example might seem a little contrived, but it is a real-world example and I chose it to illustrate how SPM works together with both Oracle In-Memory Database and adaptive cursor sharing (and if you want more, there's an earlier blog on how adaptive cursor sharing interacts with SPM).

If, instead, I had started out with a single baseline that specified an index range scan, then this is the plan that would have been used even after MYSALES populated the In-Memory column store (and we would not have had an INMEMORY FULL scan). That’s not a bad thing; it is exactly what plan stability means and it is how SPM is meant to work. In the example above I made use of a couple of SQL execution plans that were validated and accepted before I initiated the In-Memory column store. In the more general case, where the Optimizer identifies a brand new execution plan for use with the In-Memory column store, we might want to validate it before we allow the database to use it in our critical application. How can we do that? Happily, it's what SPM evolution was built for, and it goes all the way back to the initial scenario I mentioned in Part 1. I'll cover the details in Part 3

If you want to try out this example for yourself, the scripts are in GitHub.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services