X

Insights into Statistics, Query Optimization and the Oracle Optimizer

Dynamic sampling and its impact on the Optimizer

By: Nigel Bayliss | Product Manager

Dynamic sampling (DS) was introduced to improve the optimizer's ability to generate good execution plans. This feature was enhanced and renamed Dynamic Statistics in Oracle Database 12c. The most common misconception is that DS can be used as a substitute for optimizer statistics, whereas the goal of DS is to augment optimizer statistics; it is used when regular statistics are not sufficient to get good quality cardinality estimates.

So how and when will dynamic statistics be used? During the compilation of a SQL statement, the optimizer decides whether to use DS or not by considering whether the available statistics are sufficient to generate a good execution plan. If the available statistics are not enough, DS will be used. It is typically used to compensate for missing or insufficient statistics that would otherwise lead to a very bad plan. For the case where one or more of the tables in the query does not have statistics, DS is used by the optimizer to gather basic statistics on these tables before optimizing the statement. The statistics gathered in this case are not as high a quality or as complete as the statistics gathered using the DBMS_STATS package. This trade off is made to limit the impact on the compile time of the statement.

DS may be used if a SQL statement contains a complex predicate expression and extended statistics are not available. Extended statistics were introduced in Oracle Database 11g Release 1 with the goal of helping the optimizer get good quality cardinality estimates for complex predicate expressions. For example, if you had a simple query that has where clause predicates on two correlated columns, standard statistics would not be sufficient. Consider the following SQL statement and its execution plan :

EXPLAIN PLAN FOR
SELECT *
FROM   CUSTOMERS
WHERE  cust_city           = 'Los Angeles'
AND    cust_state_province = 'CA';

SELECT * FROM table(dbms_xplan.display(format=>'basic rows'));

-----------------------------------------------
| Id  | Operation         | Name      | Rows  |
-----------------------------------------------
|   0 | SELECT STATEMENT  |           |     8 |
|   1 |  TABLE ACCESS FULL| CUSTOMERS |     8 |
-----------------------------------------------

With standard statistics the optimizer estimates the cardinality as 8 rows. The actual number of rows returned by this query is 916. Without extended statistics, the optimizer does not know that there is a perfect correlation between "Los Angeles" and the state "CA" but by setting optimizer_dynamic_sampling to level 4, the optimizer will use dynamic sampling to gather additional information about this predicate expression. The additional information provided by dynamic sampling allows the optimizer to generate a more accurate cardinality estimate.

ALTER SESSION SET optimizer_dynamic_sampling=4;

EXPLAIN PLAN FOR
SELECT *
FROM   customers
WHERE  cust_city           = 'Los Angeles'
AND    cust_state_province = 'CA';

SELECT * FROM table(dbms_xplan.display(format=>'basic rows'));

-----------------------------------------------
| Id  | Operation         | Name      | Rows  |
-----------------------------------------------
|   0 | SELECT STATEMENT  |           |   916 |
|   1 |  TABLE ACCESS FULL| CUSTOMERS |   916 |
-----------------------------------------------

In this case the improved cardinality estimate has not affected our SQL execution plan, but for more complicated queries a better cardinality estimate will often result in a better SQL execution plan, which will in turn result in a faster query execution time.

You may now be wondering why we had to set the parameter optimizer_dynamic_sampling to 4 .  The dynamic statistics feature is controlled by the parameter optimizer_dynamic_sampling, which can be set to different levels (0-11). These levels control two different things; when dynamic sampling kicks in and how large a sample size will be used to gather the statistics. The greater the sample size the bigger impact DS has on the compilation time of a query.

Level When Dynamic Statistics will be used Sample size (blocks)
0 Switches off dynamic statistics N/A
1 At least one non-partitioned table in the statement has no statistics 32
2 (default)

One or more tables in the statement have no statistics

This setting is suitable for the majority of systems

64
3 Any statement that meets level 2 criteria and any statement that has one or more expressions used in the where clause predicates e.g. Where substr(CUSTLASTNAME,1,3) or Where a + b =5 64
4 Any statement that meets level 3 criteria and any statement that has complex predicates. An OR or AND operator between multiple predicates on the same table 64
5 Any statement that meets level 4 criteria 128
6 Any statement that meets level 4 criteria 256
7 Any statement that meets level 4 criteria 512
8 Any statement that meets level 4 criteria 1024
9 Any statement that meets level 4 criteria 4086
10 All statements All Blocks
11 The database determines automatically if dynamic statistics are required Automatically determined


There more information on dynamic sampling levels, refer to the Chapter 12 of the SQL Tuning Guide for Oracle Database 12c Release 2.

In our example we had an AND operator between two simple predicates on the CUSTOMERS table. From the above table you can now see why I chose to set optimizer_dynamic_sampling to level 4 in order to have dynamic sampling kick in for our example.

From Oracle Database 11g Release 2 onwards, the optimizer automatically decides if dynamic sampling will be useful and what dynamic sampling level will be used for SQL statements executed in parallel. This decision is based on size of the tables in the statement and the complexity of the predicates. You can tell if dynamic sampling kicks in by looks in the note section of the execution plan.

auto_dynamic_sampling.png


For serial SQL statements the dynamic sampling level is controlled by the optimizer_dynamic_sampling parameter but note that from Oracle Database 12c Release 1 the existence of SQL plan directives can also initiate dynamic statistics gathering when a query is compiled. This is a feature of adaptive statistics and is controlled by the database parameter optimizer_adaptive_features (OAF) in Oracle Database 12c Release 1 and  optimizer_adaptive_statistics (OAS) in Oracle Database 12c Release 2.  In other words, from Oracle Database 12c Release 1 onwards, DS will be used if certain adaptive features are enabled by setting the relevant parameter to TRUE. To summarize:

Database Version Name of Database Parameter Controlling Relevant Adaptive Features Default Value
Oracle Database 12c Release 1 optimizer_adaptive_features (OAF) TRUE
Oracle Database 12c Release 2 Onwards optimizer_adaptive_statistics (OAS) FALSE

 

Serial statements are typically short running and any DS overhead at compile time can have a large impact on overall system performance (if statements are frequently hard parsed). For systems that match this profile, setting OAF=FALSE is recommended. For Oracle Database 12c Release 2 onwards, using the default OAS=FALSE is recommended.

Parallel statements are generally more resource intensive, so it's often worth investing in additional overhead at compile time to potentially find a better SQL execution plan.

In our original example the SQL statement is serial, which is why we needed to manual set the value for optimizer_dynamic_sampling parameter (and we'll assume that there were no relevant SQL plan directives).  If we were to issue a similar style of query against a larger table that had the parallel attribute set we can see the dynamic sampling kicking in.

When should you use dynamic sampling? DS is typically recommended when you know you are getting a bad execution plan due to complex predicates. However, you should try and use an alter session statement to set the value for optimizer_dynamic_sampling parameter as it can be extremely difficult to come up with a system-wide setting.

When is it not a good idea to use dynamic sampling? If the queries compile times need to be as fast as possible, for example, unrepeated OLTP queries where you can't amortize the additional cost of compilation over many executions.

Join the discussion

Comments ( 28 )
  • Babu Thursday, March 10, 2011
    An excellent article on Dynamic Sampling. Before this i didn't have exact idea on DS. now i'm very clear about DS. Thank you very much.
  • Suraj Thursday, April 14, 2011
    This link is simply amazing.It has all the information about DS a person should be knowing. It really helped me to start off with DS.Thank you for the information.
  • Bhanu Sunday, May 8, 2011
    Excellent.
  • Guest Monday, June 6, 2011
    Excellent post ,on stop information for DS stuff
  • anjeet Thursday, July 14, 2011

    excellent, i was very confused about DS. Now this is very clear.

    thank you very much.


  • guest Wednesday, November 23, 2011

    What about stale statistics? We are facing issues because when we go live with less data and during initial loads, it changes data distribution drastically and dynamic stats would help.


  • guest Wednesday, February 8, 2012

    Maria,

    I have a question about dynamic sampling vs extended statistics. Would like to know how extended statistics affects dynamic sampling.

    Here is the scenario that i have observed.

    1. Dynamic Sampling set to default and no extended statistics

    The query uses dynamic sampling at level 5 and comes back with wrong cardinality and query takes long time to execute

    2. Dynamic Sampling hint with a value of 4 and no extended statistics

    The query uses dynamic sampling and comes back with right cardinality and executes pretty fast.

    3. Dynamic Sampling set to default and extended statistics collected

    The query still uses dynamic sampling and comes back with wrong cardinality and takes long time to execute

    4. Dynamic Sampling hint with a value of 0 and extended statistics collected

    The query comes back with right cardinality and executes pretty fast.

    Would like to know how dynamic sampling and extended statistics impact each other.

    Pranav


  • Pranav Wednesday, February 8, 2012

    Maria,

    I would like to understand how extended statistics impact dynamic sampling

    Pranav


  • guest Wednesday, March 21, 2012

    "From Oracle Database 11g Release 2 onwards the optimizer will automatically decide if dynamic sampling will be useful and what dynamic sampling level will be used for SQL statements executed in parallel"

    Other than setting any other degree of dynamic sampling, how can this be turned off?


  • pvd_jha Tuesday, March 27, 2012

    Really helpful .


  • guest Thursday, March 29, 2012

    Thank you Maria, what a good explanation :)


  • guest Friday, June 22, 2012

    SIMPLEY GREAT


  • guest Thursday, July 26, 2012

    Good Article, Thank you.


  • dhruva Mishra Friday, August 10, 2012

    good article...


  • guest Saturday, September 22, 2012

    Excellent description for DS and very useful link

    Rgds

    Harvinder


  • Khalil Wednesday, December 5, 2012

    Hi Maria,

    Thanks very much for this article. it's very nice explained with example.


  • suvajit maity Monday, December 31, 2012

    Very good artical about dynamic sampaling...


  • Vicky Monday, January 28, 2013

    Very helpful post. Many Thanks !!!


  • Tom Saturday, February 2, 2013

    This is an article I intend to re-read a couple of times.

    Good material.

    Thanks,

    Tom


  • vin Thursday, July 18, 2013

    Hi,

    Excellent explanation. Now i can say, i know something about DS.

    One question on the example shown.(Correct me if i am wrong)

    Show parameter optimizer_dynamic_sampling is showing 2.

    In explain plan of the statement it shows 4 is this normal?


  • guest Monday, December 1, 2014

    Really useful material and very well described. Thanks guys


  • guest Wednesday, March 11, 2015

    A very good explanation of dynamic sampling which I have idea on this concept before. Thanks for this article.


  • JR Sunday, April 10, 2016

    Excellent article.


  • SRINI Friday, February 17, 2017

    Nice Stuff...Thanks for your time!

    Hope this is useful for DWH


  • Grit Sunday, March 12, 2017

    Beautiful article..!! with good explanation.

    Thank you,

    Grit


  • RAMA Wednesday, January 24, 2018
    Its worth reading .


    Thx ,
    Rama
  • Gerrit Haase Monday, February 12, 2018
    Nice table about what each level means.
    I wonder, why is this not part of the documentation?

    https://docs.oracle.com/database/121/REFRN/GUID-43655FC3-3C32-486B-8B11-8C20C152618D.htm#REFRN10140

    Regards,
    Gerrit
  • Nigel Bayliss Monday, February 12, 2018
    Hi Gerrit,

    Check out the SQL tuning guide. Chapter 12 in Oracle Database 12c Release 2...

    https://docs.oracle.com/en/database/oracle/oracle-database/12.2/tgsql/options-for-optimizer-statistics-gathering.html
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services