Dynamic sampling and its impact on the Optimizer

Dynamic sampling (DS) was introduced in Oracle Database 9i Release 2 to improve the optimizer's ability to generate good execution plans. The most common misconception is that DS can be used as a substitute for optimizer statistics. The goal of DS is to augment the optimizer statistics; it is used when regular statistics are not sufficient to get good quality cardinality estimates.

So how and when will DS be use? During the compilation of a SQL statement, the optimizer decides whether to use DS or not by considering whether the available statistics are sufficient to generate a good execution plan. If the available statistics are not enough, dynamic sampling will be used. It is typically used to compensate for missing or insufficient statistics that would otherwise lead to a very bad plan. For the case where one or more of the tables in the query does not have statistics, DS is used by the optimizer to gather basic statistics on these tables before optimizing the statement. The statistics gathered in this case are not as high a quality or as complete as the statistics gathered using the DBMS_STATS package. This trade off is made to limit the impact on the compile time of the statement.

The second scenario where DS is used is when the statement contains a complex predicate expression and extended statistics are not available. Extended statistics were introduced in Oracle Database 11g Release 1 with the goal to help the optimizer get good quality cardinality estimates for complex predicate expressions. For example, if you had a simple query that has where clause predicates on two correlated columns, standard statistics would not be sufficient. Consider the following SQL statement and its execution plan :

Explain plan for
Select *

From Customers
Where cust_city='Los Angeles'
And  cust_state_province='CA';

Select * from table(dbms_xplan.display());

No_dynamic_sampling.png

With standard statistics the Optimizer estimates the cardinality as 90 rows. The actual number of rows returned by this query is 916. By setting OPTIMIZER_DYNAMIC_SAMPLING to level 4, the optimizer will use dynamic sampling to gather additional information about the complex predicate expression. The additional information provided by dynamic sampling allows the optimizer to generate a more accurate cardinality estimate and therefore a better performing execution plan.

Alter session set optimizer_dynamic_sampling=4;

Explain plan for
Select *
From Customers
Where cust_city='Los Angeles'
And  cust_state_province='CA';

Select * from table(dbms_xplan.display());

with_dynamic_sampling.png


You may now be wondering why we had to set the parameter OPTIMIZER_DYNAMIC_SAMPLING to 4 .  Dynamic sampling is controlled by the parameter OPTIMIZER_DYNAMIC_SAMPLING, which can be set to different levels (0-10). These levels control two different things; when dynamic sampling kicks in and how large a sample size will be used to gather the statistics. The greater the sample size the bigger impact DS has on the compilation time of a query.

Level When Dynamic Sampling will be used Sample size (blocks)
0 Switches off dynamic sampling N/A
1 At least one non-partitioned table in the statement has no statistics 32
2 (default) One or more tables in the statement have no statistics 64
3 Any statement that meets level 2 criteria and any statement that has one or more expressions used in the where clause predicates e.g. Where substr(CUSTLASTNAME,1,3) or Where a + b =5 64
4 Any statement that meets level 3 criteria and any statement that has complex predicates. An OR or AND operator between multiple predicates on the same table 64
5 Any statement that meets level 4 criteria 128
6 Any statement that meets level 4 criteria 256
7 Any statement that meets level 4 criteria 512
8 Any statement that meets level 4 criteria 1024
9 Any statement that meets level 4 criteria 4086
10 All statements All Blocks


In our example we had an AND operator between two simple predicates on the Customers table. From the above table you can now see why I chose to set OPTIMIZER_DYNAMIC_SAMPLING to level 4 in order to have dynamic sampling kick in for our example.

From Oracle Database 11g Release 2 onwards the optimizer will automatically decide if dynamic sampling will be useful and what dynamic sampling level will be used for SQL statements executed in parallel. This decision is based on size of the tables in the statement and the complexity of the predicates. However, if the OPTIMIZER_DYNAMIC_SAMPLING parameter is explicitly set to a non-default value, then that specified value will be honored. You can tell if dynamic sampling kicks in by looks in the note section of the execution plan.
auto_dynamic_sampling.png


For serial SQL statements the dynamic sampling level will depend on the value of the OPTIMIZER_DYNAMIC_SAMPLING parameter and will not be triggered automatically by the optimizer. The reason for this is that serial statements are typically short running and any overhead at compile time could have a huge impact on their performance. Where as we expect parallel statements to be more resource intensive, so the additional overhead at compile time is worth it to ensure we can be best execution plan.

In our original example the SQL statement is serial, which is why we needed to manual set the value for OPTIMIZER_DYNAMIC_SAMPLING parameter.  If we were to issue a similar style of query against a larger table that had the parallel attribute set we can see the dynamic sampling kicking in.

You should also note that setting OPTIMIZER_FEATURES_ENABLE to 9.2.0 or earlier will disable dynamic sampling all together.

When should you use dynamic sampling? DS is typically recommended when you know you are getting a bad execution plan due to complex predicates. However, you should try and use an alter session statement to set the value for OPTIMIZER_DYNAMIC_SAMPLING parameter as it can be extremely difficult to come up with a system-wide setting.

When is it not a good idea to use dynamic sampling? If the queries compile times need to be as fast as possible, for example, unrepeated OLTP queries where you can't amortize the additional cost of compilation over many executions.

Comments:

An excellent article on Dynamic Sampling. Before this i didn't have exact idea on DS. now i'm very clear about DS. Thank you very much.

Posted by Babu on March 10, 2011 at 03:41 AM PST #

This link is simply amazing.It has all the information about DS a person should be knowing. It really helped me to start off with DS.Thank you for the information.

Posted by Suraj on April 14, 2011 at 03:57 PM PDT #

Excellent.

Posted by Bhanu on May 07, 2011 at 07:52 PM PDT #

Excellent post ,on stop information for DS stuff

Posted by Guest on June 06, 2011 at 03:33 PM PDT #

excellent, i was very confused about DS. Now this is very clear.
thank you very much.

Posted by anjeet on July 13, 2011 at 08:18 PM PDT #

What about stale statistics? We are facing issues because when we go live with less data and during initial loads, it changes data distribution drastically and dynamic stats would help.

Posted by guest on November 23, 2011 at 02:26 AM PST #

Maria,

I have a question about dynamic sampling vs extended statistics. Would like to know how extended statistics affects dynamic sampling.

Here is the scenario that i have observed.

1. Dynamic Sampling set to default and no extended statistics
The query uses dynamic sampling at level 5 and comes back with wrong cardinality and query takes long time to execute
2. Dynamic Sampling hint with a value of 4 and no extended statistics
The query uses dynamic sampling and comes back with right cardinality and executes pretty fast.
3. Dynamic Sampling set to default and extended statistics collected
The query still uses dynamic sampling and comes back with wrong cardinality and takes long time to execute
4. Dynamic Sampling hint with a value of 0 and extended statistics collected
The query comes back with right cardinality and executes pretty fast.

Would like to know how dynamic sampling and extended statistics impact each other.

Pranav

Posted by guest on February 08, 2012 at 02:22 PM PST #

Maria,

I would like to understand how extended statistics impact dynamic sampling

Pranav

Posted by Pranav on February 08, 2012 at 02:24 PM PST #

"From Oracle Database 11g Release 2 onwards the optimizer will automatically decide if dynamic sampling will be useful and what dynamic sampling level will be used for SQL statements executed in parallel"

Other than setting any other degree of dynamic sampling, how can this be turned off?

Posted by guest on March 21, 2012 at 11:11 AM PDT #

Really helpful .

Posted by pvd_jha on March 26, 2012 at 10:57 PM PDT #

Thank you Maria, what a good explanation :)

Posted by guest on March 29, 2012 at 04:21 PM PDT #

SIMPLEY GREAT

Posted by guest on June 22, 2012 at 03:41 AM PDT #

Good Article, Thank you.

Posted by guest on July 26, 2012 at 11:04 AM PDT #

good article...

Posted by dhruva Mishra on August 10, 2012 at 08:09 AM PDT #

Excellent description for DS and very useful link

Rgds
Harvinder

Posted by guest on September 21, 2012 at 09:13 PM PDT #

Hi Maria,
Thanks very much for this article. it's very nice explained with example.

Posted by Khalil on December 05, 2012 at 03:30 PM PST #

Very good artical about dynamic sampaling...

Posted by suvajit maity on December 31, 2012 at 06:16 AM PST #

Very helpful post. Many Thanks !!!

Posted by Vicky on January 28, 2013 at 08:27 AM PST #

This is an article I intend to re-read a couple of times.
Good material.

Thanks,
Tom

Posted by Tom on February 02, 2013 at 11:43 AM PST #

Hi,

Excellent explanation. Now i can say, i know something about DS.

One question on the example shown.(Correct me if i am wrong)

Show parameter optimizer_dynamic_sampling is showing 2.

In explain plan of the statement it shows 4 is this normal?

Posted by vin on July 17, 2013 at 11:32 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

The Oracle Optimizer blog is written by members of the Optimizer development team. The goal of this blog is to provide an insight into the workings of the Optimizer and the statistics it relies on. The views expressed on this blog are our own and do not necessarily reflect the views of Oracle and its affiliates. The views and opinions expressed by visitors on this blog are theirs solely and may not reflect ours.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today