Dynamic sampling (DS) was introduced to improve the optimizer's ability to generate good execution plans. This feature was enhanced and renamed Dynamic Statistics in Oracle Database 12c. The most common misconception is that DS can be used as a substitute for optimizer statistics, whereas the goal of DS is to augment optimizer statistics; it is used when regular statistics are not sufficient to get good quality cardinality estimates.
So how and when will dynamic statistics be used? During the compilation of a SQL statement, the optimizer decides whether to use DS or not by considering whether the available statistics are sufficient to generate a good execution plan. If the available statistics are not enough, DS will be used. It is typically used to compensate for missing or insufficient statistics that would otherwise lead to a very bad plan. For the case where one or more of the tables in the query does not have statistics, DS is used by the optimizer to gather basic statistics on these tables before optimizing the statement. The statistics gathered in this case are not as high a quality or as complete as the statistics gathered using the DBMS_STATS package. This trade off is made to limit the impact on the compile time of the statement.
DS may be used if a SQL statement contains a complex predicate expression and extended statistics are not available. Extended statistics were introduced in Oracle Database 11g Release 1 with the goal of helping the optimizer get good quality cardinality estimates for complex predicate expressions. For example, if you had a simple query that has where clause predicates on two correlated columns, standard statistics would not be sufficient. Consider the following SQL statement and its execution plan :
EXPLAIN PLAN FOR SELECT * FROM CUSTOMERS WHERE cust_city = 'Los Angeles' AND cust_state_province = 'CA'; SELECT * FROM table(dbms_xplan.display(format=>'basic rows')); ----------------------------------------------- | Id | Operation | Name | Rows | ----------------------------------------------- | 0 | SELECT STATEMENT | | 8 | | 1 | TABLE ACCESS FULL| CUSTOMERS | 8 | -----------------------------------------------
With standard statistics the optimizer estimates the cardinality as 8 rows. The actual number of rows returned by this query is 916. Without extended statistics, the optimizer does not know that there is a perfect correlation between "Los Angeles" and the state "CA" but by setting optimizer_dynamic_sampling to level 4, the optimizer will use dynamic sampling to gather additional information about this predicate expression. The additional information provided by dynamic sampling allows the optimizer to generate a more accurate cardinality estimate.
ALTER SESSION SET optimizer_dynamic_sampling=4; EXPLAIN PLAN FOR SELECT * FROM customers WHERE cust_city = 'Los Angeles' AND cust_state_province = 'CA'; SELECT * FROM table(dbms_xplan.display(format=>'basic rows')); ----------------------------------------------- | Id | Operation | Name | Rows | ----------------------------------------------- | 0 | SELECT STATEMENT | | 916 | | 1 | TABLE ACCESS FULL| CUSTOMERS | 916 | -----------------------------------------------
In this case the improved cardinality estimate has not affected our SQL execution plan, but for more complicated queries a better cardinality estimate will often result in a better SQL execution plan, which will in turn result in a faster query execution time.
You may now be wondering why we had to set the parameter optimizer_dynamic_sampling to 4 . The dynamic statistics feature is controlled by the parameter optimizer_dynamic_sampling, which can be set to different levels (0-11). These levels control two different things; when dynamic sampling kicks in and how large a sample size will be used to gather the statistics. The greater the sample size the bigger impact DS has on the compilation time of a query.
|Level||When Dynamic Statistics will be used||Sample size (blocks)|
|0||Switches off dynamic statistics||N/A|
|1||At least one non-partitioned table in the statement has no statistics||32|
One or more tables in the statement have no statistics
This setting is suitable for the majority of systems
|3||Any statement that meets level 2 criteria and any statement that has one or more expressions used in the where clause predicates e.g. Where substr(CUSTLASTNAME,1,3) or Where a + b =5||64|
|4||Any statement that meets level 3 criteria and any statement that has complex predicates. An OR or AND operator between multiple predicates on the same table||64|
|5||Any statement that meets level 4 criteria||128|
|6||Any statement that meets level 4 criteria||256|
|7||Any statement that meets level 4 criteria||512|
|8||Any statement that meets level 4 criteria||1024|
|9||Any statement that meets level 4 criteria||4086|
|10||All statements||All Blocks|
|11||The database determines automatically if dynamic statistics are required||Automatically determined|
There more information on dynamic sampling levels, refer to the Chapter 12 of the SQL Tuning Guide for Oracle Database 12c Release 2.
In our example we had an AND operator between two simple predicates on the CUSTOMERS table. From the above table you can now see why I chose to set optimizer_dynamic_sampling to level 4 in order to have dynamic sampling kick in for our example.
From Oracle Database 11g Release 2 onwards, the optimizer automatically decides if dynamic sampling will be useful and what dynamic sampling level will be used for SQL statements executed in parallel. This decision is based on size of the tables in the statement and the complexity of the predicates. You can tell if dynamic sampling kicks in by looks in the note section of the execution plan.
For serial SQL statements the dynamic sampling level is controlled by the optimizer_dynamic_sampling parameter but note that from Oracle Database 12c Release 1 the existence of SQL plan directives can also initiate dynamic statistics gathering when a query is compiled. This is a feature of adaptive statistics and is controlled by the database parameter optimizer_adaptive_features (OAF) in Oracle Database 12c Release 1 and optimizer_adaptive_statistics (OAS) in Oracle Database 12c Release 2. In other words, from Oracle Database 12c Release 1 onwards, DS will be used if certain adaptive features are enabled by setting the relevant parameter to TRUE. To summarize:
|Database Version||Name of Database Parameter Controlling Relevant Adaptive Features||Default Value|
|Oracle Database 12c Release 1||optimizer_adaptive_features (OAF)||TRUE|
|Oracle Database 12c Release 2 Onwards||optimizer_adaptive_statistics (OAS)||FALSE|
Serial statements are typically short running and any DS overhead at compile time can have a large impact on overall system performance (if statements are frequently hard parsed). For systems that match this profile, setting OAF=FALSE is recommended. For Oracle Database 12c Release 2 onwards, using the default OAS=FALSE is recommended.
Parallel statements are generally more resource intensive, so it's often worth investing in additional overhead at compile time to potentially find a better SQL execution plan.
In our original example the SQL statement is serial, which is why we needed to manual set the value for optimizer_dynamic_sampling parameter (and we'll assume that there were no relevant SQL plan directives). If we were to issue a similar style of query against a larger table that had the parallel attribute set we can see the dynamic sampling kicking in.
When should you use dynamic sampling? DS is typically recommended when you know you are getting a bad execution plan due to complex predicates. However, you should try and use an alter session statement to set the value for optimizer_dynamic_sampling parameter as it can be extremely difficult to come up with a system-wide setting.
When is it not a good idea to use dynamic sampling? If the queries compile times need to be as fast as possible, for example, unrepeated OLTP queries where you can't amortize the additional cost of compilation over many executions.
This is Part 2 of a series on the Oracle Optimizer in the Oracle Autonomous Data Warehouse Cloud. You can find Part 1 here. In this post,...