SQL Plan Management (SPM) is an Oracle database feature that allows you to establish a set of SQL execution plans that will be used even if the database is subject to changes that would otherwise cause execution plan changes to occur. For example, you might have an end-of-day batch run that operates in a business context where there are extreme peaks and troughs in daily volume, or perhaps you are upgrading a database and want to be sure that plans are carried over (at least initially). You do not have to fix execution plans in stone with SPM, you can use plan evolution to automate the process of finding improved plans, improving performance in a controlled way and at your own pace. If you’re not familiar with SPM, a very good place to start is to take a look at Maria Colgan’s four-part blog post on the subject. It gives you all the tools you need to get started.
If you are using SPM in Oracle Database 11gR2 or 12c, or if you are considering whether you should use it, then this blog post is for you. I decided to publish this post because I recently encountered a couple of environments that ran into, let’s say, “difficulties” with SPM when capturing SQL plan baselines automatically and continuously over a very long period of time (more than a year in fact). I’d like to give you a few pointers to avoid running into the same problems and why automatic SQL baseline capture was never intended to be used in that way.
The first step in using SPM is to create SQL plan baselines. You can capture SQL plan baselines in a number of different ways, but the key idea is that you should capture a representative set of SQL statements for the workloads you want to have baselines for. Typically, this means that you will capture SQL plan baselines for a fixed period of time – perhaps during the end of quarter batch run or for a weeks-worth of query processing. One way to capture SQL plan baselines is to set the database parameter optimizer_capture_sql_plan_baselines to TRUE so that all previously unseen repeatable SQL statements executed against the database will be captured automatically to create new “accepted” SQL plan baselines. This “auto capture” mechanism is very easy to use, but there is often a temptation to leave it enabled for an indefinite period of time. Don’t do this! It is of course possible to do it, but it rarely offers any significant advantage over a more targeted approach and it has implications for space usage in the data dictionary that you need to be aware of.
Capturing SQL plan baselines indefinitely means that the plan for every repeatable SQL statement must be stored in the data dictionary. Over the course of a year or more, a large database application can generate hundreds of thousands or even millions of repeatable SQL statements. This will not only consume a significant amount of space in the data dictionary, but you will probably collect a lot of “dead wood” that is unlikely to benefit you as much as you think. Also, if there are hundreds of thousands of baselines captured, it can take many hours to purge them if you are not regularly doing this.
Check your alert logs to see if you have messages like this, because it is the first indication that you have exceeded the configured space budget percent for SPM:
SPM: SMB space usage (99215979367) exceeds 10.000000% of SYSAUX size (1018594954366).
This message doesn't necessarily signify that there is an urgent issue and the database will continue to work normally. That's why it can go unnoticed! Note also that it is only a warning, and the database does not enforce any hard upper limit for the space consumed by SPM (this might be changed/enhanced in a future version, but don’t take this as any product commitment and base any product purchase decision on it!).
The data dictionary contains a number of tables that store all of the data associated with SPM. These tables and their associated indexes are stored in an area known as the SQL Management Base (SMB). This is the meaning of "SMB" in the alert log message, above. You can execute the following query to find out how much space the SMB is using:
SELECT occupant_desc, space_usage_kbytes FROM v$sysaux_occupants WHERE occupant_name='SQL_MANAGEMENT_BASE';
The tables and indexes associated with the SMB might be among the top space consumers inside the data dictionary if there are a very large number of baselines. Here’s an example of what that can look like:
SELECT * FROM ( SELECT bytes/(1024*1024) MBytes, tablespace_name, segment_name, segment_type FROM dba_segments WHERE owner = 'SYS' ORDR BY bytes DESC) WHERE rownum <= 20 /
MBYTES TABLESPACE_NAME SEGMENT_NAME SEGMENT_TYPE ---------- ----------------- ------------------ ------------- 66735 SYSAUX SQL$TEXT TABLE 25420 SYSAUX SQLOBJ$AUXDATA TABLE 10089 SYSAUX SQLOBJ$_PKEY INDEX 9447 SYSAUX I_SQLOBJ$NAME_TYPE INDEX 7665 SYSAUX SQLOBJ$DATA_PKEY INDEX …
The Oracle database includes an automatic mechanism for purging SPM baselines that have not been used for longer than the plan retention period, as identified by the LAST_EXECUTED column in the DBA_SQL_PLAN_BASELINES view (see below). Here's an example of how to set the retention period to 10 weeks:
The default retention period is 53 weeks (see sub-section 23.7.2 in the Oracle Database SQL Tuning Guide). Some 11g versions seem to suffer from a problem that auto-purging does not work successfully if there are a very large number of SQL plan baselines. If you have a large number of SQL plan baselines you might want to check out bug #16996646 (see My Oracle Support). Or simply use the work-around to execute the purge manually:
VARIABLE ret NUMBER; BEGIN :ret := sys.dbms_spm_internal.auto_purge_sql_plan_baseline; END; / PRINT ret;
The SMB records a LAST_EXECUTED time-stamp against every baseline, making it possible to find out when each one was last used by a SQL statement. For example, this query counts the number of query baselines, aggregating by the age of last execution in weeks:
SELECT FLOOR(EXTRACT(DAY FROM (SYSDATE - last_executed)) / 7) AS "Weeks Since Last Execute", COUNT(*) AS "Num Baselines" FROM dba_sql_plan_baselines WHERE accepted = 'YES' AND autopurge = 'YES' GROUP BY FLOOR(EXTRACT(DAY FROM (SYSDATE - last_executed)) / 7) ORDER BY 1;
Note that the “last executed” timestamp is not refreshed continuously but on a weekly basis, so expect to see the timestamps for active baselines to take a least a week to reflect that they have been used.
The following query counts how many baselines have not been used for a period exceeding the plan retention period. For example, if you find SQL plan baselines that have not been used for longer than the retention time, then this can indicate that SQL plan baselines are not being auto purged successfully:
SELECT count(*) AS " Purgeable Baselines Count" FROM dba_sql_plan_baselines WHERE autopurge = 'YES' AND EXTRACT(DAY FROM (SYSDATE - last_executed)) / 7 > (SELECT parameter_value FROM dba_sql_management_config WHERE parameter_name = 'PLAN_RETENTION_WEEKS');
If you find that you have a large number of SQL plan baselines that have not been used for a significant time, then it is worth considering purging them individually and (optionally) reclaiming the space that was used in the SYSAUX tablespace. See Doc IDs 790039.1 and 1499542.1 in My Oracle Support which cover how to do this. Alternatively, consider setting an appropriate retention period for the auto purge process and execute the purge procedure manually if you are using Oracle Database 11gR2. Don’t only clean up, but also spend a couple of minutes thinking about how you got such a large number of SQL plan baselines and why they were not used for a long time. Maybe blindly using auto-capture is the culprit, in which case you are asking for doing the same exercise in a year from now at the latest …
If you want to use automatic baseline capture then it is best to capture SQL plan baselines for representative and smoothly-running workloads for fixed periods of time. This is usually sufficient to achieve very good results. Nevertheless, if you do want to use if for periods of days or weeks, then it is particularly necessary to be aware of your SPM storage management settings and you should monitor your alert log for SPM space usage messages and use manual purge procedures if the auto purge job is not clearing down expired SQL plan baselines successfully.