Insights into Statistics, Query Optimization and the Oracle Optimizer

  • 12c
    March 23, 2017

Efficient Statistics Maintenance for Partitioned Tables Using Incremental Statistics – Part 3

Nigel Bayliss
Product Manager


This post covers how Oracle has improved incremental statistics for partitioned tables in Oracle Database 12c Release 2. If you’re not already familiar with synopses in the context of incremental statistics then take a look at Part 1 and Part 2 before you read on.

Beginning with Oracle Database 12c Release 2 there's a new synopsis format that’s significantly more compact than the format used in earlier releases.  For brevity, I’ll refer to the Oracle Database 12c Release 2 format as new and the previous format as old.

This enhancement is particularity relevant if you looked at incremental statistics maintenance in the past but decided not to use it because of the additional space usage in SYSAUX. Oracle Database 12c Release 2 resolves this issue and, in many cases, reduces the amount of system resource required to manage synopses.

A New Type of Synopsis

A synopsis is metadata stored in a couple of tables in the data dictionary (SYS.WRI$_OPTSTAT_SYNOPSIS$ and SYS.WRI$_OPTSTAT_SYNOPSIS_HEAD$). The metadata is managed automatically by the Oracle Database, so there’s generally no reason to be aware of the underlying implementation. Nevertheless, I’ll cover some of the details here because it will help you to see why the change is so significant.

Synopses in Oracle Database 12c Release 2 are now created (by default) using an algorithm called HyperLogLog (HLL). This is a state-of-the-art algorithm that calculates the approximate number of distinct values for table column values. Even though it is an approximation, it is nevertheless very accurate with a typical error rate of less than 2%. Prior to Oracle Database 12c Release 2, synopses consisted of rows stored in the WRI$_OPTSTAT_SYNOPSIS$ table. The number of rows in this table can be very large if there are a large number of partitions and table columns, and if the columns contain a large number of distinct values. New-style synopses do not store rows in this table. Instead, some additional (and compact) hash data is stored in the WRI$_OPTSTAT_SYNOPSIS_HEAD$ table (in the SPARE2 column).

How much smaller are the new synopses? Well, as you've probably guessed, it's the consultants' answer: "It Depends". As outlined above, the space used by synopses is a function of the number of partitions, columns and distinct values in columns. I can give you an example from one of our test systems containing an 8TB test table with 84-partitions. The total size of the old-style synopses was around 160MB and the new-style synopses totaled only 6MB. The example I created in GitHub (see below) was contrived to generate a particularly large amount of synopsis data. In the example I've given, new-style synopses take up virtually no space at all and the old-style synopses take up about 160MB (in SYSAUX).

How to Use Them

What do you need to do to use new-style synopses? Nothing! They are used by default if you choose to use incremental statistics and you don’t need to do anything different in Oracle Database 12c Release 2 compared to earlier releases.

Controlling Synopses

You can control the type of that will be created using a DBMS_STATS preference called APPROXIMATE_NDV_ALGORITHM.

The default is REPEAT OR HYPERLOGLOG: if a table is using old-style synopses then it will continue to do so, and tables using new-style synopses will continue to use those!

There’s no reason to use anything other than the default unless you are upgrading a database to Oracle Database 12c Release 2. If this is the case then you might want to consider the options. That's covered next.


If you are upgrading a database that’s using incremental statistics, then you will want to migrate to using the new-style synopses. How do you go about doing that? It’s worth noting from the outset that it’s possible to have partitions with old-style and new-style synopses in the same table. Also, the good news is that you can control when and how to transition from one type of synopses to the other.

There is a DBMS_STATS preference called INCREMENTAL_STALENESS. It the controls whether or not you want to allow partitions within an individual table to have different types of synopses during the transition period from old-style to new-style. Let’s look at the different scenarios and how to proceed after you have upgraded to Oracle Database 12c Release 2. There is a spectrum of choice from “very conservative” (i.e., maintaining old behaviors) to “aggressive” (i.e., taking advantage of new features immediately). The chart below describes the different scenarios from the most conservative cases to most aggressive cases.

Use-case Action
Initially, you want to continue to use old-format synopses for all tables.

We recommend that you use the new-style synopses, but can choose to use them later on if you prefer.
The algorithm used prior to HLL is called adaptive sampling.

            'ADAPTIVE SAMPLING')
You want tables using old-style synopses to continue to use them.

Newly created incrementally-managed tables will use new-style synopses. Incrementally-managed tables without synopses will use new-style when statistics are gathered.

Incrementally-managed tables with old-style synopses will continue to use them.

No action.

This is the default behavior.


You have some very large partitioned tables. They are using old-style synopses and you want to gradually replace the old with the new. Old-format synopses are not immediately replaced and new partitions will have synopses in the new format. Mixed formats will yield less accurate statistics but the advantage is that there is no need to re-gather all table statistics in the foreground. The statistics auto job will gradually re-gather statistics on partitions with old format synopses and generate new format synopses. Eventually, new format synopses will be used for all partitions and statistics will be accurate.


Note that INCREMENTAL_STALENESS preference must have the value ALLOW_MIXED_FORMAT, but it does not need to be set explicity (unless you've changed it) because it is the default setting.
You have time to re-gather all statistics. Incrementally managed tables are using old-style synopses and you want to replace the old-style with the new immediately.

If you have a window of time to completely re-gather statistics for partitioned tables, then this is the recommended approach.


You also need to specify that you don’t want a mix of old synopses and new synopses in the same table:


You need to take some care here. The preference value should be set to 'NULL' (in quotes) and not NULL (without quotes). NULL (without quotes) sets a preference to its default value, which in this case is ALLOW_MIXED_FORMAT.

Once these preferences are set you will need to re-gather the table’s statistics.

Remember that you can also set DBMS_STATS preferences (such as APPROXIMATE_NDV_ALGORITHM) at the database, global and schema-level as well as at the table level (as per the examples above).


The synopsis format in Oracle Database 12c Release 2 is much more compact than the previous format. If your database is very large, expect to save a lot of space in SYSAUX while maintaining very good accuracy for your statistics. You can expect  the system overhead required to manage synopses to drop too (for example, when you exchange partitions). For more on this and some example scripts, take a look at GitHub.

If you have comments on this post or the scripts in GitHub, please go ahead below.

Join the discussion

Comments ( 12 )
  • Rajeshwaran Tuesday, March 28, 2017

    Team - Thanks for this post.

    gone through the scripts at the github, very difficult to follow, expecially dont know in which order should i go through them.

    It would be really cool if you could walk through the set of scripts available in the gituhub, one by one, with some brief explanation.

  • guest Friday, April 7, 2017

    Hi nbayliss-oracle,

    Are You plan to add some possibility (for user) to tune HLL algorithm for example how much hash values can be stored in base HLL structure (for example by adding some parameters to dbms_stats and APPROX_COUNT_DISTINCT) ? As if we use bigger internal structure for HLL we can have much accurate data and for some tables (for example smaller tables) we can use much smaller structure to have very close estimate.

    Another question, APPROX_COUNT_DISTINCT use HLL or Adaptive Sampling algorithms as it was introduced in 12.1 but HLL as You said in 12.2 ?

    Best Regards

  • NigelBayliss Tuesday, April 11, 2017

    Hi - I agree with you. We've been migrating the scripts to a new location and I can see that I need to brush up a lot of the README files. Expect me to do that in the next week or two.

    This is the new location:


    I will update the links in the blog posts when I've made the changes.



  • NigelBayliss Tuesday, April 11, 2017

    Hi -

    Regarding HLL - we don't have plans to make HLL tunable at the moment. It works well for large data sets and incremental stats maintenance is not usually needed for small datasets. The hash structures are not particularly large anyway.

    Correct: approx count distinct was introduced in 12.1 and uses HLL, but new in 12.2 is HLL used for synopses.



  • kal Tuesday, April 28, 2020
    these is a bug which oracle does not drop orphaned partition on oracle internal table causing the reach maximal partition.

    Then you will need to get manual store procedure to drop them
  • Nigel Bayliss Thursday, April 30, 2020
    Hi Kal,

    You are on 12.1 and ran into bug 22600570, which is marked "fixed in 19.1".

  • RICKY HIRST Friday, April 16, 2021
    In Oracle 12c R1 if we enable the use of synopses by setting the incremental preference for a partitioned table to true, and gathering the statistics on each partition as they are created.
    Do we then have to manually gather Global Statistics as well?
  • Nigel Bayliss Monday, April 19, 2021
    Hi Ricky,

    It depends. If you gather stats on the new partition using the default granularity, then you will get refreshed global stats. If you gather stats on the partition with granularity=>'partition' then you won't get fresh global stats. I would look at using gather_table_stats with options=>'gather auto' and let the DB figure it out. You can - for example - create a new partition and simply use gather_table_stats(options=>'gather auto'...) without even specifying a partition name and it will give you stats on the new partition and at the global level - leaving the fresh stats in other partitions untouched.
  • Mathavan Friday, May 28, 2021
    Hi Nigel,

    How to know if a given partition is using the new HLL algorithm (or) old-style algorithm ?
  • Nigel Bayliss Friday, May 28, 2021
  • Pete Thursday, July 8, 2021
    Nigel, you said this in a reply above "...You can - for example - create a new partition and simply use gather_table_stats(options=>'gather auto'...)", would you do that instead of gather_schema_stats(options=>'gather auto') using a table filter list (from one of your other articles about gather stats only when stale) ??
  • Nigel Bayliss Friday, July 9, 2021
    Hi Pete - this is a good point. It is related to a recent comment in part 1 of this series. Regarding the comment above - yes - gather table stats with GATHER AUTO has the desired effect but it is not necessary. Incremental stats tables are 'different', and using gather table stats with default options will refresh stats where needed and not blindly on all partitions and table-level. GATHER AUTO used with schema gather is different - but that is not obvious and it is something I should clarify in the body of this series. Gather schema stats uses the STALE_STATS flag to decide if stats should be regathered, so it behaves differently to gather table stats. So the answer to your question is that you can use either - but gather table stats will update the table-level stats but gather shcema might not depending on whether or not the table is marked stale at the global level.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.