Insights into Statistics, Query Optimization and the Oracle Optimizer

How do I create statistics to make ‘small’ objects appear ‘large’ to the Optmizer?

Maria Colgan
Distinguished Product Manager

I recently spoke with a customer who has a development environment that is a tiny fraction of the size of their production environment. His team has been tasked with identifying problem SQL statements in this development environment before new code is released into production.

The problem is the objects in the development environment are so small, the execution plans selected in the development environment rarely reflects what actually happens in production.

To ensure the development environment accurately reflects production, in the eyes of the Optimizer, the statistics used in the development environment must be the same as the statistics used in production. This can be achieved by exporting the statistics from production and import them into the development environment. Even though the underlying objects are a fraction of the size of production, the Optimizer will see them as the same size and treat them the same way as it would in production.

Below are the necessary steps to achieve this in their environment. I am using the SH sample schema as the application schema who's statistics we want to move from production to development.

Step 1. Create a staging table, in the production environment, where the statistics can be stored

Step 2. Export the statistics for the application schema, from the data dictionary in production, into the staging table

Step 3. Create an Oracle directory on the production system where the export of the staging table will reside and grant the SH user the necessary privileges on it.

Step 4. Export the staging table from production using data pump export

Step 5. Copy the dump file containing the stating table from production to development

Step 6. Create an Oracle directory on the development system where the export of the staging table resides and grant the SH user the necessary privileges on it. 

Step 7. Import the staging table into the development environment using data pump import

Step 8. Import the statistics from the staging table into the dictionary in the development environment.

You can get a copy of the script I used to generate this post here.

+Maria Colgan

Join the discussion

Comments ( 12 )
  • Connor McDonald Thursday, November 22, 2012

    Hi Maria,

    Here's my issue with this approach :-)

    Its making the presumption that merely by examination, you can assess the worthiness of an execution plan. That's just a reinforcement of the age-old problem:

    "I saw TABLE ACCESS FULL, so it must be a bad plan"

    Lets say I've copied the stats from my large environment back to my small environment, and the plan comes up as:


    | Id | Operation | Name |


    | 0 | SELECT STATEMENT | |



    Now what ? Is that the best plan ? Is it the worst one ? I can hunt around the row estimates etc for each line, but what if they are out (eg, the moment I have a couple of joins in there).

    I can't use 'gather_plan_stats' because the actual-vs-estimate comparison wouldn't make sense because of the disparate stats.

    I can't actually *run* it, because (if my dev environment is tiny), then the response might be super fast (when it might not be in production)...or alternatively, it might be super slow (when it might have been super fast in production). I can't actually make a decision on that plan based on its response time.

    I'm only pontificating because (in my humble opinion) perhaps the largest threat to production performance issues is unrealistic or unrepresentative data sample sizes in non-production environments.

    Copying stats around serves to perpetuate that problem rather than resolve it.

    I would have pressed 'Post' 10 mins ago...but I'm still struggling on the proof-of-non-spammer maths question :-)



  • guest Tuesday, December 4, 2012

    Will those imported statistics be overwritten by the regular job that gathers statistics (usually during the night)? Or will running dbms_stats.import_schema_stats() somehow lock them?

  • guest Tuesday, December 4, 2012

    To Connor:

    I see your point. Our production system is OLTP, where 95% of the queries return less than 50 rows in less than 250ms. The database has hundreds of tables with 50+ million rows each. A full table scan of any such big table is absolutely the wrong thing to do in our case. Although Maria's solution may not work in all environments, it will work in our environment.

    My problem however is this: The development environment is ALWAYS ALWAYS ahead of production environment. i.e. new columns, new tables etc. How will this export/import stats approach work with this in mind?

  • Maria Colgan Tuesday, December 4, 2012

    Unfortunately importing statistics will not automatically lock the statistics. So you will have either disable the automatic statistics gathering job or manually lock the statistics after the import.

  • guest Wednesday, January 2, 2013


    I understand what you're saying, but I think that the purpose of transporting the stats is not to provide actual timings of different query execution path runtimes that can then be compared to production; but instead to provide comparisons of different query execution path runtimes within DEV only. E.g. does execution path A produce a faster or slower response compared to execution path B, when both A & B have access to the same stats.

    You can then attempt to predict the production execution path and tune the query on the smaller subset of data.

    I do have to admit that some of the 11g features, like SQL Plan Management, make this approach slightly redundant.

  • guest Wednesday, January 23, 2013


    I think you need think about having a schema that is an exact copy of prod.

  • @Roderick Tuesday, January 29, 2013

    By replacing the stats is just not a controlled and predictable approach. The subject just makes this technique sound more sophisticated, but all knowledge is good :)

  • guest Monday, February 4, 2013

    Many shops simply can't afford to have large development environments, so replacing the stats seems to be about a close as some as they may get to replicating their production databases object statistics.

  • guest Monday, March 11, 2013

    @guest with the extra columns: Sounds like you have a "developmentesting" environment and have put your finger on one of the many problems with this all-too-common setup.

  • guest Tuesday, April 9, 2013

    this only works well if dev/uat have exactly the same ddl structures in place i.e. prod could have additional partions etc. this will break the import of the stats?

  • guest Tuesday, April 9, 2013

    You are correct. The assumption here is that the DDL for the objects in the test environment is identical to production.



  • Richard Armstrong-Finnerty Saturday, June 22, 2013

    One problem with this approach is that any CBO mechanisms that relate to actual data values, such as Adaptive Cursor Sharing or Bind Variable Peeking, may not produce the same effects as with the parent table, as there could be swathes of absent values in the smaller table.

Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.