X

Insights into Statistics, Query Optimization and the Oracle Optimizer

The Oracle Optimizer and ADWC - Statistics and Bulk Load

By: Nigel Bayliss | Product Manager

It's time for some posts on the Oracle Autonomous Data Warehouse Cloud and the enhancements we've made to Statistics Management and the Oracle Optimizer. This is Part 1, and it covers statistics maintenance during bulk load. I'll add links to the other posts when they become available. 

My scripts on the topic of autonomous are stored in GitHub here. The scripts for this post are here.

Statistics and Bulk Loading

Consider an empty fact table called FACT1. Let's populate it with 10,000 rows using a bulk transform and load operation from FACT1_SOURCE:

SQL> insert /*+ APPEND */ into fact1 select num0,1,txt1 from fact1_source;
10,000 rows inserted.
Elapsed: 00:00:00.519

-- Take a look at stats...

select table_name,num_rows,sample_size,stale_stats 
from   user_tab_statistics where  table_name = 'FACT1';

TABLE_NAME                       NUM_ROWS SAMPLE_SIZE STALE
------------------------------ ---------- ----------- -----
FACT1                               10000       10000 NO

We can see that there are valid statistics on FACT1, and this will come as no surprise if you are familiar with online statistics gathering

Now let's bulk load another 10,000 rows...

SQL> insert /*+ APPEND */ into fact1 select num0,2,txt1 from fact1_source;
10,000 rows inserted.
Elapsed: 00:00:00.414
...
TABLE_NAME                       NUM_ROWS SAMPLE_SIZE STALE
------------------------------ ---------- ----------- -----
FACT1                               20000       20000 NO

This is new behavior: ADWC maintains statistics on each subsequent direct path load. The demo script continues to insert more rows (from FACT1 so that we rapidly increase the row count) until finally...

SQL> insert /*+ APPEND */ into fact1 select num0,16,txt1 from fact1;
81,920,000 rows inserted.
Elapsed: 00:00:49.970
...
TABLE_NAME                     NUM_ROWS   SAMPLE_SIZE STALE
------------------------------ ---------- ----------- -----
FACT1                          163840000  163840000   NO

What about histograms?

SQL> select table_name,column_name,low_value,high_value,sample_size,histogram 
     from   user_tab_col_statistics 
     where  table_name = 'FACT1';

TABLE_NAME  COLUMN_NAME  LOW_VALUE  HIGH_VALUE      SAMPLE_SIZE HISTOGRAM
----------- ------------ ---------- --------------- ----------- ---------------
FACT1       NUM0         C102       C302            163840000   HYBRID
FACT1       NUM1         C102       C111            163840000   FREQUENCY
FACT1       TXT1         58585831   58585839393939  163840000   HYBRID

We have histograms too!

Summary

I connected to ADWC using the HIGH consumer group and within two or three minutes I populated FACT1 with over 160 million rows using a series of bulk load operations. Statistics (including histograms) were maintained automatically.

Not bad!

Part 2 of this series is here.

 

 

 

 

 

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services