X

Information, tips, tricks and sample code for Big Data Warehousing in an autonomous, cloud-driven world

Big Data SQL Quick Start. Predicate Push Down - Part6.

Alexey Filanovskiy
Product Manager


Since 3.0 version Big Data SQL has awesome feature - Predicate Push Down (PPD). It works over ORC files, Parquet Files and few NoSQL Databases (like HBase). I plan to devote separate topic for NoSQL databases and Big Data SQL, today i'm going to explain ORC and Parquet file case. For both files there is using the same approach and I'll explain ORC case (Parquet is the same). If you are not familiar with those filetypes I advice you to read this documentation about Parquet File and this about ORC. So, this files has some meta information inside. ORC files has minimum and maximum value for each 10 000 rows (it's default value and defined with orc.row.index.stride  parameter).

It means that with fast lookup we could understand eather scan this stripe or no. if we scan pretty unique column (like primary key), we could skip a lot of IO operations. Let's have an example. I have table that contain two stripes and want to scan it with using B predicate.

if i will use "B=6" predicate in "WHERE" clause, i will skip scan of first block. If i will use "B=15" i will skip both stripes.

Big Data SQL workflow.

Now it's a good time to explain how Big Data SQL workload looks like. Let's imagine  that we have table stock_prices in HDFS which is ORC (or Parquet). Over this query we run follow query:

SQL> SELECT ticker, AVG(price) OVER (PARTITION BY ticker ORDER BY mnth) as avg_price
FROM stock_prices
WHERE
mnth < :x
AND mnth > :y
AND stx_xchange = :z
ORDER BY mnth

Here is workflow:

1) Reading Data from disks to the External Table Services Level

On the first step we read data from disk and we could do few very significant optimizations here:

- Column pruning.  If table stock_prices has 100 columns, but in query we list only 4 (ticker, price, mnth, stx_xchange), other 96 will not be readed from disk.

- Pradicate Push Down In given example we do have to columns in where clause and two statements to be pushed:

mnth < :x AND mnth > :y
AND stx_xchange = :z

using ORC index we could eliminate part of the reads. We will not read the data, which definitely not match by our where predicate and only read the data that may have rows that match our conditions.

I borrowed one picture from this presentation about parquet file, which illustrates perfectly what we could achieve with Column Pruning and Predicate Push Down

2) Transform data to Oracle RDBMS format and transfer it toward Smart Scan

On the second step we transform the data from HDFS format (Binary, Avro, ORC, Parquet)  to Oracle RDBMS format. This is very CPU intensive step, here we apply SerDe and convert data to the Oracle RDBMS types. If we prune columns and rows (thanks for Predicate Push Down feature ) on the previous step, we could prevent a lot of CPU burning.

3)  Smart Scan

on the next step will be applied Smart Scan over "WHERE" predicates. 

mnth < :x AND mnth > :y
AND stx_xchange = :z

If Predicate Push Down eliminate part of the data, Smart Scan does exact filtration. Also it could do some more operations rather than just predicates. All list of those functions you cuold find here:

SQL> SELECT * FROM v$sqlfn_metadata WHERE offloadable = 'YES' 

4) Oracle RDBMS part.

This step is out of my picture, but it is. Not all functions could be applied on the Smart Scan level (like pl/sql, window functions) and this part of the query passes towards the datatabase. In our example AVG(price) OVER (PARTITION BY ticker ORDER BY mnth) as avg_price and ORDER BY will we done on the database level.

all picture looks like this:

 

Predicate Push Down good/bad use case.

For testing good and bad use cases for Predicate Push Down feature I got table from Intel Big Bench - store_sales. It's relatively big table. 

SQL> SELECT COUNT(1)  FROM BDS.store_sales_orc  
6 385 178 703 rows

Data is compressed with ZLIB codec (default one for ORC):

$ hadoop fs -du -s -h /user/hive/warehouse/orc.db/store_sales/
297.8 G  893.5 G  /user/hive/warehouse/orc.db/store_sales

From statistics tables I found most selective and unselective columns: 

SQL> SELECT *  FROM (SELECT num_distinct,
               Column_name
          FROM user_tab_col_statistics
         WHERE table_name = UPPER('STORE_SALES_ORC')
         ORDER BY num_distinct FETCH FIRST 1 ROW ONLY)
UNION ALL
SELECT *
  FROM (SELECT num_distinct,
               Column_name
          FROM user_tab_col_statistics
         WHERE table_name = UPPER('STORE_SALES_ORC')
         ORDER BY num_distinct DESC FETCH FIRST 1 ROW ONLY);
-------------------------------------------
100        SS_QUANTITY
849805312  SS_TICKET_NUMBER

now I will create two external tables. One (STORE_SALES_ORC) enable predicate pushdcontain, second (STORE_SALES_ORC_NOPPD) will disable is. Predicate Push Down enabled by default. If you want to disable it, you need to add com.oracle.bigdata.ppd=FALSE parameter in "create table" statement, like this:

CREATE TABLE STORE_SALES_ORC_NOPPD   (
...
   ) 
   ORGANIZATION EXTERNAL 
    ( TYPE ORACLE_HIVE
      DEFAULT DIRECTORY "DEFAULT_DIR"
      ACCESS PARAMETERS 
 ACCESS PARAMETERS
      ( ...
       com.oracle.bigdata.ppd=FALSE
        ) 

Now let's have a tests.

1) Filtering using unselective predicate:

SQL> SELECT COUNT(1) FROM store_sales_orc WHERE SS_QUANTITY=6;

63849869 

done in 89 seconds

SQL> SELECT COUNT(1) FROM store_sales_orc_noppd WHERE SS_QUANTITY=6; 

63849869 

done in 84 seconds

2) Filtering using selective predicate:

SQL> SELECT COUNT(1) FROM store_sales_orc WHERE SS_TICKET_NUMBER=187378860; 

2

done in 9 seconds

SQL> SELECT COUNT(1) FROM store_sales_orc_noppd WHERE SS_TICKET_NUMBER=187378860; 

2

done in 87 seconds

3) Joins.

Predicate Push Down Feature even could accelerate Join oprration. Not all joins, but joins where exist predicate operator and even implicit request, like in nested loop.

Let's have join our big fact tables (store_sales_orc and store_sales_orc_noppd each one is 6 385 178 703 rows)with small table date_dim_orc (which is 109573 rows only). I enforce nested loop with correspond hint, but I did this for demo purposes, don't do this on your production system, let optimizer decide which type of join use.

both joins over table with enabled PPD feature:

SQL> SELECT /*+ use_nl(store_sales_orc)*/
 dt.d_year,
 SUM(ss_ext_sales_price) sum_agg
  FROM date_dim_orc    dt,
       store_sales_orc 
WHERE dt.d_date_sk = store_sales_orc.ss_sold_date_sk
AND dt.d_moy = 12
AND dt.d_dom = 24
AND dt.d_year = 2004
GROUP BY dt.d_year
ORDER BY dt.d_year,sum_agg DESC
FETCH FIRST 100 ROWS ONLY;

 and over the table where PPD feature is disable: 

SQL> SELECT /*+ use_nl(store_sales_orc_noppd)*/
 dt.d_year,
 SUM(ss_ext_sales_price) sum_agg
FROM date_dim_orc    dt,
store_sales_orc_noppd 
WHERE dt.d_date_sk = store_sales_orc_noppd .ss_sold_date_sk
AND dt.d_moy = 12
AND dt.d_dom = 24
AND dt.d_year = 2004
GROUP BY dt.d_year
ORDER BY dt.d_year,sum_agg DESC
FETCH FIRST 100 ROWS ONLY;

has the same query plan:

Which has predicates to store_sales table:

but Join over table with enabled PPD feature was done in 11 seconds. Join over the table with disabled PPD feature in 128 seconds. We got this effect only because we require only small dataset from store_sales table (from fact table). This feature will not be so efficient if we will need all data set or significant part of it.

Note: this happens only if join type is Nested Loop.

4) Cumulative predicate 

Predicates are cumulitive, which means that if you will run query over few unselective (which is bad for BDS) predicates cululitively they will give you selective dataset and performance will be great.I took 5 most unselective predicates from previous example and run the query:

SQL> SELECT /*+ MONITOR*/ COUNT(1)   FROM store_sales_orc SS
WHERE ss.Ss_Quantity = 84  
AND ss.Ss_Store_Sk = 21  
AND ss.Ss_Sold_Date_Sk = 38046  
AND ss.Ss_Promo_Sk = 90  
AND ss.ss_hdemo_sk = 3702;

How to monitor predicate pushdown.

To determ whether it will work or not you have to check query plan. As usual there are tow most common ways either Enterprise Manager:

or  explain plan statement:

 

In both cases you have to check that predicate exist over the big table that you are store into HDFS.

Findings.

1) Predicate Push Down works perfectly for selective queries (which return only few rows).

2) Pradicate Push Down work a bit slower when it works over unselective columns (when we need to scan each stripe + extra overhead for index lookup). But difference is so small, that we could just neglecte it.

3) Column projection always applies. And that's great because it save so many CPU preventing data type conversion.

4) Joins where we need small piece of the data from big fact table also could be accelerated (if nested loop works).

Join the discussion

Comments ( 1 )
  • JK Hong Wednesday, April 24, 2019
    Thanks for the nice explanation
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.