Optimizer Transformation: Join Predicate Pushdown

Happy New Year to all of our readers! We hope you all had a great holiday season. We start the new year by continuing our series on Optimizer transformations. This time it is the turn of Predicate Pushdown. I would like to thank Rafi Ahmed for the content of this blog.

Normally, a view cannot be joined with an index-based nested loop (i.e., index access) join, since a view, in contrast with a base table, does not have an index defined on it. A view can only be joined with other tables using three methods: hash, nested loop, and sort-merge joins.

Introduction

The join predicate pushdown (JPPD) transformation allows a view to be joined with index-based nested-loop join method, which may provide a more optimal alternative. In the join predicate pushdown transformation, the view remains a separate query block, but it contains the join predicate, which is pushed down from its containing query block into the view. The view thus becomes correlated and must be evaluated for each row of the outer query block. These pushed-down join predicates, once inside the view, open up new index access paths on the base tables inside the view; this allows the view to be joined with index-based nested-loop join method, thereby enabling the optimizer to select an efficient execution plan.

 The join predicate pushdown transformation is not always optimal. The join predicate pushed-down view becomes correlated and it must be evaluated for each outer row; if there is a large number of outer rows, the cost of evaluating the view multiple times may make the nested-loop join suboptimal, and therefore joining the view with hash or sort-merge join method may be more efficient.

The decision whether to push down join predicates into a view is determined by evaluating the costs of the outer query with and without the join predicate pushdown transformation under Oracle's cost-based query transformation framework. The join predicate pushdown transformation applies to both non-mergeable views and mergeable views and to pre-defined and inline views as well as to views generated internally by the optimizer during various transformations. The following shows the types of views on which join predicate pushdown is currently supported.

  • UNION ALL/UNION view
  • Outer-joined view
  • Anti-joined view
  • Semi-joined view
  • DISTINCT view
  • GROUP-BY view

Examples

Consider query A, which has an outer-joined view V. The view cannot be merged, as it contains two tables, and the join between these two tables must be performed before the join between the view and the outer table T4.

A:
SELECT T4.unique1, V.unique3
FROM T_4K T4,
           (SELECT T10.unique3, T10.hundred, T10.ten
            FROM T_5K T5, T_10K T10
            WHERE T5.unique3 = T10.unique3) V
WHERE T4.unique3 = V.hundred(+)
AND       T4.ten = V.ten(+)
AND       T4.thousand = 5;


The following shows the non-default plan for query A generated by disabling join predicate pushdown.

pred_push_plan1.png




When query A undergoes join predicate pushdown, it yields query B. Note that query B is expressed in a non-standard SQL and shows an internal representation of the query.

B:
SELECT T4.unique1, V.unique3
FROM T_4K T4,
           (SELECT T10.unique3, T10.hundred, T10.ten
            FROM T_5K T5, T_10K T10
            WHERE T5.unique3 = T10.unique3
            AND T4.unique3 = V.hundred(+)
            AND T4.ten = V.ten(+)
) V
WHERE T4.thousand = 5;


The execution plan for query B is shown below.

pred_push_plan2.png



























In the execution plan BX, note the keyword 'VIEW PUSHED PREDICATE' indicates that the view has undergone the join predicate pushdown transformation. The join predicates (shown here in red) have been moved into the view V; these join predicates open up index access paths thereby enabling index-based nested-loop join of the view. With join predicate pushdown, the cost of query A has come down from 62 to 32.

As mentioned earlier, the join predicate pushdown transformation is cost-based, and a join predicate pushed-down plan is selected only when it reduces the overall cost.

Consider another example of a query C, which contains a view with the UNION ALL set operator.

C:
SELECT R.unique1, V.unique3
FROM T_5K R,
           (SELECT T1.unique3, T2.unique1+T1.unique1
            FROM T_5K T1, T_10K T2
            WHERE T1.unique1 = T2.unique1
            UNION ALL
            SELECT T1.unique3, T2.unique2
            FROM G_4K T1, T_10K T2
            WHERE T1.unique1 = T2.unique1) V
WHERE R.unique3 = V.unique3 and R.thousand < 1;


The execution plan of query C is shown below.

pred_push_plan3.png



In the above, 'VIEW UNION ALL PUSHED PREDICATE' indicates that the UNION ALL view has undergone the join predicate pushdown transformation. As can be seen, here the join predicate has been replicated and pushed inside every branch of the UNION ALL view. The join predicates (shown here in red) open up index access paths thereby enabling index-based nested loop join of the view.

Consider query D as an example of join predicate pushdown into a distinct view. We have the following cardinalities of the tables involved in query D: Sales (1,016,271), Customers (50,000), and Costs (787,766).

D:
SELECT C.cust_last_name, C.cust_city
FROM customers C,
           (SELECT DISTINCT S.cust_id
            FROM sales S, costs CT
            WHERE S.prod_id = CT.prod_id and CT.unit_price > 70) V
WHERE C.cust_state_province = 'CA' and C.cust_id = V.cust_id;


The execution plan of query D is shown below.

pred_push_plan4.png




















As shown in XD, when query D undergoes join predicate pushdown transformation, the expensive DISTINCT operator is removed and the join is converted into a semi-join; this is possible, since all the SELECT list items of the view participate in an equi-join with the outer tables. Under similar conditions, when a group-by view undergoes join predicate pushdown transformation, the expensive group-by operator can also be removed.

With the join predicate pushdown transformation, the elapsed time of query D came down from 63 seconds to 5 seconds.

Since distinct and group-by views are mergeable views, the cost-based transformation framework also compares the cost of merging the view with that of join predicate pushdown in selecting the most optimal execution plan.

Summary

We have tried to illustrate the basic ideas behind join predicate pushdown on different types of views by showing example queries that are quite simple. Oracle can handle far more complex queries and other types of views not shown here in the examples. Again many thanks to Rafi Ahmed for the content of this blog post.


Comments:

Hi

I have a situation that returned rows from subquery "ps" should match only to one table in subquery "du" (standard EBS tables) therefore there is additional filter "te.tab = du.tab" which eliminates id matches from other tables. from plan I see that retuned row id from subquery "ps" will be searched in all three subquerys tables and then results will be filtered by this condition "te.tab = du.tab". Is this all three table access is unavaoidable? Can this query be rewritten or hinted somehow that only one table is accessed for a row from "ps" as there is filter avaliable?

WITH ps AS
(SELECT /*+ materialize*/ 8380 tab_id, 'PA' tab FROM DUAL UNION ALL
SELECT 10096, 'AP' FROM DUAL UNION ALL
SELECT 1000, 'AR' FROM DUAL
)
SELECT *
FROM ps te
,(SELECT customer_trx_id tab_id
,'AR' tab
FROM ra_customer_trx_all
UNION ALL
SELECT project_id
,'PA'
FROM pa_projects_all
UNION ALL
SELECT invoice_id
,'AP'
FROM ap_invoices_all
) du
WHERE te.tab_id = du.tab_id
AND te.tab = du.tab

--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 18 | 12 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 18 | 12 (0)| 00:00:01 |
| 2 | VIEW | | 3 | 24 | 6 (0)| 00:00:01 |
| 3 | UNION-ALL | | | | | |
| 4 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 6 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
|* 7 | VIEW | | 1 | 10 | 2 (0)| 00:00:01 |
| 8 | UNION-ALL PARTITION| | | | | |
|* 9 | INDEX UNIQUE SCAN | RA_CUSTOMER_TRX_U1 | 1 | 6 | 2 (0)| 00:00:01 |
|* 10 | INDEX UNIQUE SCAN | PA_PROJECTS_U1 | 1 | 5 | 1 (0)| 00:00:01 |
|* 11 | INDEX UNIQUE SCAN | AP_INVOICES_U1 | 1 | 6 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

7 - filter("TE"."TAB"="DU"."TAB")
9 - access("CUSTOMER_TRX_ID"="TE"."TAB_ID")
10 - access("PROJECT_ID"="TE"."TAB_ID")
11 - access("INVOICE_ID"="TE"."TAB_ID")

----------------------------------------------

Tried to "trick" query to push predicate "CASE WHEN te.tab = du.tab THEN du.tab_id ELSE NULL END" which would retun NULL if id is looked up in wrong table there fore avoiding unique index scan as there is is not null contraint for access columns, but somehow it did not worked as it seems it does not like when there is columns from "du" subquery in both sides of predicate.

WITH ps AS
(SELECT /*+ materialize*/ 8380 tab_id, 'PA' tab FROM DUAL UNION ALL
SELECT 10096, 'AP' FROM DUAL UNION ALL
SELECT 1000, 'AR' FROM DUAL
)
SELECT *
FROM ps te
,(SELECT customer_trx_id tab_id
,'AR' tab
FROM ra_customer_trx_all
UNION ALL
SELECT project_id
,'PA'
FROM pa_projects_all
UNION ALL
SELECT invoice_id
,'AP'
FROM ap_invoices_all
) du
WHERE du.tab_id = CASE WHEN te.tab = du.tab THEN ps.tab_id ELSE NULL END

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 125 | 3651 (10)| 00:00:13 |
| 1 | NESTED LOOPS | | 5 | 125 | 3651 (10)| 00:00:13 |
| 2 | VIEW | | 3 | 24 | 6 (0)| 00:00:01 |
| 3 | UNION-ALL | | | | | |
| 4 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 6 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
|* 7 | VIEW | | 2 | 34 | 1215 (10)| 00:00:05 |
| 8 | UNION-ALL | | | | | |
| 9 | INDEX FAST FULL SCAN| RA_CUSTOMER_TRX_U1 | 1260K| 7384K| 433 (11)| 00:00:02 |
| 10 | INDEX FAST FULL SCAN| PA_PROJECTS_U1 | 110K| 537K| 38 (11)| 00:00:01 |
| 11 | INDEX FAST FULL SCAN| AP_INVOICES_U1 | 1785K| 10M| 744 (9)| 00:00:03 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

7 - filter("DU"."TAB_ID"=CASE WHEN ("TE"."TAB"="DU"."TAB") THEN "PS"."TAB_ID"
ELSE NULL END )

I know there can be flaws in my logic or asumpltions as I am beginer in this sql performance thing so please speare me :D .

My DB version is 11.2.0.2.0

Posted by guest on March 27, 2012 at 10:14 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

The Oracle Optimizer blog is written by members of the Optimizer development team. The goal of this blog is to provide an insight into the workings of the Optimizer and the statistics it relies on. The views expressed on this blog are our own and do not necessarily reflect the views of Oracle and its affiliates. The views and opinions expressed by visitors on this blog are theirs solely and may not reflect ours.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today