A SQL Tuning Case Study: Could we K.I.S.S. Please?

direct shot:

Preface

They say, it pays to K.I.S.S (Keep It Simple and Smart). The simplest ideas are the greatest ideas.This philosophy also applies to sql coding when we tell the oracle SQL engine what to fetch and how to fetch it (Indirectly). There are great rewards for simplistic thinking.
Lets discuss this principle through a case study in which the user was trying to fetch very simple data through an over-kill query, involving  RA_* views which were internally using some of the same tables. The SQL engine was therefore just trying to fetch and filter too much data from the same sources. It was not certainly efficient.
In this case study, we see how its important to understand the "Why" of a query, rather than the "How" of a query. Many times, just rethinking the whole approach pays great dividends.

The Poorly Performing Query

The client had a custom concurrent program which had this query as the worst offender.. it was taking 90% of the time and the program used to take ~ 1 hour to finish.
The formatted query is listed below:
SELECT SUBSTR(A_CUST.CUSTOMER_NUMBER,1,8) CUST_NUMBER, 
       HZ_CUST_SITE_USE.LAST_UPDATE_DATE LAST_UPDATE_DATE,
       SUBSTR(A_CUST.CUSTOMER_NUMBER,1,8) BILL_CUST_ID
FROM
    RA_CUSTOMERS A_CUST,
    HZ_CUST_ACCT_SITES_ALL HZ_CUST_ACCT_SITE,
    HZ_CUST_SITE_USES_ALL HZ_CUST_SITE_USE,
    RA_ADDRESSES_ALL RAA
WHERE
  A_CUST.CUSTOMER_NUMBER = :B1
  AND A_CUST.CUSTOMER_ID =  HZ_CUST_ACCT_SITE.CUST_ACCOUNT_ID
  AND HZ_CUST_ACCT_SITE.CUST_ACCT_SITE_ID = 
      HZ_CUST_SITE_USE.CUST_ACCT_SITE_ID
  AND HZ_CUST_ACCT_SITE.PARTY_SITE_ID =  RAA.PARTY_SITE_ID
  AND HZ_CUST_ACCT_SITE.STATUS = 'A'
  AND HZ_CUST_SITE_USE.STATUS = 'A'
  AND HZ_CUST_SITE_USE.SITE_USE_CODE =  'BILL_TO'
  AND RAA.ADDRESS1 = 'ICC LEASE'
/
Lets look at the TKPROF stats and runtime SQL plan used along with row counts.
plan of bad iw query:
Its interesting to observe that while only ~3000 rows are returned as the output of this sql, a huge number of rows were fetched from one of the HZ* tables.
Lets check the underlying tables being used by the Views RA_ADDRESSES_ALL and RA_CUSTOMERS. RA_ADDRESSES_ALL was a synonym for RA_ADDRESSES_MORG, while RA_CUSTOMERS was a synonym for RA_HCUSTOMERS.
  1  select referenced_name, referenced_type
  2  from dba_dependencies
  3* where name='RA_HCUSTOMERS'
SQL> /
REFERENCED_NAME         REFERENCED_TYPE
----------------------- -----------------
HZ_CUST_ACCOUNTS        TABLE
HZ_PARTIES              TABLE
SQL> l
  1  select referenced_name, referenced_type from dba_dependencies
  2  where name='RA_ADDRESSES_MORG'
  3* and referenced_type='TABLE'
SQL> /
REFERENCED_NAME                REFERENCED_TYPE
------------------------------ -----------------
HZ_PARTY_SITES                 TABLE
HZ_LOCATIONS                   TABLE
HZ_CUST_ACCT_SITES_ALL         TABLE
HZ_LOC_ASSIGNMENTS             TABLE
Lets also check the data volume involved in the base HZ* tables being referred and the percetange of estimation sample Percentage used.. (~70%) by Gather statistics program:
hz table stats information:

Do we need this complexity?


This is a good time for Reality Check: This is certainly overkill.
Why would you want to join complex views with 3-4 base tables being referred inside the views. Not an intelligent approach, shall we say? Better, why wouldnt one want to query the same data by understanding the data model relationship between the base tables involved and see if the same results can be achieved through them.
Spending time on checking the data volume involved, the % Estimation by Gather Stats, index cardinality etc would not lead to anything conclusive or direct inference in this case.
Another good learning from the http://etrm.oracle.com website (requires a Metalink Login: Note:150230.1) regarding the RA_* views is that they should not be used directly in ANY CUSTOMIZATIONS ever! All the more reason to eliminate the views from the query altogether and find an alternate way of getting the same result.
RA_ADDRESSES_MORG internal only:
ra_hcustomers internal only:

The Better Way...


Now, Only 1 column was being really queried. Inspite of this, after understanding the object model relationship diagrams for HZ* tables, at least five HZ* tables were required to be queried for a supplied customer_number value.

What was the SQL trying to achieve:


For a particular Customer_number, It wanted the find the Last_update_date for an ACTIVE Site which was being used as an ACTIVE BILL_TO location, with an address of ICC LEASE in its information.
Lets look at the index information:
select column_position, column_name
from dba_ind_columns
where index_name='&1'
/
Enter value for 1: HZ_LOCATIONS_N1
COLUMN_POSITION COLUMN_NAME
--------------- ------------------------------
              1 ADDRESS1
SQL> /
Enter value for 1: HZ_PARTY_SITES_U1
COLUMN_POSITION COLUMN_NAME
--------------- ------------------------------
              1 PARTY_SITE_ID
SQL> /
Enter value for 1: HZ_PARTY_SITES_N2
COLUMN_POSITION COLUMN_NAME
--------------- ------------------------------
              1 LOCATION_ID
SQL> /
Enter value for 1: HZ_CUST_ACCT_SITES_N1
COLUMN_POSITION COLUMN_NAME
--------------- ------------------------------
              1 PARTY_SITE_ID
SQL> /
Enter value for 1: HZ_CUST_SITE_USES_N1
COLUMN_POSITION COLUMN_NAME
--------------- ------------------------------
              1 CUST_ACCT_SITE_ID
              2 SITE_USE_CODE
              3 PRIMARY_FLAG
The same SQL query could be re-written such that it fetched the same data using base HZ_* tables, after connecting the foreign Key relationships.
Such an approach had many advantages:
  • RA_* views avoided as they are not supported except for in standard 11i products, also, its highly inefficient to access 7 tables/views for getting 1 column
  • The new query fetches less than half the number of consistent read buffers as compared to the previous query (verified using set autotrace on sql command)
  • The new query has just 14 access steps as opposed to 21 access steps for previous un-tuned query
  • The new query uses three unique Primery Key indexes (primary key) out of the five tables and two Foriegn key indexes for the remaining two, which is a really selective way of querying data
It simply cannot get better than that!
The performance improvement was very visible as the improved query ran within 2 minutes. Compare that with its previous cousin which used to run for ~1 hour.
SELECT SUBSTR(hca.account_NUMBER,1,8) CUST_NUMBER, 
       hcsu.LAST_UPDATE_DATE LAST_UPDATE_DATE,
       SUBSTR(hca.account_NUMBER,1,8) BILL_CUST_ID
FROM
   hz_cust_accounts       hca,
   hz_cust_acct_sites_all hcas,
   hz_cust_site_uses_all  hcsu,
   hz_party_sites         hps,
   hz_locations           loc
WHERE
--
-- Joining all the tables through primray keys
-- and foreign key relationships first
--
      hca.account_number = :B1
  and hca.cust_account_id = hcas.cust_account_id
  and hcas.party_site_id = hps.party_site_id
  and hca.party_id = hps.party_id
  and hps.location_id = loc.location_id
  and hcas.cust_acct_site_id = hcsu.cust_acct_site_id
--
-- the actual data query criteria starts here
--
  AND hcas.STATUS = 'A'
  AND hcsu.STATUS = 'A'
  AND hcsu.SITE_USE_CODE =  'BILL_TO'
  AND loc.ADDRESS1 = 'ICC LEASE'
--
-- Could perhaps add this criteria too ???
-- AND hcas.bill_to_flag = 'Y'
--
/
Before:
before - iw query plan and autotrace stats:
After:
improved iw query plan and stats:

Summary

It pays to ask the "why" questions sometimes, rather than focusing on "How" and "What" and going on a wild goose chase. The "Why" questions can get you more quickly to the answer by re-thinking the strategy.
Einstein himself said, "everything should be made as simple as possible, but no simpler."
The defence rests.

Related Links


Other Performance Related Articles

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

bocadmin_ww

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today